JIT Compilation Gotchas: Mismatched Calling Conventions
I recently ran into an interesting issue with some JIT-compiled code that serves as a good reminder about the importance of consistent calling conventions, especially when working with mixed compilation environments.
The Setup
I am working on a system that uses LLVM to compile SQL queries to machine code. While debugging an issue, I added a function that logs double values along with file and line information:
extern "C" void logdouble(double x, decltype(__builtin_FILE()) file, decltype(__builtin_LINE()) line){
google::LogMessage(file, line, google::GLOG_INFO).stream() << x;
}
This is a utility function for logging from code that is generated and compiled at runtime (JIT-compiled). To generate some IR that calls the function, it looks something like this (with boiler plate removed to simplify it):
llvm::IRBuilder<> *builder = getBuilder();
llvm::LLVMContext& context = getLLVMContext();
llvm::Type *double_t = llvm::Type::getDoubleTy(context);
llvm::Value *val_to_log = Builder->CreateLoad(double_t, double_ptr);
llvm::FunctionType *fType =
FunctionType::get(void_type, {double_t, char_ptr_type, int32_type}, false);
std::string fName = "logdouble";
llvm::Function *f = Function::Create(fType, Function::ExternalLinkage, fName, TheModule);
builder->CreateCall(f, {val_to_log, CreateGlobalString(__builtin_FILE()), createInt32(__builtin_LINE())});
The generated LLVM-IR looks something like this:
%i20 = load double, double* %i15, align 8
store i8* getelementptr inbounds ([64 x i8], [64 x i8]* @.str, i32 0, i32 0), i8** %globalStr, align 8
%i21 = load i8*, i8** %globalStr5, align 8
call void @logdouble(double %i20, i8* %i21, i32 276)
The Problem
Complete junk was being logged, and it took me a little1 while to realize that my call to the logging function itself was the problem. For example, I was seeing this:
I20250410 19:13:25.154513 1023324 my_file.cpp:228] 6.95334e-310
When I inspected the memory at the address contained in double_ptr
I saw 0x000000000000f83f
which is the IEEE 754 double representation of 1.5
. However, when I inspected the value of x at a breakpoint in logdouble
I saw 6.95334e-310
. So what was happening?
When examining the assembly code for the JIT-compiled caller, I noticed something odd:
mov QWORD PTR [rsp+0x50],r15
fld QWORD PTR [rbx+r12*8]
movabs rax,0x7ffff3baa0e0
mov QWORD PTR [rsp+0x58],rax
mov rdi,QWORD PTR [rsp+0x58]
fstp QWORD PTR [rsp]
mov esi,0x114
call r13
The JIT-compiled code was using the old x87 floating-point stack (with fld
and fstp
instructions) rather than the expected SSE/AVX registers. According to the System V AMD64 ABI, the first floating-point argument should be passed in the %xmm0
register.
When I looked at the start of the logdouble
function itself:
logdouble(double, char const*, unsigned int):
push rbp
mov rbp,rsp
push r15
push r14
push rbx
sub rsp,0x18
mov r14d,esi
lea rsi,[rip+0x10caed]
lea rbx,[rbp-0x30]
mov edx,0x443
mov r15,rdi
vmovsd QWORD PTR [rbp-0x20],xmm0
mov rdi,rbx
I could see it was expecting the double in %xmm0
(notice the vmovsd
instruction that stores from xmm0
to the stack). This mismatch in calling conventions led to a subtle bug where the function received garbage values for the floating-point argument based on whatever happened to be in %xmm0
.
So why was my JIT’d code not using SSE? A while ago, I was investigating a performance issue due to AVX2 instructions, which is another story altogether, and hackily disabled all vector instructions in my JIT, including both SSE and AVX. Until now, this had not been a problem as I never needed to pass floating point types from JIT compiled to ahead-of-time compiled code.
The Solution
I re-enabled SSE and AVX instructions in my JIT and the generated code properly used the expected calling convention:
vmovsd xmm0,QWORD PTR [rbx+r12*8]
movabs rax,0x7ffff3baa0a0
mov QWORD PTR [rsp+0x50],rax
mov rdi,QWORD PTR [rsp+0x50]
movabs r13,0x7ffff584e520
mov esi,0xe4
call r13
Now the first argument (double x
) is correctly passed in %xmm0
, the second argument (file
) in %rdi
, and the third argument (line
) in %esi
— exactly as expected by the System V AMD64 ABI.
Key Takeaways
- Consistent compilation flags are crucial: When working with JIT compilation that interfaces with statically compiled code, make sure your compiler flags are consistent across both environments.
- Disabling SSE is a bad idea: SSE (and SSE2) are mandatory instructions for x86_64. Disabling them is silly2 and will break your ABI.
- Inspect the assembly: When dealing with subtle bugs in cross-module calls, especially in mixed JIT and AOT environments, looking at the actual assembly can quickly reveal calling convention mismatches.