JIT Compilation Gotchas: Mismatched Calling Conventions

I recently ran into an interesting issue with some JIT-compiled code that serves as a good reminder about the importance of consistent calling conventions, especially when working with mixed compilation environments.

The Setup

I am working on a system that uses LLVM to compile SQL queries to machine code. While debugging an issue, I added a function that logs double values along with file and line information:

extern "C"  void logdouble(double x, decltype(__builtin_FILE()) file, decltype(__builtin_LINE()) line){
    google::LogMessage(file, line, google::GLOG_INFO).stream() << x;
}

This is a utility function for logging from code that is generated and compiled at runtime (JIT-compiled). To generate some IR that calls the function, it looks something like this (with boiler plate removed to simplify it):

llvm::IRBuilder<> *builder = getBuilder();
llvm::LLVMContext& context = getLLVMContext();

llvm::Type *double_t = llvm::Type::getDoubleTy(context);

llvm::Value *val_to_log = Builder->CreateLoad(double_t, double_ptr);

llvm::FunctionType *fType =
    FunctionType::get(void_type, {double_t, char_ptr_type, int32_type}, false);

std::string fName = "logdouble";
llvm::Function *f = Function::Create(fType, Function::ExternalLinkage, fName, TheModule);

builder->CreateCall(f, {val_to_log, CreateGlobalString(__builtin_FILE()), createInt32(__builtin_LINE())});

The generated LLVM-IR looks something like this:

%i20 = load double, double* %i15, align 8
store i8* getelementptr inbounds ([64 x i8], [64 x i8]* @.str, i32 0, i32 0), i8** %globalStr, align 8
%i21 = load i8*, i8** %globalStr5, align 8
call void @logdouble(double %i20, i8* %i21, i32 276)

The Problem

Complete junk was being logged, and it took me a little¹ while to realize that my call to the logging function itself was the problem. For example, I was seeing this:

I20250410 19:13:25.154513 1023324 my_file.cpp:228] 6.95334e-310

When I inspected the memory at the address contained in double_ptr I saw 0x000000000000f83f which is the IEEE 754 double representation of 1.5. However, when I inspected the value of x at a breakpoint in logdouble I saw 6.95334e-310. So what was happening?

When examining the assembly code for the JIT-compiled caller, I noticed something odd:

mov    QWORD PTR [rsp+0x50],r15
fld    QWORD PTR [rbx+r12*8]
movabs rax,0x7ffff3baa0e0
mov    QWORD PTR [rsp+0x58],rax
mov    rdi,QWORD PTR [rsp+0x58]
fstp   QWORD PTR [rsp]
mov    esi,0x114
call   r13

The JIT-compiled code was using the old x87 floating-point stack (with fld and fstp instructions) rather than the expected SSE/AVX registers. According to the System V AMD64 ABI, the first floating-point argument should be passed in the %xmm0 register.

When I looked at the start of the logdouble function itself:

logdouble(double, char const*, unsigned int):
    push   rbp
    mov    rbp,rsp
    push   r15
    push   r14
    push   rbx
    sub    rsp,0x18
    mov    r14d,esi
    lea    rsi,[rip+0x10caed]
    lea    rbx,[rbp-0x30]
    mov    edx,0x443
    mov    r15,rdi
    vmovsd QWORD PTR [rbp-0x20],xmm0
    mov    rdi,rbx

I could see it was expecting the double in %xmm0 (notice the vmovsd instruction that stores from xmm0 to the stack). This mismatch in calling conventions led to a subtle bug where the function received garbage values for the floating-point argument based on whatever happened to be in %xmm0.

So why was my JIT’d code not using SSE? A while ago, I was investigating a performance issue due to AVX2 instructions, which is another story altogether, and hackily disabled all vector instructions in my JIT, including both SSE and AVX. Until now, this had not been a problem as I never needed to pass floating point types from JIT compiled to ahead-of-time compiled code.

The Solution

I re-enabled SSE and AVX instructions in my JIT and the generated code properly used the expected calling convention:

vmovsd xmm0,QWORD PTR [rbx+r12*8]
movabs rax,0x7ffff3baa0a0
mov    QWORD PTR [rsp+0x50],rax
mov    rdi,QWORD PTR [rsp+0x50]
movabs r13,0x7ffff584e520
mov    esi,0xe4
call   r13

Now the first argument (double x) is correctly passed in %xmm0, the second argument (file) in %rdi, and the third argument (line) in %esi — exactly as expected by the System V AMD64 ABI.

Key Takeaways

Consistent compilation flags are crucial: When working with JIT compilation that interfaces with statically compiled code, make sure your compiler flags are consistent across both environments.
Disabling SSE is a bad idea: SSE (and SSE2) are mandatory instructions for x86_64. Disabling them is silly² and will break your ABI.
Inspect the assembly: When dealing with subtle bugs in cross-module calls, especially in mixed JIT and AOT environments, looking at the actual assembly can quickly reveal calling convention mismatches.

More than a little, I spent a chunk of time looking for the root issue in other parts of the system first. ↩
I learned my lesson ↩