Calling convention consistent with C++ methods

Basically, I’m working on a project that hooks into a C++ binary, and exposes some of its methods through an API. The C x64 calling convention states that if a struct is returned by value and it’s greater than 8 bytes, a pointer to its storage should be passed in the first integer parameter register, and then the pointer is returned. Unfortunately, C++ methods technically don’t follow this because the ‘this’ pointer is always passed in that register, regardless of the return type. It would be very nice if there was a separate calling convention to handle this case so the API could be this same as the C++ method. This probably isn’t a common enough case to warrant it, but it would be cool…

const BigStruct = extern struct {
    // > 8 bytes
}

const CppClass = extern struct {
    ...
    // The actual function
    fn returnBigStruct(self: *CppClass) callconv(.CppMethod) BigStruct {
        return BigStruct {...};
    }
    
    // Consistent with C++ methods
    fn shouldBe(self: *CppClass, out: *BigStruct) callconv(.C) *BigStruct {
        out.* = BigStruct {...};
        return out;
    }

    // Consistent with x64 C calling convention
    fn actuallyIs(out: *BigStruct, self: *CppClass) callconv(.C) *BigStruct {
        ...
    }
}

pub fn main() void {
    // API Consistent with C++
    var class = CppClass {...};
    const result: BigStruct = class.returnBigStruct();
    
    // What the API actually has to be
    var class = CppClass {...};
    var out: BigStruct = undefined;
    _ = class.returnBigStruct(&out);
}

The only problem with this that I can see is that it’s ambiguous if the first argument in a function (that’s a pointer) is supposed the be the ‘this’ pointer if it’s not attached to a struct, but maybe it could just always treat it as such.

1 Like

This calling convention doesn’t exist. C++ methods use the same calling convention as C functions, but C++ is hiding stuff, as usual. Whenever you look at a C++ method, you have to imagine the this pointer as the first argument.

struct A{
  BigStruct shouldBe(int x) {
    return BigStruct{...};
  }
}

In Zig, this would be:

const A = extern struct{
  pub fn shouldBe(this: *@This(), x: c_int) callconv(.C) BigStruct{
    return .{...};
  }
};

These will be equivalent down to the binary level. Now that you can specify the this pointer in C++, you could actually write the same signature as in Zig.

That’s not what I’m talking about. I’m trying to create a wrapper around existing C++ functions where the Zig wrapper function can literally just have a jump instruction into the C++ function. The C++ function expects the ‘this’ pointer to be in rcx, but with the C calling convention it will actually be in rdx if the return value is > 8 bytes (because the return value pointer has to be in rcx)

1 Like

How are you handling the name mangling done by C++?

if the compiler target is for clang, we may be able to use clang’s ast-dump, something like this topic Experimental tool to generate idiomatic Zig bindings from C++, the msvc compiler also can dump ast but it won’t be compatible with clang i suppose, so you’ve to create wrapper for different compilers or just ditch msvc, since gcc and clang both support same c++ mangling afaik.

No, it doesn’t. It expects this to be the first argument. If the function needs a pointer for the return value, it will be in rcx, and this will be in rdx.

Currently the only way to interface zig with C++ is to create C++ code that exposes the C++ API with C linkage using extern "C" {}.

What is commonly used by languages to interface with C++ is generators that parse C++ header files and generate glue code. For example see: swig and rust bindgen.

2 Likes

The thing I’m working on is a modding tool for a game, so most of the functions in the main executable don’t have symbols, I’m just hardcoding their addresses. The game does load some dlls that I want to expose too, but there’s not that many functions in them that I actually care about, so I figured I could just manually find their mangled names/ordinals.

How are you working around ASLR?

I’m just patching whatever flag in the executable enables it before startup

Ok, but FYI, Windows 10 and further will refuse to load an executable without ASLR.

Weird, I’m using Windows 10 and it works. I should probably still find a better solution though

C++ calling conventions are not standardized. It’s implementation-dependent, so each compiler can differ, and in some cases, optimizations can change exactly what happens, so it’s not guaranteed to be uniform. Anything you manage to get working is nonportable and subject to breakage on updates to the compiler.

1 Like