Call a variadic C function from generic Zig code without overhead

I have a project where I want to use Zig to generate C code for an embedded device which is not directly supported by Zig. In my Zig code I am calling a variadic function f that is provided by the C compiler.

Here is an example, where callExplicit corresponds to my application code and wrap1Explicit and wrap2Explicit are functions wrapping the C function (in reality they provide some additional functionality which is irrelevant to the example):

extern fn f(x: u8, ...) void;

export fn callExplicit(x: u8, y: u8) void {
    wrap1Explicit(x);
    wrap2Explicit(x, y);
}

inline fn wrap1Explicit(x: u8) void {
    f(x);
}

inline fn wrap2Explicit(x: u8, y: u8) void {
    f(x, y);
}

When compiled with zig build-obj -ofmt=c -OReleaseSmall test.zig (using Zig 0.14.0), this C code is generated:

void test_callExplicit__226(uint8_t const a0, uint8_t const a1) {
 unsigned int t0;
 f(a0);
 t0 = (unsigned int)a1;
 f(a0, t0);
 return;
}

When I manually write the wrapper functions for a specific number of arguments like wrap1Explicit and wrap2Explicit above, this C code looks as expected, where f is called and nothing else.

However, when trying to generalize the wrapper functions for any number of arguments, I found that the generated code is not as optimal as I hoped.

I first tried using @call:

export fn callGeneric1(x: u8, y: u8) void {
    wrapGeneric1(.{x});
    wrapGeneric1(.{ x, y });
}

inline fn wrapGeneric1(args: anytype) void {
    @call(.auto, f, args);
}

This results in:

typedef struct anon__lazy_44 nav__229_39;
struct anon__lazy_44 {
 uint8_t f0;
};
typedef struct anon__lazy_46 nav__229_41;
struct anon__lazy_46 {
 uint8_t f0;
 uint8_t f1;
};

void test_callGeneric1__229(uint8_t const a0, uint8_t const a1) {
 unsigned int t4;
 nav__229_39 t0;
 uint8_t t1;
 uint8_t t3;
 nav__229_41 t2;
 t0.f0 = a0;
 t1 = t0.f0;
 f(t1);
 t2.f0 = a0;
 t2.f1 = a1;
 t1 = t2.f0;
 t3 = t2.f1;
 t4 = (unsigned int)t3;
 f(t1, t4);
 return;
}

i.e., there are additional (to me, superfluous) stack allocations and copies, which are unfortunately not optimized away by the C compiler.

Instead of using @call, I also tried going back to calling f with an explicit number of arguments, but still inside a generic wrapper:

export fn callGeneric2(x: u8, y: u8) void {
    wrapGeneric2(1, .{x});
    wrapGeneric2(2, .{ x, y });
}

inline fn wrapGeneric2(n: comptime_int, args: anytype) void {
    switch (n) {
        1 => f(args[0]),
        2 => f(args[0], args[1]),
        else => @compileError("only 2 or less supported"),
    }
}

but the same C code is generated as with wrapGeneric1.

Is there another way to have a generic Zig function like wrapGeneric1 or wrapGeneric2, but get the C code as if the wrappers were written manually like in the first example?


Yes, f is printf, and if there is no other way I will consider reimplementing code like in std.fmt.format.