Writing runtime-extensible code in Zig is massively inconvenient

Here’s a design choice that seriously undermines Zig’s goal to replace C: auto structs/unions/enums/optionals/functions/errors in Zig have no guaranteed representation at all. This would be fine and cool, if there weren’t programs made to be extended at runtime. Because their layout is generally undefined, the compiler can theoretically just randomize the order of fields on each invocation, or encrypt them with a per-ZCU key, meaning that passing them across ABI boundaries is illegal behavior. When trying to write such a program, you will face several challenges that might suck the joy of programming in Zig right out of you:

  • No slices
  • No errors
  • No allocators
  • No standard containers
  • No auto-callconv functions
  • No bit integers (I’m not sure why this is the case, since clang has _BitInt which is ABI-safe, but it is)

To use a real-world example, here’s a generic “options” struct, mapping a string key to a union value:

pub const Options = struct {
       pub const Value = union(enum) {
       bytes: []u8,
       const_bytes: []const u8,
       array: []Value,
       signed: i64,
       unsigned: u64,
       float: f64,
       pointer: ?*anyopaque,
       boolean: bool,
   };

    map: std.StringHashMapUnmanaged(Value) = .{},

    pub fn deinit(o: *Options, allocator: std.mem.Allocator) void {
        o.map.deinit(allocator);
    }

    pub fn set(o: *Options, allocator: std.mem.Allocator, key: []const u8, value: Value) error{OutOfMemory}!void {
        return o.map.put(allocator, key, value);
    }

    pub fn get(o: *Options, key: []const u8) ?Value {
        return o.map.get(key);
    }
};

So far, so simple. There are several issues if you want to use it across ZCUs:

  • Options has auto layout
  • Value has auto layout
  • Value contains slices
  • std.StringHashMapUnmanaged(Value) has auto layout
  • set and get have auto layout
  • set and get have parameters with no guaranteed representation

There are two main approaches to making this code ABI-safe. The first, easy one is not to bother, and expose an ABI-safe version:

pub const AbiSafeOptions = opaque {
    pub const Value = extern struct {
        // enum with native backing type
        pub const Tag = enum(u8) {
            bytes,
            const_bytes,
            array, 
            signed, 
            unsigned,
            float, 
            pointer,
            boolean,
        };
        pub const Payload = extern union {
            bytes: extern struct {ptr: [*]u8, len: usize},
            const_bytes: extern struct {ptr: [*]const u8, len: usize},
            array: extern struct {ptr: [*]Value, len: usize}, 
            signed: i64,
            unsigned: u64,
            float: f64,
            pointer: ?*const anyopaque,
            boolean: bool,
        };
        tag: Tag,
        payload: Payload,
    };

    const Map = std.StringHashMapUnmanaged(Value);

    pub fn create() callconv(.c) ?*AbiSafeOptions {
        // global allocator
        const p = std.heap.smp_allocator.create(Map) catch return null;
        p.* = .{};
        return @ptrCast(p);
    }

    pub fn destroy(self: *AbiSafeOptions) callconv(.c) void {
        const options: *Map = @alignCast(@ptrCast(self));
        options.deinit(std.heap.smp_allocator);
        std.heap.smp_allocator.destroy(options);
    }

    pub fn set(self: *AbiSafeOptions, key_ptr: [*]const u8, key_len: usize, value: Value) callconv(.c) bool {
        const options: *Map = @alignCast(@ptrCast(self));
        const r = options.put(std.heap.smp_allocator, key_ptr[0..key_len], value);
        return if(r) |_| true else false;
    }

    pub fn get(self: *AbiSafeOptions, key_ptr: [*]const u8, key_len: usize, out_value: *Value) callconv(.c) bool {
        const options: *Map = @alignCast(@ptrCast(self));
        out_value = options.get(key_ptr[0..key_len]) orelse return false;
        return true;
    }
};

Let’s be frank: This code is ass. It:

  • is much longer than the code it replaces,
  • isn’t as safe as the code it replaces,
  • forces the use of a specific global allocator,
  • forces heap allocation of Map, since its size is also not part of the ABI,
  • forces an inconvenient tagged union that can’t be switched on as easily and lacks type checking.
  • generally goes against Zig’s goal of being a better C.

This is the easy approach. The hard approach is to take, in this case, std.hash_map.HashMapUnmanaged, std.mem.Allocator, and slices, make them ABI-safe, and use those. Don’t bother.

Of course, some things can be improved with comptime magic:

However, you will quickly run into the fact that decls can’t be reified.

As it stands now, trying to write a runtime-extensible program in Zig is, for the most part, worse than doing it in C. If the representation of slices and auto-layout stuff were defined for a given (compiler version, build options) pair, which is the de-facto behavior of the Zig compiler right now (there are plans to change that), it would be a lot less painful.

1 Like

I’ve been looking into this and found a way to make it sorta work. I generate packed structs that exactly matches the compiled struct binary layout. Right now it’s extremely inconvenient, comptime is not allowed to have side effects like generating the sources for these packed structs on top you have to make the passing of function pointers into a thing wrapped in a struct to be able to use the normal zig calling protocol. The only ways you can presently do it is through the log(which I’ll never do) or by storing the relevant meta data (reflection data) in the binary and then export the code at runtime.

I’ve gotten it to work nicely with all my stuff. But I had to add a generate API commandline option. The real problem is stuff like slices. The only saving grace is that coincidentally the compiler is 100% consistent in slice layout regardless of how I’ve been using the slices.

Once the promised build time meta server arrives it should be possible to do this in build.zig at which point a tool could be built for it. Of course it would be much better if the compiler could store these memory layout options into a datafile which could be supplied to the compiler so it knew to respect the established layout of those structs.

2 Likes

this is definitely the reason for extern struct as a part of the language :slight_smile: . from the anger in your post, it sounds like you must be facing this issue with more frequency than me! do you have a proposed solution (maybe even just “these structs in std should be extern”)?

1 Like

That’s smart, but unfortunately the language doesn’t allow it =/ even though you can definitely do it, it’s technically IB.

Yep! You’re totally right, that’s what I meant by

, right now it’s all consistent, by coincidence…

That would be nice, but as far as I’m aware, there are no plans for such a thing… the spec would still forbid you from using auto-layout types in that way.

I think the reason @cancername is triggered is because dynamic linking is practically impossible with Zig. I’ve spent quite a few hours and what I’ve come up with is a subpar hack, to provide something that is a common feature in all other languages I know of.

“All” code editors supports plugins, “all” paint programs support plugins, “all” game engines supports custom code, “all” OS supports dynamic drivers, many of the best games supports mods. Having the language not support such a basic concept without reverting back to C ABI is super annoying. (The reason I put all in quotation marks is that I’m quite aware it’s technically not true, drivers can be compiled together with the kernel in one go, etc. Please ignore the generalization and just accept my premise that almost all languages supports this without loosing language features.)

2 Likes

I agree, but unfortunately it is quite limited, and made for C compatibility, not intra-zig ABI guarantees… maybe a @abiLayout(.foobar_layout, T) builtin or something similar to guarantee a layout for that type would be a language-level solution?

I would really like my work-in-progress multimedia library to be runtime-extensible (i.e. dlopen + registerDecoder or something to that effect), because things like FFmpeg just aren’t… but it’s a pain, at least for now.

In my opinion, std.mem.Allocator is definitely a struct that should be usable across ZCUs, but the core team seems to disagree.

2 Likes

Yes, it’s clearly a hack and without a doubt IB. I’d still argue that it’s preferable to using extern struct and adding calling conventions all methods simply to make shit work.

Since all the data is available at comp time to make a 100% identical data structure using packed struct, I’d argue that I’m merely recreating the steps that the compiler should have performed for me as a built in.

2 Likes

Actually I would hate for std structs to become extern. I love the premise that the compiler/linker should be given as much agency to generate the optimal binary as possible.

I just need a way to pass these compiled layouts along so that we can build dynamically linked zig code.

1 Like

Same! I would prefer having the option to say “I want this type, but with a guaranteed layout”, keeping the decls. I would not want C++ style runtime type info because it can increase binary size and, for example, accessing struct fields depends on runtime variables. To me, that is as good as generated functions.

Well I’ve seen no concrete plans, but I’ve seen a few issues closed with reference to such a build meta server thing being the proper solution. I can’t quite find it, but once I saw that response closing the issue it made a lot of sense. Build.zig is the right place to do such meta programming, not inserted into comptime of a “random” zig file.

1 Like

I come from C# and I hate not having this available! But I totally accept the premise that it comes with at cost that is not always desirable, on top there are many structs where such information would never be used/valuable.

I’m hoping for that meta build server thing, because that would let me create a build tool to provide exactly the relevant runtime type information for exactly the right structs in exactly the right format.

1 Like

Please forgive my ignorance, but is there even a single systems programming language that provides a satisfactory solution?

My understanding is that Rust is only compatible if all compilation units are compiled using the same compiler version, and I can’t even get ABI compatible code using MSVC and MinGW for C on windows…

Which languages provide a satisfactory solution in your view? I can understand why the ZSF has hesitated to provide a stable ABI because it can be quite limiting to the compiler optimizations.

This is a significant limitation, but even this would be better than nothing at all.

Can you elaborate? In general, C code without bit-fields is easy to make ABI-safe. Or do you mean different libcs being incompatible?

I agree, having the compiler optimize struct layout, like sorting fields by size, is awesome. But it can still do that in the general case if the layout is only frozen optionally when requested by the user, as suggested here:

My previous struggles linking MinGW to MSVC libs:

(it hasn’t crashed yet but i’m still a bit scared :slight_smile: )

1 Like

That makes sense. Mixing libcs is, uh, probably not a good idea

Satisfactory… well… that’s perhaps a mouthful.
But it is supported for all major language features in C/C++/Pascal/Rust/etc. Yes there may be limitations on which compiler/linker/versions you are required to use. But generally speaking it works!

Edit: What I’m asking for is not the .net style dynamic linked library that “just works™” across multiple programming languages and that even allows linking across .net versions. Just anything that will let third parties make a plugin for your binary using the normal features of the zig language.

1 Like

Actually I don’t think it would be THAT hard to make a really strong Zig ABI that could in theory work across multiple zig versions and even be possible to interact with from C with a few hoops.

But honestly I think version 1 should be locked on both zig version and even relevant libraries, so that the dynamic library becomes self sustained without relying on stuff not being cut from the host.

The challenges of supporting multiple zig versions and library versions would be a sizable undertaking that adds a ton of code to the dynamic library to resolve methods in the host. That would quickly turn into a nightmarish endeavor.

Why lock yourself to some non-standard Zig ABI that may or may not happen (most likely wont), instead of designing well defined protocol that you use to talk with. Dynamic libraries are sort of mess, and even the status quo (C ABI) isn’t perfect (stdbool?? bitfields??), I think it’s fair enough zig doesn’t try to pursue this effort.

I think the only language that actually tries to have stable ABI is swift and it’s certainly not simple either: How Swift Achieved Dynamic Linking Where Rust Couldn't - Faultlore

Because an unstable ABI is significantly better than none at all.

Those aren’t mutually exclusive. In fact, I would very much like to do both!

True. Nobody is calling for perfection, and Zig, as a pragmatic language, rejects it.

I don’t. I think Zig should be a straightforward improvement over C in as many cases as is reasonable. Adding something that the compiler already implements (and, perhaps more importantly, people are already relying on) to the spec is more than reasonable.

What Swift is doing is making the ABI stable in as many situations as possible. Zig isn’t making any effort to even support the possibility of having an ABI at all. These are leagues apart.

Coincidentally, this would also make saving space by omitting the safety-checking tag possible without that cursed hack, assuming there are safe and unsafe ABIs: @AbiLayout(.unsafe, union{...}).