Stricter function pointer casts

As an exercise I’m writing a wayland client implementation from scratch. Naturally this involves handling wayland events.

I currently have a system which involves storing a series of event handlers. These handlers have a context, which is not feasible to determine at comptime (primarily because this context can be any of a number of different types), so currently it is a *anyopaque. The struct which stores all these handlers looks something like the following:

struct {
    context: *anyopaque,
    call: *const fn (*anyopaque, u32, enum {}, []const u8) void,
}

I then call this with a code snippet similar to the following:

handler.call(handler.context, object, @enumFromInt(opcode), body);

My problem comes from the binding code itself. I end up with code that looks like the following:

self.handlers.items[wl_display].wl_display.set(.delete_id, .{
    .context = self,
    .call = @ptrCast(&Client.unbind),
});

With Client.unbind() defined as follows:

fn unbind(
    self: *Client,
    object: u32,
    opcode: wayland.display.event,
    body: []const u8,
) void

This is fine and works, however it does end up with type erasure of the function. That means if I were to forget one of the arguments for an event handler or accidentally get one of them swapped around, the code would still compile and simply end up with a nasty runtime error (I have had this happen in this specific case, fortunately it wasn’t too hard to find).

If I remove the @ptrCast(), I get the following compile error:

src/way2.zig:216:14: error: expected type '*const fn (*anyopaque, u32, protocols.wayland.display.event, []const u8) void', found '*const fn (*src.way2.Client, u32, protocols.wayland.display.event, []const u8) void'
            .call = &Client.unbind,
            ~^~~~~~~~~~~~~~~~~~~~~
src/way2.zig:216:14: note: pointer type child 'fn (*src.way2.Client, u32, protocols.wayland.display.event, []const u8) void' cannot cast into pointer type child 'fn (*anyopaque, u32, protocols.wayland.display.event, []const u8) void'
src/way2.zig:216:14: note: parameter 0 '*src.way2.Client' cannot cast into '*anyopaque'
src/way2.zig:216:14: note: pointer type child 'anyopaque' cannot cast into pointer type child 'src.way2.Client'

Ideally, I’d be able to remove the @ptrCast() to get stronger type guarantees, however it doesn’t seem possible to do that without changing the actual handling functions themselves to use anyopaque and use @ptrCast() within them, which I don’t like because I feel it impacts the readablity.

Is there another good workaround to get the strict compile time type checking?

Also it looks like the error message suffers from compile error hint for *anyopaque to *T due to function pointer parameter gets the type names backwards · Issue #13586 · ziglang/zig · GitHub

1 Like

I think this thread might help you: Callback with userdata Zig way

Thanks, I’d previously had the idea of writing a function to initialise my event handlers with stricter typing, however I couldn’t figure out how to do it correctly. I’ve now figured out how to do it correctly.

I’ve created a new comptime data structure, of type std.EnumArray(Interface, type), to convert from a wayland interface (which is an enum) to the corresponding event opcode enum for that interface. TBH the fact the you can do this in the first place is incredible, let alone using a lookup of the value as a function argument. Then I have the following piece of wizardry (as far as I’m concerned):

fn setHandler(
    self: *Client,
    comptime interface: wl.Interface,
    object: u32,
    opcode: wl.map.get(interface),
    context: anytype,
    call: *const fn (@TypeOf(context), u32, @TypeOf(opcode), []const u8) void,
) void {
    switch (self.handlers.items[object]) {
        interface => |*handlers| handlers.set(opcode, .{
            .context = context,
            .call = @ptrCast(call),
        }),
        else => unreachable,
    }
}

Which I can call as follows:

self.setHandler(wl_display, .wl_display, .delete_id, self, Client.unbind);

This will (as with the original example), set the Client.unbind function as the event handler for object 1 (wl_display) for the wl_display::delete_id wayland event.

AFAICT the only downside is that ZLS can’t determine the type of the opcode, so there’s no fancy editor completion for the opcode field.

1 Like

There is 1 argument missing.

I don’t really know what kind of type wl.map.get(interface) is, interface and opcode seem to form a sort of nested namespace, maybe you could replace it with something like this:

pub const interfaces = struct {
    pub const A = struct {
        pub const opcode1 = 1;
        pub const opcode2 = 2;
    };
    pub const B = struct {
        pub const opcode1 = 1;
        pub const opcode2 = 2;
    };
};

Or possibly use enums too. I don’t really know enough about wayland…

But if you can basically unroll the logic behind interface and opcode into something more static, involving less comptime logic, then it probably could be completed by zls, I just don’t know whether that works in your case, or whether it is worth the trouble.

I am also a fan of using function pointers with compatible functions that have a more concrete type.

I don’t really know what kind of type wl.map.get(interface) is

Sorry yes, I have an enum value for each opcode of every wayland interface. This enum is what is returned by the map lookup. I have something very similar to what you describe. The issue as I see it is that you cannot programmatically switch between what you’ve called interfaces.A and interfaces.B. What the map does (and it really needs a better name) is creates a table/map of an enum value for the interface to the corresponding type.

tl;dr the structure of the opcodes is way too deeply embedded into the code for it to be feasible to change without rewriting most of my code (currently at 1400 lines, excluding generated code).

The necessary wayland knowledge is the following:

  • wayland is composed of a bunch of protocols
  • each protocol contains a number of interfaces
  • these interfaces define a series of events, among other things
  • each event has its own opcode
  • every wayland “object” has a single defined interface

When I create a new object, I specify its interface and create an enum array of the needed opcode and the handling stuff above. Basically, it’s really helpful to specify the interface as an enum of some kind.

With this in mind, what I’ve done is take your example:

pub const interfaces = struct {
    pub const A = enum {
        opcode0,
        opcode1,
    };
    pub const B = enum {
        opcode0,
        opcode1,
    };
};

And added the following:

pub const Interface = enum {
    A,
    B,
};

The missing link was some kind of map between the Interface.A enum value and the interfaces.A type. Basically making it more static doesn’t really work because I’d need to completely modify the Interface enum everywhere it’s used, and I’m using it as the tag for a tagged union to store my event handlers. An excerpt from my code is below (shortened because the whole thing is generated by a python script and is almost 1200 lines long):

pub const EventHandlers = union(Interface) {
    wl_display: std.EnumArray(
        wayland.display.event,
        ?struct {
            context: *anyopaque,
            call: *const fn (*anyopaque, u32, wayland.display.event, []const u8) void,
        },
    ),
    wl_registry: std.EnumArray(
        wayland.registry.event,
        ?struct {
            context: *anyopaque,
            call: *const fn (*anyopaque, u32, wayland.registry.event, []const u8) void,
        },
    ),
};

If I were to change this to just be a lookup table using ints (using the largest opcode size, which is doable because it’s all generated code), which would probably emit the same, if not better, assembly I’d be able to remove the union entirely and probably remove the Interface type, however then I would lose a lot of benefits, like the ones I mentioned earlier. I’d also lose some type safety and have to rely a lot more on @intFromEnum().

Also I feel as though I’ve already dug a pretty deep comptime hole considering I have a function where I need to increase the eval branch quota to 2 million. Here it is:

fn createEventHandlers(interface: wl.Interface) wl.EventHandlers {
    @setEvalBranchQuota(2000000);
    return switch (interface) {
        inline else => |i| @unionInit(
            wl.EventHandlers,
            @tagName(i),
            std.meta.FieldType(wl.EventHandlers, i).initFill(null),
        ),
    };
}

I have absolutely no clue why backwards branching is even required in the first place, but apparently it is for std.meta.FieldType(). There’s probably a loop somewhere but idk, it works if the eval branch quota is 2 million.

1 Like

std.meta.FieldType() mostly calls std.meta.fields and then selects one of the results and from that the type, normally I would think that memoization should cause that to be reused, but you could try factoring out that call and reusing it for all the inline else branches, I don’t know whether that would actually reduce the branch quota but you could try it:

fn createEventHandlers(interface: wl.Interface) wl.EventHandlers {
    @setEvalBranchQuota(2000000);
    const fields = std.meta.fields(wl.EventHandlers);
    return switch (interface) {
        inline else => |i| @unionInit(
            wl.EventHandlers,
            @tagName(i),
            fields[@intFromEnum(i)].type.initFill(null),
        ),
    };
}

Not completely sure if I got the types correct here.
Maybe you could even use this:

@TypeOf(@field(@as(wl.EventHandlers, undefined), @tagName(i))).initFill(null),

My reasoning is that using builtins this may induce less branching cost (although it also does less type checking (which may be okay if other parts already ensure that the right values are provided))

But if it works and isn’t slowing down compilation then having a high branch quota might not be that bad, I also have seen code repeatedly set the branch quota within the loop, so maybe it also could be set within the inline else branch so that it actually adapts to the number of handlers.

Here is an example: zig/lib/std/meta.zig at ee9f00d673f2bccddc2751c328758a2820d2bb70 · ziglang/zig · GitHub

1 Like

Thanks, the first option did the trick. I personally wasn’t seeing any compilation slowdown (in fact from my one test, a completely clean compilation was 0.2 seconds faster with the original code), but even needing to use @setEvalBranchQuota() seemed a bit off to me. While it worked, if I add more fields (which will definitely happen in the future), I would’ve needed to increase that again by a few orders of magnitude and I suspect there would be a point where compilation times would definitely start to suffer.

I had actually tried before to get some kind of builtin combination as you described although I couldn’t quite get the right combination of builtins to get it to work. Regardless both of your options work, so thank you.

What’s particularly interesting to me is that the following works, no @setEvalBranchQuota() required.

fn createEventHandlers(interface: wl.Interface) wl.EventHandlers {
    return switch (interface) {
        inline else => |i| @unionInit(
            wl.EventHandlers,
            @tagName(i),
            std.meta.fields(wl.EventHandlers)[@intFromEnum(i)].type.initFill(null),
        ),
    };
}

This got me thinking, so I tried with std.meta.fieldInfo().type, which needed an eval branch quota of 1 million.

fn createEventHandlers(interface: wl.Interface) wl.EventHandlers {
    @setEvalBranchQuota(1000000);
    return switch (interface) {
        inline else => |i| @unionInit(
            wl.EventHandlers,
            @tagName(i),
            std.meta.fieldInfo(wl.EventHandlers, i).type.initFill(null),
        ),
    };
}

I guess it has something to do with the compile time type checks in those functions, but I have absolutely no clue after that, and 2 million is a lot more than the 167 interfaces I have.

1 Like