Zig equivalent to C idiom for structure with "tail" data

tracy · October 31, 2023, 12:59pm

In C there’s a common idiom where you define a structure with a zero-length array as the last item. A pointer to such a struct acts like a header for a buffer.

Example from a SoC project:

struct TPropertyTag
{
	u32	nTagId;
	u32	nValueBufSize;
	u32	nValueLength;
	u8	ValueBuffer[0];
}

The caller is responsible for allocating the right size of memory, then casting the void * to a TPropertyTag *. The advantage is that one struct can handle different sizes that are runtime-known. The obvious disadvantage is the utter lack of type safety.

In this case, interfacing with hardware, I have to use the layout dictated by the SoC, so I don’t have the option to redefine the API.

I’m trying to find a clean and idiomatic way to express this in Zig.

One approach would be to make a struct for each usage of the property tag. This is unwieldy, because there are many different “subtypes” of property tag that each have their own lengths. It’s impractical to define all of those structs, especially since the first 3 fields would be identical across all of them.

I can’t find a way to declare an array of unknown size. A many-item pointer doesn’t seem to do the job, since the memory layout needs to put the buffer immediately after the header.

Does anyone have a good example of this usage in Zig?

Validark · October 31, 2023, 3:35pm

Here was my attempt. This is an idiom used by std.MultiArrayList, where you have get/set methods and you specify which field is being get or set via an enum. I probably made this a little more general-purpose than necessary, but hopefully this can get you started.

const std = @import("std");
const Allocator = std.mem.Allocator;
const assert = std.debug.assert;

const TPropertyTag = struct {
    nTagId: u32,
    nValueBufSize: u32,
    nValueLength: u32,
    // ValueBuffer: [*]u8,
};

fn WrappedStruct(comptime T: type) type {
    const field_infos = @typeInfo(T).Struct.fields;
    var enumFields: [field_infos.len + 1]std.builtin.Type.EnumField = undefined;
    var decls = [_]std.builtin.Type.Declaration{};

    inline for (field_infos, 0..) |field, i| {
        enumFields[i] = .{
            .name = field.name,
            .value = i,
        };
    }

    enumFields[field_infos.len] = .{
        .name = "ValueBuffer",
        .value = field_infos.len,
    };

    const FieldEnum = @Type(.{
        .Enum = .{
            .tag_type = std.math.IntFittingRange(0, field_infos.len - 1),
            .fields = &enumFields,
            .decls = &decls,
            .is_exhaustive = true,
        },
    });

    return struct {
        fn FieldType(comptime field: FieldEnum) type {
            return switch (field) {
                .ValueBuffer => []u8,
                else => @typeInfo(T).Struct.fields[@intFromEnum(field)].type,
            };
        }

        raw_memory: []align(@alignOf(T)) u8,

        pub fn get(self: *const @This(), comptime field: FieldEnum) FieldType(field) {
            return switch (field) {
                .ValueBuffer => self.raw_memory[@sizeOf(T)..],
                else => @field(@as(*T, @ptrCast(self.raw_memory)), @tagName(field)),
            };
        }

        pub fn set(self: *const @This(), comptime field: FieldEnum, value: FieldType(field)) void {
            assert(field != .ValueBuffer);
            @field(@as(*T, @ptrCast(self.raw_memory)), @tagName(field)) = value;
        }

        pub fn deinit(self: *const @This(), gpa: Allocator) void {
            gpa.free(self.raw_memory);
        }
    };
}

fn structWithBuffer(gpa: Allocator, comptime T: type, buffer_size: usize) !WrappedStruct(T) {
    return .{ .raw_memory = try gpa.alignedAlloc(u8, @alignOf(T), try std.math.add(usize, @sizeOf(T), buffer_size)) };
}

pub fn main() !void {
    const gpa = std.heap.c_allocator;
    var mem = try structWithBuffer(gpa, TPropertyTag, 4);
    defer mem.deinit(gpa);

    mem.set(.nTagId, 22);
    mem.set(.nValueBufSize, 21);
    mem.set(.nValueLength, 20);

    std.debug.print("{}\n", .{mem.get(.nTagId)}); // 22
    std.debug.print("{}\n", .{mem.get(.nValueBufSize)}); // 21
    std.debug.print("{}\n", .{mem.get(.nValueLength)}); // 20

    const buffer = mem.get(.ValueBuffer);
    buffer[0] = 'a';
    buffer[1] = 'b';
    buffer[2] = 'c';
    buffer[3] = 'd';
    std.debug.print("buffer: {s}\n", .{buffer});

    std.debug.print("{}\n", .{buffer.len}); // 4
}

tracy · October 31, 2023, 4:43pm

Very interesting! I am going to need some time to study that to understand it.

At first look I can see how the memory layout matches what I was looking for. I don’t get the @typeInfo(T).Struct.fields part though…

LucasSantos91 · November 1, 2023, 2:52am

This is on purpose. C’s variable length arrays / structs were a mistake, because one of the main advantages of types is that they have a known size. When you use a variable length array, this messes up a lot of optimizations. Sticking an unknown amount of data at the end of a struct basically destroys the ideia of a type. At this point, you are dealing with a bunch of loosely defined bytes, out of which the first couple have a predefined meaning.
Still, here’s a concrete example of what @Validark code does, without the metaprogramming.

fn foo() void{
  const PropertyTag = extern struct {
    nTagId: u32,
    nValueBufSize: u32,
    nValueLength: u32,
  };
  const Usage = extern struct{
    tag: PropertyTag,

    // Example payload
    a: f64,
    b: f32,
  };
  var concreteData: Usage = //...
  // Fill Usage and, when you need to pass it to the API, you give it a pointer to the tag field (which is the first), or just @ptrCast the entire struct.
}

I don’t see why this could be unwieldy. Granted, there are many subtypes, but the code that is using the particular subtype needs to know what kind of the data that it is working with, otherwise, you wouldn’t even be able to initialize the data.

Validark · November 1, 2023, 3:21am

Yeah… my solution was under the assumption that the pattern that @tracy wanted to use was the right one. I know nothing about the problem, for all I know the code is supposed to stick arbitrary strings in the extra buffer. Conceivably there could be a blend of the two techniques, but your solution is a lot better for situations where the real shape of the data is known.

pdoane · November 1, 2023, 4:07am

VLAs are commonly used with GPU programming (HLSL, GLSL, Metal, WGSL all expose this functionality), although typically as actual arrays and not a variable payload as discussed here.

As the CPU needs to create data for structures with VLA members, making them easy to work with in Zig would be a welcome addition.

mcadamy · November 1, 2023, 11:37am

Use them in finance and trading for data locality all the time.

The reason some of us became interested in Zig to get away from how paternalistic other languages have become. Andrew has said multiple times about how Zig is meant to allow you to emit the code you want. The stance of not allowing VLAs for our own good because some people don’t know what they are dong with them seems to go against this ethos.

VLAs are a required things. Rust has ?Sized for such things. If you are going to take away the C-style ones, something needs to replace them or we just start allocating raw blocks and pretending they are on back of the struct.

tracy · November 1, 2023, 12:18pm

They weren’t designed into C, more like an “emergent feature”. In other words, a hack.

Strictly speaking, the size of the ‘tail’ of the structure is comptime knowable. There’s a finite number of messages in the protocol. So I could create a struct for each. That’s actually nice for correctness and that it makes the details visible and explicit. It kind of creates a new difficulty, since there’s no polymorphism in the type system I have to pass all the values of different structs all as anyopaque to a common send/receive function. That’s where treating them as a common base struct with a variable-length array comes in handy.

pdoane · November 1, 2023, 4:16pm

Zig’s documentation claims “Zig is better at using C libraries than C is at using C libraries” which is maybe a little suspect if it doesn’t support a required C99 feature? They were made optional in C11 though, so from that view maybe it’s off the hook, even if support for VLAs in compilers is pretty common.

Reading old Zig issues, most of the criticism seems to be focused on the stack allocation aspect of VLA. I don’t need want that feature either - too easy to unexpectedly blow out the stack later and I’d rather use Bounded containers.

Where VLA is interesting for me is the last member being variably sized. There are workarounds, like doing the pointer math in a function… and I think using generics on the type would work, so maybe lack of support is reasonable. I’m curious what the current recommended practice is.

pdoane · November 1, 2023, 5:38pm

It looks like Linus’ comment is the stack side of VLA which I’m not advocating for.

What translate-C is doing is reasonable for a code generator (although readability isn’t great). A bit annoying when one wants to do this manually.

tracy · November 1, 2023, 5:48pm

No problem. The example I used was minimal to avoid adding distracting detail.

The interface in question is in dealing with a Broadcom SoC design. The CPU communicates with the GPU via this “mailbox” interface. A message for the mailbox is a structure in memory where one of the fields designates the message type and another specifies how many u32’s are in the request payload. The GPU then overwrites the request parameters with a response.

So each different message type can have a different number of u32 request parameters and u32 response parameters.

An example of how a C++ framework deals with this interface is here. It’s C++ so there’s a lot of subclassing going on, but the memory layout is as I described originally: a header followed by a message-specific number of words.

In C, it’s more common to have a generic “supertype” which looks like a VLA struct. Then some (definitely cursed) pointer casting lets a caller pass a pointer to a more specific struct. (In C it’s also common to use unions to represent the dual nature of the request parameter / response value words, since they occupy the same bytes of memory.)

My ideal wish list would be:

It’s easy to create the specific types for individual kinds of requests.
Code that creates & interacts with those types just uses ordinary field access.
I can pass any of those types in to a “send” function that is not specific to the type.
I don’t have to copy the bytes of the messages, just using pointers to the structures.
I don’t have to sacrifice all type safety.

I can get 1 - 4 with the C idioms, at the expense of type safety. In C++ I can get all 5 of those. I’m still trying to figure out how to get all of them in Zig. My original request might have been going down the wrong path… perhaps “pretending” the specific structs are VLAs in the generic send function isn’t going to get my wish list.

pdoane · November 1, 2023, 5:56pm

Is this reasonable for your use case? You’ll have the union overhead of the biggest struct but I think it gives you the right memory layout.

const TPropertyTag = extern struct {
    nTagId: u32,
    nValueBufSize: u32,
    nValueLength: u32,
    Value: extern union {
        Simple: extern struct { nValue: u32 },
        SetCursorInfo: extern struct { ... },
    },
};

squeek502 · November 1, 2023, 6:12pm

The idea is essentially:

allocate @sizeOf(Struct) + trailing_size bytes with alignment @alignOf(Struct), cast it to *Struct, and then make sure that you can still free the entire allocated slice (so either store the size of the trailing data in the header, or make it so the size of the trailing data can be calculated when it’s time to free).

Here’s an example (from this help thread in Discord):

const std = @import("std");

pub const Il2CppString = extern struct {
    len: usize,
    // 0-length dummy field for trailing UTF-16 data
    data: [0]u16,

    pub fn getData(self: *Il2CppString) []u16 {
        return @as([*]u16, &self.data)[0..self.len];
    }

    pub fn init(ally: std.mem.Allocator, s: []const u8) !*Il2CppString {
        const utf16_len = try std.unicode.calcUtf16LeLen(s);
        const full_byte_len = @sizeOf(Il2CppString) + (utf16_len * 2);
        var string_bytes = try ally.alignedAlloc(u8, @alignOf(Il2CppString), full_byte_len);
        var string = @as(*Il2CppString, @ptrCast(string_bytes.ptr));
        string.* = .{
            .len = utf16_len,
            .data = undefined,
        };
        var data_slice = @as([*]u16, @ptrCast(&string.data))[0..utf16_len];
        // catch unreachable since we already know `s` is valid UTF-8 from calcUtf16LeLen above
        _ = std.unicode.utf8ToUtf16Le(data_slice, s) catch unreachable;
        return string;
    }

    pub fn deinit(self: *Il2CppString, ally: std.mem.Allocator) void {
        const byte_len = @sizeOf(Il2CppString) + self.len * 2;
        const byte_slice = @as([*]align(@alignOf(Il2CppString)) u8, @ptrCast(self))[0..byte_len];
        ally.free(byte_slice);
    }
};

test Il2CppString {
    var string = try Il2CppString.init(std.testing.allocator, "hello");
    defer string.deinit(std.testing.allocator);

    const hello_utf16 = std.unicode.utf8ToUtf16LeStringLiteral("hello");
    try std.testing.expectEqualSlices(u16, hello_utf16, string.getData());
}

Here’s a more complicated example from the standard library: Objects with header first and payload after? - #13 by squeek502

tracy · November 1, 2023, 6:13pm

Correct.

I think that send function is what I was missing. Specifically the compile-time checking even though the parameter is anytype. Combined with the extern union from @pdoane, I think that gets me what I was looking for… and no need for VLAs for this case after all.