Add metadata to types

I’m writing a deserialiser. Given a type it can decode bytes into that type.
Some of the serialised bytes have the same representation in zig but are different in byte format, for example a bytearray might be prefixed by 0x01 and array of u8 as 0x02, but they are both just decoded as []const u8.

For now i just use a switch statement, when 0x02 or 0x01 is found then decode into []const u8. Is there a way to add metadata to a type so i could optimize this at comptime?

Something like this:

const Field = struct {
    field_format: enum { Str, Int, Bin },
    child: type,
};
const JustSomeStruct = struct { a: []Field{.child = u8, .field_format = .Bin}, b: f64 };
const value = decodeAs(&buffer, JustSomeStruct);

Here a is of type Field, that has the type to decode into (u8) and the format that it will find it as (.Bin). This way decodeAs has some extra context and might remove some unused branches.

Something like this.

I’m not familiar of any comptime technique that will deserialize this data in the method that you specified. Are tagged unions insufficient?

const std = @import("std");

const SomeStruct = union(enum(u2)) {
    bytearray: u8 = 0x01,
    arrayofu8s: [8]u8 = 0x02,

    fn decode(buffer: []const u8) SomeStruct {
        return switch (buffer[0]) {
            0x01 => .{ .bytearray = buffer[1] },
            0x02 => .{ .arrayofu8s = buffer[1..][0..8].* },
            else => unreachable,
        };
    }
};

test SomeStruct {
    // one issue is that now these structs could have large sizes
    std.debug.print("size of SomeStruct: {d}\n", .{@sizeOf(SomeStruct)}); // returns 9
    const buffer1 = [_]u8{ 1, 10 };
    const result1 = SomeStruct.decode(&buffer1);
    std.debug.print("{}", .{result1}); // test.SomeStruct{ .bytearray = 10 }
    const buffer2 = [_]u8{ 2, 1, 2, 3, 4, 5, 6, 7, 8 };
    const result2 = SomeStruct.decode(&buffer2);
    std.debug.print("{}", .{result2}); // test.SomeStruct{ .arrayofu8s = { 1, 2, 3, 4, 5, 6, 7, 8 } }
}

Perhaps you could provide more details about the data that you have. Is the prefix a header byte? Are you trying to work with variable length lists or static arrays? Are you consuming bytes in a stream?

1 Like

See the TAGS pattern: Tags · Issue #1099 · ziglang/zig · GitHub

5 Likes

Would you mind expending on how this pattern would be used in practice? It looks neat, but I’m not sure I understand what it would do.

TAGS adds metadata to struct fields.


Lets say that we want to generate SQL statements that create the database tables from our structs.

pub const Client = struct {
    id: u64,
    name: []const u8,
    phone: ?[]const u8,

    pub const TAGS = .{
        .id = .{ .db_type = "bigint"},
        .name = .{ .db_type = "varchar(80)"},
        .phone = .{ .db_type = "varchar(20)"},
    };
};

What we need is the following statement:

CREATE TABLE Client(
    id bigint not null, 
    name varchar(80) not null,
    phone varchar(20) null
);

From zig struct definition we have the name of the struct, the name of the attributes and the nullability of the attributes (if it is optional or not) but we are missing the database types.
TAGS holds the fields additional metadata that we require. .db_type is the type of the field in the database.

4 Likes