Lizpack: (another) Zig MessagePack Library

I’ve been suffering from some not-written-by-me syndrome lately as I learn to do more things with zig.

I use message pack a lot for inter-process messaging (mainly with Redis), so here is a nice little message pack implementation I wrote:

Its only about 700 LOC even with tests, which really highlights the power of zig generics.

The user simply writes a struct to define their message format and it will be serialized and deserialized accordingly:

const std = @import("std");

const lizpack = @import("lizpack");

test {
    const CustomerComplaint = struct {
        user_id: u64,
        status: enum(u8) {
            received,
            reviewed,
            awaiting_response,
            finished,
        },
    };

    var out: [1000]u8 = undefined;
    const expected: CustomerComplaint = .{ .user_id = 2345, .status = .reviewed };
    const slice: []u8 = try lizpack.encode(expected, &out);
    try std.testing.expectEqual(expected, lizpack.decode(@TypeOf(expected), slice));
}

Heavily inspired by msgspec

5 Likes

v0.2.0 adds a ton of new features!

  • Decode and encode variable length data structures (slices etc.) using an allocator.
  • Customize the format of types that could correspond to multiple MessagePack Types.

Default Formats

Zig Type MessagePack Type
bool bool
null nil
u3,u45, i6 integer
?T nil or T
enum integer
[N]T N length array of T
[N:x]T N+1 length array of T ending in x
[N]u8 str
@Vector(N, T) N length array of T
struct map, str: field value
union (enum) map (single key-value pair)
[]T N length array of T
[:x]T N + 1 length array of T ending in x
[]u8 str
[:x]u8 str ending in x
*T T

str is the default MessagePack type for []u8 because it is the smallest for short slices.

Unsupported types:

Zig Type Reason
union (untagged) Decoding cannot determine active field, and neither can you.
error I can add this, if someone asks. Perhaps as str?

Note: pointer types require allocation to decode.

Customizing Formats

You can customize how types are formatted in message pack:

Zig Type Available Encodings
enum string, int
[]u8,[N]u8 string, int, array
struct map, array
union (enum) map (single key-value pair), active field

See examples directory for how to do it.

2 Likes

v0.3.0 adds a zero error encoding api by encoding types with compile-time known bounded encoded lengths into a std.BoundedArray.

Enjoy the bliss of zero errors!!!

test "basic example bounded" {
    const CustomerComplaint = struct {
        user_id: u64,
        status: enum(u8) {
            received,
            reviewed,
            awaiting_response,
            finished,
        },
    };

    // look mom! no errors!
    const expected: CustomerComplaint = .{ .user_id = 2345, .status = .reviewed };
    const slice: []const u8 = lizpack.encodeCustomBounded(expected, .{}).slice();
    try std.testing.expectEqual(expected, lizpack.decode(@TypeOf(expected), slice));
}
It also continues my ruthless abuse of generics / reflection!
/// Returns longest possible length of MessagePack encoding for type T.
/// Raises compile error for unbounded types (slices).
pub fn largestEncodedSize(comptime T: type, format_options: FormatOptions(T)) usize {
    return switch (@typeInfo(T)) {
        .bool => 1, // see Spec, bools are one byte
        .int => switch (@typeInfo(T).int.signedness) {
            .unsigned => switch (@typeInfo(T).int.bits) {
                0...7 => 1, // pos fix int
                8 => 2, // uint 8
                9...16 => 3, // uint 16
                17...32 => 5, // uint 32
                33...64 => 9, // uint 64
                else => unreachable, // message pack supports only up to 64 bit ints
            },
            .signed => switch (@typeInfo(T).int.bits) {
                0...8 => 2, // int 8 TODO: optimize using pos/neg fix int?
                9...16 => 3, // int 16,
                17...32 => 5, // int 32
                33...64 => 9, // int 64
                else => unreachable, // message pack supports only up to 64 bit ints
            },
        },
        .float => switch (@typeInfo(T).float.bits) {
            32 => 5, // f32
            64 => 9, // f64
            else => unreachable, // message pack supports only 32 and 64 bit floats
        },
        .array => switch (@typeInfo(T).array.child) {
            u8 => switch (format_options) {
                .bin, .str => 5 + @typeInfo(T).array.len, // TODO: don't assume bin_32 and str_32
                .array => 5 + 2 * @typeInfo(T).array.len, // TODO: don't assume array_32
            },
            else => 5 + @typeInfo(T).array.len * largestEncodedSize(@typeInfo(T).array.child, format_options),
        },
        .optional => largestEncodedSize(@typeInfo(T).optional.child, format_options),
        .vector => 5 + largestEncodedSize(@typeInfo(T).vector.child, format_options) * @typeInfo(T).vector.len, // TODO: don't assume array_32
        .@"struct" => switch (format_options.layout) {
            .map => blk: {
                var size: usize = 5; // TODO: don't assume map_32
                inline for (comptime std.meta.fields(T), comptime std.meta.fields(@TypeOf(format_options.fields))) |field, field_option| {
                    size += 5 + field.name.len; // TODO: don't assume str_32
                    size += largestEncodedSize(field.type, @field(format_options.fields, field_option.name));
                }
                break :blk size;
            },
            .array => blk: {
                var size: usize = 5; // TODO: don't assume array_32
                inline for (comptime std.meta.fields(T), comptime std.meta.fields(@TypeOf(format_options.fields))) |field, field_option| {
                    size += largestEncodedSize(field.type, @field(format_options.fields, field_option.name));
                }
                break :blk size;
            },
        },
        .@"enum" => switch (format_options) {
            .str => blk: {
                comptime assert(@typeInfo(T).@"enum".is_exhaustive); // TODO: only exhaustive enums supported
                break :blk 5 + largestFieldNameLength(T); // TODO: don't assume str_32
            },
            .int => blk: {
                const TagInt = @typeInfo(T).@"enum".tag_type;
                break :blk largestEncodedSize(TagInt, void{});
            },
        },
        .@"union" => switch (format_options.layout) {
            .map => blk: {
                const size: usize = 1; // assumes fixmap
                var largest_field_size: usize = 0;
                inline for (std.meta.fields(T), std.meta.fields(@TypeOf(format_options.fields))) |field, field_option| {
                    const field_size: usize = 5 + field.name.len + largestEncodedSize(field.type, @field(format_options.fields, field_option.name)); // TODO: don't assume str_32
                    if (field_size > largest_field_size) {
                        largest_field_size = field_size;
                    }
                }
                break :blk size + largest_field_size;
            },
            .active_field => blk: {
                var largest_field_size: usize = 0;
                inline for (std.meta.fields(T), std.meta.fields(@TypeOf(format_options.fields))) |field, field_option| {
                    const field_size = 5 + field.name.len + largestEncodedSize(field.type, @field(format_options.fields, field_option.name)); // TODO: don't assume str_32
                    if (field_size > largest_field_size) {
                        largest_field_size = field_size;
                    }
                }
                break :blk largest_field_size;
            },
        },
        .pointer => switch (@typeInfo(T).pointer.size) {
            .One => largestEncodedSize(@typeInfo(T).pointer.child),
            else => @compileError("type: " ++ @typeName(T) ++ " not supported."),
        },
        else => @compileError("type: " ++ @typeName(T) ++ " not supported."),
    };
}

v0.4.0

  • removes encodeCustom / decodeCustom API in favor of .{} in encode() / decode() function signatures
  • adds format customization for vectors, similar to arrays
  • fixes bug in signed integer serialization

v0.5.0

  • Added support for encoding / decoding arrays/slices of two-field structs as MessagePack maps.

Example:

test "maps" {
    const RoleItem = struct {
        username: []const u8, // key
        role: enum { admin, plebeian }, // value

    };

    const roles: []const RoleItem = &.{
        .{ .username = "sarah", .role = .admin },
        .{ .username = "bob", .role = .plebeian },
    };

    const format: lizpack.FormatOptions(@TypeOf(roles)) = .{ .layout = .map_item_first_field_is_key };

    const expected_bytes: []const u8 = &.{
        (lizpack.Spec.Format{ .fixmap = .{ .n_elements = 2 } }).encode(),
        (lizpack.Spec.Format{ .fixstr = .{ .len = 5 } }).encode(),
        's',
        'a',
        'r',
        'a',
        'h',
        0,
        (lizpack.Spec.Format{ .fixstr = .{ .len = 3 } }).encode(),
        'b',
        'o',
        'b',
        1,
    };
    var out: [1000]u8 = undefined;
    const encoded = try lizpack.encode(roles, &out, .{ .format = format });
    try std.testing.expectEqualSlices(u8, expected_bytes, encoded);
}
2 Likes