How to free only the overhead for the arena allocator?

I’ve got a recursive descent parser for message pack that I want to allocate with.

The user provides their type, and the generics take care of the rest. Whatever I return, the user will just have to clean up themselves.

pub fn decodeAlloc(allocator: std.mem.Allocator, comptime T: type, in: []const u8) error{ OutOfMemory, Invalid }!T {
    var fbs = std.io.fixedBufferStream(in);
    var arena = std.heap.ArenaAllocator.init(allocator);
    errdefer arena.deinit();
    const res = decodeAny(T, fbs.reader(), fbs.seekableStream(), arena.allocator()) catch |err| switch (err) {
        error.OutOfMemory => return error.OutOfMemory,
        error.Invalid => return error.Invalid,
        error.EndOfStream => return error.Invalid,
    };
    if (fbs.pos != fbs.buffer.len) return error.Invalid;
    return res;
}

test "decode slice bools" {
    const decoded = try decodeAlloc(std.testing.allocator, []bool, &.{ 0b10010011, 0xc3, 0xc2, 0xc3 });
    defer std.testing.allocator.free(decoded);
    const expected: []const bool = &.{ true, false, true };
    try std.testing.expectEqualSlices(bool, expected, decoded);
}

Did you spot the bug? The arena allocator has “overhead”. On each allocation, it needs to allocate some extra stuff to track what it has allocated. On the happy path, I leak memory in decodeAlloc due to this extra memory usage.

How do I fix this? The primary reason I am using an arena is that I can happily decode a stream without errors, but if the stream does not end at the correct byte, I want to return an error. Is this a flaw in my decoding API? Should I just not return an error if the stream is too long, i.e. delete this line?

if (fbs.pos != fbs.buffer.len) return error.Invalid;

Maybe I could add an is_end: bool parameter to my recursive decent parser but then I have to clutter every single parsing function with a check to see if its at the end.

I see std.json just returns the entire arena along with the type.

well it makes my API uglier but maybe the memory management is easier for the user so they don’t have to write their own deinit methods

pub fn Decoded(comptime T: type) type {
    return struct {
        arena: *std.heap.ArenaAllocator,
        value: T,
        pub fn deinit(self: @This()) void {
            const allocator = self.arena.child_allocator;
            self.arena.deinit();
            allocator.destroy(self.arena);
        }
    };
}

pub fn decodeAlloc(allocator: std.mem.Allocator, comptime T: type, in: []const u8) error{ OutOfMemory, Invalid }!Decoded(T) {
    var fbs = std.io.fixedBufferStream(in);
    const arena = try allocator.create(std.heap.ArenaAllocator);
    errdefer allocator.destroy(arena);
    arena.* = .init(allocator);
    errdefer arena.deinit();
    const res = decodeAny(T, fbs.reader(), fbs.seekableStream(), arena.allocator()) catch |err| switch (err) {
        error.OutOfMemory => return error.OutOfMemory,
        error.Invalid => return error.Invalid,
        error.EndOfStream => return error.Invalid,
    };
    if (fbs.pos != fbs.buffer.len) return error.Invalid;
    return Decoded(T){ .arena = arena, .value = res };
}
test "decode slice bools" {
    const decoded = try decodeAlloc(std.testing.allocator, []bool, &.{ 0b10010011, 0xc3, 0xc2, 0xc3 });
    defer decoded.deinit();
    const expected: []const bool = &.{ true, false, true };
    try std.testing.expectEqualSlices(bool, expected, decoded.value);
}

zig std lib wins again!