I’ve got a recursive descent parser for message pack that I want to allocate with.
The user provides their type, and the generics take care of the rest. Whatever I return, the user will just have to clean up themselves.
pub fn decodeAlloc(allocator: std.mem.Allocator, comptime T: type, in: []const u8) error{ OutOfMemory, Invalid }!T {
var fbs = std.io.fixedBufferStream(in);
var arena = std.heap.ArenaAllocator.init(allocator);
errdefer arena.deinit();
const res = decodeAny(T, fbs.reader(), fbs.seekableStream(), arena.allocator()) catch |err| switch (err) {
error.OutOfMemory => return error.OutOfMemory,
error.Invalid => return error.Invalid,
error.EndOfStream => return error.Invalid,
};
if (fbs.pos != fbs.buffer.len) return error.Invalid;
return res;
}
test "decode slice bools" {
const decoded = try decodeAlloc(std.testing.allocator, []bool, &.{ 0b10010011, 0xc3, 0xc2, 0xc3 });
defer std.testing.allocator.free(decoded);
const expected: []const bool = &.{ true, false, true };
try std.testing.expectEqualSlices(bool, expected, decoded);
}
Did you spot the bug? The arena allocator has “overhead”. On each allocation, it needs to allocate some extra stuff to track what it has allocated. On the happy path, I leak memory in decodeAlloc
due to this extra memory usage.
How do I fix this? The primary reason I am using an arena is that I can happily decode a stream without errors, but if the stream does not end at the correct byte, I want to return an error. Is this a flaw in my decoding API? Should I just not return an error if the stream is too long, i.e. delete this line?
if (fbs.pos != fbs.buffer.len) return error.Invalid;
Maybe I could add an is_end: bool
parameter to my recursive decent parser but then I have to clutter every single parsing function with a check to see if its at the end.