Tying down logic and memory management

Hi everyone,

I have been using Python for a while and when trying to do things in Zig, I often find myself in some sort of analysis paralysis.

In this case, I am trying to implement a very basic HTTP/1.1 server as part of my learning process and to satisfy my curiosity. While implementing the routing and responses, I found myself in the following situation:

const Header = struct {
    name: []const u8,
    value: []const u8,
};

const Response = struct {
    start_line: []const u8,
    headers: []const Header,
    message_body: []const u8,
};

fn router(r: Request) !void {
    var response;
    if (std.mem.eql(u8, r.path, "/bla")) {
        // version 1
        var header_list = std.ArrayList(Header).init(allocator);
        const content_length = try std.fmt.allocPrint(allocator, "{}", .{r.start_line.request_target.len - 6});
        try header_list.append(Header{ .name = "Content-Type", .value = "text/plain" });
        try header_list.append(Header{ .name = "Content-Length", .value = content_length });
        response = Response{
            .start_line = .{ .status_code = 200, .reason_phrase = "OK" },
            .headers = header_list.toOwnedSlice(),
            .message_body = r.start_line.request_target[6..],
        };

        // version 2
        response = route(allocator, r)
    }
    _ = try stream_out.print("{}\n", .{response});
}

fn route(a: Allocator, r: Request) !Response {
    var header_list = std.ArrayList(Header).init(a);
    const content_length = try std.fmt.allocPrint(a, "{}", .{r.start_line.request_target.len - 6});
    try header_list.append(Header{ .name = "Content-Type", .value = "text/plain" });
    try header_list.append(Header{ .name = "Content-Length", .value = content_length });
    return Response{
        .start_line = .{ .status_code = 200, .reason_phrase = "OK" },
        .headers = header_list.toOwnedSlice(),
        .message_body = r.start_line.request_target[6..],
    };
}

The code above is not exactly what I have, except for the function body, which is almost the same.


I feel a bit stuck because I am trying to find the best solution, but nothing really convinces me. So maybe you could advice me on how to proceed. This is not just a technical but also a philosophical question.


After all my reading, I see the following options:

  1. Move the _ = try stream_out.print("{}\n", .{response}); into the route(). This fixes the lifetime issues since I never need to access response outside of route(). However, it feels a bit ugly, because later on I might want to add extra headers, which would require managing them within my “business logic.” It makes things trivial, but I don’t really learn anything new this way.

  2. Use an arena allocator. This would be the lazy approach, as I wouldn’t need manage memory as long as I don’t run out of it. Potentially inefficient. I’m not at the point where I’m creating so many resources that I might run out of memory, but that is still something I don’t learn.

  3. Allocate the entire Response structure and then release everything. This seems inefficient, because I’d be allocating for constants that are stored in global registry.

  4. Use “bookkeeping”, which I only read about yesterday: have a structure that records all the pointers I need to free.

Any thoughts? Feel free to share links to anything you consider insightful.

P.S. I know there have been a few similar questions here on the forum. I’ll probably re-read them again later today.

If you’re always going to allocate two headers, there’s no point in using an ArrayList, just allocate the two headers directly.
The Response start_line field is not what you’re showing in the code, as you declared it as being a []const u8. Anyways, since every status code can be mapped to a single reason phrase, there’s no point in storing both. It’s a waste and it creates opportunities for these two fields to go out of sync. Store just the code, and have a function that returns the phrase given a code.
But there’s something better than everything I wrote here. Just take a Writer as a parameter. Instead of allocating, just write the response directly into the writer.

1 Like

what?? Using an arena IS managing memory, the entire point is to condense lifetimes to a singular backing buffer. It’s perfectly valid to choose an easier solution, it’s not lazy.

you could do that, yes it’s inefficient.

that is a solution.

You’re making a toy http server to learn, the only criteria to measure solutions against is how much you learn from them, the only way to know that is to try them all :3.

To extract something more generic from what @LucasSantos91 said, if you can trivially extract information from another piece of information, you only need to store one of them.
It’s perfectly fine to write code that accounts for a future requirement, but it’s also not necessary.
You can leave decisions up to the discretion of the caller/user, they tend to have more circumstantial information that helps make decisions.

1 Like

I hope you are enjoying your Zig journey. It can be very frustrating at the start, but once you get going with it, it’s just so nice to write and read!

I’d suggest that arenas are perfectly suited to this type of problem, where everything in the response has the same lifetime. Once the response is completed, there is usually no need for any of the memory related to that response to hang around. And if there are a few things that do need to have a lifetime outside of the response, use a different allocator for those few.

But I think you are coming up against one of the big differences in programs written in low level languages vs what you are used to. In python, it’s perfectly normal to build a response totally in memory and send that out all at once.

But when writing this in Zig, what @LucasSantos91 suggested could be a better approach. You just stream your response as you build it into a write buffer that is sized nicely for your underlying network transport. Once the buffer fills, it gets flushed automatically. Write some more response and repeat. For all your code would care it would be the same as writing to a string buffer, except it is a string buffer that sends itself across a TCP/IP connection.

The Zig standard library has a lot of what you need under std.io with things like BufferedWriter. Have a look and see if that helps simplify what you are trying to do.

Happy tinkering.

1 Like

Sorry for that confusing example. I’m currently experimenting with different approaches simultaneously To post an example here, I tried to Frankenstein various parts of my code together, and it didn’t work out that well.

I’m absolutely with you regarding using an enum for the status code. I just didn’t get to that at the moment of posting.

I’ve implemented that approach, but there’s a minor drawback: if there’s something in between the server and the client that needs a header from the server to let the response through, then I have to work with []u8 to add it. And maybe something else can occur. So, as part of my learning, I’m trying to return the entire structure instead.

1 Like

That was a really poor choice of wording on my part. What I meant by “lazy” is that I would avoid getting into the nitty-gritty details of memory management that would be required if I didn’t use an arena allocator.

Thanks, I guess that’s basically what I’ll need to do in the end. I’m a bit averse to trying out things without knowing the outcomes in advance, which is not helping me here.

I guess for me it’s more about not really knowing what I’m doing or how to approach learning in this area. So the requirements end up being rather frivolous, shifting, and open-ended.

Perhaps in this case, the best thing would be to have a header iterator, that progressively parses the resquest headers. Then you do what you need to with them, and you finish by writing a response to a Writer.

I absolutely love Zig. I guess the only problem is that forgot how to learn. Also have way more responsibilities. Always time crunch :slight_smile:

This was my big worry, I couldn’t find an example that would pass multiple allocators, but I was thinking about that.

Yeah, lifetimes and types ([ _ ]u8 vs [ ]u8) hit especially hard after Python.

Thanks!

So, RFC 9112 - HTTP/1.1 states that headers are typically parsed into a hash table. That’s why I’m trying to simulate how to make it extendable.

Here is my current code for this. It feels a bit strange but it works: a partially allocated response is returned from the function to the outer scope, and then freed at a later point.

const AllocHeader = struct {
    name: []const u8,
    name_alloc: bool,
    value: []const u8,
    value_alloc: bool,
    allocator: std.mem.Allocator,

    pub fn deinit(self: AllocHeader) void {
        if (self.name_alloc) {
            self.allocator.free(self.name);
        }
        if (self.value_alloc) {
            self.allocator.free(self.value);
        }
    }
};

const AllocHttpResponse = struct {
    start_line: StatusLine,
    fields: []AllocHeader,
    fields_alloc: bool,
    message_body: ?[]const u8 = null,
    allocator: std.mem.Allocator,

    pub fn deinit(self: AllocHttpResponse) void {
        for (self.fields) |field| {
            field.deinit();
        }
        if (self.fields_alloc) {
            self.allocator.free(self.fields);
        }
    }

    pub fn format(
        self: @This(),
        _: []const u8,
        _: std.fmt.FormatOptions,
        writer: anytype,
    ) !void {
        try writer.print("{s}\r\n", .{self.start_line});
        for (self.fields) |field| {
            try writer.print("{s}: {s}\r\n", .{ field.name, field.value });
        }
        _ = try writer.write("\r\n");
        if (self.message_body) |message_body| {
            try writer.print("{s}", .{message_body});
        }
    }
};

And then it is being called like that:

fn handler(...) {
        const response = try route_get_echo(allocator, request);
        try stream_out.print("{s}", .{response});
        response.deinit();
}

Edit: In the end I have really messeded up my original example.

Thats not a hash table…
There is std.StringHashMap, you wouldn’t need to store the name in the value since it’s stored in the keys.

This is a great use case for an arena, you wouldn’t need to keep track of if the name or value have been allocated, you’d just deinit the arena when you’re done and the allocations go poof. Ofc you wouldn’t need to store the allocator, though you shouldn’t need to now either.

Regardless if you use the same allocator for every header, you should store it outside the header, instead of a copy with each header.

Sorry, my current code is really confusing, I didn’t really implement that far, yet. I started with “something” and I am trying to learn how everything works.

Specifically because I don’t need to do anything with header fields, yet, I decided to first start with [10][]u8. But then of course this limits the number of headers I can add. So an array simmed like “good next step”.

Thank being said I am appreicating you pointing that out.

if you use the same allocator for every header, you should store it outside the header, instead of a copy with each header.

Thanks, that makes sense. I am sort ofcurious if I can create AllocHeader in the AllocHttpResponse.init(). Then I could actually guarnatee that the allocator passed the AllocHttpResponse is the one used for allocaitng the AllocHeader.

You can also just have the caller deal with managing the allocators, in most cases they only have one, maybe some sub arenas.

That is what std is moving towards.