HTTP Server with new std.Io

I decided to try Zig (I come from a Golang background) and have been reading up on the new Io which seems quite exciting. To understand it better I put together this basic http server. It seems to run fine, even for concurrent requests. The focus is not on the caveman-like http request parsing, but rather handing new connections with io.async to support simultaneous connections (if io implementation supports concurrency). I assume that before std.Io, this would have been done with a manual thread pool or event loop?

Please let me know if there is a major flaw in this implementation or room for improvement, thanks!

const std = @import("std");
const Io = std.Io;
const net = std.Io.net;
const fmt = std.fmt;
const Allocator = std.mem.Allocator;

pub fn main(init: std.process.Init) !void {
    const portText = init.environ_map.get("PORT") orelse "8080";
    const port = try fmt.parseInt(u16, portText, 10);
    var allocator = std.heap.stackFallback(1024, init.arena.allocator());
    try startServer(allocator.get(), init.io, port);
}

fn startServer(allocator: Allocator, io: Io, port: u16) !void {
    std.log.info("Starting server on {d}", .{port});
    const address: net.IpAddress = try .parseIp4("0.0.0.0", port);
    var server = try address.listen(io, .{ .reuse_address = true });
    while (true) {
        std.log.debug("Waiting for connection", .{});
        const stream = try server.accept(io);
        _ = io.async(handleStream, .{ allocator, io, stream });
    }
}

fn handleStream(allocator: Allocator, io: Io, stream: net.Stream) !void {
    _ = allocator;
    defer stream.close(io);

    // read request
    var buf: [100]u8 = undefined;
    var reader = stream.reader(io, &buf);
    try reader.interface.fillMore();
    const request = reader.interface.buffered();
    std.log.debug("request (N={d}): {s}", .{ request.len, request });

    // simulate db work
    try io.sleep(.fromSeconds(2), .awake);

    // write response
    var writer = stream.writer(io, &.{});
    const body =
        \\{
        \\  "name": "Daniel", 
        \\  "age": 28
        \\}
    ;
    const response = fmt.comptimePrint(
        \\HTTP/1.1 200 OK
        \\Content-Length: {d}
        \\Content-Type: application/json
        \\
        \\{s}
    , .{ body.len, body });
    try writer.interface.writeAll(response);
}
1 Like

as i am reviewing it, the first thing i realize is I probably need a seperate allocator per stream handler to avoid race conditions on the global allocator?

Here’s how it used to work in the server binary used to serve the standard library documentation:

while (true) {
    const connection = try http_server.accept();
    _ = std.Thread.spawn(.{}, accept, .{ &context, connection }) catch |err| {
        std.log.err("unable to accept connection: {s}", .{@errorName(err)});
        connection.stream.close();
        continue;
    };
}

I think this is probably intentionally crude and a large-scale server would have indeed used std.Thread.Pool.

This is probably true from what I’ve read, however the standard library does have some dedicated thread-safe allocators:

To just parrot the documentation and offer nothing more helpful, it seems like the SmpAllocator maintains some low-level allocation primitives per thread, without having to have an entire separate allocator for each thread.
The thread-safe FixedBufferAllocator just uses atomic loads/stores, since it doesn’t actually have to interact with virtual memory.

Both of those are probably better than doing it manually, and by using them in conjunction you can probably hack together a solution that works with StackFallbackAllocator.

4 Likes

You probably want each request to have its own arena, init.gpa will always be thread safe, and ArenaAllocator is also thread safe. If you have a static upper bound a FixedBufferAllocator is great, and can be made by each task on their own stack.

Some important things:
you must always await or cancel a future. The implementation can only clean up and reuse the resources used for the future once it knows you have gotten the result. This can be done via a group or other mechanism.

What behaviour do you want when you reach the concurrency limit? There is no artificial limit by default (you can set one if you want), but there is a practical limit of your hardware.
Currently, you will just max out your cpu leaving pending connections to timeout.
Usually you will want to gracefully tell clients that you dont have resources avaiable.

when you server exits, you probably want to finish outstanding requests if possible, or otherwise gracefully ending requests early. This can be done with cancelation.
Currently you will just hardkill the application, including tasks handling requests. Leaving clients to timeout.

7 Likes

There’s a good example of a server with the Io interface in std.Build.WebServer. If you ignore all the build related parts, the start(), serve()and accept() functions get you a decent starting point.

9 Likes

Aaaa, hidden gem!

1 Like

what about a threadlocal fallback allocator in the same server setup, backed by a thread-safe smp allocator. From a newcomer perspective this seems like a pretty ideal allocator for 99% of basic web servers?

threadlocal var main_allocator = std.heap.stackFallback(8192, std.heap.smp_allocator);

Unless you are using threads directly, don’t use threadlocal.

Io implementations can, and do, use other units of concurrency; such as Evented, which uses green threads/fibres, basically user made threads on top of OS threads.

threadlocal is for OS threads, multiple fibres may share an OS thread, including the thread locals, and they may not always be run on the same thread either.

1 Like

Thanks again for any feedback, really learning so much. So should my allocator choice be influenced by my Io implementation choice? Thus use threadlocal if Threaded, otherwise normal var?

No it shouldn’t.

And threadlocal is not an allocator, it is a kind of global variable.

Even with Threaded you should not use it because your code should not depend on the Io implementation you use.

I think this is the key that is not so obvious to a newcomer - you should (at least often) be open to a swap-in of a different Io. E.g., you might want to use green threads, another might want to use (your code with) OS threads only, or single-threaded, even. (By “another”, I’m assuming a lib or someone who wants to use your code a little differently, but it could even be you, wanting to profile your own design running on async green threads versus running on OS threads – the hope is that such a swap-in is easy-peasy, which is why Io is designed as it is, to the point of avoiding function coloring.)

The same is sort-of true with allocators, though there’s no tight coupling of allocator choice and Io choice, of course. But, I often use the debug allocator in tests, to confirm no leaks, but then make a different choice for production. Sometimes, especially with something like an arena, it doesn’t make a lot of sense to assume that somebody will want to swap for any other kind of allocator, but they may want to try a different kind of arena, or another allocator that avails the .reset() function, so behaves “that way”.

A consideration which hadn’t yet occurred to me.

Perhaps the Io abstraction needs a concept of a “task local” variable to go with it? Nothing wrong with threadlocal, but you’re right, as of 0.16 it’s much easier to use it incorrectly.

How would that be any different from an argument to the function you are calling via async or concurrent

1 Like

For anyone else that wants to reach for the fixed buffer allocator or stack fallback, because you don’t want the allocator to request/free memory from the OS everytime you allocate/free a few bytes take note: For many of the std allocators (like the smp allocator) there seems to already be a builtin cache that mitigates this. Something which seems like “duh’“ now that i think about it, but missed it initially

5 posts were split to a new topic: How to make good use of new std.Io in a storage engine?

The same way that a threadlocal or global var is different from a local, surely.

It’s possible for a language to demand strictly local state, but I’m glad Zig doesn’t force me to wear that particular straitjacket.

But I’m just spitballing over here. I’ll want some experience with the new way of doing things before developing obnoxiously confident opinions about it.

1 Like

Sure, I just don’t see a use case for them, TBH I barely see the point of even threadlocal.

And if there does exist a usecase for such a feature, I think there is a good chance threadlocal already fulfils it, or could fulfil it. Io implementations could even support setting tls up on their own unit of concurrency! I am not entirely sure if that is possible, but it might be.

1 Like

429 Too Many Requests :scream: