The impact of async (non-blocking I/O) on network services

Will we see a change in performance in the networked services after the addition of async/await to the language?

You ask to predict how the performance of network services is going to change if async/await is implemented. It is hard to predict the future, in particular: Is “async/await” going to be added to zig or not?
See Andrew Kelley answering about async on the 2024 zig roadmap.

1 Like

If the async model is ever reintroduced to the language in some form or other, I imagine it’ll be comparable in performance to using callbacks (TigerBeetle, Bun) or completions (libxev-based projects), since any good implementation will be using the same async IO interfaces on major platforms (io_uring, kqueue, iocp). So, adding async support would largely be a usability improvement, not so much a capability or performance one.

3 Likes

I’d like to note that async/await and async i/o are actually orthogonal things. The first one is about coroutines (fibers/green threads…) and the second one is about things like IOCP, io_uring and similar mechanisms. Surely one can use async i/o stuff inside coroutines. As to performance impact my speculations is that coroutines require context switching (of course context of a coroutine is much smaller than context of an OS thread but nevertheless…) so that would be some overhead.

3 Likes

Not necessarily. As @tensorush points out, there’s a distinction here between non-blocking IO (sometimes called ‘evented’) and async.

Async is one strategy for using non-blocking IO. It lets you suspend a function until the resource is ready, then resume/await that function.

There are many others, the simplest for a single-threaded system is arguably polling: there’s an event loop, and it periodically polls resources, and calls the appropriate functions when those are ready. Often this is engineered with callbacks, but that actually isn’t necessary either. If the consumer is statically known, so you don’t have to dynamically subscribe and unsubscribe from a resource, then callbacks might be overkill.

I’m using the term “polling” somewhat generally here, to abstract over the various OS-specific ways to be alerted when a resource is ready without blocking the thread in the process.

If you’re using an event loop library which is based on callbacks, then yes you’ll need to use them. But an event loop can be as simple as while(true). You’ll want to add an exponential back off, which calls sleep for longer slices of time when nothing is available to do, or you’re going to pin a CPU and annoy the kernel. But add that, and you have a credible event loop.

Zig is a good language to try a focused solution like this. There are some more advanced solutions, like libxev, which provide affordances for something more complex, and sometimes that will be the right choice.

Async is not an optimization, so performance must be found elsewhere. It’s a control-flow primitive, and one which is independent of non-blocking, evented I/O. If we do get async back, the first place I’m going to apply it has nothing to do with networking at all.

6 Likes