Io Event Loop Architecture

I’m trying to figure out how to use 0.16.0’s Io. I haven’t been able to find any examples of the basic event loop I think I need. Hopefully someone here can guide me a bit.

I’ve been incrementally porting my C library to Zig. I did some work with Readers and Writers and found the whole experience to be a joy. Next on the list is replacing the networking functionality. My current approach is messy with recursive mutexes, and I don’t want to port it directly. I’d like to fully utilize std.Io in the best way possible.

In this part of the library, I have Devices that I’m discovering with UDP and connecting to using TCP. The library allows a user to send commands and get replies to these commands. I’m going to ignore the UDP part for this discussion because I expect it will become obvious once the TCP is laid out.

Background:

  • There’s an init function that gets called to initialize the library. This will initialize the library’s context (e.g. Io and Allocator) as a global struct. This would also be the place to start the task with Io.
  • The library user can call a function to discover devices, and get a list of structs back. The user typically picks one from the list and frees the others. There’s a bunch of things that go into this device struct (from the UDP packet), but the important one is the resolved network address to use to open a TCP socket.
  • The user calls device.open(), which creates and opens a socket.
  • Various library functions wrap device.do_transfer(…) which sends some command bytes to the socket and waits for a matching command response. There needs to be a mutex to make sure simultaneous calls don’t intermix the outgoing framing and messages. There needs to be some kind of list/queue/event/something for each function call to wait for the matched reply. Replies come in the same order as outgoing commands.
  • Simultaneous to commands, there’s incoming stream data, which get delivered to a callback registered on the device. I mention this because I think it means I can’t just wait for the reply with the device mutex locked in do_transfer(). I think I need some central device read thread that processes incoming data and notifies listeners when their data is available.
  • The user calls device.close() which needs to wait for any ongoing do_transfer calls then close the socket.
  • The user calls device.free()
  • The user calls library deinit() that shuts down any threads started in init() and frees any other resources. This is where I’d want to print error messages about devices that weren’t freed and so on.

My question is how do I organize things?

My best guess is:

  • Create a single task in init() using concurrent to read from all sockets. This task has access to a global list of sockets and associated devices. There’s some global mutex to protect this list.
  • Each device has a mutex and singly linked list of structs with reply buffer details, a status byte, and a std.Io.Event.
  • The device’s do_transfer() locks the device mutex, creates an std.Io.Event and adds it along with buffer details to the end of the linked list. It writes data to the socket. It then unlocks the mutex. It then waits on the Event.
  • When the main task gets a relpy from the device’s socket it locks the device mutex, finds the first entry in the list, writes the reply to the buffer pointer, sets the status byte, and notifies the std.Io.Event to wake up.

Unknowns:

  • How do I create this list of sockets for the task to simultaneously select from?
  • How do I know which device the read comes from?
  • How do I add and remove sockets from this list while the select is running?
  • How do I signal to this task that it’s time to exit?
  • How do I wait for the task to exit?

Is this an efficient approach, or am I way off base?

Thanks for any pointers you can provide.

1 Like

This kind of architecture is unfortunately against the spirit of std.Io. In the std.Io model, you would have a separate task/future/thread/coroutine for each connection/socket. In each of the tasks, you handle the reads using blocking reader API. And then use queue to communicate with the rest of the system, as needed.

If you want to build an event loop out of std.Io, you will have a bad experience. There is std.Io.Batch which allows this to some degree, but the version in Zig 0.16 doesn’t cover much networking needs. Maybe it will be better in Zig 0.17 or 0.18.

If you want to use the event loop approach, looking outside of std.Io would be better.

1 Like

For such project use Zig as ‘best C’

Separate your C project to two layers (if it’s possible)

  • High level
  • Network Layer

Port your High Level to Zig

Zig has excellent binding with C code => use you C Network layer instead of Io…

In 0.16.0 from Io you will need Mutex, Sepaphore , …, time-outs
Thread still is not in Io
Also you can use “C” structs for net addresses of use Io.net - up to you

Recently I’ve shipped an nng (nanomsg-ng) wrapper built on top of the new std.Io APIs, and I ran into some similar questions while implementing a recv poller.

I have a Poller implementation that multiplexes recv operations across multiple nng sockets.
My use case only required recv-side polling, so I can’t really comment on integrating send operations into the same mechanism.

If you’re implementing a recv Poller, I’d strongly recommend using std.Io.Select.concurrent() rather than std.Io.Select.async().

I initially used std.Io.Select.async(), but ran into:

  • Receive operations would sometimes not be observed until a scheduler switch happened.
  • Communication tests between pollers became sensitive to task scheduling and would occasionally stall.

std.Io.Select.concurrent() behaved much more predictably for this use case.

I use nng’s aio_cancel / aio_stop APIs.

After stopping all AIOs, I wait for completion using std.Io.Group.

1 Like

Thanks. That was the insight I was missing. I will give this a try and see how far I can get. I think it should map pretty cleanly.

Thanks for the heads up. I would rather build my architecture around what std.Io expects rather than try to force my expectations to fit, at least to start.

Yes, that’s a good idea if I can’t get std.Io to work. I do have some low level cross platform networking parts in my existing code that I could reuse. My hope was to ditch all of my old code (and old problems) and go for a pure Zig implementation.

Thanks! Cancellation is obviously the solution, but I just wasn’t seeing it. I feel a bit stupid really.

1 Like