The new `Io` abstraction

Beware, the following is a long-winded reflection on the current discussion on async and IO abstraction. There is no purpose to it but to organize my thoughts and comes from someone with limited understanding of what is being achieved.

In his last live stream, Andrew K. shared his reflection (intention?) to start working on async/await again and started introducing a new concept: the IO interface.

My understanding of it is this Io interface is an inversion of control. It’s an abstraction that would be injected to any function that wants to do IO: read/writing a file, sending/receiving on the network, etc. This interface would be similar to the Allocator interface in principle and would allow to abstract any IO operation so that an implementation could be plugged in independently of whatever your trying to do with this Io.

(We will set aside the complexity of implementing an Io abstraction that would satisfy all the constraints that any IO might require and trust the zig core team to figure out those pesky details. :wink: )

Andrew gave the example of a jpeg library that read and write jpeg files. You want the library to provide the ability to access jpeg files but in the same way you don’t want the library to impose on you a way to allocate memory, you don’t want this library to impose on you a way to do Io. Is this going to be a file on a disk, or is it going to be buffers in memory? Are the IOs going to be handled in a mono-threaded event-loop or is it going to be managed by a thread pool? etc. This will depend on the implementation of this Io interface that the consuming final program can choose depending on its requirements.

With this idea, the jpeg library is abstracted away from those concerns and can just rely on a standard interface to read and write “stuff” from some source or sink. This goes beyond async/await in reality and can be used to solve all kind of problems. Is your OS posix or not? You don’t really care anymore. Want to intercept an IO done so you can debug or compute some statistics? Add a layer in the IO abstraction, etc.

In my view, introducing this Io abstraction would be getting zig closer to what developing JS in a browser looks like. Where any IO is handled by your runtime and all you need to do is “await” it. The difference is that, instead of being a feature of the language itself, with keywords and function coloring and whatnot, it’s just an API in the Io interface.

There is a notion of stackful, stackless and green thread that popped up in the discussion but that’s for people interested in language design. A smolbrain like me is only interested in the impact from a zig developer perspective.

As a developer, I often agonize about how to design my APIs and most of the time this agonizing is really about IO. The most general way of dealing with this is indeed the reader/writer interface which I don’t really like using. The fact that you have to specify an anytype and just expect the user to know what you want is really not great. People have rationalized it in many ways in this forum but the lack of comptime interface in zig just makes this clunky. Most of the time I need to “parse” a file (or decode it) I just end up taking a slice containing the entire file and returning some sort of struct referencing data into that slice so my function is not involved in doing any IO or even in doing any allocation if the file format allows to upper bound the data structures.

However, coming back to the async/await subject, it seems to me that the Io abstraction is just going to solve one side of the async coin. The Io contention side. The other use of async is for CPU contention. When you really want to multi-thread your code because some function is CPU bound and you want your other cores to do stuff in the meantime. I fail to see how this can be generalized or maybe I misunderstand Andrew’s intention. Maybe that is a completely different problem that should be handled separately in a different leg of the async story?

7 Likes

In the async-await-demo branch, there are two example implementations of std.Io provided: std.EventLoop (M:N “green threads”) and std.Thread.Pool.

In both cases, if you use Io.async or Io.go to run a function, that will be multiplexed onto a thread, so it is appropriate for both I/O and CPU work.

8 Likes

I agree with this concern, but from what I understand of the IO interface, it won’t be an anytype, it will be like Allocator where you have a known set of methods that you can call on it. This avoids the anytype issue raised here.

3 Likes

2 posts were split to a new topic: User-friendlines of anytype

Sorry for being late to the party. I like the idea, it’s also close to something I’ve been brainstorming during this Christmas but I couldn’t get it done during the break so it’s left unfinished (and forgotten).

I’ve been trying many different approaches and I came to the conclusion that typical Promise/Future is a poor fit for Zig, and it’s basically inevitable to do co-routines with intrusive Loop/Queue. I did a quick PoC and I’ve also came to the conclusion that there should be some my.io.* namespace. In my case it was flat struct, and it was accepting anytype in every method, so it would be user-extensible but maybe I went too far with that. I can elaborate a bit more if you wish but IMHO it’s a bit of off-topic now.

Anyway, the next thing I wanted to implement in that PoC was structured concurrency because in my opinion, it’s something which has to be provided by the framework, otherwise it’s too easy to shoot yourself in the foot.

How is this Io abstraction going to solve it? Sorry I did not watch the whole stream so I might be missing something but I skimmed through the codebase and examples and I don’t see any notion of nurseries or anything like that.

What I mean is that you have some “spawn point” where you can create new coroutines but they will all be cancelled if there is any uncaught error in that block, and it should also play nice with catch/defer/errdefer:

BTW: I think even Kotlin now has support for something like that.
BTW: The repo is here but there’s not much to see, it’s very bare PoC of coroutines. GitHub - cztomsik/nio.zig: TBD

2 Likes

I also want to add question, how will the supporting parts be passed around, like mutexes for example. Will it be just empty struct full of noop functions for some implementation?

I found the EventLoop implementation and created a perma-link for everyone who is interested: zig/lib/std/Io/EventLoop.zig at 570b5cac4a79ad96c529d62b9c48b72cf8ae13d8 · ziglang/zig · GitHub

And here should be the ThreadPool one: zig/lib/std/Thread/Pool.zig at 570b5cac4a79ad96c529d62b9c48b72cf8ae13d8 · ziglang/zig · GitHub

3 Likes