Libucw fastbufs are similar to the new Zig IO interface but 30 years older

The new IO interface design similar to libucw fastbufs. The original fastbuf implementation is from 1997!

From the documentation:

Generally speaking, a fastbuf consists of a buffer and a set of callbacks. All front-end functions operate on the buffer and if the buffer becomes empty or fills up, they ask the corresponding callback to handle the situation. Back-ends then differ just in the definition of the callbacks.

I find it interesting to compare both implementations. For example fastbufs allow seeking over the backing streams and reuses the same buffer for both reader and a writer.

I also remember hearing claims that fastbufs were 10x faster than the stdio.h implementations at the time but can’t find that now.

8 Likes

That’s a nice find, thanks for sharing. I’m actually trying to understand why Zig has moved to this new interface. I don’t see the connection between it and the new Io interface, or why the old Reader interface couldn’t be used.

The old interfaces did work, but the new reader/writer interfaces are more efficient due to

  1. embedding the buffer into the interface, instead of as an implementation detail.
  2. better vtable functions to allow for more optimised implementations.

They are related to io in general, as they are used to do streaming io.

But the new Io interface is not relevant to reader/writer being upgraded.

2 Likes

I have a talk about this coming out next week, and the motivation is in the release notes for the respective PR, but to summarize it here:

  • The old interface was generic, poisoning structs that contain them and forcing all functions to be generic as well with anytype. The new interface is concrete.
    • Bonus: the concreteness removes temptation to make APIs operate directly on networking streams, file handles, or memory buffers, giving us a more reusable body of code. For example, http.Server after the change no longer depends on std.net - it operates only on streams now.
  • The old interface passed errors through rather than defining its own set of error codes. This made errors in streams about as useful as anyerror. The new interface carefully defines precise error sets for each function with actionable meaning.
  • The new interface has the buffer in the interface, rather than as a separate “BufferedReader” / “BufferedWriter” abstraction. This is more optimizer friendly, particularly for debug mode.
  • The new interface supports high level concepts such as vectors, splatting, and direct file-to-file transfer, which can propagate through an entire graph of readers and writers, reducing syscall overhead, memory bandwidth, and CPU usage.
  • The new interface has “peek” functionality - a buffer awareness that offers API convenience for the user as well as simplicity for the implementation.

I’m not surprised to see this fastbufs thing. As is one of the key points in my talk, every programming language has an equivalent of this. What’s interesting is whether or not the language manages to get the buffer into the interface. Among all the languages I looked at, C alone manages to do it - and you don’t need libucw fastbufs. Just regular libc does it. But there’s a critical failure - due to the interface functions being in separate compilation unit than usage code, everything is opaque to the optimizer!

What I discovered is that Zig alone among its peers (C, C++, Go, Rust) is the only one to get this detail right.

30 Likes

Quite frankly, this is imo both an advantage AND a disadvantage.

It’s an advantage in the sense that you want to have a buffer somewhere in 99% of the cases (and if you wrap readers/writers with other readers/writers in only one of them in my experience).

It’s a disadvantage when you don’t want one in the other 1%. But I hope that statically passing in &.{} during initialisation will be enough in optimisation modes to remove the buffer handling code.

1 Like

This is pretty awesome tbh.

4 Likes

Wow, I just discovered {f} formatting that calls a format method, awesome :grinning_face:

1 Like

I’m pretty sure that is a new feature.

I checked and this was already available in 0.14 at least and is actually written in doc comment of format function

Edit: even found it in 0.8

1 Like

well color me pleasantly surprised

The format function is an old feature and was automatically/implicitly called, the {f} format option is a new feature, it makes it explicit within the format string whether a format function should be called:

std.Io.Writer.printValue

if (!is_any and std.meta.hasMethod(T, "format") and fmt.len == 0) {
    // after 0.15.0 is tagged, delete this compile error and its condition
    @compileError("ambiguous format string; specify {f} to call format method, or {any} to skip it");
}

The format function also has different functionality from before so I think making it explicit with this compile error helps people with updating their code to the new way things work.

10 Likes