Why does the buffer increase automatically with std.Io.Writer?

Hello,
this is my first forum post. Sorry if I’m missing some information, but i will try to provide them if necessary. So I started out with the new version of Zig 0.15.1, i created a simple Hello world application that prints to the Console output. I noticed however that when i create a Buffer of size [1024]u8, pass that buffer into the Writer and retrieve the std.Io.Writer interface that i can still print more than 1024 characters. I have created a Git repository with additional comments with some thoughts. I hope it’s okay to attach Links :slight_smile:
https://codeberg.org/Kassiopeia/zig_print_stdout_0.15.1

In the main.zig file i have a function **fn print_bigger_than_buffer_size(w: *std.Io.Writer) void ** that prints the character a to stdout 2048 times. I also tried it with 1024 times, because maybe my math was (or is still) wrong. When i counted the characters after flushing the Writer the correct amount of Characters were outputted to the Console (i used an online character counting tool).

I tried looking through the std library documentation for version 0.15.1 but i couldn’t get behind it, the print function on std.Io.Writer didnt provide me any clues. I also tried to look into std.fs.File.writer and the only thing i could find was the description for it: Defaults to positional reading; falls back to streaming. so i thought it might be that the writer is using streaming?

Maybe someone can explain me what is going on behind the scenes, don’t get me wrong i like that i can print 2048 characters even though i only create a buffer of size 1024, but i want to understand how this is happening. Please keep in mind i just started Zig programming since version 0.15.0-dev came out so im very new. Also if someone has the time for it, they can check the comments i added in my source code and see if i got anything wrong and have to look into that topic again, thank you~ :grinning_cat_with_smiling_eyes:

edit: i suppose the buffer is not increased automatically, because that shouldn’t be possible right? So something else is going on, maybe the buffer is flushed when it reaches the max size (i cannot confirm this yet, i tried reading the source code but i have more questions than answers)? :thinking:

1 Like

The purpose of the buffer is batching.

If you have access to a Linux machine, try this:

  1. Compile your example program with a very small buffer, and then run it with strace. This tool shows you all the syscalls it makes. In particular pay attention to the write calls that it sends to the operating system.
  2. Compile your example program with a very large buffer, and then again run it with strace, observing the difference in write calls.

Let me know how it goes!

6 Likes

The buffer is just a temporary storage for output to build up in before being drained by the underlying writer implementation, which in this case would be a File.Writer to stdout. Writes first go to this buffer, and are only written to stdout when there is no room left, at which point the buffer is also emptied so new input can build up in it again.

As andrew says, this reduces the number of write syscalls, at least in the case of File.Writer.

From your post, I guess you are new to programming?

If not, I would recommend you to dig into what various language runtimes do “under the hood” for you, since knowing that is quite important for learning Zig (and pretty much the same, even if it’s with a different dressing from time to time).

And if yes, I don’t think that Zig (or any statically memory managed language for that matter) is a good first language for various reasons, but good luck if you decide to do so anyway.

If it’s the only output of your program, there is a really simple way of counting the characters: the --bytesoption of wc.

Given the following program which just prints Hello to stdout:

const std = @import("std");

pub fn main() !void {
    var buf: [10]u8 = undefined;
    var writer = std.fs.File.stdout().writer(&buf);
    const stdout = &writer.interface;
    try stdout.writeAll("Hello");
    try stdout.flush();
}

If you pipe the output of it into wc --bytes, you will get 5 as output, since the program outputted 5 bytes (one for each character of the string “Hello”; there is no new line character which would be an additional byte).

In case you don’t know how piping works, you use the | character on your console.

So given that you compile the program into a file called output, you can do it by typing in ./output | wc --bytes (the ./ says “in the current directory”).

If you haven’t compiled it yet, but only in a file like output.zig, you can do this with zig run output.zig | wc --bytes.

No copying from your console to an online tool needed, which will save you a lot of time long term if you start doing stuff like this.

I start to see what is going on, together with the Comment from n0s4 i was able to figure out what the buffer does. Especially with the Large buffer example with strace i saw that when i want to print some random text + Character “C” 4096 times, instead of calling the writev syscall 4096 times it was just called 3 times (using a 2048 byte buffer).

pwritev(1, [{iov_base="A bigger [2048]u8 buffer!\nCCCCCC"..., iov_len=2048}, {iov_base="C", iov_len=1}], 2, 0) = -1 ESPIPE (Illegal seek)
writev(1, [{iov_base="A bigger [2048]u8 buffer!\nCCCCCC"..., iov_len=2048}, {iov_base="C", iov_len=1}], 2A bigger [2048]u8 buffer!
CCCC....) = 2049
writev(1, [{iov_base="CCCC..."..., iov_len=2048}, {iov_base="C", iov_len=1}], 2CCCC....) = 2049
writev(1, [{iov_base="CCC....) = 24

I updated my Git repo with the information i gathered for everyone else if they have the same question. Thank you guys :grinning_cat: I guess now i can start looking into how this works by looking at the std library source code!

1 Like

Hey, I’m not new to programming. It’s just that when i learned C or C++ at school all the details didn’t really matter because we weren’t expected to think about these things. All what mattered was that the logic of the program was functioning. I also used to program Assembly for a micro controller a long time ago, but since that microcontroller didnt have any displays or something to write to, it was more on the simple side of things (Getting Input from switches, a potentiometer, LEDs, etc etc. it wasnt an arduino). Learning Zig is more about getting a deeper look into how things work, and i really love details which i dont get when using Powershell or C# at work (I’m a sysadmin, not a SWE).

And thank you for the wc command recommendation, i totally forgot that this tool exists :grin: