Why BufferedReader

plano · November 10, 2024, 4:07pm

Was having a look at Readers and BufferedReaders, simplest example of open a file, read its contents in limited steps to print it and not potentially run out of memory in the process.
So i found

var file = try std.fs.cwd().openFile(
    "ZigExamples/file-io/lorem.txt", .{}
);
defer file.close();
var buffered = std.io.bufferedReader(file.reader());
var bufreader = buffered.reader();

var buffer: [1000]u8 = undefined;
@memset(buffer[0..], 0);

_ = try bufreader.readUntilDelimiterOrEof(
    buffer[0..], '\n'
);
try stdout.print("{s}\n", .{buffer});

from 13 Filesystem and Input/Output (IO) – Introduction to Zig

And: whats the point of buffered reader if we still need to create another buffer and thats what we use?

Im clearly missing some stuff.

pachde · November 10, 2024, 4:10pm

To give you the choice of where that buffer exists (global, stack, heap) and what it’s lifetime is.

IntegratedQuantum · November 10, 2024, 4:34pm

A buffered reader can greatly improve performance in this case.
To figure out why this happens you need to look at how readUntilDelimiterOrEof works inside:

        while (true) {
            const byte: u8 = try self.readByte();
            if (byte == delimiter) return;
            try writer.writeByte(byte); // A fixedBufferStream of the output buffer you passed into the function
        }

As you can see, it reads one byte a time, checks if it is the delimiter and appends it to the output buffer.

Now if you use the raw file reader, each call to readByte is going to be one syscall to the OS, which is quite expensive.

With a buffered reader it will request larger chunks from the operating system at a time and place it into the internal buffer. And readByte is just cheap a lookup into that buffer.

plano · November 11, 2024, 11:01am

I think im begining to understand.
Does the BufferedReader handle incrementaly calling the chunks of the file as we need more?
I dont see a free method for BufferedREader, tho.
Im still not sure why i would use BufferedReader. If I dont need to do changes on the contents and just need to get the whole thing lets do it into an array and skip the buffered, and if i do have to do some stuff with the contents why not use some read with allocator.

I dont know why this is such a strange concept to me

LucasSantos91 · November 11, 2024, 1:18pm

Yes.

The buffer is inside the reader itself.

You can change its size with a comptime config parameter.

Yes, if you need the whole thing, you could dump everything into a buffer without a buffered reader, but then what are you going to do with it? In the example given, you’re reading until a certain delimiter. If you read 1GB of data and the delimiter is found in the first character, you just wasted a bunch of space and time reading the whole thing. On the other hand, reading each byte at a time is inefficient, as pointed out by @IntegratedQuantum. The reader handles incrementallly pulling in data in an optimal way. It is more space efficient, avoids unnecessarily reading data after the point that you’re insterested and avoids overloading the bus with data that may only be needed later. If you’re doing I/O in parallel, multiple threads can be pulling data without a single thread hogging the entire bus.
Also, the act of processing data itself usually requires some form of tracking where you are in the data stream. The reader keeps track of that for you. It’s worth using a reader even if you’re just planning on processing a memory buffer. You can think of it as a slice, with convenience methods for subslicing.

The reader and writer interfaces are just Zig’s version of streams. You can get a lot more information reading about streams in C++.

The real object is the BufferedReader. The .reader() methods provides an interface, which you can think as just a fancy pointer to the real thing, equipped with nice convenience methods.

mutech · November 11, 2024, 2:16pm

This is easier to understand if you have a rough estimate of how fast or slow certain operations are. I just recently watched a great video about data oriented design that summarized these times.

I can’t remember the numbers or find that video, but the gist of it is that IO (even from SSD) is slow, compared to iterating through memory.

If you read from a file, byte by byte, your OS will load one page from the file and keep it in memory, so that’s not this category of bad. But since read is a system call, the cpu state needs to be saved, enter kernel mode, do stuff, restore the cpu state and come back to your loop. So the difference between reading buffered or not is almost always many times larger than anything you might do with the data you read. There is a whole lot of research going on just to reduce system call usage (see eBpf), so that’s a real thing.

From then on, you will never read unbuffered again (unless you’re reading from a terminal). But that is if you’re in C, C++, Rust or Zig world, where there is no interpreter or VM between you and all these issues. You wouldn’t bother in JS, Python or Java. Well no, they all do buffered reading.

The incomplete list of expensive operations from bad to better is:

Disk IO
System Calls
RAM (missed all caches)
L(n) cache read
…
L(0) cache
Float ops (and other complex instructions)
Integer/Logic

And independent of this kind of hardware related performance atrocities, there is algorithmic sins, where you use bubble sort instead of something reasonable.

In most cases you don’t pay attention to everything more esoteric than trying to reduce system calls, unless you’re writing libraries that typically benefit more from micro optimizations. But you definitely do not read files byte by byte if you value your reputation and somebody is looking your way.

Ah yes, the point: Once you got a feel for these issues, you won’t be surprised when you see people doing strange things with buffers, memory mapped IO, scatter/gather IO and all kinds of async IO. Most of it is about avoiding context switches or semaphores (similar issues but worse).

cancername · November 11, 2024, 4:12pm

Program:

const std = @import("std");

pub fn main() !void {
    const buffer = false;
    const unbuffered = std.io.getStdIn().reader();
    var buffered_state = std.io.bufferedReader(unbuffered);
    const buffered = buffered_state.reader();

    const r = if (buffer) buffered else unbuffered;

    while(true) {
        _ = r.readByte() catch break;
    }
}

Syscalls without BufferedReader:

read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
[...]
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "\0", 1)                        = 1
read(0, "", 1)                          = 0

Syscalls with BufferedReader:

read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
read(0, "", 4096)                       = 0

plano · November 12, 2024, 11:19am

The part about the syscalls i get, what throws me off is

That when using a BufferedReader we still need to create another buffer and thats what we pass to the read funcs. Seems redundant. Since, at least im imagining, it could return a slice from the BufferedReader.
There are no ways to handle memory with BufferedReader. Lets say I have like it was mentioned earlier a 1GB file. So with BufferedReader it reads a chunk and i do whatever with that text and now i need another chunk, either BufferedReader overwrites the chunk to get ahother chunk (perfect for the use cases where i dont need that chunk anymore, ive done everything i needed with it) and that way the max memory usage is the chunk size; OR it reads another chunk and appends it (perfect for use cases where i need to incrementally read a file but i still need whatever was prev) and eventually reads the whole file with a 1GB memory usage.

Both are legitimate and common use cases, but it seems to me that since i have no control over memory freeing, in the second case i would rather use a readWithAlloc (probably an Arena) instead of bufferedReader; and for the first use case it would be simpler to also skip the BufferedReader and just work with an array buffer.
(and yes, going back to the syscalls thing, in both cases readAll chunks, and then work with them)

So obviously im missing something on why you would use BufferedReader
(ive read the src Zig Documentation and still)

yataro · November 12, 2024, 1:11pm

BufferedReader is a general purpose utility, there is no goal to satisfy every possible case. Copying memory is much cheaper than syscall.

Then there’s no need for BufferedReader, just read into your [N]u8 buffer.

BufferedReader is not about memory usage

The second case is invalid, as there no buffer growing at all.
Using BufferedReader is the same as reading to array, but you get convenience functions with tracked position like readByte.

BufferedReader is a convenience utility to be able to read in small(er) chunks using a larger underlying buffer. The purpose is to minimize the number of underlying read calls, not just to proxy reader for fun.

What if you need to pass your reader to other APIs and they do things like read the reader byte by byte? (like std.io.Reader.streamUntilDelimiter does)
What if you just want to read the reader byte by byte?
And in both cases you want to minimize the number of underlying read calls?
In the first case you have no control over the code, and in the second case you will end up reinventing the BufferedReader wheel.

LucasSantos91 · November 12, 2024, 1:17pm

The buffer inside the BufferedReader is a cache. It’s been stablished that you don’t want to make small reads from the OS. But what if your code only needs one byte at the moment? You want to cache a chunk and use parts of it. That’s what the BufferedReader does for you.
You pass the destination memory to the reader. If you’re reading something small, it reads a whole bunch for you and caches it. It gives you the one byte you asked for, but the next reads will have more data at the ready, without needing syscalls.
If it gave you a slice it would cause all sorts of trouble. The next usages of the reader would overwrite the slice, and you wouldn’t be able to read more than the cache size.

Sze · November 12, 2024, 3:31pm

If you need access to arbitrary previous parts of the file then it is likely easier to just read the entire file into memory and then begin processing it, you can do that via:

If the file is too big to read in entirely and you both need to stream over it and access individual parts later, you basically have to stream over the file using a BufferedReader and copy the parts you want to keep for later.

But ideally you would be able to avoid having such a data file in production and instead create a optimized file that only contains what is needed, but this itself might involve streaming over the file and writing out a smaller file.

You also could use memory mapping, but that is less portable and involves its own tricky semantics.