Hi, I know this question has kind of been asked before here, but the answer is old now and it does not apply to the latest version of zig anymore, so I was wondering how to use the new Reader interface from calling file.reader(buffer); plus I would like to understand what is buffer doing and if this is the preferred way of doing this or is there other alternative.
You want something like this:
var read_buffer: [1024]u8 = undefined;
var reader = file.reader(&read_buffer);
while (true) {
const line = reader.interface.takeDelimiterInclusive('\n') catch |err| switch (err) {
error.EndOfStream => break,
error.ReadFailed => |e| return reader.err orelse e,
else => |e| return e,
};
std.log.info("Read: {s}", .{line});
}
The buffer needs to be as large as the longest line you expect in the file. It’s used to accumulate data until the reader reaches the expected character, newline in this case.
You can mark a reply as the solution
some things to be aware of:
The new Reader/Writer interfaces don’t contain a pointer to the implementation. As a result, if an implementation has state it needs to access, it does some pointer arithmetic under the assumption that the interface pointer it gets is a pointer to a field of its state type.
Just be aware of that so you don’t break that assumption, or if you do get weird results, you know what to check first.
This kind of interface is called an ‘intrusive interface’, there are pros and cons.
You can provide a zero length buffer to disable buffering, e.g. &.{}(my preference), "" or &[0/_]u8{}. Writers with a non zero buffer size should have flush called at whatever point you need to ensure all data is written.
Some APIs may have requirements around the buffer size, for example all the peek* and take* functions on Reader require at least a non-zero buffer length.
You should also, where applicable, have your own buffer requirements to simplify your own code.
Also think about the maximum data you need at a time when considering buffer size.
If you’re not sure, just pick a number, probably something on the order of kilo/megabytes. But ofc consider your use case.
And it will only work properly if the last line also ends with a newline.
Not related to buffer size, rather the semantics of the function.
There are takeDelimiterExclusive and takeDelimiter as alternatives.
If you use the inclusive variant, then you probably care about the delimiter actually existing.
The exclusive and plain variant will count the end of stream/file as though it were a delimiter which is probably more appropriate here.
The plain takeDelimiter has an odd return type Error!?[]const u8, it makes it nice to use with a while loop with trivail try error propagation.
while (try reader.interface.takeDelimiter('\n')) |line| {}
To get the same with the exclusive version would need to switch on the error.
Not sure why it has not been mentioned yet, but you can also just allocate the line on the fly so as not to recompile every time you found out you need a bigger buffer to store the line (but still enforce a maximum limit just to make sure a line does not takes GBs of memory). Here is an funny program that prints itself (if you are lucky enough for @src().file to work): Godbolt
const std = @import("std");
const Io = std.Io;
// 1MB line is too much?
const maximum_line_bytes = 1 * 1024 * 1024;
pub fn main() !void {
// TODO: upgrade to 0.16 and use std.process.Init.{arena/gpa}
const gpa = std.heap.page_allocator;
const self_src_file = try std.fs.cwd().openFile(@src().file, .{});
// At the source, usually the buffer size is OS-related.
// Here, I just pick a page-sized buffer
var self_src_reader_buf: [4096]u8 = undefined;
var self_src_reader = self_src_file.reader(&self_src_reader_buf);
const r = &self_src_reader.interface;
// A sink that allocates output buffer space.
// If you can estimate how large each line is (say, 80 bytes),
// could also initCapacity here so that there is no reallocation
var line_buf: Io.Writer.Allocating = .init(gpa);
// infinite loop with an index
for (0..std.math.maxInt(usize)) |i| {
// stream from the file reader into the line buffer writer
_ = r.streamDelimiterLimit(&line_buf.writer, '\n', .limited(maximum_line_bytes)) catch |err| switch (err) {
error.StreamTooLong => @panic("do something here if line too long?"),
else => return err,
};
const line = line_buf.written();
std.log.info("line {}: {s}", .{ i, line });
line_buf.clearRetainingCapacity();
// end of stream condition, refer to streamDelimiterLimit docs
if (r.bufferedLen() == 0) break;
// skip the delimiter, refer to streamDelimiterLimit docs
r.toss(1);
}
}
EDIT: forgot to print the last line of the file