How to read line by line using the new Io.Reader

Hello,

Yesterday I upgraded from zig 0.14.1 to master, which as far as I can tell is basially 0.15.0.

The upgrade went smoothly, with the exception of trying to replace reader.readAllAlloc.

My old code is

pub fn read(allocator: std.mem.Allocator, stdin: *std.Io.Reader) ![]u8 {
    if (raw) {
        // TODO: make this use the new std.Io.Reader
        const reader = std.fs.File.stdin().deprecatedReader();
        return try reader.readAllAlloc(allocator, std.math.maxInt(usize));
    } else {
        var bytes: std.Io.Writer.Allocating = .init(allocator);
        errdefer bytes.deinit();
        _ = try stdin.streamDelimiter(&bytes.writer, '\n');
        return try bytes.toOwnedSlice();
    }
}

I tried to use std.Io.Reader.readRemaining, but it just kept returning zero bytes. It may be relevant that I’m reading from a raw stdin. The reader.readAllAlloc code worked perfectly fine.

You are probably creating the stdin reader with .reader, use .readerStreaming instead.

stdio aren’t normal files and don’t accept certain operations, so the new reader/writer reflect that with different modes, FYI the deprecated ones are analogous to the streaming_reading mode.

It does detect when the reader/writer is in the wrong mode but only after you try to use it resulting in the first read being 0, after that it will be in streaming mode and will work.

The modes are
streaming, positional(default), positional_reading, streaming_reading.

The _reading avoid doing the sendFile optimisation, which copies files from one place to another in one syscall as opposed to a series of read/write syscalls, if the writer implements it, if not it changes modes and returns 0.

The positional modes use an internal tracked position for reads, where the streaming modes use whatever position the file resource has.

Which is to say positional, each reader reads from its own position (which stdio doesnt accept), streaming they all share the position via the OS.

2 Likes

Hm okay I have swapped my reader with a readerStreaming, but I still get the same result. I am using allocRemaining with .unlimited

Nevermind, it works, I wrote a little test and that works so I must be having problems somewhere else

I have managed to make it work but with a strange requirement. My program constantly reads from the raw stdin in a loop. My function now looks like this:

/// Returns the typed characters as a slice.
/// Caller owns returned memory.
pub fn read(allocator: std.mem.Allocator, stdin: *std.Io.Reader) ![]u8 {
    var bytes: std.Io.Writer.Allocating = .init(allocator);
    errdefer bytes.deinit();
    if (raw) {
        _ = try stdin.streamRemaining(&bytes.writer);
    } else {
        _ = try stdin.streamDelimiter(&bytes.writer, '\n');
    }
    return  bytes.toOwnedSlice();
}

My loop looks something like this:

pub fn main() !void {
    const stdin_file = std.fs.File.stdin();
    defer stdin_file.close();
    var stdin_buf: [4 * 1024]u8 = undefined;
    // ...
    while (true) {
        // ...

        // why does this have to be inside the loop???
        // if it is not, i get no crash, but the reader simply stops reading any bytes
        var stdin_reader = stdin_file.readerStreaming(&stdin_buf);
        const reader = &stdin_reader.interface;

        const in = try tty.read(arena.allocator(), reader);
        _ = in;
        // ...
    }
}

Simply moving the reader interface creation outside the loop just makes the program stop receiving input. Is this correct or a bug?

That is most certainly a bug

where that bug is coming from? I have no idea.

try stepping through it with a debugger, should show you exactly what its doing.

Will do. Ive never tried to debug something like this with gdb but ill try

I may take a few days though, because I’ll get my wisdom tooth removed in a few minutes :sweat_smile:

Things I have found:

  • The very first call of my read function works. Every one after that does not.
  • This has the same issue: it reads once and then no longer blocks and no longer reads, I’d be thankful if someone sees some obvious misuse I just dont know about yet or could reproduce:
pub fn main() !void {
    var alloc_impl: std.heap.DebugAllocator(.{}) = .init;
    defer _ = alloc_impl.deinit();
    const allocator = alloc_impl.allocator();

    var stdin_file: std.fs.File = .stdin();
    defer stdin_file.close();
    var stdin_buf: [4 * 1024]u8 = undefined;
    var stdin_reader = stdin_file.readerStreaming(&stdin_buf);
    const reader = &stdin_reader.interface;

    var stdout_file: std.fs.File = .stdout();
    defer stdout_file.close();
    var stdout_buf: [4 * 1024]u8 = undefined;
    var stdout_writer = stdout_file.writer(&stdout_buf);
    const writer = &stdout_writer.interface;

    while (true) {
        var bytes: std.Io.Writer.Allocating = .init(allocator);
        defer bytes.deinit();
        _ = try reader.streamDelimiter(&bytes.writer, '\n');

        try writer.print("{s}\n", .{bytes.written()});
        try writer.flush();
    }
}

const std = @import("std");

I have no idea about the new IO yet, but it seems strange that you need 3 buffers just to copy data from stdin to stdout.

I assume it is just for exploration purposes, trying to mimic the rough setup in their actual project, which is doing stuff in-between.

Even then, buffers are best on the ends of the pipeline, not in the middle. There is a nice API to manipulate the buffers in the interface, as opposed to creating another buffer that you read into.

Also worth noting that the new interfaces can detect when you’re piping data directly between files and optimise that to a file copy bypassing buffers entirely, only supported on linux atm, but any os that lets you do file copies via just handles should be supported in the future.

1 Like

I think I might just be misunderstanding the new API in some way, how would you write that code?

Also, you were right in assuming that im mimicing my larger project. I wouldnt do this amy other way in a smaller project though.

Basically a simpler streamDelimiter as it isn’t interacting with a writer and doesn’t support limiting the amount read.

pub fn main() !void {
    var in_buf: [1024]u8 = undefined;
    var in = std.fs.File.stdin().readerStreaming(&in_buf);
    var out_buf: [1024]u8 = undefined;
    var out = std.fs.File.stdout().writerStreaming(&out_buf);

    // fills buffer with at least 1 byte, then returns buffered bytes
    // doesnt advance the buffer
    while (in.interface.peekGreedy(1)) |buffered| {
        // find the end of line, if there is one
        const end = std.mem.indexOfScalar(u8, buffered, '\n') orelse buffered.len;

        // unbuffer the bytes till newline orelse all
        // we need to do this as we are interacting with the underlying buffer
        // the implementatin has no way to know how much we actually use/need to get rid of otherwise
        defer in.interface.toss(end);

        const definately_did_stuff = buffered[0..end];

        try out.interface.writeAll(definately_did_stuff);
    } else |e| return e;

    // flush when youre finnished
    // writes buffered bytes to implementation
    // is no op on Writer.Allocating, cause that only writes to its buffer.
    out.flush();
}

const std = @import("std");

Really recommend reading the std source as it’s quite simple, at least the interfaces are, and are well documented.

streamDelimiter does not consume the delimiter itself. After the first time it will just continue to read/write 0 bytes. It should work if you remove the '\n':

        _ = try reader.streamDelimiter(&bytes.writer, '\n');
        _ = try reader.takeByte();
2 Likes

Thank you so much, that at least fixes the issue from my little demo. Now I just gotta figure out why it breaks with readRemaining

/// Returns the typed characters as a slice.
/// Caller owns returned memory.
pub fn read(allocator: std.mem.Allocator, stdin: *std.Io.Reader) ![]u8 {
    if (raw) {
        return stdin.adaptToOldInterface().readAllAlloc(allocator, std.math.maxInt(usize));
    } else {
        var bytes: std.Io.Writer.Allocating = .init(allocator);
        errdefer bytes.deinit();
        _ = try stdin.streamDelimiter(&bytes.writer, '\n');
        _ = try stdin.takeByte();
        return try bytes.toOwnedSlice();
    }
}

This worked in !raw, and raw. If I switch the line with adaptToOldInterface out for just streamRemaining it just reads once, unless I reinitialize the actual reader itself.