3 buffers needed for stream to stream copy?

Hello! I’m a complete noob to Zig.
I had this piece of code, that worked as a cat alternative:

const std = @import("std");

// 4KB buffer
const W_BUF: usize = 4096;
const R_BUF: usize = 4096;
const C_BUF: usize = 4096;

pub fn zat() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // get args
    const args = try std.process.argsAlloc(allocator);
    defer std.process.argsFree(allocator, args);

    const arglen = args.len;
    if (arglen != 2) {
        std.debug.print("Usage: {s} <filename> \n", .{args[0]});
        return;
    }

    var out_buf: [W_BUF]u8 = undefined;
    var writer = std.fs.File.stdout().writer(&out_buf);
    const stdout = &writer.interface;

    var read_buf: [R_BUF]u8 = undefined;

    const filename = args[1];
    const file = try std.fs.cwd().openFile(filename, .{});
    defer file.close();

    var fr = file.reader(&read_buf);
    const reader = &fr.interface;

    var cbuf: [C_BUF]u8 = undefined;

    while (true) {
        const rb = reader.readSliceShort(&cbuf) catch |err| {
            return err;
        };
        if (rb == 0) break;
        try stdout.writeAll(read_buf[0..rb]);
    }

    try stdout.flush();
    //std.debug.print("{} bytes written\n", .{n});
}

I wanted to know if using three buffers is really the best way to go around writing something like this? If there’s anything else I’m missing too, please feel free to point it out as well.

You can use the stream() function:

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // get args
    const args = try std.process.argsAlloc(allocator);
    defer std.process.argsFree(allocator, args);

    const arglen = args.len;
    if (arglen != 2) {
        std.debug.print("Usage: {s} <filename> \n", .{args[0]});
        return;
    }

    var out_buf: [W_BUF]u8 = undefined;
    var writer = std.fs.File.stdout().writer(&out_buf);
    const stdout = &writer.interface;

    var read_buf: [R_BUF]u8 = undefined;

    const filename = args[1];
    const file = try std.fs.cwd().openFile(filename, .{});
    defer file.close();

    var fr = file.reader(&read_buf);
    const reader = &fr.interface;

    while (true) {
        _ = reader.stream(stdout, .unlimited) catch |err| switch (err) {
            error.EndOfStream => break,
            else => return err,
        };
    }
    try stdout.flush();
}
5 Likes

Here are some improvments:

const std = @import("std");

// 4KB buffer
const W_BUF: usize = 4096;
const R_BUF: usize = 4096;

pub fn main() !void {
    var gpa = std.heap.DebugAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // get args
    const args = try std.process.argsAlloc(allocator);
    defer std.process.argsFree(allocator, args);

    const arglen = args.len;
    if (arglen != 2) {
        std.debug.print("Usage: {s} <filename> \n", .{args[0]});
        return;
    }

    var out_buf: [W_BUF]u8 = undefined;
    var writer = std.fs.File.stdout().writerStreaming(&out_buf);
    const stdout = &writer.interface;
    errdefer stdout.flush() catch {};

    const filename = args[1];
    const file = try std.fs.cwd().openFile(filename, .{});
    defer file.close();

    var read_buf: [R_BUF]u8 = undefined;
    var fr = file.reader(&read_buf);
    const reader = &fr.interface;

    _ = try reader.streamRemaining(stdout);

    try stdout.flush();
    //std.debug.print("{} bytes written\n", .{n});
}
  • std.heap.GeneralPurposeAllocatorstd.heap.DebugAllocator
  • reader.streamRemaining instead of the loop
  • stdout: writerwriterStreaming this allows for better syscalls.

Note that depending on the some factors like the os it can be that no buffer is used.

7 Likes

Thanks, everyone! :slight_smile:

1 Like

Linux, Mac, FreeBSD: depends on where you’re reading from.
Concrete files don’t need any buffers, ‘pipe’ files need the writer to have a buffer. Could be more caveats that I can’t remember right now.

On other systems: the writer needs a buffer, the reader does not.

1 Like

I think this is not true on linux if the filename arg is a pipe:

test.zig contains the code of my previous reply. This is no error, but an error if you set W_BUF to 0:

zig run test.zig -- <(cat test.zig)

Somewhat related: I don’t really like readers/writers being able to switch between positional and streaming at runtime.

1 Like

I forgot about those lol, you’re right.

I feel the same. I somewhat understand why, it makes them ‘just work’. I think a good middle ground would be explicitly choosing an auto mode, that does the current behaviour.

2 Likes