A less limited streamUntilDelimiter

This expands upon Reader Delimiters and Missing streamUntilDelimiterOrEof. More specifically:

This is, in my opinion, the biggest problem of streamUntilDelimiter since this limits its use to only text and other forms of data that don’t use all 256 possible bytes (Large Binary Files).

Although, I think that streamUntilDelimiterOrEof should NOT be included in the language, since streamUntilDelimiter was made to replace all those variations in the first place.

Instead, I think that the delimiter parameter should be optional (?u8 instead of u8). This way, no new functions will be added and the changes that are made are almost non-breaking.

Consider the following:

    // Download the latest zig master
    // https://ziglang.org/download/index.json {master.x86_64-linux.tarball}
    var f = try std.fs.cwd().createFile("zig-linux-x86_64-master.tar.xz", .{});
    defer f.close();

    if (false) {
        // size = https:ziglang.org/download/index.json {master.x86_64-linux.size}
        var buffer = try alloc.alloc(u8, size);
        defer alloc.free(buffer);

        _ = try dl_req.readAll(buffer);
        try f.writeAll(buffer);
    } else {
        var buffer = try std. ArrayList(u8).init(alloc);
        defer buffer.deinit();

        while (true) {
            const byte = dl_req.reader().readByte() catch |err| switch (err) {
                error. EndOfStream => break,
                else => |e| return e,
            };
            try buffer.writer().writeByte(byte);
        }

        try f.writeAll(buffer.items);
    }

The first method assumes that the download provider has provided an accurate size number for the file you are downloading (Thankfully ziglang.org does).

The second method assumes that the download provider does not provide an accurate size number. streamUntilDelimiter provides 99% of the functionality to read the entire file, only to be blocked by if (byte == delimiter) return;. Thus, I must reimplement parts of it achieve the functionality I need.

If streamUntilDelimiter was fn(std.io.Reader, anytype, ?u8, ?usize), it would look like:

dl_req.reader().streamUntilDelimiter(buffer.writer(), null, null) catch |err| switch (err) {
    error.EndOfStream => {},
    else => |e| return e, 
};

I haven’t followed this but stream until eof seems like a common enough use case potentially to keep a variant for, even.

I think that’s compatible with what @cactusbento is proposing.

To @cactusbento’s point, you could just compose the two like a function bind where the bound arguments are null for the 2nd and 3rd parameter. Then you’d have your EOF version built directly from the above example.

If I am reading this correctly, what the original post is concerned about is the number of function variants that are created and how that splits the control flow if you don’t know your sizes up front. Personally, if I was implementing this, I prefer composition and would tend to agree from a library-design perspective with what is being proposed.

Full disclosure, I am not an expert on the internals of these two functions. I cannot currently speak to any potential problems beyond this except for the api and control flow concern.

The one issue I see with this from a design perspective is now your try block can catch errors that cannot occur when valid sizes are provided. So for instance, is it necessary to have an error that can be returned signaling EOF when the size of the file is already (correctly) known? Probably not, but that may not be a critical factor in the end.

So if a valid size is provided, you could say that the EOF signal is unreachable. We have ways to canonically handle this in Zig, but I tend to not encourage design patterns that make unreachable more appealing than it should be.

Just food for thought - interesting stuff. I still tend to agree at my current altitude on this subject.

I would say that it depends on how you interpret the case where io.Reader’s read function successfully reads 0 bytes. Currently, this is treated as an EOF signal, and I agree with the status quo. This will also allow the programmer (who’s using the io.Reader interface) to decide whether hitting EOF is a valid behavior of their program.

If a valid size is already provided, I would encourage that people should stay away from streamUntilDelimiter and use something like the first method of reading like in OP.
EOF would still be a valid signal if the stream were to be somehow cut off.
As for unreachable, I also would prefer that the conclusion to use unreachable be unreachable.

Shouldnt’ these be more like StreamUntilCondition and then the condition would be a generic function argument (template in C++) either simple byte check, suibstring check, or more complex regex? Let the inliner do its work. (if it can’t unline a generic function argument that needs to be fixed before APIs are set.