Should read return !?usize?

So I was happily coding along when copilot hallucinated this code:

var buf: [8192]u8 = undefined;
while (try de.reader().read(&buf)) |n| {
    _ = try stdout.write(buf[0..n]);
}

This obviously doesn’t compile because read() returns !usize. But it could work if read() returns null instead of 0. Personally I feel like returning “null bytes read” doesn’t really make sense, but the resulting code does look more concise than the currently functional alternative:

var buf: [8192]u8 = undefined;
while (true) {
    const n = try de.reader().read(&buf);
    if (n == 0) break;
    _ = try stdout.write(buf[0..n]);
}

What do you guys think?

according to read(2) — Linux manual page:

On success, the number of bytes read is returned (zero indicates end of file),

It means zero is valid value for read operation

nullable has different meaning

1 Like

I’d say since read returns an Integer length it should be allowed to return 0. It’s supposed to be close to the C ABI version and 0 meaning end of stream is it’s convention.
But maybe a separate function that returns !?[]u8 would be a convenient wrapper that could be a “method” of the reader (for the common case of slicing buf):

fn readSlice(reader: anytype, type: T, buf: []T) !?[]T {
    const nread = try reader.read(buf);
    if (nread == 0) return null;
    return buf[0..nread];
}

separate function version:

var buf: [8192]u8 = undefined;
while (try readSlice(de.reader(), u8, &buf)) |slice| {
    _ = try stdout.write(slice);
}

as a “method” of reader the type argument could be omitted, too:

var buf: [8192]u8 = undefined;
while (try de.reader().readSlice(&buf)) |slice| {
    _ = try stdout.write(slice);
}
var buf: [8192]u8 = undefined;
while (de.reader().read(&buf)) |n| {
    _ = try stdout.write(buf[0..n]);
} else |err| return err;

I think this would also work.

In any case, I agree with the ?usize. Just because Linux uses 0 as end of stream doesn’t mean Zig has to do the same. Consider an asynchronous reader. If there’s some data ready to read, it reads it. If there isn’t, we would like it return immediately, in which case it should return 0. But this is not allowed by the interface now. This reader would be forced to block and wait for at least one byte to be ready, before returning.

2 Likes

I’ve found it useful to distinguish between 0, meaning “polling succeeded and nothing was read”, and null, meaning “polling did not find anything ready to read from”.

Not sure what anyone should conclude from that, but it seemed relevant.

3 Likes

i don’t understand the difference - both are not errors

        /// Returns the number of bytes read. It may be less than buffer.len.
        /// If the number of bytes read is 0, it means end of stream.
        /// End of stream is not an error condition.
        pub fn read(self: *Self, buffer: []u8) Error!usize {

for low-level language it’s better to be close to C/OS API

The difference lies in the difference between reading (which blocks) and polling (which always returns).

When polling, it’s useful and important to distinguish between “this has finished reading” and “this did not happen to read”.

3 Likes

still the question is “are we going to provide interface similar to C/OS or provide new one”

It already is a new interface. The OS doens’t have the notion of zig errors. I see no value in mimicking the OS.

3 Likes

Andrew is also refactoring I/O with the aim of producing a ‘colorblind’ interface which works the same way with blocking as with polling. I’m interested in what he comes up with, and it’s a safe bet that the part of the current Reader interface we’re discussing here will work differently.

I use null to represent ‘not ready on poll’, but that decision was downstream of readers returning 0 when there’s nothing left to read. If they returned null I’d have done something else.

I don’t think 0 is much of a ‘magic number’ here, (bytes_read == 0) is a fairly coherent expression. I grant the point that control flow would be nicer with a null however.

1 Like