Would I need to flush w here? It feels like I would, but it does work without it. Do I only need to do it if I write to a file?
I’m having a hard time wrapping my head around why I would need two buffers, one for the reader and one for the writer. As far as I can tell, reader reads from the file and dumps the output to it’s buffer and then it will write the contents of that buffer into writer’s buffer. It feels like I should be able to just use reader’s buffer to get the line. Am I misunderstanding something?
The following caused me a great deal of headache:
var fr = file.reader(&r_buf);
const r = &fr.interface;
This works as expected with streamDelimiter, however every other attempt I had crashed and burned with ReadFailed
First I tried with
var r = file.reader(&r_buf).interface;
I believe this failed, because r’s type is now Reader and r.streamDelimeter expects *Reader, so I assume no casting is made in the background, which is fine, but then I specify (&r).streamDelimiter, which has the same result, I then try
var r = &file.reader(&r_buf).interface
But for some reason, the type becomes *const Reader, which I assume is the root cause for my pain.
I guess my third question is: Why aren’t these two readers the same:
var fr = file.reader(&r_buf);
const r = &fr.interface;
var r = &file.reader(&r_buf).interface
?
The error trace I got wasn’t helpful to me at all, do I just read them wrong or is it still WIP? (it’s just ReadFailed being passed down and NotOpenForReading being the root error)
The point is that this reader implementation (in fact, all Reader implementations which I’m aware of), expect to be fields on a larger struct which the interface uses as its context for perform its operations. The Reader struct, if you jump into the source code, does not contain a context pointer.
When you write var r = file_reader.interface, you are making a copy of the interface into r. This is why your code doesn’t work: in this case r has lost its parent struct.
Unfortunately, as it currently stands it’s a bit tough to write good runtime errors messages for this kind of bug, because the source of the issue is effectively that your program has done a bit of pointer arithmetic to jump from interface expecting to land in file_reader but has failed.
Fortunately, once you fix your understanding, this kind of bug is very easy to avoid writing.
Your other issue, where you try to write a one-liner, suffers from the problem that Zig temporaries (the return value of reader(&buf) in this case) are always const.
Yes, you only need to flush when writing. When using a Writer, you are actually only writing to the buffer, which automatically flushes its contents to the underlying implementation when it gets full. When you are done writing, you will likely have some “leftover” data in the buffer that was not enough to trigger this automatic flush, so must do so manually.
You often don’t, but these are implementation-dependent details. Specifically with the File type, there is functionality to stream directly using the sendFile family of functions, though you will still need to provide a buffer as a fallback when this is unsupported by the system.
As already explained better by other, this is because of how Zig defaults to const when accessing a field or return value, but you require it to be a var in order to access its child field by non-const pointer,. You figured out the fix to assign the field first to a var, and then accessing its child field. This is just an idiosyncrasy of the language, not specific to the new Io, though it is common area where it becomes an issue.
I am unsure which specific iteration was causing the error, but I assume it was something related to a copy-by-value of the interface, which means whatever it was pointing to when it used @fieldParentPtr was junk memory, which is undefined behavior. In your case it gave an error, but it could just as easily segfault.
// Bad
const fr = file.reader(&r_buf);
var r = fr.interface;
// Good
var fr = file.reader(&r_buf);
const r = &fr.interface;
Both of these will compile, but only the latter will actually work. The former is trying to read memory from somewhere on the current stack by offsetting from where r is stored to get the parent File.Reader implementation, but that is not where it is.
Welcome to the forum. The code looks pretty good. It’s done mostly well.
Just to address the points you raised:
Any writer with a pending buffer on top of a storage should be flushed so that any data cached in the buffer can be written to the underneath storage, but the one you used .fixed is a pure buffer based writer with no underneath storage. Flushing is a no-op in this case.
You need two buffers because the r_buf is a fixed size buffer caching the read data. The data to be processed might be more than r_buf is holding. The w_buf is for processing the data. Imagine r_buf has a size of 8 bytes. It would take many rounds of reading of 8 bytes to process one line. The data pending processing needs to be held somewhere.
Others have addressed it.
There’s a minor problem that might cause problems. The .fixed writer buffer is set to a fixed size of 4096. If there’s a line longer than 4096, streamDelimiter will fail with the WriteFailed error.
There’s another pre-made writer in the std that can grow its buffer, std.Io.Writer.Allocating. For example,
var file = try std.fs.cwd().openFile(filename, .{ .mode = .read_only });
defer file.close();
var line = std.Io.Writer.Allocating.init(alloc);
defer line.deinit();
var r_buf: [4096]u8 = undefined;
var fr: std.fs.File.Reader = file.reader(&r_buf);
const r = &fr.interface;
while (true) {
_ = r.streamDelimiter(&line.writer, '\n') catch |e| {
if (e == error.EndOfStream) break else return e;
};
_ = reader.toss(1);
std.debug.print("{s}\n", .{ line.written() });
line.clearRetainingCapacity();
}
if (line.written().len > 0) { // remaining text beyond the last '\n'
std.debug.print("{s}\n", .{ line.written() });
}
you dont need 2 buffers, you dont need a writer either. see takeDelimiter which requires the readers’ buffer is large enough to contain a whole line, or whatever it is you are delimiting. If the buffer is too small you will get an error, in that aspect streamDelimiter is more robust, if the buffer is too small it can just empty it to the writer and keep searching, whereas takeDelimiter needs to keep it, as it returns a slice of the buffer, hence why it errors if it’s too small.
The cause of the error, as others have stated, was most likely your copying of the interface causing undefined behaviour.
However, for future reference, as Reader and Writer are non-generic they have a fixed set of errors that the implementation cannot add to.
Instead, implementations can store their specific error in a field which you can inspect when you encounter an error, and return a non specific interface error (I think you were confused by this). Implementations are not required to do this but many do, including File.Reader/Writer which actually store multiple to differentiate between different underlying operations which is very useful as they often fall back to other operations which could also error.
I think others have adequately covered your other questions.
Flushing the writer: .fixed is a pure buffer based writer with no underneath storage. Flushing is a no-op in this case.
Two buffers: Writer’s buffer has to be large enough for the whole string I’m trying to handle at once, while reader’s only have to fit chunks. The way I wrote it doesn’t make a lot of sense as I reached for the more advanced, flexible solution, and then didn’t take advantage of it.
var reader: In my failed attempts I’ve either copied the interface value or got a constant to it.
Error didn’t help: The error/stack wasn’t very useful due to how I made the error. (Copied a var / undefined behavior)
if you use a writer that does have an underlying destination, it doesn’t need a buffer at all, likewise readers don’t need a buffer either. You do this by passing a zero length slice which you can create with &.{}.
I should have mentioned, zig has plans to catch this and similar mistakes add safety checks for pointer casting · Issue #2414 · ziglang/zig · GitHub which will tell you what the actual problem was in your case. I was pointing out the error/stack trace would be helpful if it wasn’t for the UB, as well as how you can access the underlying error in code to act on it.
If r is a var it means that you can change where the pointer points to, which is rarely what you want to do in typical code.
If fr is a var (which it should) then the &fr.interface expression will default to type *std.io.Reader even if you save it like this: const r = &fr.interface;
The benefit of typing it explicitly, is mainly for clarity for the reader and so that people are less likely to make the mistakes of:
not taking the address of the interface-struct with & (using fr.interface and thus copying it, which isn’t allowed because it needs to stay a field within the file.reader var)
defining the fr = file.reader as a const, which would mean that the file reader can’t work properly on its local memory (because it needs to be a var to do that) and that taking the address of the interface would result in a *const std.Io.Reader (which also doesn’t allow to modify the contents of the std.fs.File.Reader.interface field of type std.Io.Reader)
So overall I disagree with calling your case good, because it makes it far more likely for people to make any one of those mistakes and there is no need to make r into a var.
It only needs to be a var if you actually want to change its value dynamically (based on runtime decisions) to point to a different reader (which most people wouldn’t want to do and could figure out themselves).
I think from a teaching standpoint it is best to use the var only with the reader implementation (which requires it) and the const with explicit type annotation, that way the code has the highest robustness against accidents like typos dropping the &.