Hi I’m a new zig user. Loving the simplicity of zig. This is the way.
I’m building a bitcoin blockchain parcer as my first zig project and I want to pipe a whole dat file (128 MiB) from the commanline:
cat blk00000.dat | ./parce
or
./parce < blk00000.dat
In my normal read loop I read one block at a time. And stop when all the bytes in the file have been read.
So, I know where this goes wrong.
const file_size = try (file.stat()).size;
My read loop condition is:
while (bytes_read < file_size) {
It seems that the file handle that std.io.getStdIn() retuns, does not give a regluar number when I try to get the file size.
I have been looking through the std library code, and I was thinking it had something to do with:
So, I tried changing the io modes to evented. But was not having any luck with that. Perhaps that is not the reason, or perhaps I just did it wrong.
I was also thinking that my whole approach to detecting the end of file is unhelpful, and there is a more straightforwards way to do this in zig. The dat files are pretty huge binary files, but there is probably a better way to detect EOF.
I normally don’t write help posts. But due to zig being a new language and all, I am having trouble finding the info which might put me on track.
If you really want to stream bytes, then you’re going to need to rethink the whole idea of trying to get the file size… It’s a stream of bytes, that will be in the process of being actively populated by the time your program is being loaded
Edit: In the streaming case, you’ll likely want to catch error.EndOfStream for EOF, and catch error.BrokenPipe in case there’s some other reason to stop (for example piping into something like head)
const in = std.io.getStdIn();
const in_stat = try in.stat();
if (in_stat.kind == .NamedPipe or in_stat.kind == .File) { // Piped input
try read(in);
So, it seems in the first case in is of kind .NamedPipe
cat blk00000.dat | ./parce
And in the latter case, in is a .File
./parce < blk00000.dat
I tried using .isTty(), but it did not trigger as true in my case. Perhaps I used it wrong.
Thank you all for your replies.
Edit: To keep all things in one place. I also changed my read loop condition so not to depend on file size at all. Now it just reads until no bytes are read.
while (try readBlock(&magic_bytes, &block_size, &raw_block, in_stream, allocator) > 0) {
I would turn this around over its head: always reading as from a stream of bytes (so, you just don’t inquire for any file size at all) and splitting that into chunks based on whatever condition you know about the format. For typical Unix files this would be “splitting into lines using the \n terminator”; for your case, it might be “accumulating data until you have a 1 MB block, or until EOF”.