Struggling to simply read from a file?

Hi,

Allow me to start by saying that I’m by no means an inexperienced programmer. I’m coming to Zig from a C++ and Rust background. I’m really liking the way of thinking and how explicit all the code is, but I have to admit that IO is currently tripping me up a lot. I cannot work out for the life of me how to read the contents of a file into a buffer. I know from trying master that this is going to change even more in an upcoming Zig version, but this is with 0.15.2. This code panics, and I simply cannot get it to work no matter what reader methods I try it seems :frowning: Any help is most certainly appreciated, I’m certain I’m missing something small. My first ever experience with zig was me making a memory corruption in 10 lines by trying to return a local []const u8 from a function, but now I fully get how strings work and love them.

const std = @import("std");

pub fn main() !void {
    const file = try std.fs.cwd().openFile("test.txt", .{ .mode = .read_only });
    defer file.close();
    var buffer: [1024]u8 = undefined;
    const file_reader = file.reader(&buffer);
    var reader = file_reader.interface;
    try reader.readSliceAll(&buffer);
    std.debug.print("{s}\n", .{buffer});
}

Thanks in advance!

1 Like

It would be helpful to see the panic message, othewise I will be guessing just based on the code.

My guess is that you are getting an assert triggered because of aliasing. You are mixing up the uses of the different buffers.
In the code example you provided, you are using the same buffer for both the reader interface and the destination buffer. This is likely the cause of your problem.
There are a few ways around this:

Use the internal buffer directly

You can access the reader’s internal buffer and fill it using methods on the Reader. Calling fillMore will fill the buffer and then you can access the buffer from the Reader with buffered

Read into another buffer

You can create 2 buffers and use them:

    var buffer: [1024]u8 = undefined;
    const file_reader = file.reader(&buffer);
    var reader = file_reader.interface;
    var copy_buffer = [1024]u8 = undefined;
    try reader.readSliceAll(&copy_buffer);
    std.debug.print("{s}\n", .{&copy_buffer});

Note: The buffers can be different sizes

Stream into a Writer

This is a more special use case, but if you are wanting to read the whole file into a new allocated buffer, you can use Writer.Allocating and stream from the reader using Reader.stream. You could also used a Fixed Allocator as well if you don’t want allocation but want to fill a bigger buffer.

1 Like

Hi,

Sorry about that, I definitely should’ve posted the panic. I tried updating my code to use .buffered, but it still doesn’t work and gives the same panic:

thread 663050 panic: switch on corrupt value
/home/quin/.zvm/0.15.2/lib/std/fs/File.zig:1344:17: 0x1141f78 in readVec (std.zig)
        switch (r.mode) {
                ^
/home/quin/.zvm/0.15.2/lib/std/Io/Reader.zig:1074:29: 0x113efbc in fillMore (std.zig)
    _ = try r.vtable.readVec(r, &bufs);
                            ^
/home/quin/test.zig:9:24: 0x113d514 in main (test.zig)
    try reader.fillMore();
                       ^
/home/quin/.zvm/0.15.2/lib/std/start.zig:627:37: 0x113de59 in posixCallMainAndExit (std.zig)
            const result = root.main() catch |err| {
                                    ^
/home/quin/.zvm/0.15.2/lib/std/start.zig:232:5: 0x113d351 in _start (std.zig)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)

Code:

const std = @import("std");

pub fn main() !void {
    const file = try std.fs.cwd().openFile("test.txt", .{ .mode = .read_only });
    defer file.close();
    var buffer: [1024]u8 = undefined;
    const file_reader = file.reader(&buffer);
    var reader = file_reader.interface;
    try reader.fillMore();
    std.debug.print("{s}\n", .{reader.buffered()});
}

And reader should be a pointer, not a copy.

var reader = &file_reader.interface;
1 Like

Hey, your issue is with how you use the file_reader.interface field. You have to store it as a pointer, as it used @fieldParentPtr to access the implementation for each method in the interface.

So if you change your code to this, it works:

const std = @import("std");

pub fn main() !void {
    const file = try std.fs.cwd().openFile("test.txt", .{ .mode = .read_only });
    defer file.close();
    var buffer: [1024]u8 = undefined;
    var file_reader = file.reader(&buffer);
    var reader = &file_reader.interface;
    try reader.fillMore();
    std.debug.print("{s}\n", .{reader.buffered()});
}

Notice the & infront of file_reader.interface?

2 Likes

file_reader should be a var because it will be mutated (for example indirectly through using its .interface).

The second line copies the Io.Reader value out from the file_reader instance, doing this breaks its invariant of being located at the memory that is accessible via file_reader.interface, Io.Reader has a vtable and the functions within that vtable use @fieldParentPtr to convert the pointer to the file_reader.interface to a pointer to file_reader, this doesn’t work when you make a copy of .interface. You have to take the reference to the field instead.

Overall those two lines should look like this:

var file_reader = file.reader(&buffer);
const reader: *std.Io.Reader = &file_reader.interface;
3 Likes

Ah, I see! This was the solution that fixed all of the problems for me, the missing & was a part of the problem for sure but I had to make file_reader var and reader const and it all works.

Thanks again! And just FYI, I don’t need the type hint here for some reason :slight_smile:

The explicit type isn’t needed but it helps keeping people from making the file_reader into a const, because if they do they will get a compile error, without the explicit type you can make file_reader a const and you don’t get the compile error directly there (instead you get it later when trying to call methods that expect a *Io.Reader, but get a *const Io.Reader instead).

So basically adding the explicit type is helpful to avoid typos or misuse, like for example forgetting the & in front of file_reader.interface.

2 Likes

Ah, makes sense! I see that. Thanks a ton for all your help, I’m loving the Zig community so far :slight_smile:

4 Likes

I’m still struggling with this a bit. In the tool I’m writing, I want to be able to read files more than a few megabytes, and if I try to do that with a local stack buffer, I’ll run out of memory pretty quickly. I decided to instead try readAlloc, because it seems like exactly what I want. However, it returns error.EndOfStream when you, well, get to the end of the stream. But that’s not a fatal condition as my try is treating it, I want to just reuse the result that we get on that error. If I try to use a catch |err| switch (err) for it and just return the value, I get an error that the value hasn’t been declared yet, which makes sense. I also tried returning a value from a labeled block, but got the same result. What’s the correct way to do this?

const std = @import("std");

const max_file_size = 50 * 1024 * 1024;

pub fn main() !void {
    const file = try std.fs.cwd().openFile("test.txt", .{ .mode = .read_only });
    defer file.close();
    var reader_buf: [1024]u8 = undefined;
    var file_reader = file.reader(&reader_buf);
    const reader: *std.Io.Reader = &file_reader.interface;
    const result = try reader.readAlloc(std.heap.page_allocator, max_file_size);
    defer std.heap.page_allocator.free(result);
    std.debug.print("{s}\n", .{result});
}

This is another way to do it. It’s written for version 0.15.1, but after checking the 0.15.2 docs, I don’t see any reason why it shouldn’t work there as well.

pub fn main() !void {
    var buffer: [2084]u8 = undefined;
    const content = try std.fs.cwd().readFile("file.txt", &buffer);
    std.debug.print("{s}", .{content});
}

If you need an allocator, you could do it like this.

pub fn main() !void {
    var gpa: std.heap.DebugAllocator(.{}) = .init;
    defer _ = gpa.deinit();

    const allocator = gpa.allocator();

    const content = try std.fs.cwd().readFileAlloc(allocator, "file.txt", 64 * 1024);
    defer allocator.free(content);

    std.debug.print("{s}", .{content});
}
1 Like

Thanks a ton, I totally overlooked those convenient little APIs in std.fs! Is there a more IO-centric solution though? I want to not only support files on disk but also reading from stdin.

Also, don’t forget that @embedFile() exists

Useful for ingesting file contents that are static (like AoC exercises), so no fiddling with IO needed

2 Likes

Yes, the value is not declared because the function returns an error. It is either an error or a value.
But I found the problem with the EndOfStream, and a possible solution. It looks like you are almost there. Btw, thank you for that, for I struggle with this whole IO thingy as well.

When you look at the doc of readAlloc, you can see, that it is shorthand for readSliceAll.

pub fn readSliceAll(r: *Reader, buffer: []u8) Error!void {
    const n = try readSliceShort(r, buffer);
    if (n != buffer.len) return error.EndOfStream;
}

The documentation also states the following:

If the provided buffer cannot be filled completely, error.EndOfStream is returned instead.

As I understood this: The function reads the file chunk by chunk and reaches at some point the end of the file, but it does not stop there. It always wants to read another chunk until the allocated memory is full.

The problem is, that too much memory is allocated. max_file_size is the problem. It must match the exact file size.

I got it with:

const stat = try file.stat();
const size: usize = @intCast(stat.size);

This worked for me.

I also tried it with allocRemaining, which seems to be more flexible and maybe what you want with max_file_size:

Transfers all bytes from the current position to the end of the stream, up to limit, returning them as a caller-owned allocated slice.

If limit would be exceeded, error.StreamTooLong is returned instead.

1 Like

Personally, I’d avoid assigning the interface to a variable at all, even a pointer. Just do file_reader.interface.readSliceAll(&buffer). It’s longer, but I feel it’s perfectly readable and prevents this error from occuring in the first place.

Wow, allocRemaining was indeed what I wanted, thank you! Although I think I’ll eventually use the stat method to be able to read files of virtually unlimited size (minus a file that takes up more memory than your system has, of corse).

const std = @import("std");

const max_file_size = 50 * 1024 * 1024;

pub fn main() !void {
    const allocator = std.heap.page_allocator;
    const file = try std.fs.cwd().openFile("test.txt", .{ .mode = .read_only });
    defer file.close();
    var reader_buf: [1024]u8 = undefined;
    var file_reader = file.reader(&reader_buf);
    const reader: *std.Io.Reader = &file_reader.interface;
    const result = try reader.allocRemaining(allocator, std.Io.Limit.limited(max_file_size));
    defer allocator.free(result);
    std.debug.print("{s}\n", .{result});
}

Thanks for all your help everyone, once again I absolutely love this community and am finding it endlessly more welcoming than Rust’s.

1 Like

I agree, this is the style I’ve started preferring as well. Not for allocators because .allocator() is a function call, but .interface is just a property and it’s very easy to forget an & when copying to a pointer and get a very confusing segfault.