(Note, this is a variation on this article, invited into docs with the release of 0.16.)
As always, reply/comment all you want, and I’ll endeavor to make changes accordingly.
Motivation, and migration
Zig 0.16 brought big changes to I/O. std.fs references deprecate everywhere, and understanding basic use of std.Io is in order. Zig’s new I/O model brings I/O into alignment with the Allocator model - all of Zig’s “file system, networking, timers, synchronization, and pretty much everything that can block [is moved] into a new std.Io interface. All code that performs I/O will need access to an Io instance, similar to how all code that allocates memory needs access to an Allocator instance.” ref.
What used to be std.fs.cwd().openFile(path, .{}) becomes std.Io.Dir.cwd().openFile(io, path, .{}), and likewise throughout the interface: the io arg is in every function that does I/O. This allows you to choose your I/O model (synchronous, asynchronous - coroutines, threads, etc.) with just a swap of the Io you instantiate and provide to all of your I/O calls.
Most of this article is an attempt to provide basic examples and discussion around several typical file (or file-like) I/O use cases. If you’re looking for an std.Io overview, including async and concurrency, look here.
Getting an io object
First thing first; there are many ways to get an io object, but two common ones are within a test:
test "within a test" {
const io = std.testing.io;
// ...
}
and, alternately, within a main. Here we’ll assume use of juicy main:
pub fn main(init: std.process.Init) !void {
const io = init.io;
// ...
}
I’ll take the “within a test” approach throughout the remainder of this journey…
Reading a whole file
First, let’s simply read a whole file:
var buf: [10240]u8 = undefined; // must be big enough for entire file
const io = std.testing.io;
const contents = try std.Io.Dir.readFile(std.Io.Dir.cwd(), io, "test-filename", &buf);
var tok = std.mem.tokenizeSequence(u8, contents, "\n");
while (tok.next()) |line| {
std.debug.print("line: {s}\n", .{line});
}
This variant uses readFile() to read the entire (text) file, or as much as can fit into buf. It returns contents, but it’s important to realize that contents is just a slice on buf - no memory is magically materialized. Still, you want to use contents, not your buf, directly, because contents is a proper slice with .len that corresponds to the data read. Note that no error is returned if buf is not big enough for the file; rather, only buf.len bytes are read, to fill buf - the remainder of the file remains unread. Thus, in real code, you should either be confident that you know the length of the file, or check the length of the result (contents.len < buf.len) to deal accordingly.
The above requires a preallocated buffer (and, implicitly, the length of the file). A variant, readFileAlloc(), exists as well. For this, familiarity with zig’s Allocators is helpful.
// ... alternatively ...
const allocator = std.testing.allocator;
const contents = try std.Io.Dir.readFileAlloc(std.Io.Dir.cwd(), io, "test-filename", allocator, .unlimited);
defer allocator.free(contents); // or free elsewise; caller owns readFileAlloc()'s return buffer!
// ...
Be careful to read the caveates with this implementation, in the doc comments, and consider readFileAllocOptions(), as well. Also note that readFileAlloc*()is implemented to handle “files” for which file size isn’t known (e.g., network streams, terminal input, etc.) (see…).
Errors
The above example just propagates all errors, with try. This is irresponsible - errors should most-often be unwrapped and treated thoughtfully, especially since Error.Canceled needs to be propagated for (async task) cancelation to work properly. In examples below, we’ll see some more realistic error handling.
Reading line-by-line
The above file was treated like a ‘\n’-line-delimited text file, and we’ll continue assuming “text files” for awhile. Rather than reading the whole file, what if we just wanted to read line-by-line (and thus reduce our memory requirement for each read operation)…
const io = std.testing.io;
if (std.Io.Dir.cwd().openFile(io, "test-filename", .{
.mode = .read_only, // optional args, used here for clarity
.lock = .exclusive,
})) |file| { // or catch, rather than if
defer file.close(io);
var buf: [1024]u8 = undefined; // must be big enough for longest line
var reader: std.Io.File.Reader = file.reader(io, &buf);
while (try reader.interface.takeDelimiter('\n')) |line| { // not advisable to auto-propagate; see below...
std.debug.print("line: {s}\n", .{line});
}
} else |err| switch (err) {
error.FileNotFound, error.AccessDenied => {
std.debug.print("unable to open file: {}\n", .{err});
// loop back to try another or something
},
else => |e| return e, // don't continue; rather, bomb out
}
This example uses takeDelimiter() and includes some (but not enough) error handling. First, note that takeDelimiter() is a function of Io.Reader, but our reader, here, is actually an Io.File.Reader. Io.XX.Reader objects carry an interface, which must be used to access functions like takeDelimiter. One common mistake, though, involves assigning that interface improperly:
const reader = &file_reader.interface; // right way
//const reader = file_reader.interface; // BAD!! BAD!!! DON'T DO!
(The reference is necessary because the interface needs it’s connection to the parent reader (File.Reader, in this case), and a copy would isolate it from its parent.) To avoid this, one good pattern is to just always use reader.interface.foo() - that is, always type out the whole thing. Sometimes this is too verbose, given the context, so, if you need to create a const, make sure it’s a const reference, and not a copy. In the line-by-line reading example, above, you see the verbose reader.interface.takeDelimiter() pattern.
Error handling: if the openFile() fails, the code switches to handle FileNotFound and AccessDenied errors as redeemable - perhaps the user is given a chance to choose another file(name). But if any other errors are returned, the final else just propagates the error (which may rise to the top and result in a panic). However, this example neglects takeDelimiter() errors - we’ll handle those better below. Also note: the earlier example, with std.Io.Dir.readFile() never explicitly called openFile(), so did not need the defer file.close(io) that is essential here in this example.
This example also uses a stack buffer, buf - this time just 1024 bytes large; this suggests that we know that the lines of the file are less than 1024 bytes each, or else a whole line would not fit into buf when takeDelimiter() tried to read the line, and takeDelimiter() would return StreamTooLong. Note that tossBuffered() does NOT need to be called, even if a line is 1023 bytes long, because each subsequent call to takeDelimiter() will assume responsibility for that (line will be invalidated at that point, since line is just a slice reference into buf, and buf must be available for the next read). The use of an allocator, rather than the stack, is demonstrated a couple of places elsewhere in this article.
Better error handling… and a touch of async I/O
The above example relied on try to auto-propagate on errors returned from takeDelimiter(). More often, for code that has the concrete reader (or writer), such as our File.Reader, it is more appropriate to return reader.err.?. This is essential, for instance, for error.Canceled, which must be propagated for cancelation (in async/concurrent contexts) to work properly:
// ... alternatively ...
while(reader.interface.takeDelimiter('\n')) |result| if (result) |line| {
std.debug.print("line: {s}\n", .{line});
} else break else |err| switch (err) {
error.ReadFailed => {
std.debug.print("read failed, discontinuing!\n", .{});
return reader.err.?; // return the specific error; especially essential for error.Canceled
},
else => return err, // StreamTooLong could be handled explicitly, but `else` propagation is not illegal
}
// ...
In this case, the “compound-optional” result of takeDelimiter() could be:
- actual data, when unwrapped, or
- null (since
resultis anoptional), or - an error
If takeDelimiter() succeeds with data, result is unwrapped to line, and you can process this line that was read from the file. If result is null, the else break bit breaks out of the while loop. And, finally, if result is an error, it’s switched upon. That’s the work that this line does:
} else break else |err| switch (err) {
A common pattern involves the top-level code, which creates the reader or writer, carefully checking reader.err or writer.err, like this, when handling error.ReadFailed or error.WriteFailed. Lower code, which just takes the reader.interface or writer.interface can merely try *read(), or the likes, and let that ReadFailed or WriteFailed propagate up. The creater of the concrete reader or writer knows what it is (e.g., a File.Reader), and knows if it might need to propagate .err (which might be error.Cancelable, e.g.). So, for instance:
foo(&file_writer.interface) catch |err| switch (err) {
error.WriteFailed => return file_writer.err.?,
};
… but, within the implementation of foo(), try writer.writeAll(...) might simply propagate any WriteFailed (or other error) if that code is in no position to do more than propagate.
See also “More on async”, below.
More approaches to reading…
Yet another approach is to use takeDelimiterInclusive() or takeDelimiterExclusive():
// ... alternatively ...
var rif = &reader.interface; // careful to take the address &!
while(rif.takeDelimiterInclusive('\n')) |line| {
std.debug.print("line: {s}", .{line});
} else |err| switch (err) {
error.ReadFailed => return reader.err.?,
error.EndOfStream => { // process tail...
const line = try rif.take(rif.end - rif.seek); // eek! better catch ReadFailed again!
std.debug.print("final line: {s}\n", .{line});
},
error.StreamTooLong => return err, // or just else => return err,
}
// ...
Here, we don’t have an optional result, so don’t have the if-check for a null result, but assign to line directly. Instead, if the end of the stream is reached before a delimiter is reached, EndOfStream is returned. In this case, there may be “tail” data if the file’s last byte was not a \n - then, the last line of the file would be missed if not for that tail handler.
Using the “stream” pattern
Streaming is a typical approach, and often relies on allocating memory along the way:
if (std.Io.Dir.cwd().openFile(io, "test-filename", .{})) |file| {
defer file.close(io);
var gpa = std.heap.DebugAllocator(.{}){}; // or just use std.testing.allocator directly
defer _ = gpa.deinit();
const alloc = gpa.allocator();
var line = std.Io.Writer.Allocating.init(alloc);
defer line.deinit();
var buf: [64]u8 = undefined; // somewhat arbitrary buffer size
var reader: std.Io.File.Reader = file.reader(io, &buf);
while(reader.interface.streamDelimiter(&line.writer, '\n')) |written_count| {
_ = written_count;
_ = reader.interface.toss(1); // move past the delimiter
std.debug.print("line: {s}\n", .{line.written()});
line.clearRetainingCapacity(); // reset the buffer
} else |err| switch (err) {
error.ReadFailed, error.WriteFailed => return reader.err.?, // in this case, WriteFailed err detail is in reader.err
error.EndOfStream => {
if (line.written().len > 0) {
std.debug.print("tail: {s}\n", .{line.written()});
}
},
else => return err,
}
} else |err| switch (err) {
error.FileNotFound, error.AccessDenied => {
std.debug.print("unable to open file: {}\n", .{err});
// loop back to try another or something
},
else => return err, // don't continue; rather, bomb out
}
This example employs a DebugAllocator; lines in the file could be any arbitrary length as the Writer.Allocating will allocate more space as needed when streaming from the file.
This code block is the most entrenched, of course, but offers the advantage of handling completely unknown file sizes with completely unknown line sizes. Note the many essential defer lines to clean up (after openFile, DebugAllocator, and std.Io.Writer.Allocating.init().)
Byte-wise
The above examples read chunks of data from a file according to some delimiter (‘\n’ or end-of-file). Reading in more granular chunks is straightforward:
// ...
const byte = try reader.interface.takeByte();
std.debug.print("Byte: {}\n", .{byte});
const int = try reader.interface.takeInt(u32, .little);
std.debug.print("u32 int: {}\n", .{int});
More reading options
readSliceAll, readSliceShort, readSliceEndian, takeStruct, takeStructPointer, as well as peak*(), discard*(), toss(), *alloc() variants, and others are all worth investigation. With peek* and take* functions, in particular, you can simplify your code, reading non-uniform data, by taking advantage of the reader’s buffering.
Writing
A simple buffer-centric writer (for testing, e.g.) can be constructed with fixed() (fixed() exists for Reader, as well, by the way):
var buf: [1024]u8 = undefined;
var writer: std.Io.Writer = .fixed(&buf);
Note that this code, the code that creates the writer, will be responsible for calling flush() (below), and responsible for NOT propagating error.WriteFailed errors, but, instead, unwrapping those errors; if, in an async/concurrent context, error.Canceled is an error return, then it usually should be propagated in order for cancelation to work correctly. I’ll keep the following code in “simple” form for illustrative purposes, though…
Writing follows in expected ways (note the use of writer, below, not writer.interface, because writer, created above, is an instance of Io.Writer, not a specific implementation):
try writer.writeByte(byte);
try writer.writeInt(u32, int, .little);
Note that several functions, such as write() return usize, indicating “bytes transferred”, but note that the entire payload may not be transferred yet, so don’t rely on the return value matching the number of bytes in a buffer sent to such functions. Other functions, such as writeAll(), do not return bytes transferred, but, rather, call drain() repeatedly until all bytes are transferred. See *drain() functions for more.
Importantly, after a block of writing functions, be sure to flush():
try writer.flush(); // useless for our .fixed() writer, but...
Note that flush() is useless (no-op) for our fixed() buffer-destined writer, but a more typical, e.g., File.Writer, would require flush() at the finish.
See also print() and print*() variants, sendFile() and send*() variants, and splat*() variants.
More on async
Especially significant, with cancelable I/O, are async variants, and their proper error handling. note, from the 0.16 release notes - “Future, Group, and Batch APIs all support requesting cancelation. When cancelation is requested, the request may or may not be acknowledged. Acknowledged cancelation requests cause I/O operations to return error.Canceled…” That resource goes on to coach your decision between propagating, io.recancel()ing, or declaring unreachable via io.swapCancelProtection(). An async version of the above openFile() step might do something like this:
var open_task = io.async(Io.Dir.openFile, .{ .cwd(), io, "test-filename", .{} });
defer if (open_task.cancel(io)) |file| file.close(io) else |_| {};
// ... continue with the read operations
You can read elsewhere about how cancelation is equivalent to awaiting, in the sense that the above code does not continue until the file is open. But defering a cancelation after creating an Io task is preferred to await() in order to ensure resource clean-up. (Note, that is, that the file may have successfully been opened, just before the task was canceled, and so required the immediate re-close()!)
Sources
- pedropark99’s zig book, chapter 12 (but note that, at the time of this writing, references here were still to 0.15 std.fs)
- williamw520’s I/O “basics” article (but note that, at the time of this writing, references here were still to 0.15 std.fs)
- And, of course, above all: the official Zig documentation and source code - don’t skip it, it’s very valuable.