I was wondering if there’s a write-up on working with the file system.
How does one work with directories, files - how are we supposed to do standard operations with them, what are the best practices, etc.
For example, it’s not quite clear to me ATM, how best to pass a directory path to a function which would enumerate and load certain files from it.
If I simply pass it as a string, how do I handle separators if I want my code to be cross-platform? Should I use Path (from std/Build/Cache/Path.zig) instead of a string? Maybe something from std/fs/path.zig ?
The examples I’ve seen don’t give much explanation, they’re just that - short examples. For instance paths in them are just strings, and they don’t include any separators.
I would love to see something more substantial.
std.fs.path handles paths as strings, without filesystem calls
std.fs.Dir all file and directory operations (mkdir, delete, rename), walk that returns a directory Entry iterator, createFile and openFile that return a file handle.
However, if a general post / article / in-depth explanation exists, I would still love to read it, because have other questions too.
For example (and this is just one example), Dir is an essential struct if you want to work with files (create, open, etc.).
In order to create a Dir you always start with std.fs.cwd(), at least that’s what all the examples show.
But what if I want to use absolute paths and don’t care about current directory? std.fs.Dir.openDir() expects an existing Dir (self), so in order to use it I must have a valid Dir first, and std.fs.cwd() seems to be the only function that doesn’t need a Dir to open a Dir.
Which means you MUST always start with std.fs.cwd(). Seems strange to me.
I feel that I could benefit greatly if I understood the line of thinking behind the design of this API.
Even if you want to specify the absolute path, std.fs.cwd().openXXX (XXX is Dir or File) is accepted this path, and will return appropriate std.fs.Dir.
Consider the following case:
It calls ls with an absolute path from the arbitrary directory.
File lists of specified absolute path will be list.
I seems to be same result calling std.fs.cwd().openXXX by the absolute path.
I’ve checked this case on MacOS platform.
Another platform may be different result.
One random point of clarification since I’ve seen it trip people up:
The APIs in std.fs.Dir can handle both relative and absolute paths
/ works on every platform if you’re using the std.fs APIs (the lower level Windows-specific APIs may need \, but the higher level cross-platform APIs will take care of transforming them for you under-the-hood).
By leaning into using std.fs.Dir, you get:
better protection against Time Of Check, Time Of Use bugs
your code will likely just work when targeting WASI without any extra effort
Is this guaranteed on all platforms, now and in the future, including WASM and whatnot? I haven’t seen any discussion of this point, this is why I was asking for a write-up on fs.
I’m not against using Dir, quite the opposite. I just don’t get the reasoning that makes cwd() the mandatory part of Dir initialization.
What if I want to start with some arbitrary absolute path, why make a reference to the current working directory? Once again, I would love to read about the “why”, not just “how to”.
The one platform it likely won’t work on is UEFI but I’m not sure that the std.fs API works with UEFI anyway. Every other platform that I’m aware of supports / as the path separator, and any cross-platform std.fs API not properly handling / as a path separator should be treated as a bug IMO.
Note also that fs/test.zig actually tests that both / and \ work on Windows as of this commit (on Windows, all tests that use testWithAllSupportedPathTypes are run once with / as the separator and again with \ as the separator).
If you’d like to avoid going through cwd, then you can use one of the std.fs.<something>Absolute functions. This point, that users may have reason to avoid cwd, is the reason that those functions exist, but it’s also worth noting that many of the Absolute suffixed functions are currently nothing but wrappers around the Dir function, so in practical terms you might not actually always avoid the cwd() call:
On POSIX systems, it returns a Dir with an fd containing a special constant AT_FDCWD. The fact that this is not a real fd can cause problems (and there are bugs if you try to use the return of cwd() with certain APIs. How best to handle this is still up in the air)
On WASI, there is no actual CWD, so by default Zig just returns the fd of the first preopen. If this is not what you want, you can customize the behavior of cwd() for WASI via a root std_options decl and providing a wasiCwd function.
This means that a cwd() call is effectively free, which makes it a nice common-denominator starting point for the Dir APIs, even if the cwd doesn’t end up being used (i.e. you’re using an absolute path). If cwd() wasn’t a cheap call (or could fail), then it seems pretty likely that cwd() calls wouldn’t be so ubiquitous.
One useful part of mental model here is that absolute paths are relative to a particular file system. In the simplest case, /etc/passwd on my machine is not the same as /etc/passwd on your machine.
I had to model this situation in rust-analyzer. While it doesn’t do this currently, I architectured the thing such that it can function in “remote-server” mode, where a powerful machine provides code analysis for source code stored on the client machines. So this created a need to model abosolute paths not globally, but rather as “an abosolute path on this particular machine/file system”. The obvious way to do this is to represent a path as a (FileSystem, String) pair, but threading filesystem parameter everywhere didn’t seem like a good idea for a “wishful thinking” future needs. The key realisation there was that, if you want to use an absolute path, you get that path from some other file, and you can use that existing file. So paths are modeled as FileDescriptor, String, and the difference with FileSystem approach is that you always have FileDescriptor already.
Which is to say: you can think of a Dir file descriptor which gives you a capability of:
reaching any subfile, using relative path
or, reaching any file in the same universe that this Dir comes from, using an absolute path.
This gets rid of ambient authority of absolute paths.