Random file type "unknown" when iterating over sshfs mount directory

the setup is this:

  • I have a directory that is in a tree that is a mounted with sshfs
  • I use the following to iterate over it:
var idir = try std.fs.cwd().openIterableDir(dir_path, .{});
defer idir.close();
while (try dir_iter.next()) |entry| {
    std.debug.print("dbg: {s} {any}\n", .{ entry.name, entry.kind });
    ....
    // opening and reading those files here work just fine
}

now if I run built program once - I see all expected results, such as

dbg: file1 fs.file.File.Kind.file
dbg: file2 fs.file.File.Kind.file
...

but if I run the program again within a short timeframe (several seconds), then I start getting unknown entry types:

dbg: file1 fs.file.File.Kind.unknown
dbg: file2 fs.file.File.Kind.unknown
...

I can still open and read those files though, no problems there, only the kind is broken

after some time entries start to show correctly again

this problem does not occur on “normal” mounts

Am I not initializing the directory iterator in a correct way?
or is it a std lib bug?

Can you provide what operating system you’re on? I’m going to guess some flavor of Linux but there’s some ports to Windows and MacOS that are cluttering up my search attempts.

From the looks of this post, you’re using a version of Zig 11, yes?

sorry, was tired, forgot those details.

zig: 0.11.0
for OS: tested that on ubuntu and nixOS, kernels 5.15.0 and 6.1.82

Okay, I got around to finding something - looks like the problem is with getdents64: Dir.Iterator doesn't return file-type correctly from getdents64 · Issue #5123 · ziglang/zig · GitHub

You can see it getting called and used here:

const rc = linux.getdents64(self.dir.fd, &self.buf, self.buf.len);
...
const linux_entry = @as(*align(1) linux.dirent64, @ptrCast(&self.buf[self.index]));
...
const entry_kind: Entry.Kind = switch (linux_entry.d_type) {
    // lots - I'm truncating them for this post
    else => .unknown,
};

It looks like this is a known issue.

thank you!

the issue seem to be pretty old. Is it even an issue, or an inherited from getdents64 feature?
both would be fine, I just wish that warning from getdents64’s man would also make it into zig’s std lib

1 Like

Well, I’d definitely say it’s an issue of some sort - can we do something about it? Maybe not. I think this post probably explains why it’s not getting looked into more (from that link):

I think that this, returning Unknown, is a sane default because adding an additional call for each entry in those cases would probably impact performance quite a bit since filesystems without d_type support aren't likely storing that metadata anywhere close.

The common approach seems to be to have an explicit function call on the entry to get the file type that does the stat call if needed.

It looks like there’s some speculation about the performance for something like that (I don’t have a good reason to doubt this from here, but I can’t confirm it either). Looks like another call to stat the file is the most reliable thing to do if you really need that info. Kinda weird though - good question btw.