Walk over stdin given directory

While writing a simple programm that can take input from stdin I noticed that I can do

my_program < folder/

and my stdin would be .kind == .directory
Now I find this interesting, but how can I walk this directory that is passed? Its a std.Io.File and not a std.Io.Dir and just doing:

const dir: std.Io.Dir = .{.handle = stdin.handle}

feels highly illegal (even though it worked)
Is there any way to use this directory in a “legal” way?
Did I maybe just miss a method that would help me here?

I mean, std.Io.File and std.Io.Dir are just thin wrappers around file descriptors (on Linux; not sure how applicable this would be on Windows or other OSes), so that probably will work there just fine, and there is nothing “illegal” about it per se. If it still feels illegal to you, you could construct your Dir like this (as long as you’re targeting Linux):

const stdin_dir: std.Io.Dir = .{ .handle = std.os.linux.STDIN_FILENO };

There will be no real difference, but it might feel cleaner perhaps?


Anyway, the ability to redirect directories into stdin is news to me! Might be useful in the future.

1 Like

I disagree, there is something very illegal here.
Not doing

const dir: std.Io.Dir = .{.handle = stdin.handle};

should be punishable! \s

1 Like

It will be punished in the futur with a compile error, assuming they dont change plans to remove the T{} syntax.

1 Like

in which case const dir = std.Io.Dir{.handle = stdin.handle}; will be punished. Not original one.

I am sorry about that.
I added the const dir = later else I would have directly done const dir: std.Io.Dir = .
I too agree that what I wrote should be illegal. \s

1 Like

Maybe you are searching for std.Io.Dir.Walk() or std.Io.Dir.OpenDir()? I started writing my own copy tool which is currently now on pause, here is the recursive logic for reading directories in case it might be helpful as well:

/// Starts the recursive reading process.
pub fn recurse(self: *const CopyEngine, dir_path: []const u8, anchor: []const u8) !void {
    const start_dirs = self.buffers.dirs_list_len();

    var current_dir = try Io.Dir.cwd().openDir(
        self.init.io,
        dir_path,
        .{ .iterate = true },
    );
    errdefer current_dir.close(self.init.io); // Close on error

    var iter = current_dir.iterate();
    while (try iter.next(self.init.io)) |entry| {
        const src_path = try Io.Dir.path.join(self.init.gpa, &.{ dir_path, entry.name });
        switch (entry.kind) {
            .file => try self.buffers.file.append(self.init.gpa, src_path),
            .directory => try self.buffers.directory.append(self.init.gpa, src_path),
            .sym_link => try self.buffers.sym_link.append(self.init.gpa, src_path),

            else => {
                std.log.debug("Skipping: {s}\nUnsupportedKind: {s}", .{
                    src_path,
                    @tagName(entry.kind),
                });
                self.init.gpa.free(src_path);
            },
        }
    }

    //  If this is an end of a branch, create the directory path
    if (start_dirs == self.buffers.dirs_list_len()) {
        try self.mkpath_strategy.exec(&self.init, dir_path, anchor);
    }

    current_dir.close(self.init.io); // Close before recursion
    //  Recurse into subdirectories
    while (self.buffers.dirs_list_len() > start_dirs) {
        const dir = self.buffers.directory.pop();
        std.debug.assert(dir != null);
        std.log.debug("Popping and recursing on: {s}", .{dir.?});
        defer self.init.gpa.free(dir.?);

        self.recurse(dir.?, anchor) catch |err| {
            std.log.err("{s}: {s}\n", .{ @errorName(err), dir.? });
            // Check for logical end of a branch
            if (self.buffers.dirs_list_len() == start_dirs) {
                try self.mkpath_strategy.exec(&self.init, dir_path, anchor);
            }
        };
    }
}

Thats the thing. I would like to use walk(), but I have a std.Io.File and not a std.Io.Dir
But as it seems it is not as illegal as it feels. I also got from somewhere else the idea to do:

var stdinDir: std.Io.Dir = .{.handle = stdin.handle};
const dir = try stdinDir.openDir(io, ".", .{.iterate = true});

So that I atleast catch maybe some errors before actually doing .walk()

The re-opening seems redundant to me. You’ve already got a valid descriptor. If there were a problem with it, you’d catch an error on the .stat() you presumably called on it to get the .kind. Iterating the stdin directory seems to work just fine, so I really don’t think an extra open does anything useful:

pub fn main(init: std.process.Init) !void {
    const io = init.io;
    const stdin = std.Io.File.stdin();
    const stat = try stdin.stat(io);
    std.debug.print("kind: {t}\n", .{stat.kind});

    if (stat.kind == .directory) {
        const dir: std.Io.Dir = .{ .handle = stdin.handle };
        var it = dir.iterate();
        while (try it.next(io)) |e| {
            std.debug.print("entry: {s}\n", .{e.name});
        }
    }
}

const std = @import("std");
2 Likes

That is of course true. I didn’t think about that.

Yes, re-opening is not needed. Just passing a handle.

Now this got me thinking whether std.Io.File.stdin() should actually be moved to a std.Io.stdin() returning a tagged union (either a File or a Dir), based on whether stdin is a directory or not… it’s probably kind of a niche use case, but sounds valid enough to me, given that it’s possible to redirect a directory like that.

I guess that’s true but Dir has some safeguards which we are bypassing when initing a struct this way. OpenOptions contains iterate = false as a default which got me worried till I understood fully what we are doing here. Basically bypassing the whole dirOpenDIrPosix():

/// This function is also used for WASI when libc is linked.
fn dirOpenDirPosix(
    userdata: ?*anyopaque,
    dir: Dir,
    sub_path: []const u8,
    options: Dir.OpenOptions,
) Dir.OpenError!Dir {
    const t: *Threaded = @ptrCast(@alignCast(userdata));
    _ = t;

    if (is_windows) {
        const sub_path_w = try sliceToPrefixedFileW(dir.handle, sub_path, .{});
        return dirOpenDirWindows(dir, sub_path_w.span(), options);
    }

    var path_buffer: [posix.PATH_MAX]u8 = undefined;
    const sub_path_posix = try pathToPosix(sub_path, &path_buffer);

    var flags: posix.O = switch (native_os) {
        .wasi => .{
            .read = true,
            .NOFOLLOW = !options.follow_symlinks,
            .DIRECTORY = true,
        },
        else => .{
            .ACCMODE = .RDONLY,
            .NOFOLLOW = !options.follow_symlinks,
            .DIRECTORY = true,
            .CLOEXEC = true,
        },
    };

    if (@hasField(posix.O, "PATH") and !options.iterate)
        flags.PATH = true;

    const mode: posix.mode_t = 0;

    const syscall: Syscall = try .start();
    while (true) {
        const rc = openat_sym(dir.handle, sub_path_posix, flags, mode);
        switch (posix.errno(rc)) {
            .SUCCESS => {
                syscall.finish();
                return .{ .handle = @intCast(rc) };
            },
            .INTR => {
                try syscall.checkCancel();
                continue;
            },
            .INVAL => return syscall.fail(error.BadPathName),
            .ACCES => return syscall.fail(error.AccessDenied),
            .LOOP => return syscall.fail(error.SymLinkLoop),
            .MFILE => return syscall.fail(error.ProcessFdQuotaExceeded),
            .NAMETOOLONG => return syscall.fail(error.NameTooLong),
            .NFILE => return syscall.fail(error.SystemFdQuotaExceeded),
            .NODEV => return syscall.fail(error.NoDevice),
            .NOENT => return syscall.fail(error.FileNotFound),
            .NOMEM => return syscall.fail(error.SystemResources),
            .NOTDIR => return syscall.fail(error.NotDir),
            .PERM => return syscall.fail(error.PermissionDenied),
            .NXIO => return syscall.fail(error.NoDevice),
            .ILSEQ => return syscall.fail(error.BadPathName),
            .FAULT => |err| return syscall.errnoBug(err),
            .BADF => |err| return syscall.errnoBug(err), // File descriptor used after closed.
            .BUSY => |err| return syscall.errnoBug(err), // O_EXCL not passed
            else => |err| return syscall.unexpectedErrno(err),
        }
    }
}