How do I match glob patterns in Zig?

Hi, I’m trying to walk through directories, but respecting .gitignore .
How can I match glob patterns in .gitignore file like .zig-cache/ tmp-* ?

If this feature isn’t in the standard library yet, do I have to learn C in order to use this feature?

Thank you very much.

If it’s helpful as a starting point, here’s how you’d recursively iterate a directory (in this case the current working directory):

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator();

    var dir = try std.fs.cwd().openDir(".", .{ .iterate = true });
    defer dir.close();

    var walker = try dir.walk(allocator);
    defer walker.deinit();

    while (try walker.next()) |entry| {
        std.debug.print("{s}\n", .{entry.path});
    }
}
3 Likes

Thank you for the answer, I was able to get that far, until I realize that I don’t know how to filter Glob patterns in Zig.

Do I need to hook into C in order to do that? Something like this example: Regular Expressions in Zig ?

I don’t think the Zig std library have this Glob filtering feature? Please correct me if I’m wrong.

Glob is rather simple compared to regular expressions, so there shouldn’t be a need to pull in a C library. Here’s a quick direct port of the algorithm from this post (my port is mostly untested, you’ll definitely want to check my work)

fn match(pattern: []const u8, name: []const u8) bool {
    var pattern_i: usize = 0;
    var name_i: usize = 0;
    var next_pattern_i: usize = 0;
    var next_name_i: usize = 0;
    while (pattern_i < pattern.len or name_i < name.len) {
        if (pattern_i < pattern.len) {
            const c = pattern[pattern_i];
            switch (c) {
                '?' => { // single-character wildcard
                    if (name_i < name.len) {
                        pattern_i += 1;
                        name_i += 1;
                        continue;
                    }
                },
                '*' => { // zero-or-more-character wildcard
                    // Try to match at name_i.
                    // If that doesn't work out,
                    // restart at name_i+1 next.
                    next_pattern_i = pattern_i;
                    next_name_i = name_i + 1;
                    pattern_i += 1;
                    continue;
                },
                else => { // ordinary character
                    if (name_i < name.len and name[name_i] == c) {
                        pattern_i += 1;
                        name_i += 1;
                        continue;
                    }
                },
            }
        }
        // Mismatch. Maybe restart.
        if (next_name_i > 0 and next_name_i <= name.len) {
            pattern_i = next_pattern_i;
            name_i = next_name_i;
            continue;
        }
        return false;
    }
    // Matched all of pattern to all of name. Success.
    return true;
}

You also may want to think about writing a custom std.fs.Dir.Walker implementation that does not recurse into subdirectories that are ignored by the glob if you care about that sort of thing.

6 Likes

Thank you so much for putting in the time and energy to help me :pray:
I was so used to having things handed to me in other languages (via their std library or community libraries) that I didn’t even consider solving the problem myself.
Thank you for making me realize my shortcomings.

2 Likes

If you want there is this library which has a PCRE regex implementation, if you want something already made.
Fluent

3 Likes