EazyArgs: a simple argument parsing library with a comptime twist

Hello hello!

I am fully aware that there are several parsing arguments libraries, but nevertheless it’s time for another one!

I wrote this due to my frustration with zig-clap api design, which I do not enjoy a lot, and I replicated EasyArgs’ brilliant idea of C’s preprocessor usage with a compile time struct generation! The idea is just to define what do you want to parse in an anonymous struct like this:

const definitions = .{
    .required = .{
      Arg(u32, "limit", "Limits are meant to be broken"),
      Arg([]const u8, "username", "who are you dear?"),
    },
    .optional = .{
      // type, field_name, short, default, description
      OptArg(u32, "break", "b", 100, "Stop before the limit"),
    },
    .flag = .{
      // default as false, type is bool
      // field_name, short, description
      Flag("verbose", "v", "Print a little, print a lot"),
    }
};

const arguments: InputStruct = try parseArgs(allocator, definitions, stdout, stderr);

and then the compiler will generate a struct with those field names and fill them with whatever arguments you provided! :))

It just works on current master version of zig (0.16) due to the @Type function split up! Also idk how ergonomic it does feel, if you want to try it out or check the code and tell me pain points or bugs I would be very happy!

Thank you and have a nice day :slight_smile:

10 Likes

Welcome to Ziggit @PauSolerValades!

There’s room for as many as we want to write.

This approach seems promising to me, it could likely be extended to support ‘commands’ by returning a union of the commands, each one of which is struct-constructed the way you’re already doing. Quite Zon-like in a way.

Reified types have the limitation that it isn’t possible to add declarations, so they can’t have member functions. But options work well as plain-old-data, so not much of a limitation in practice here.

2 Likes

thank you for your reply and the edit :smiley:

Actually i was not familiar with Zon until you menctioned, and indeed it looks very similar with commands added (i had the commands in sight when finished the argument parsing) those are the sketches i was planning to do try to implement. Would be converting this:

const definitions = .{
    .flags = .{
        eaz.Flag("verbose", "v", "Enable detailed debug logging"),
        eaz.Flag("version", "V", "Print version and exit"),
    },
    .commands = .{
        .commit = .{ // this is a definition all over again
            .required = .{ ... },
            .flags = .{ ... },
            .options = .{ ... },
        },
        .push = .{
            .required = .{
                Arg([]const u8, "remote", "The remote registry (e.g. origin)"),
                Arg([]const u8, "branch", "The branch to push"),
            },
            .flags = .{
                Flag("force", "f", "Force overwrite remote branch"),
            },
            .options = .{},
        },
    },
    .required = .{ // mix command with normal arg (made up arg)
        Arg([]const u8, "path", "Check status of this specific path"),
    },
    .options = .{}, 
};

into this, using the union as you said

const Result = struct {
    verbose: bool,
    path: []const u8,
    cmd: union(enum) {
        commit: struct {
            message: []const u8,
            amend: bool,
        },
        push: struct {
            remote: []const u8,
            branch: []const u8,
            force: bool
        },
    },
};

which actually looks very clean imo, i’ll get to do fs!

never even thought about the possibility of generating declarations on the generated struct, and I struggle to came up for a well thought use for that anyway, if one comes to mind please tell me :))

1 Like

This deserves a book, or a long chapter in a book anyway.

For what you’re doing it would be possible to generate the help string as a .format method on the constructed type, then help would just be

try stdout.print("{f}", .{opts});

Of course there’s another way to do that, there always is, and individually they tend to seem sufficient.

The effect is that reified types are second-class subjects. It has to be container.function(&instance, args..), it can never be instance.function(args...) like for “real” types. I find that unsatisfactory.

1 Like

This does come with the downside of never being able to see the generated struct definition yourself… I wonder if there could be an api that would take in a struct and a set of customizations so you could still have a human-readable struct definition but the parsing options (required, positional, etc), could still be applied.

hello :))

yep, totally. It was very difficult for to convince myself the stuff had worked and the function did actually returned a real struct. If you check src/main.zig in the repo you’ll find my attempts to convince myself that the output was indeed a struct despite not appearing in the code anywhere xD

then i realized that not seeing the struct written in code was kinda the point of the library, not doing it yourself, but it took some time to settle in haha

1 Like

okay gotcha it would be more elegant the way you say for sure.

In fact, printUsage function (which prints the help) does exactly this: needs a the object and the writer. so your example is literally a thing in the library :rofl:

1 Like

This is exactly what my library does (not posting here so as not to hijack OP’s thread). Overall it works pretty well. It has the same declarative nature seen here, but with concrete types.

@PauSolerValades fun project! Welcome to the Zig CLI club!

4 Likes

let’s go CLI club!!! haha

actually I would be very curious on what other people did on this topic! do you have the link of your library (and others if you have) somewhere? If you don’t want to post it here I can message you and you send me the link? Thank you

1 Like

Sure thing! My library is cova. The repo has links at the bottom to a few other Zig CLI libs as well.

You can also look at some of the Zig Awesome Lists and Zigistry.dev for other examples.

3 Likes

Hello again!

I am very satisfied to say that I’ve made some progress regarding this, and despite needing still a bit more love, it is actually done! Now command parsing has been added like this in a simplified git example.

    const gitu_def = .{
        // set global arguments for the whole program
        .flags = .{ Flag("verbose", "v", "Enable verbose logging") },
        .optional = .{ Opt([]const u8, "config", "c", "~/.gituconfig", "Path to config file") },

        .commands = .{
            .init = .{ // simple command with 1 positional argument
                .required = .{ Arg([]const u8, "path", "Where to create the repository") },
                .flags = .{ Flag("bare", "b", "Create a bare repository") },
            },
            .commit = .{ // just options and flags
                .optional = .{ Opt([]const u8, "message", "m", "Default Message", "Commit message") },
                .flags = .{ Flag("amend", "a", "Amend the previous commit") },
            },
            .remote = .{ 
                .commands = .{ // nested subcommands !
                    .add = .{
                        .required = .{  // multiple required args // gitu remote add <name> <url>
                            Arg([]const u8, "name", "Remote name (e.g. origin)"),
                            Arg([]const u8, "url", "Remote URL"),
                        },
                        .optional = .{ Opt([]const u8, "track", "t", "master", "Branch to track") },
                    },
                    .show = .{
                        .required = .{ Arg([]const u8, "name", "Remote name to inspect") },
                    },
                },
            },
        },
    };
     

This definition complies with the following rules:

  1. No .commands and .required at the same level.
  2. No .commands can appear if a .required has already appeared.
  3. Every label must contain just it’s adequate type (eg, in .required you just can put Arg)

Now, regards to the parsing I had several meltdowns, and I essentially implemented two functions after messing arround with “how can you validly parse the arguments”. I ended up with two implementations, a freestyle no rules like all GNU linux utilities which requires an allocator. This will parse all the variations in flag and options positioning, like gitu init -v -b "path", gitu init -v "path" -b as well as gitu -v init "path" -b (at least I hope)

    //convert the args into a slice
    const args = try init.minimal.args.toSlice(init.arena.allocator()); 
    const arguments = argz.parseArgs(init.gpa, gitu_def, args, stdout, stderr) catch |err| {
        switch (err) {
            ParseErrors.HelpShown => try stdout.flush(),
            else => try stderr.flush(),
        }    
        std.process.exit(0);
    };

The other implementation is a POSIX “compliant” (I have to actually check what posix compliant really means from the source to test how compliant it is), which means the following order always: Flags/Options, required for every level. That means that the only valid parsing of the above example is gitu -v init -b "here": the flag is defied globally so must be just after the program name, and -b just after the command. In exchange for rigidity, this just iterates once per all the arguments, so it’s much faster (not that the difference is noticable but you know)

    // also, you can do it strict posix
    var iter = init.minimal.args.iterate(); 
    const arguments = argz.parseArgsPosix(gitu_def, &iter, stdout, stderr) catch |err| {
        switch (err) {
            ParseErrors.HelpShown => try stdout.flush(),
            else => try stderr.flush(),
        }    
        std.process.exit(0);
    };

An indefinite number of subcommands can be nested also, and they are reificated as TaggedUnions with the Enums being the names on the .command tuples. This part specifically has been written with lots of sweat and tears with inlineand comptime shenaningans, but I am confident that it works :slight_smile:

Lastly, the help message is not done and the POSIX implementation has to be tested and validated, but if you want to play around and break it I would be very happy!

I also wanted to ask a question. Do you think it’s better to shorten the names to three letters?

  • .required.req
  • .optionalopt
  • .flagflg
  • .commandscmd

It would made the definition shorter but I am not sure.

Thank you for reading this and have a nice day!!

1 Like

Please use real words, not arbitrary shortenings. The goal is clarity of reading, not the minimal amount of characters to type.

1 Like

Yeah, that’s a good reasoning! Thank you :slight_smile:

1 Like

Hello everyone!

I’ve released the library :DD At this time, the parseValue function works under all conditions with all the functionalites, while parseValuePosix has still some things to figure out design wise. If you want to give it a try and give me some feedback I will be very happy!

Regarding POSIX compliance, if you are reading this I would love you to let me know what do you think on this features:

  1. Mutually exclusive flags: This can be done at the definition level as this:
const def = .{
    .flags = .{ 
        .exclusive = .{ 
            Flag("verbose", "v", "Print a lot"), 
            Flag("quiet", "q", "Don't print") 
        },
       Flag("normal", "n", "This is a normal flag").
    }
};

or be left at the expense of the user to do with normal logic, which seems more straight forward to me

if(args.verbose and args.quiet) print("Those two flags are mutually exclusive");

What do you think is best for this?

  1. Long format: techincally, POSIX does not allow a long name for an option utility --verbose should not work, but utility -v yes. I think I won’t implement this at all, but would you be annoyed if a library saying that is POSIX compliant does also accept long form arguments?

That be all, thank you very much! :slight_smile:

1 Like

This reminds me of my own, I find that generating unions using comptime is such an obviously right choice in Zig and couldnt find any other paresers that did this.

2 Likes

hell yeah! despite being kinda tricky to implement at the beginning (lots of looking in std to actually understand how to create the enum from the tuple lables to reify the union) i think to be able to access all subcommands with a switch feels super nice and feels a lot like zig!

const definition = .{
    .flags = .{ Flag("verbose", "v", "Enable detailed logging") },
    .commands = .{
        .query = .{
            .required = .{ Arg([]const u8, "statement", "The SQL statement to run") },
            .optional = .{
                Opt(u32, "limit", "l", 100, "Max rows to return"),
                Opt([]const u8, "format", "f", "table", "Output format"),
            },
        },
        .backup = .{
            .required = .{ Arg([]const u8, "path", "Destination file path") },
            .flags = .{ Flag("compress", "z", "GZIP compress the output") },
        },
    },
};

const args = try argz.parseArgsPosix(definition, iter, stdout, stderr);

switch (args.cmd) {
    .query => |q| {
        try stdout.print("Running SQL: \"{s}\"\n", .{q.statement});
        // doSomeStuff(); 
        try stdout.print("Limit: {d} | Format: {s}\n", .{ q.limit, q.format });
    },
    .backup => |b| {
        try stdout.print("Backing up to: {s}\n", .{b.path});
        if (b.compress) {
            try stdout.print("(Compression enabled)\n", .{});
        }
        // doSomeStuff()
    },
}
1 Like

Totally, the switch syntax was also my end goal!

Regarding the type generation: My union generation is also a recursive abomination of comptime-creation-helper-functions haha.

1 Like

yeah, I think there’s no other way other than to build a friking recursive monster to do parse it! In any case, it’s a love-hate relationship: i cannot think another intuitive way to code this rather than recursion!

Totally, the switch syntax was also my end goal!

Not for me tbh, it was a pleasant discovery once i check it was finished! I just thought that i made no sense to not be a union

The only thing that annoys me from this approach is to access a single command args.cmd.query.statement, when you nest two or three cmd is very verbose: args.cmd.entry.cmd.start.projectid, but you have a very nice switch. The only other approach that allowed to access directly was to build the definition to get something like this

const Args = struct {
    verbose: bool,
    query: ?QueryCmd,
    backup: ?BackupCmd,
};

which absolutely loses all the glamour that the original definition has, and the access is, well, if statements :frowning:

if (args.entry) |e| {
    handleEntry(e);
} else if (args.project) |p| {
    handleProject(p);
} 

and i think this latter one has a much worse memory layout, another win for the union approach!

1 Like

Yup. My personal way to justify the recursive monster was the hope that one day, incremental compilation will do all that once and not touch it again unless necessary.

1 Like

Hello everyone!

The library is essentially done, and with some improvements.

  1. Added a non allocator freestyle GNU parsing library parseValue(definition, args, stdout, stdin) which uses a buffer of 256, and the parseValueAllocator(gpa, definition, args, stdout, stdin) is what the normal function used to be.
  2. In the POSIX function you can merge flags together -v -A -a -> -vaA and a concatenated value: -o3 value is 3 in the flag -o
  3. Help message now adapts and prints descriptions provided in the definition like this:
 const gitu_def = .{
        // what's your programs name and what does it do
        .name = "gitu",
        .description = "Gitu - A simple git for example purposes.",
        
        // set global arguments for the whole program
        .flags = .{ Flag("verbose", "v", "Enable verbose logging") },
        .options = .{ Opt([]const u8, "config", "c", "~/.gituconfig", "Path to config file") },

        .commands = .{
            .init = .{ // simple command with 1 positional argument
                .required = .{ Arg([]const u8, "path", "Where to create the repository") },
                .flags = .{ Flag("bare", "b", "Create a bare repository") },
                .description = "Creates a new repository",
            },
            .commit = .{ // just options and flags
                .options = .{ Opt([]const u8, "message", "m", "Default Message", "Commit message") },
                .flags = .{ Flag("amend", "a", "Amend the previous commit") },
                .description = "Commits changes"
            },
            .remote = .{ 
                .commands = .{ // nested subcommands !
                    .add = .{
                        .required = .{  // multiple required args // gitu remote add <name> <url>
                            Arg([]const u8, "name", "Remote name (e.g. origin)"),
                            Arg([]const u8, "url", "Remote URL"),
                        },
                        .options = .{ Opt([]const u8, "track", "t", "master", "Branch to track") },
                        .description = "Add a new remote",
                    },
                    .show = .{
                        .required = .{ Arg([]const u8, "name", "Remote name to inspect") },
                        .description = "Show current remote"
                    },
                },
                .description = "Interacts with the server (remote)",
            },
        },
    };

which produces:

$: gitu
Gitu - A simple git for example purposes.

Usage: gitu [options] [commands]

Options:
  -v, --verbose         Enable verbose logging
  -c, --config          Path to config file

Commands:
  init                  Creates a new repository
  commit                Commits changes
  remote                Interacts with the server (remote)
    add                   Add a new remote
    show                  Show current remote

$: gitu remote -h
Usage: eazy_args remote [commands]

Description: Interacts with the server (remote)

Commands:
  add                   Add a new remote
  show                  Show current remote

$: gitu commit -h
Usage: eazy_args commit [options]

Description: Commits changes

Options:
  -a, --amend           Amend the previous commit
  -m, --message         Commit message

I’ve decided that mutual exclusive flags will not be supported and will have to be added by the user with several if statements. I think it clutters the definition a lot, and the main point of this library is that is that the definition is cleaner and simpler than writing the struct.

The only two things that annoy me are the following:

  • api names: I don’t feel the names are very good nor intuitive. I have to think to what could i rename them to be simpler to know what differences do they have.
  • POSIX behaviour with commands: i don’t really like that, if a flag is in the root of the definition but there is a subcommand, the flag has to be specified between the program name and the command (eg gitu -v commit -m "hello"). I think the intuitive behaviour should be this gitu commit -v -m "hello" or even this gitu commit -m "hello" -v: that is to prioritize the subcommands before the flags. I think I know how to implement it by wrapping the Iterator in a Custom one which allows me to “peek” what the next argument is. If the starts align I will create a parseArgsPosixErgonomic (awful name ik) reusing code from the normal posix parse function.

Thank you for reading and have a nice day! :smiley:

4 Likes