Interface method, returns value based on the implementation

I have a ValueParser interface with a method parse. I want to have this method return a value and the details of the implementation and type of this value will depend on the implementation of this interface for example:

Here is my ValueParser interface and an implementation IntParser.

pub const ValueParser = struct {
    ptr: *anyopaque,
    parseFn: *const fn (*anyopaque, []const u8) ParseError!void,

    const Self = @This();

    pub fn init(ptr: *anyopaque) Self {
        const gen = struct {
            pub fn parse(pointer: *anyopaque, value: []const u8) ParseError!void {
                const PtrType = @TypeOf(pointer);
                const PtrInfo = @typeInfo(PtrType);

                const self: PtrType = @ptrCast(@alignCast(pointer));

                return try PtrInfo.Pointer.child.parse(self, value);
            }
        };

        return .{
            .ptr = ptr,
            .parseFn = gen.parse,
        };
    }

    pub fn parse(self: *Self, value: []const u8) ParseError!void {
        return self.parseFn(self, value);
    }
};

pub const IntParser = struct {
    pub fn parse(self: *IntParser, value: []const u8) ParseError!usize {
        _ = self;

        return try fmt.parseInt(usize, value, 10);
    }

    pub fn parser(self: *IntParser) ValueParser {
        return ValueParser.init(self);
    }
};

I want to have this parse function to return a value from IntParser and use it like this:

fn parseTest(parser: *ValueParser, value: []const u8) ParseError!void {
    const val = try parser.parse(value);

    /// Type should be inferred based on provided implementation of ValueParser
    std.debug.print("{any}", .{@TypeOf(val)})
}

So far, all of the Interfaces that I have seen have methods that return anyerror!void or just FooError!void or a concrete type !u8 signature.

I cannot for the life of me figure out how to do this. Any help and suggestions would be appreciated.

you can’t without knowing the underlying implementation, which you don’t know with vtable interfaces.

it sounds like you need a generic interface instead

1 Like

Hi @Maranix, welcome to Ziggit!

is there a reason you are trying to use a function-pointer / vtable based interface?

The problem with your interface is that it defines a function pointer that returns ParseError!void, while your implementation wants to return ParseError!usize, you need some way to be able to support the implementations different value types.

One way would be to use type erasure, for example you could change the

parseFn: *const fn (*anyopaque, []const u8) ParseError!void,

to:

parseFn: *const fn (*anyopaque, []const u8, []u8) ParseError!void,

where the third parameter would serve as the memory destination where the value is written to, on the side of the usage code you would then be able to write a function that specifies the type like this:

pub fn parse(self: *ValueParser, value: []const u8, comptime Result:type) ParseError!Result {
    var res:Result = undefined;
    self.parseFn(self, value, std.mem.asBytes(&res));
    return res;
}

Btw your gen.parse function doesn’t work, PtrType is just *anyopaque and PtrInfo.Pointer.child which is anyopaque doesn’t have any information about what it is, you can’t use *anyopaque or anyopaque to get to a concrete implementation, because that information has been erased. (below is a changed version)

The way to use *anyopaque is to pass such a pointer to some associated function that can be called with it.
For example somewhere in your program where the type information isn’t lost yet, at that place you also create something that allows you to undo the type erasure and then use the concrete implementation to do something.

In your case that place is where the Instance of the ValueParser is created (where the int parser is packaged as a ValueParser).


I changed a few things, the parseFn has three parameters, the ValueParser.init function is called with anytype which is something very different from *anyopaque, it expects to be called with the self pointer from a concrete implementation it then uses that to create an instance of the ValueParser interface that contains a function pointer to the concrete implementations parse function.
(Think of this init function as a small helper, that converts a pointer to the concrete implementation to a fat pointer that is an instance of the interface, where the types have been erased (but the .parseFn function pointer is associated with the .ptr pointer and once it is called with it, it can restore the original type information (because it assumes that it will be called with the right self-pointer)))

The IntParser.parse implementation uses const value_ptr = std.mem.bytesAsValue(usize, dest); to turn the destination bytes into a single element pointer *usize and it then parses the string and stores the result at that address. If the given destination is too small the parse function returns error.DestinationTooSmall

const std = @import("std");
const fmt = std.fmt;

const ParseError = error{
    Overflow,
    InvalidCharacter,
    DestinationTooSmall,
};

pub const ValueParser = struct {
    ptr: *const anyopaque,
    parseFn: *const fn (parser: *const anyopaque, input: []const u8, destination: []u8) ParseError!void,

    const Self = @This();

    pub fn init(ptr: anytype) Self {
        return .{
            .ptr = ptr,
            .parseFn = @ptrCast(&std.meta.Child(@TypeOf(ptr)).parse),
        };
    }

    pub fn parse(self: *const Self, value: []const u8, dest: []u8) ParseError!void {
        return self.parseFn(self, value, dest);
    }
};

pub const IntParser = struct {
    pub fn parse(_: *const IntParser, value: []const u8, dest: []u8) ParseError!void {
        if (dest.len < @sizeOf(usize)) return error.DestinationTooSmall;
        const value_ptr = std.mem.bytesAsValue(usize, dest);
        value_ptr.* = try fmt.parseInt(usize, value, 10);
    }

    pub fn parser(self: *const IntParser) ValueParser {
        return ValueParser.init(self);
    }
};

pub fn main() !void {
    const usize_parser = IntParser{};
    const parser: ValueParser = usize_parser.parser();

    var usize_val: usize = undefined;
    try parser.parse("123456", std.mem.asBytes(&usize_val));

    std.debug.print("number: {d}\n", .{usize_val});
}

There are probably a bunch of things that could be improved about this, but it depends on what you really intend to accomplish with this, for example if the set of parser you need, is closed (meaning it is known at compile time, what parsers will be needed), it may make more sense to switch to something that is based on ducktyping and comptime code, using concrete types and generic programming, instead of runtime polymorphism (through type erased interfaces).

To figure out what solution would be best, you would have to describe in more detail what your specific requirements are, how/when you intend to use your parsers and what is known statically at compile time and what is only known at run-time.

This isn’t possible in directly this way.
If you switch to parser: anytype you could use generic programming and parsers could return concrete values, thus the type of the value would be known based on what the concrete parser implementation given, returns.

With a type erased interface, you could instead add methods to that interface that do more than just parsing, but also use the value in some way, or you could use the type erased interface in some context where you know from the context what the concrete type will be.

The compiler can’t infer a type that depends on runtime data, once the type information has been erased only the programmer can know if a specific interface value will have a specific type, for example some language parser could read a type declaration know that it expects an int then try parsing an int-value and if it fails it knows that something other than an int-value was written in the code it tried to parse.

1 Like

A possible solution would be to use a tagged union of the types you need.

Without more context for what you’re doing, it’s hard to give good advice

2 Likes

Thank you very much for such a detailed reply. My understanding of interfaces in Zig was wrong. After reading through everything and realizing the reason behind ‘why’ it wasn’t working. I guess using interface was not the correct decision on my part.

To figure out what solution would be best, you would have to describe in more detail what your specific requirements are, how/when you intend to use your parsers and what is known statically at compile time and what is only known at run-time.

I apologize for not being thorough and correctly communicating my intentions and requirements. I’ve started learning this language seriously since this month and have decided to make a couple of CLI tools/applications in-order to become proficient at it. My intention was to create a light-weight command line argument parsing library for personal usage (I know hardcoding flags and etc would be much simpler but I wanted something which I can re-use in the future as well).

The purpose behind creating the ValueParser interface was for parsing, validating and enforcing the type of flag values. So, let’s say I have a Flag struct.

Example Struct:

pub const Flag = struct {
    long: []const u8,
    short: []const u8,
    parser: ValueParser // parses and validates the flag value
}

Here long and short are your usual flag names prefixed with some identifier (generally) - for short and -- for long and as the comment describes the purpose of parser field. If i wanted to get the value of passed argument --value=20, I can just use the defined Flag to parse the value and return it or save it in a map or something (doesn’t really matter at this point).

Edit: I forgot to add that parsing values for Defined Flag would be done on runtime and the Flag Definition would be done on comptime (if possible). By definition, I mean defining supported flags (structs) for the application.

Usage:

pub fn main() !void {
    const helpFlag: Flag = .{
        .long = "value",
        .short = "v",
        .parser = IntParser // This is an implementation of ValueParser, which strictly enforces that this flag only accepts integer values, anything else will result in an error.
    };

    const valueArg = "--value=20";

    // I haven't split key and value here just to keep the example simple.
    // consider that I will most likely be providing just "20" as `[]const u8` type to parser.
    const result = helpFlag.parser(valueArg);

    std.debug.print("value: {d}", .{result});
}

I guess in this case, It would be better to use a generic function and tagged union to provide the parsers to struct as field or as function parameter. The problem with this approach I think is retrieving the parsed value from the union.

Example:

pub const ValueParser = union(enum) {
    Int: usize,
    Bool: bool,

    const Self = @This();

    pub fn parse(self: Self, value: []const u8) !void {
        switch(self) {
            .Int => |int_val| {
            // Do something with value
           },
           ...
        }
    }

    //How do I retreive the value with correct type from here?
}

Really appreciate the feedback and suggestions. Looking forward for further discussion on this topic.

1 Like

I have considered them as well but I think retrieving the correct value would be problematic with tagged unions. See my post above for more information and context on what I am trying to achieve.

This is a bit of an example of how you could use comptime programming to create a parser, note that an actual command-line parser would operate on a list of strings, instead of one long string.

For example the shell would be responsible for string quotes and escaping.
At least I think so, I am not an expert on command line parsing.

It probably also would make sense to take a look at the various command line parsing projects you can find in this forum and how they are implemented.

I haven’t tested this a whole lot, or refined it, if I were to spend more time on it I probably would find ways to improve it, but I have already been nerd-sniped into spending too much time on this… :upside_down_face:

const std = @import("std");

// this is a comptime only type,
// because `value: type` can't exist at runtime
pub const Flag = struct {
    long: [:0]const u8,
    short: [:0]const u8,
    value: type,
};
pub const Flags = []const Flag;

pub fn ParseResult(comptime Definition: Flags) type {
    var fields: [Definition.len]std.builtin.Type.StructField = undefined;
    for (Definition, &fields) |d, *dest| {
        dest.* = .{
            .name = d.long,
            .alignment = @alignOf(d.value),
            .type = d.value,
            .default_value = null,
            .is_comptime = false,
        };
    }
    return @Type(.{ .Struct = .{
        .layout = .auto,
        .fields = &fields,
        .decls = &.{},
        .is_tuple = false,
    } });
}

pub fn Parser(comptime Definition: Flags) type {
    return struct {
        const Self = @This();

        failed: bool,
        msg: []const u8,
        name: []const u8,
        remaining: []const u8,

        pub const default: Self = .{
            .failed = false,
            .msg = "",
            .name = "",
            .remaining = "",
        };

        fn fail(self: *Self, msg: []const u8, remaining: []const u8) !void {
            self.failed = true;
            self.msg = msg;
            self.remaining = remaining;
            return error.ParseFailed;
        }

        pub fn parse(self: *Self, input: []const u8) !ParseResult(Definition) {
            var res: ParseResult(Definition) = undefined;

            var remaining = input;
            while (true) {
                if (remaining.len == 0) return res;
                if (remaining[0] == ' ') {
                    remaining = remaining[1..];
                    continue;
                }

                const Mode = enum { short, long, unexpected };
                var mode: Mode = .unexpected;
                if (std.mem.eql(u8, "--", remaining[0..2])) {
                    mode = .long;
                    remaining = remaining[2..];
                }
                if (std.mem.eql(u8, "-", remaining[0..1])) {
                    mode = .short;
                    remaining = remaining[1..];
                }
                if (mode == .unexpected)
                    try self.fail("expected - but got unexpected characters starting from:", remaining);

                if (std.mem.indexOfScalar(u8, remaining, '=')) |equals_index| {
                    const name = remaining[0..equals_index];
                    if (remaining.len <= equals_index + 1) try self.fail("expected value but got end of input", remaining);
                    remaining = remaining[equals_index + 1 ..];

                    inline for (Definition) |d| {
                        if ((mode == .long and std.mem.eql(u8, d.long, name)) or
                            (mode == .short and std.mem.eql(u8, d.short, name)))
                        {
                            self.name = name;
                            @field(res, d.long) = try self.parseValue(d.value, &remaining);
                        }
                    }
                    if (std.mem.eql(u8, self.name, "")) {
                        self.name = name;
                        try self.fail("unknown flag:", remaining);
                    } else {
                        self.name = "";
                    }
                } else {
                    try self.fail("expected name= but got unexpected characters starting from:", remaining);
                }
            }
        }

        fn parseValue(self: *Self, comptime Value: type, remaining: *[]const u8) !Value {
            switch (Value) {
                usize => {
                    const end = std.mem.indexOfScalar(u8, remaining.*, ' ') orelse remaining.len;
                    const value = std.fmt.parseInt(u8, remaining.*, 10) catch {
                        try self.fail("parsing usize value failed:", remaining.*);
                        unreachable;
                    };
                    remaining.* = remaining.*[end..];
                    return value;
                },
                else => @compileError("type '" ++ @typeName(Value) ++ "' is not supported"),
            }
        }
    };
}

pub fn parser(comptime Definition: Flags) Parser(Definition) {
    return Parser(Definition).default;
}

pub fn main() !void {
    var p = parser(&.{
        .{ .long = "value", .short = "v", .value = usize },
    });

    // NOTE: a proper command-line parser wouldn't operate on `[]const u8`, but instead on `[]const []const u8`
    // because the program is given a list of arguments not a single string argument

    const result = p.parse("--value=20") catch |err| {
        std.debug.print("error: {}\n", .{err});
        std.debug.print("\nparser\nmsg:{s}\nname:{s}\nremaining:{s}\n", .{ p.msg, p.name, p.remaining });
        std.process.exit(1);
    };

    std.debug.print("result: {}\n", .{result.value});
}
1 Like

At least I think so, I am not an expert on command line parsing.

Neither am i but we’ll see how it goes :stuck_out_tongue:

It probably also would make sense to take a look at the various command line parsing projects you can find in this forum and how they are implemented.

I took a peek in the repositories of zig-clap and zig-cli before making the post here but I couldn’t come up with something on my own (atleast something I had understanding of how it worked).

I haven’t tested this a whole lot, or refined it, if I were to spend more time on it I probably would find ways to improve it

Thanks for this, this looks pretty close to what I want.

but I have already been nerd-sniped into spending too much time on this… :upside_down_face:

LOL, I was hard-stuck on this problem for the last 3-4 days. I looked around almost everywhere and had no choice but to seek help here, but that’s all part of the process.

So, if my understanding of this generic is correct.
Basically, you are programmatically creating a Struct and its fields (where the name of each field would correspond to Flag identifier in this case long and the type of this field will be same as value field from Flag) at comptime using a slice of Flag.

pub fn ParseResult(comptime Definition: Flags) type {
    var fields: [Definition.len]std.builtin.Type.StructField = undefined;
    for (Definition, &fields) |d, *dest| {
        dest.* = .{
            .name = d.long,
            .alignment = @alignOf(d.value),
            .type = d.value,
            .default_value = null,
            .is_comptime = false,
        };
    }
    return @Type(.{ .Struct = .{
        .layout = .auto,
        .fields = &fields,
        .decls = &.{},
        .is_tuple = false,
    } });
}
1 Like