An Alternative to `anytype` Duck-Typing

There’s recently been a lot of good discussion surrounding anytype and duck-typing in Zig. These discussions have been happening for years obviously, but some of the more recent ones here on Ziggit (such as “A better anytype?” and “Zig’s Comptime Is Bonerks Good”) inspired me to try and use Zig’s comptime to solve a few of the issues that were mentioned.

Working through this, I came up with a basic Comptime Type Constraint pattern. Basically, this is just a comptime function that takes in a Type and returns the same Type if all requirements are met (otherwise resulting in a compile error). The focus here is readability and reuse. Readability, in this case, refers both to readers of the code and the error messages that are generated. Reuse simply means the ability to easily apply the pattern in multiple places.

This pattern isn’t entirely new (the stdlib uses comptime validation functions in several areas), but I think it’s under represented. Specifically, using the validation function within a function signature is something I haven’t seen much of, if ever.

With that said, here’s a basic demo of the pattern. Feel free to play with the arg Type and compile to see the pattern in action:

const std = @import("std");
const log = std.log;
const mem = std.mem;
const meta = std.meta;

pub fn main() !void {
    // Change my Type and recompile to see the `IsFoo()` Comptime Type Constraint in action.
    // Types: Foo, FooTwo, BadFoo, Bar, Baz
    const arg: BadFoo = .{};
    someFn(@TypeOf(arg), arg);
}

/// Just a demo Function using our Comptime Type Constraint (`IsFoo()`).
pub fn someFn(ArgT: type, arg: IsFoo(ArgT)) void {
    log.debug("Good Foo: {}", .{ arg });
    return;
}

/// Our Comptime Type Constraint.
/// This ensures the provided Type is a Struct that contains:
/// - Field: `foo: bool`
/// - Function: `doFoo([]const u8) bool`
pub fn IsFoo(CheckT: type) type {
    var check_msg: []const u8 = "The Type `" ++ @typeName(CheckT) ++ "` must be a Struct with the field `foo: bool` and function `doFoo([]const u8) bool`.";
    // Check the Type
    const raw_info = @typeInfo(CheckT);
    if (raw_info != .Struct) @compileError(check_msg);
    const info = raw_info.Struct;
    var good: bool = true;
    // Check for a Field
    checkFoo: {
        for (info.fields) |field| {
            if (!mem.eql(u8, field.name, "foo")) continue;
            if (field.type != bool) break;
            break :checkFoo;
        }
        good = false;
        check_msg = check_msg ++ "\n- Missing Field: `foo: bool`";
    }
    // Check for a Declaration
    checkDoFoo: {
        declCheck: for (info.decls) |decl| {
            if (!mem.eql(u8, decl.name, "doFoo")) continue;
            const DeclT = @TypeOf(@field(CheckT, "doFoo"));
            const decl_info = @typeInfo(DeclT);
            if (decl_info != .Fn) break;
            for (decl_info.Fn.params, 0..) |param, idx| {
                const ParamT = switch (idx) {
                    0 => []const u8,
                    else => unreachable,
                };
                if (param.type != ParamT) break :declCheck;
            }
            if (decl_info.Fn.return_type != bool) break;
            break :checkDoFoo;
        }
        good = false;
        check_msg = check_msg ++ "\n- Missing Fn: `doFoo([]const u8) bool`";
    }
    if (!good) @compileError(check_msg);
    return CheckT;
}

/// The most basic Struct matching the IsFoo Constraint.
pub const Foo = struct {
    foo: bool = false,

    pub fn doFoo(arg: []const u8) bool {
        return mem.eql(u8, arg, "foo");
    }
};

/// Another Struct matching the IsFoo Constraint with additional Fields and Declarations.
pub const FooTwo = struct {
    foo: bool = false,
    other_field: []const u8 = "I can have other fields...",

    pub const other_decl: []const u8 = "...and other declarations too!";

    pub fn doFoo(arg: []const u8) bool {
        return mem.eql(u8, arg, "FOO_TWO");
    }
};

/// This Struct nearly matches the IsFoo Constraint, but has a bad signature for `doFoo()`.
pub const BadFoo = struct {
    foo: bool = false,

    pub fn doFoo(arg: []const u8) []const u8 {
        return if (!mem.eql(u8, arg, "foo")) "BAD FOO!" else "foo";
    }
};

/// This Struct is similar to `BadFoo`, but has a bad declaration name instead.
pub const Bar = struct {
    foo: bool = false,

    pub fn doBar(arg: []const u8) bool {
        return mem.eql(u8, arg, "bar");
    }
};

/// This Struct is missing ALL requirements of the IsFoo Constraint.
pub const Baz = struct {
    baz: []const u8 = "",

    pub fn doBaz(arg: []const u8) bool {
        return mem.eql(u8, arg, "baz");
    }
};

This pattern improves readability and reuse in two ways:

  1. For both code readers and writers, it provides a succinct place to see exactly what requirements a Type must have when reading a function signature. Since this is just a function itself, it can be reused across several functions easily. It also shows up easily in tools like ZLS so you don’t have to blow up the Doc Comment of every function where this Constraint applies just to clarify what the Type of some Parameter must be.
  2. It lifts errors about the Type to the call site instead of leaving them buried within the function. This can help a lot when debugging and trying to figure out exactly which call has a troublesome parameter.

There are also downsides of course. For one, this isn’t compatible with anytype as the Type must be provided directly; thus adding more parameters to a function. I typically avoid anytype in favor of providing the Type anyway, but it’s understandable that not everyone feels the same way.

This pattern also only applies functions. These constraints couldn’t be directly used as Types for fields or declarations, though they could help in Type validation. You could also extend this to automatically create vtables (that could be used anywhere), but that’s a separate discussion beyond the scope of this small solution.

Does this pattern properly address the issues I mentioned? Are there other downsides I’m missing?

6 Likes

Something like this has crossed my mind once or twice but it’s cool seeing it actually realized. My only concern would be compile-time overhead if the same type gets checked for the same constraints multiple times, but I’m pretty sure the result gets cached.

This could even compose like so:

pub fn doIO(
    Stream: type
    stream: IsReader(IsWriter(Stream))
// We know that these error sets exist from the type constraints:
) (Stream.WriteError||Stream.ReadError)!void {}

I also think that having some friction for writing generic code is not a bad thing. ‘Genericising’ your code can be very tempting when the language makes it easy, but there is overhead introduced not just in compiler complexity, but also mental overhead for the programmer.

2 Likes

This is the kind of stuff I had in mind, love it. Having some constraints that makes it easy to tell what the code actually wants.

1 Like

I just want to point out that more recently, the strategy followed by the standard library appears to not be to introduce superfluous “type shape” validation logic and to instead let compile errors guide the user toward figuring out what the expectations are. See for example this commit, which removed such validation logic from ArrayHashMap, and this recently merged PR which did the same for HashMap.

The issue linked by the PR also demonstrates a problem with having a separate validation function from the implicit validation from the usage site: it’s easy for the two to grow subtly out of sync and for the validation function to be more strict than it actually needs to be.

Most problems with anytype can be classified as “tooling issues”, either in the form of bad/confusing/insufficient compile errors, or insufficient information provided by tools like autodoc or ZLS.

I can’t find the exact issue it was discussed in, but for compile errors I know there have been talks about making sure module boundaries are taken into account when omitting error traces, so that even if an error occurs deep inside std.fmt it should still highlight the first usage outside of the std module instead of requiring -freference-trace to start seeing traces from user code. This would probably help a lot of users more easily make sense of compile errors and fix their code.

For tools like autodoc or ZLS, I’m not going to be as bold as to say that fixing it would be trivial, but at least for simple cases it should definitely be possible to analyze how an anytype parameter is used inside a function, by looking at expressions like duck.speak("quack"); or const x: usize = duck.num_feathers;, and then provide hints to the caller like “duck should expose a quack method that takes a *const [5:0]u8 and returns void” or “duck should have a num_feathers field of a type coercible to usize”.

6 Likes

@castholm appreciate the insight as always! And, as is typical of my Zig journey, I’m always a step behind, haha. The first commit you referenced was exactly the kind of function I had in mind when writing what you quoted.

I must say, my initial reaction to this decision is to disagree. I know the standard answer is “just document what the Type needs to be,” but doing that across more than one function gets tedious and still isn’t as readable as a single constraint function in my opinion. Moreover, relying on the compiler for duck-typing errors quickly turns into a game of whack-a-mole wherein we may have to compile several times over to figure out exactly what our Type is missing if we don’t understand the documentation.

With a constraint, the documentation and error messaging are all handled together. The “right” solution (obviously just imo) to that PR would’ve been to make the constraint function less strict so as to be inclusive of its actual requirements.

That said, the idea of module boundary consideration during error parsing will help quite a bit. Hopefully tooling improves in kind as the core team and contributors like yourself continue to grind away. Thanks again for highlighting the current opinion on this in regard to the stdlib!

I agree to some extent that this is also an issue with tooling, and I’m sure that in 1 or 2 years, if anytype remains, a lot of the problems I have with it will mostly vanish, with that said, I still think that anytype is too loose, too easy to write, and we know how things turns out when you rely on programmers to provide documentation. I’m not convinced the situation is exclusively to rely on type constraint functions, but at the same time it’s difficult to say that anytype is a good solution as is.

What I truly love about Zig is it’s smart use of friction as a design element, making the right thing to do the easy path, while still letting you the freedom to deviate at the expense of friction is really an amazing balance, that brings me joy every-time I use Zig. The issue is that I don’t see much friction with anytype at all, I don’t see how the path of using anytype is better than the one where you specify what you want out of the box.

Now I’m not a compiler designer or a maintainer, I don’t have the knowledge, experience or big picture of what anytype enables that’s worth it. There are probably a ton of good reasons why it exist, at the same time, as a user, I just don’t enjoy it, and don’t think it’s the best Zig can do.

3 Likes

A while back we already had a similar discussion (and other people had similar discussions or ideas before that), this topic tries to put type checking in the parameter types, which comes with the downside of not being compatible with anytype parameters.

This is why people have experimented with putting type constraints checking in the return value instead:

4 Likes

I completely forgot about that post, but reading through it (and seeing that I had already “liked” it) I remember really liking the approach. It just wasn’t applicable to my project at the time and I hadn’t grasped comptime quite as well yet.

Moving the constraint function to the return type is probably better than the one I proposed overall. Thanks for linking that!

2 Likes