Implementing Generic Concepts on Function Declarations

– edited – See below (post 4) for an updated version of this using function prototyping –

I’ve was thinking about how to implement more transparent constraints for anytype parameters - I was thinking about ways to expose this as apart of a function declarations. So I came up with an example to demonstrate something using conditional statements in the return type.

Basically, the first argument is evaluated as the constraint, and the second argument is evaluated as the result type. This has some similarity to “Enable If” in C++…


const std = @import("std");

fn RequiresInteger(comptime constraint: type, comptime result: type) type {    
    return switch (@typeInfo(constraint)) {
        .Int => { 
            return result;
        },
        else => { 
            @compileError("Constraint: must be integer type.");
        }
    };
}

fn genericFunction(x: anytype) RequiresInteger(@TypeOf(x), bool) {
    return true;
}

test "Successful Requirement" {
    const x: usize = 0;
    std.debug.assert(genericFunction(x));
}

test "Failed Requirement" {
    const x: struct { } = undefined;
    std.debug.assert(genericFunction(x));
}

Now, this is a very simple example, but the Requires function can be arbitrarily more complicated. You can check for all kinds of things with Zig’s reflection capabilities.

The value of this is that it moves the requirements out of the function’s body and into the declaration. ZLS perfectly displays the full declaration with the requirement clause as the return, so it’s quite friendly to language tools - when typing genericFunction(), here is what ZLS shows:

fn genericFunction(x: anytype) RequiresInteger(@TypeOf(x), bool)

The error message is quite nice too:

main.zig:9:13: error: Constraint: must be integer type.
            @compileError("Constraint: must be integer type.");
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
main.zig:14:47: note: called from here
fn genericFunction(x: anytype) RequiresInteger(@TypeOf(x), bool) {

Thoughts?

8 Likes

That’s a really cool idea.

However I think it gets a bit unreadable in more complicated cases. Let’s say we want a struct with a few functions.
I think in this case it gets a bit convoluted and it’s hard to find the return type:

RequiresStructWithFunctions(.{.functionA, .functionB}, .{fn(@TypeOf(x)) bool, fn(@TypeOf(x)) usize}, @TypeOf(x), bool)

Would it be better if the return type was in the beginning? Maybe a structure like this would be more readable:

Returns(bool, requiresStructWithFunctions(.{.functionA, .functionB}, .{fn(@TypeOf(x)) bool, fn(@TypeOf(x)) usize}, @TypeOf(x)))

Where Returns would be:

fn Return(T: type, _: void) type {
    return T;
}
1 Like

I was just drafting a post (my first here) asking something about constraining what shape of arrays are accepted by a function. Here’s an sample code for background (any improvements to it are very welcome):

// What can be put here to indicate [_][2] array is expected?
//            |
//            |
//           \|/
fn doIt(arr: anytype) @TypeOf(arr[0][0]) {
    const T = @TypeOf(arr[0][0]);
    var res: T = 0.0;
    for (arr) |xy| {
        res += xy[0] + xy[1];
    }
    return res / @as(T, arr.len);
}

const expect = @import("std").testing.expect;

test "averaging" {
    const arr = [_][2]f32{
        [_]f32{ 0, 0 },
        [_]f32{ 1, 2 },
        [_]f32{ 2, 4 },
        [_]f32{ 3, 6 },
        [_]f32{ 4, 8 },
    };

    try expect(doIt(arr) == 6);
    try expect(doIt(&arr) == 6);
}

anytype function args

I am very new to Zig and (coming from Python, Java, Rust, etc.) functions accepting anytype are not very informative in the small completion pop-ups in my editor. Concepts feel like the right idea, but if they are going to constrain the accepted args, shouldn’t they be written where the args’ type is specified, instead of where the return type is specified?

2 Likes

I’m glad you’re seeing that there’s something here worth looking at - I’m definitely open to moving things around because I can agree that it’s hard to see the return type in the example you provided. In fact, your example is very similar to the next idea I had which allows for concept composition:


First, we can declare a function prototype like so. In this case, it’s just a high-level wrapper for the return type and preconditions…

fn Prototype(comptime result: type, comptime constraint: bool) type {
    if (!constraint){ 
        @compileError("Failed function prototype constraints.");
    }
    return result;
}

This function isn’t strictly necessary but for the sake of clarity here it is…

fn Returns(comptime result: type) type {
    return result;
}

We can make two different concepts here…

fn isInteger(comptime T: type) bool {    
    return switch (@typeInfo(T)) {
        .Int => {  return true; }, else => {  return false; }
    };
}

fn isStruct(comptime T: type) bool {    
    return switch (@typeInfo(T)) {
        .Struct => {  return true; }, else => {  return false; }
    };
}

And all of that leads up to function prototyping that allows for concept composition like so…

fn genericFunction(x: anytype) Prototype(
    Returns(bool), isInteger(@TypeOf(x)) or isStruct(@TypeOf(x))
){
    return true;
}

I think there’s something here if we can keep work-shopping it!

2 Likes

Yeah, that one is even better and more flexible.
Sadly this however means we loose information about which of the conditions failed.
That might be fine though.

There are ways to do this in canonical Zig that show up in the standard library quite often. One way to do it is to use a slice for the first argument instead of a fixed size array and then just have another comptime parameter as the type. If you want more feedback on how to do it that way, I think opening a new thread would be the best way to get feedback on your specific issue.

Going off of the topic of this thread, we could prototype your function this way:

fn Prototype(comptime result: type, comptime constraint: bool) type {
    if (!constraint){ 
        @compileError("Failed function prototype constraints.");
    }
    return result;
}

fn isPair(comptime T: type) bool {
    return switch (@typeInfo(T)) {
        .Array => |arr| {
            return (arr.len == 2);
        }, 
        else => return false
    };
}

fn doIt(arr: anytype) Prototype(
    @TypeOf(arr[0][0]), isPair(@TypeOf(arr[0]))
){
    const T = @TypeOf(arr[0][0]);
    var res: T = 0.0;
    for (arr) |xy| {
        res += xy[0] + xy[1];
    }
    return res / @as(T, arr.len);
}
2 Likes

Yeah I was thinking about that. You know, we could have a mix of hard and soft constraints.

A hard constraint is one that raises a compile error with a direct message or returns true. That way, we could have messages if we really wanted to enforce something specific.

Otherwise, a soft constraint could be one that just returns bool without raising a compile error. This would allow for further concept composition to occur and would not necessarily stop the compilation if the concept isn’t fulfilled.

That way, you could easily have a mix of both.

But wouldn’t that add visual noise to the code, because I assume you’d probably need both variants for many functions, leading to isIntegerSoft and isIntegerHard or something like that.
Additionally I think it would cause confusion, for example when you accidently use isIntegerHard(...) or isStructHard(...) instead of the soft variants you will get a compiler error.

I mean, it might - the thing is, OR statements already kind of do what I was referring to here because it is a form of a soft constraint. Likewise, nothing stops someone from just raising a compile error wherever they’d like, so if they really wanted to it’s obviously not prohibited lol.

I’d like to see some more examples of where the readability becomes a bigger problem. Because the composition can occur internally to a function as well… like in the case I wrote above, you could just have:

isInteger(T) or isStruct(T) -> isIntegerOrStruct(T)

That way, embedded statements can be more easily addressed and named.

nothing stops someone from just raising a compile error

Yeah but it is about convention. Like how do I know which function is throwing a compile-error as a hard constraint? And which function can be composed with others?
Let’s say I have a function

fn doSomething(x: anytype) Prototype(
    Returns(void), isInteger(@TypeOf(x))
) {...}

And want to modify it, so it also accepts floats:

fn doSomething(x: anytype) Prototype(
    Returns(void), isInteger(@TypeOf(x) or isFloat(@TypeOf(x))
) {...}

Then boom, compiler error because isInteger was a hard constraint.

Right, I’m agreeing with you. I could have been more clear about that :slight_smile:

In a sense, given a function prototype, if we only have one constraint, that constraint is a hard constraint. So for instance:

Prototype(…, isFloat(T)) → isFloat is a hard constraint because it must be satisfied.

Prototype(…, isFloat(T) or isInteger(T)) → isFloat is a soft constraint because it is optional.

So yeah, I’d agree with your point about keeping compile errors out of the concepts by convention because it raises the issue you’re referring to.

– edited for further clarification –

Regarding readability, I mean given what you have already brought up, I just want to see more examples of this in action to actually see the readability. I think your first example was good regarding the struct with functions and I’d like to take a crack at solving some more of these to see if this idea looks good in practice.

I just want to see more examples of this in action

I can try to find some examples of using anytype throughout my code.

I sometimes have cases where the return type already implicitly checks the restrictions. For example this vector dot-product:

pub fn dot(self: anytype, other: @TypeOf(self)) @typeInfo(@TypeOf(self)).Vector.child {
	return @reduce(.Add, self*other);
}

So the Prototype pattern would probably decrease readability here.

This one is a bit fancy, essentially it accounts compares to a pointer that contains a .pos field of matching type, but that pointer may be optional, so this is implemented recursively:

pub fn equals(self: ChunkPosition, other: anytype) bool {
	if(@typeInfo(@TypeOf(other)) == .Optional) {
		if(other) |notNull| {
			return self.equals(notNull);
		}
		return false;
	} else if(@typeInfo(@TypeOf(other)) == .Pointer) {
		return std.meta.eql(self, other.pos); // other must have a .pos field.
	} else @compileError("Unsupported");
}

Probably something like that, but that doesn’t quite reflect the recursive nature:

pub fn equals(self: ChunkPosition, other: anytype) Prototype(
    Returns(bool),
    isOptional(@TypeOf(other))
    or (
        isPointer(@TypeOf(other))
        and hasFieldOfType(@TypeOf(other), .pos, ChunkPosition)
    )
) {

It is not obvious from the call signature that the optional would need to be an optional pointer with the given struct field.


These are all the problematic cases I could find in my code. Apart from that I usually just have simple requirements, like integers or tuples.
In one case(a json parser) I have a lot of possible types, but that would just be some work writing down all the possible cases.
One case that might be more complicated is the use of reader/writer, but I think a isReader/isWriter function would make sense there.

Thanks for digging these up!

Yes, I think your “isReader/isWriter” point is spot on. I can picture this being more useful for high-level interfaces, such as isForwardIterator, etc…

One example I have is for implementing something like the strategy/factory pattern where we inject dependencies into a builder that returns a struct with our desired components. Likewise, iterator interfaces and general compound types that need to have several fields in one spot would help. I also think this is helpful in cases where *anyopaque member variables are involved as well. Since we’re losing type information, we can add constraints to the interface if we so choose.

In terms of this:

pub fn dot(self: anytype, other: @TypeOf(self)) @typeInfo(@TypeOf(self)).Vector.child {
	return @reduce(.Add, self*other);
}

That return type is quite gnarly as is, so I don’t think much besides a comptime helper function to unpack that would be helpful. So for instance:

fn ElementType(comptime T: type) type {
    return @typeInfo(@TypeOf(self)).Vector.child
}

But that essentially is its own constraint. It has to be a vector for that to even work so it’s probably not super useful here.

In your second example of equals, the only thing that comes to mind right now is the following…

fn hasPointer(comptime T: type) bool {
    return switch (@typeInfo(T)) {
        .Optional => |opt| { 
            return hasPointer(opt.child);
        },
        .Pointer => { 
            return true;
        },
        else => false        
    };
}

And in the .Pointer segment, you could add your concepts to form the pointer constraint. Of course, we’d want to rename that concept at that point, but the general point is still there.

– edited to finish the example –

So for the equals concept, it could be like this…

fn hasPointerToChunkPosition(comptime T: type) bool {
    return switch (@typeInfo(T)) {
        .Optional => |opt| { 
            return hasPointerToChunkPosition(opt.child);
        },
        .Pointer => |ptr| { 
            return hasFieldOfType(ptr.child, .pos, ChunkPosition);
        },
        else => false        
    };
}
3 Likes

I wanted to post a link to a project that does something similar to this idea and is much more fleshed out.

I was not aware of this library at the time of making this post. It takes a different syntactical approach but has some interesting features. Check it out if you are interested in this stuff!

5 Likes

This is blowing my mind

1 Like

I made something similar recently. I didn’t end up putting the concept/trait requirements in the function declaration, but it would be interesting to see if there is a way to do it that is still readable.

1 Like

@permutationlock, thanks for sharing your work!

This will be a long response, but here we go…

Readability is both the cause and the requirement for this issue. Fundamentally, anytype creates an opaque interface at the declaration level - an LSP that shows you function declarations cannot give you any more information beyond “it’s a type”.

This has a long history outside of Zig - a famous example from C++ is:

template<typename T, typename R> R foo(T&& x);

And, I would argue, this is only getting worse. From cppreference, this is a 2023 definition of std::optional’s transform: std::optional<T>::transform - cppreference.com

template<class F> constexpr auto transform(F&& f) &;

Now, they have a mechanism with templates called “terse syntax” with auto which was a step in the right direction:

void foo(concept auto x);

At least now, you could tell that x needs to satisfy the concept. This is very helpful because an LSP will show you this at the call site as you’re typing.

I had a great conversation with @booniepepper yesterday and we drafted a small (but useful) tweak that got added to the 1.0 milestones yesterday to help the compiler print more helpful invocation messages (thank you @booniepepper for the example that made it into the issue): Provide full function invocation during comptime. · Issue #17402 · ziglang/zig · GitHub

We talked about what is the “scope” of the problem we’re trying to solve. Fundamentally, the idea behind explicit constraint at the call site would hopefully mean that we get less compile errors in general because the programmer knows what should go into a function in the first place. In other words, the compile error itself isn’t my explicit target.

Another benefit is of course avoiding a world where the only indicator is implicit duck-typing (if you have a variable named count, we’ll just go ahead and use it). Ironically, that’s how many of the constraints still work - we just put it upfront so you can read it in one place.

“Announce that a duck is required and you’ll have less compile errors when a duck shows up.”
~ Lao Tzu (probably)

The issue I am interested in is on first contact - the first time I read a function declaration, do I know what this thing needs? This helps greatly with documentation - if I see something requires a “readable” object, then I can go lookup what constitutes readable.

Now, that said, I don’t think people want tremendous boilerplate to make sure you have a field called “my_int” in your struct, so it has to be manageable. Furthermore, as good at generics as Zig is, I don’t think generics is actually the heart of Zig - I think being a sensible C replacement is more inline with the goals and generics happen to be apart of that strategy.

Here are some related issues that are still open.

2 Likes

@AndrewCodeDev Thanks for pointing me to those issues, I assumed that this must have been discussed a lot in the past but hadn’t seen those before. My library was just an exploration of what I could do with what is currently available and without changing the language features: I am not even advocating that type traits in the way I’ve implemented them are the way to go.

I’ll admit that I personally don’t usually use LSPs, so I wasn’t thinking from that perspective.

I tried using my library with the style presented here we could have something like:

fn myGenericFunction(comptime T: type) Returns(
    MyReturnType,
    trait.implements(MyTrait).assert(T)
) {
    // function body
}

Where Returns is a hack like

pub fn Returns(comptime T: type, comptime _: void) type { return T; }

Combined with your proposal to print the full function declaration in compile time error messages, this could get close to having concepts/traits with nice error messages, without adding complexity to the language.

@permutationlock That’s another really cool idea. Very interesting stuff.

I’d like to workshop this a bit more, so I’ll come up with some more ideas and post here as we move forward.

One thing I’d recommend is changing the name “Returns” to “Contract” and then if the user wants, use the Returns function in one of my previous posts for added readability.

fn myGenericFunction(comptime T: type) Contract(
    trait.implements(MyTrait).assert(T),
    Returns(MyReturnType) // I personally like returns here
) {
    // function body
}

One of the reasons I like using boolean logic is that it naturally supports intuitive composition. I haven’t read much into your library so far, but how would you intend to handle or statements. I suppose you could create a custom object that has the or statement inside of it and use the meta/traits imports from std. That said, I like on-site composability because it reduced boilerplate.

I’ll think about it more. I think I have a nice idea in mind that would allow composition. I like the idea of changing to contract putting returns there. It feels a little weird calling a function just for labeling, but then again all of this is a little hacky.