How does the compiler know two generic types are the same?

How does the compiler know two instances of a generic type (a type created by a function) are the same?

pub const MyStruct = struct {
    data: std.BoundedArray(u8, 100),

    pub fn init() MyStruct {
        return MyStruct{
            // How does the compiler know this is the same bounded array type as the struct field?
            .data = std.BoundedArray(u8,100){},
        };
    }
}

If this is only based on memoizing the function signature and args, couldn’t I defeat this by using a type generating function that depends on a random number generator?

The compiler caches the result of comptime functions based on the input parameters. This is an important optimization to avoid having many duplicated of e.g. ArrayList(u8).

There are even easier ways to defeat this than random numbers. Since anonymous types are different each time, you can also just pass opaque{} in there. See for example C/C++ macro challenge #1: BOOST_PP_COUNTER - #4 by ianprime0509

Why would you want to defeat this though?

3 Likes

I’ve thought about this, and related matters, quite a bit actually. Using memorization for comptime parameters has a lot of bang for buck, but I’ve come to believe that a deeper process is needed. I haven’t finished writing all of that up, though, or even thinking it through particularly.

Part of it all is that I don’t have a solid enough grasp of how comptime caching in type-returning functions works. There’s limits on how state can be kept around during comptime, but it isn’t clear to me that those limits are strict enough to guarantee that type-returning functions are deterministic on inputs, or how the compiler might detect the situation where that isn’t the case, if it even does.

I would just call this intended behavior. Might be useful for more than Ziggit challenges, even. Can’t think of how offhand, though.

But it should work insofar as it you make an opaque{} A and assign it to B, then TypeFunction(A) and TypeFunction(B) should return the same type. So passing a fresh opaque{} on each call is just a special case of passing different types to a type-returning function, and getting distinct types back in return.

I think this post and topic is required reading to get a deeper grasp of how comptime works:

1 Like

Misconception alert! :wink:

In modern Zig, function memoization is purely an optimization: it does not impact language semantics and will not be a part of the language specification.

Instead, the compiler determines whether types are equivalent by comparing two things:

  • Where are the types declared in source code?
  • What values do the type definitions “capture” – meaning, what is the value of any identifier used within the type’s definition but declared outside of it?

Here’s a simple example:

fn Foo(comptime val: anytype) type {
    const T = @TypeOf(val);
    return struct { x: T };
}

The struct declaration in this function captures the value of the local variable T, so whether two types Foo(x) and Foo(y) are equivalent depends on what T is in each case. For instance, Foo("hello") == Foo("world"), and Foo(1) == Foo(2), but Foo(1) != Foo(1.0).

10 Likes

Right, I think the question here was how does the compiler know when memorization of parameters is or isn’t a valid optimization to apply. It is if the function is a deterministic function of its inputs, but Zig isn’t Haskell so that would need to be determined.

The term is actually “Memoization”

2 Likes

Memorization is the older term, and still used by a number of respectable authors, Roberto Ierusalimschy in particular. I prefer it.