Comptime type inference

Given a generic function:

fn f(x: anytype) MightDependOn(@TypeOf(x)) {
    ...
}

how can we infer its result type given concrete parameter types?
There’s a standard trick:

fn infer(f: anytype, value: anytype) type {
    return @TypeOf(f(value));
}

and I believe this works, but it requires a value. What if you only have a type? I tried this:

fn infer(f: anytype, V: type) type {
    const value: V = undefined;
    return @TypeOf(f(value));
}

but it doesn’t always work. For example, if f can return types:

fn f(x: usize) type {
    return [x]u8;
}

then the above inference code will error like this:

error: use of undefined value here causes illegal behavior

Granted, in this case f was not generic so more direct inference is available, but it’s not difficult to embed it in a generic example.

I think it’s because you’re basically trying to do [undefined]u8.

fn infer(f: anytype, V: type) type {
    const value: V = undefined; //  <-- here you define comptime known value as undefined
    return @TypeOf(f(value));
}

// So your 'f' function is resolved as (pseudo-code):
fn f(x = undefined) type {
    return [undefined]u8; // <-- this is illegal behavior
}

You can reproduce this simply by doing:

// example.zig
fn main() void {
    const a: [undefined]u8 = undefined;
}

If you zig run example.zig you get the exact same message

1 Like

According to Documentation of std.builtin.Type.Fn, return_type of std.builtin.Type.Fn will no longer be optional in the future. Although I can’t imagine how it would be possible to make it non-optional, perhaps in the future an expression itself could become a type value.
At least for now, I see no reason not to just use MightDependOn(T)

1 Like

That’s right:

[undefined]u8

is definitely an illegal type. But the idea was that the compiler would be able to compute

@TypeOf(f(v))

without evaluating f(v). And it definitely does in some cases, for example

fn f(x: anytype) @TypeOf(x) {
    @panic(“nope”);
}

is amenable to this kind of inference as long as the type of x you infer against is not comptime-only.

1 Like

The behavior I’m observing is consistent with the following claim:
When evaluating the expression

@TypeOf(f(x))

at compile time (necessarily) the compiler first analyzes f to compute its return type against values of type @TypeOf(x), call it R. If the result of this analysis is not a comptime-only type (type, comptime_int etc) then it directly evaluate the whole expression to R without having to actually compute f(x). But if R is comptime-only, then the compiler doesn’t do this optimization, and it actually computes f(x).

Anyone familiar enough with the compiler implementation to know whether this sounds correct and if maybe perhaps the optimization could be extended to comptime-only return types? Or, better yet, a way to directly expose the result R of the analysis, which is what I’m trying to emulate anyway.

I think you may want more type inference than Zig intends you to have.
To me it seems like Zig tries to encourage people to just write the comptime functions that compute the return type of a function explicitly, instead of trying to get around that and have that inferred instead.

So I wonder if this is a case where you are trying to do something, the language tries to steer you away from.

I think one reason to avoid unbounded inference in a language is that it would allow writing programs that don’t contain any types anymore (or big chains of inferred types), so a small change somewhere could lead to type changes somewhere else and the cause could be quite difficult to track down. So I think Zig tries to limit inference, so that these chains can’t be formed, which would make it possible to create code where you don’t know what type it will have (without deep analysis).

Having these manually written type computation functions also puts a limit to the complexity of the code that is encountered by the compiler, making the job that needs to be done by the compiler simpler (probably making it simpler to make the compiler fast).

I don’t know the details of how it is implemented, but one reason why I think it has to run f(x) is that that function could create a @compileError or throw a panic at comptime, hit the limit of an @setEvalBranchQuota etc.

That said, would be interesting to hear the perspective of someone familiar with the internals.

1 Like

Let me add that I’m on board with Zig’s priorities, the point about type explosion is well taken, and I wouldn’t want this feature if it requires compromises. I’m just trying to understand if that’s really the case.
I don’t think the compiler literally has to execute function bodies to trigger @compileError. Just branch analysis is enough. This is surely the case, because we know that something like

fn f() void {
    @panic(“!”);
}

compiles, so it must not be called at compile time, but

fn f() void {
    @compileError(“!”);
}

won’t compile. If the former is not called at compile time then I can’t image the latter would be - how would the compiler know to treat them differently?

1 Like

Hmm, yes I think you are right with that.

@compileError triggers when it is reached during semantic analysis, but comptime / compile time execution of functions also happens during semantic analysis.

From:

After we have generated all of our ZIR, then we have Sema. This is the heart of the compiler – it’s the stage that performs semantic analysis, which includes type checking, comptime code execution, most error messages, etc. Sema interprets the ZIR which AstGen emitted and turns it into AIR (Analyzed Intermediate Representation), a much more simple and low-level IR which is sent to the code generator. CodeGen is actually interleaved with Sema: after a function is semantically analyzed, it’s immediately sent to the code generator.

For the rest read the linked post. So I think I would rephrase:

I think it runs whatever comptime parts there are within the function (if there are any).

Not completely sure, that is just my best guess at the moment.
I think for a better answer I would have to dig into the implementation myself. (Which I want to do eventually, but haven’t done yet)

1 Like

Changed the category from ‘Help’ to ‘Explain’ as it seems more appropriate now.

I did some more experimenting and thought I would share my findings.

To start with, I was exploring the behavior of value-based inference by calling this function:

fn infer(f: anytype, value: anytype) type {
    return @TypeOf(f(value));
}

I found that if ‘value’ is not comptime-known then you will always get a compiler error:

error: unable to resolve comptime value

even in cases where using a comptime-known ‘value’ of the same type wouldn’t result in a compiler error.
However, I found that if you inline it:

inline fn inferInline(f: anytype, value: anytype) type {
    return @TypeOf(f(value));
}

then, with non-comptime-known ‘value’, sometimes it doesn’t error, but not always. The behavior of infer and inferIline is the same when ‘value’ is comptime-known.

My other main finding is: it seems to be genuinely true that a distinction is made based on whether the type that f returns against some particular input type is comptime-only (type, comptime_int etc) or not. In the former case, f will actually be called at comptime, while in the latter, it will not.

test "simple concrete return type" {
    const Test = struct {
        fn f(value: anytype) u32 { // concrete return type
            _ = value;
            @panic("you shall not pass!");
        }
    };
    try std.testing.expect(infer(Test.f, void) == u32);
    try std.testing.expect(inferInline(Test.f, void) == u32);

    try std.testing.expect(infer(Test.f, 0) == u32);
    try std.testing.expect(inferInline(Test.f, 0) == u32);

    const x: u32 = 0;
    try std.testing.expect(infer(Test.f, x) == u32);
    try std.testing.expect(inferInline(Test.f, x) == u32);

    // try std.testing.expect(infer(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value
    try std.testing.expect(inferInline(Test.f, std.testing.random_seed) == u32);

    try std.testing.expect(Test == Test);
}

test "simple comptime-only return type" {
    const Test = struct {
        fn f(x: anytype) type { // comptime-only return type
            _ = x;
            @panic("you shall not pass!");
        }
    };
    // try std.testing.expect(infer(Test.f, void) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, void) == u32); // error: encountered @panic at comptime

    // try std.testing.expect(infer(Test.f, 0) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, 0) == u32); // error: encountered @panic at comptime

    // const x: u32 = 0;
    // try std.testing.expect(infer(Test.f, x) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, x) == u32); // error: encountered @panic at comptime

    // try std.testing.expect(infer(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value
    // try std.testing.expect(inferInline(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value

    try std.testing.expect(Test == Test);
}

test "dependent return type" {
    const Test = struct {
        fn f(x: anytype) @TypeOf(x) { // whether it's comptime-only varies with x itself
            @panic("you shall not pass!");
        }
    };
    // try std.testing.expect(infer(Test.f, void) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, void) == u32); // error: encountered @panic at comptime

    // try std.testing.expect(infer(Test.f, 0) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, 0) == u32); // error: encountered @panic at comptime

    const x: u32 = 0;
    try std.testing.expect(infer(Test.f, x) == u32);
    try std.testing.expect(inferInline(Test.f, x) == u32);

    // try std.testing.expect(infer(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value
    try std.testing.expect(inferInline(Test.f, std.testing.random_seed) == u32);

    try std.testing.expect(Test == Test);
}

test "dependent return type switcheroo" {
    const Test = struct {
        fn f(x: anytype) if (@TypeOf(x) == type) u32 else comptime_int { // naturally
            @panic("you shall not pass!");
        }
    };
    try std.testing.expect(infer(Test.f, void) == u32);
    try std.testing.expect(inferInline(Test.f, void) == u32);

    // try std.testing.expect(infer(Test.f, 0) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, 0) == u32); // error: encountered @panic at comptime

    // const x: u32 = 0;
    // try std.testing.expect(infer(Test.f, x) == u32); // error: encountered @panic at comptime
    // try std.testing.expect(inferInline(Test.f, x) == u32); // error: encountered @panic at comptime

    // try std.testing.expect(infer(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value
    // try std.testing.expect(inferInline(Test.f, std.testing.random_seed) == u32); // error: unable to resolve comptime value

    try std.testing.expect(Test == Test);
}
1 Like

My conclusions:

  1. If the return type of f against some input type V (i.e. the sought-for result of the call to infer / inferInline) is a comptime-only type, then it must be resolved, and in order to resolve it, f must be called.
  2. If f must be called and ‘value’ is not comptime-known, we of course get ‘error: unable to resolve comptime value’.
  3. Since the return type of infer is comptime-only, to compile it you must call it (see point 1. again), hence it fails on not-comptime-known values.
  4. inferInline calls get inlined before resolution, so you shed a whole ‘this function returns type’ barrier. @TypeOf is not a simple function that returns types, it’s a compiler directive.
  5. If the return type of f against some input type V is not a comptime-only type, then we don’t have to resolve it, and we don’t have to call f.
  6. The compiler does actually compute the return type of f against any relevant input types V before attempting to resolve it (in the case that it’s comptime-only). That’s the only way I can explain the behavior in the two dependent return type tests. But, the type resolution is required to happen before it can assign a value to the @TypeOf. To me, it seems like it has enough information to supply a value to the @TypeOf even if resolution fails. (That doesn’t mean it’s a good idea!).

The compile error is because type is actually always a comptime value. Your infer returns a comptime value, so its argument cannot be a runtime value.
inferinline allows runtime values, which is actually due to the special behavior of @TypeOf . This “runtime value” isn’t actually a true runtime value. For example, if you enter an expression with side effects into @TypeOf , it won’t actually be evaluated and produce the side effects. Therefore, the “runtime value” here is essentially a compile-time “expression literal,” whose type is determined by compile-time “peer type resolution.”
However, this unevaluated “expression literal” is a feature unique to @TypeOf and cannot be used elsewhere in the language.

1 Like

Thanks!
Can you explain why the expression

@TypeOf([x]u8)

(whose value is ‘type’) cannot be resolved if x is known only at runtime, or if it’s ‘undefined’ at comptime?

The length of an array if part of its distinct type, and types must all be resolved at comptime. Therefore your example of [x]u8 where x is a runtime value is impossible; it needs to know the length in order to resolve the type at comptime. Slices are the closest analog in this scenario that can be created with runtime-known lengths.

3 Likes

That’s certainly true, but my question is a little subtler.
As npc1054657282 says, in many cases the expression

@TypeOf(<expression>)

is able to be evaluated without evaluating <expression>. In fact, this capability is the raison d’etre of @TypeOf: for example we can evaluate (at comptime)

@TypeOf(std.testing.random_seed)

even though the inner expression cannot be evaluated.

Consider expressions of the form

switch (std.testing.random_seed) {
    1234 => <leg0>,
    else => <leg1>,
}

What does the compiler do when you stick that in a @TypeOf? Does it bail just because it doesn’t know the value of std.testing.random_seed? Not a chance. It doesn’t know which leg to use, but it does know that the result of the @TypeOf can be computed from the types of the legs, so it goes and tries to work them out. Proof: compiling

@TypeOf(switch (std.testing.random_seed) {
    1234 => std.testing.random_seed,
    else => @as(f32, 3.141),
}

gives an error:

src/root.zig:576:36: error: incompatible types: 'u32' and 'f32'
    try std.testing.expect(@TypeOf(switch (std.testing.random_seed) {
                                   ^~~~~~
src/root.zig:577:28: note: type 'u32' here
        1234 => std.testing.random_seed,
                ~~~~~~~~~~~^~~~~~~~~~~~
src/root.zig:578:17: note: type 'f32' here
        else => @as(f32, 3.141),
                ^~~~~~~~~~~~~~~

which is expected: the compiler has worked out the types of the legs, and now helpfully tells us that they’re different, no dice. A nice glimpse into the inner workings.

Let’s fix this:

@TypeOf(switch (std.testing.random_seed) {
    1234 => @as(f32, @floatFromInt(std.testing.random_seed)),
    else => 3.141,
}

Now the compiler is happy, and the value of the @TypeOf expression is f32.

OK, let’s run the same experiment where we allow our legs to be of comptime-only type.

@TypeOf(switch (std.testing.random_seed) {
    1234 => u32,
    else => |x| x,
}

This gives the exact same sort of error as when the legs had types u32 and f32:

src/root.zig:576:36: error: incompatible types: 'type' and 'u32'
    try std.testing.expect(@TypeOf(switch (std.testing.random_seed) {
                                   ^~~~~~
src/root.zig:577:17: note: type 'type' here
        1234 => u32,
                ^~~
src/root.zig:578:21: note: type 'u32' here
        else => |x| x,
                    ^

only now, one of the legs has type type.
OK, so just like before, the compiler can see the types of the legs, one happens to be comptime-only, that’s cool, anyway we can’t proceed because they’re different. Let’s help it out and replace one of them with a value of type noreturn, which coerces to any type (including comptime-only apparently):

@TypeOf(switch (std.testing.random_seed) {
    1234 => u32,
    else => unreachable,
}

This compiles just fine and evaluates to type.
Now let’s get a little more ambitious and make that leg be of the right type, namely type. Should be fine, right?

@TypeOf(switch (std.testing.random_seed) {
    1234 => u32,
    else => f32,
}

Wrong!

src/root.zig:590:36: error: value with comptime-only type 'type' depends on runtime control flow
    try std.testing.expect(@TypeOf(switch (std.testing.random_seed) {
                                   ^~~~~~
src/root.zig:590:55: note: runtime control flow here
    try std.testing.expect(@TypeOf(switch (std.testing.random_seed) {
                                           ~~~~~~~~~~~^~~~~~~~~~~~
src/root.zig:590:36: note: types are not available at runtime

It knows the legs have the same type, it even tells us right here that any evaluation of the expression would be of type type. But, unlike the not-comptime-only case, it can’t skip to the @TypeOf expression. It demands a value for the inner expression, and everybody agrees that it can’t have that.

Note that in none of my testing am I actually instantiating <expression> at comptime (i.e. I’m not writing const x = <expression> at top level or anything like that, I’m just doing

test "<expression>" {
    _ = @TypeOf(<expression>);
}

My high-level read on this situation is this.
Compiler expression type analysis pipeline:

  1. Type inference: Compute type of expression (using Peer Type Resolution if necessary).
  2. Comptime-only value resolution: If the result is a comptime-only type, evaluate expression.

@TypeOf implementation:

  1. Perform type analysis on expression (both steps)
  2. Get type from analyzed expression

If this summary is correct, my question really boils down to this:
Logically, @TypeOf does not need the type analysis pipeline to do comptime-only value resolution, it only needs it to do type inference. Why not skip step 2 in that case?

I’m asking from a design choices perspective, not a practicality perspective - I can guess that actually changing the compiler to behave this way, even if it were viable, would be a lot of effort for basically zero reward. Probably nobody (including me) would ever use it.