Result Location Semantics

Result Location semantics added to the Language Reference, authored by @mlugg
This explains “the primary mechanism of type inference in the language”.

5 Likes

Note that RLS has never been called or referred to as “r-value semantics”, and if you’re referring to the concept of lvalue and rvalue as in C, those concepts don’t quite map neatly to Zig (or at least, their definitions become more subtle), and aren’t really related to RLS in any case. I think the term “r-value” in this post could be a bit confusing!

4 Likes

Yes, I was referring to the Lvalue (but wrote rvalue) concept of CPL (ancestor of BCPL,B,C).
I removed it from the post. Yes, it was confusing, and the rvalue was wrong.

I’ve read through the linked section. The part about location semantics makes sense, but the table I find confusing/mystifying. 1. What does it mean that “x has no result location”? Is that just a meaning that the expression doesn’t “move” x? 2. What does it mean that the Parent Expression is Ptr? I get that a lot of those deal with pointers (or arrays, etc.) but not all of the do, or at least not obviously to me.

I ask not to be critical, but to understand better. I read any RLS occasionally here on the threads, but want to understand what’s the big deal.

The compiler must allocate a location to store the result (cannot optimize it using an existing location).

Cannot find it anywhere. Parent Result Type is used in the first table and you can replace it with Result Type; there is no special meaning for parent.

2 Likes

This video explains them all:

  • parameter reference optimization
  • result location semantics

and their current problems.

Sorry I meant Result Location in the second table.

In the second table, column Result Location, there two values: - and ptr.

  • -” means nothing (e.g. in var val: T = x => result location of x is &val)
  • ptr” is just a name for the 1st column expression (e.g. in ptr = .{ .a = x } => result location of x is &ptr.a)
1 Like

I read it and I have to admit, that example went past my bat until I looked at it more than once. It’s hard to express the idea being presented, though. I don’t have a better way of saying what it’s trying to communicate.

I am curious about how we take advantage of the info to infer the result type without explicitly passing the type as a an argument.

Built-in functions can know result type, so the result type can be omitted as the first argument, like @intCast()

1 Like

This has potential to make certain comptime functions smoother. Something like this, which doesn’t currently work:

fn cast(a: anytype) T {
    return @as(T, a);
}

test "cast mechanism" {
    const result: u64 = cast(@as(u8, 5));
    std.testing.expectEqual(5, result);
}

It couldn’t be T because it’s just a convention to use T for generic types, there could be a real T struct definition or whatever. So it would need to be a keyword, for the sake of argument:

fn cast(a: anytype) anyresult {
    return @as(anyresult, a);
}

This is a possible construct because of result-location semantics. I don’t know if anyresult is the clearest possible keyword here, but it tracks with other constructs in the language.

There’s a common pattern in comptime fn specifications where a type is passed in, usually as the first arguement. For the common case where that type is specifying the result type of the function, something like this could be used instead, leading to cleaner code.

Small side tangent… I remember seeing a discussion about deduced return types. I can’t remember if it was a git issue or if someone mentioned it on here, but that idea has been kicked around. I believe (could be wrong) that they are still considering it.

I think that’d be a great addition if it can be done smoothly. In a meta programming context, we often have to write something like:

pub fn foo(...) blk: { // code here to deduce type
pub fn foo(...) FooReturn(...) // helper function
pub fn foo(...) switch(...)
pub fn foo(...) if (...) T else U

Actually, I believe it was something @marler8997 commented on.

3 Likes

Anybody know if there are any plans to make it any smarter and see though some compuations, or is that no possible. Even the simplest computations force wildly lengthy cast or yet another line with another made up varibles not to do the cast first:

pub noinline fn f(x: i32, y: u32) i32 {
    const z: i32 = x + @as(i32, @intCast(y)); // all this for x + y
    return z;
}

IIf RLS was supposed to make code clearer and casting easier, it is failing horribly. With RLS came the decision to remove all the target cast types from the cast functions @intCast(i32, y) would be soooo much nicer. RLS seems like it was half way added, not taken avantage of, and left in a worst of both worlds state.

I feel likes its a negative at this point even though I liked the idea of it originally. I feel like half my code is casting sometimes (not too far off the mark for some functions).

For a system langauge, it can be really hard to do system-y stuff sometimes. I don’t think it uniquely gives any optimizations that can’t be done in other ways (eg, RVO), and it definitely makes the code less readable.

4 Likes

RLS could be more sophisticated for sure. I hit this recently:

    var buf: [256]u8 = undefined;
    var buf_alloc = FixedBufferAllocator.init(&buf);
    const alloc = buf_alloc.allocator();

The only reason buf_alloc exists is because of the const default, so I have to cast it to var instead of just this:

    var buf: [256]u8 = undefined;
    const alloc = FixedBufferAllocator.init(&buf).allocator();

Analysis of the whole expression would reveal that the chaining is perfectly legal. The compiler just needs to be a bit lazier about assigning constness to intermediate expressions.

This, however:

Is a downcast. Half of the legal values of a u32 aren’t representable in i32. C’s loose rules around promotion can lead to absolutely teeth-grinding bugs, which are difficult to surface and correct.

That said, downcasting does happen a lot. A builtin @downCast(i32, y) would be a good compromise here.

Having to do a cast on lossy trype changes is totally fine. It is the the sheer amont of syntax involved in doing it. Gd help you if you have to cast 2 variables. @intCast(T, elem) would be fine, and that’s the way it used to work. But when RLS came along all those type arguements got remvoved now and RLS isn’t sophisticated enough to see thought the computation ie, const z: i32 = x + @intCast(y) doesn’t work (where x is i32 and y is u32).

I don’t know if there are technical reasons for RLS not being able to see though even the most trivial computation and letting @intCast work like you would expect it to, but now it seems like almost all my casts require the @xFromY and the @as(T, …) both. removing the type arguments was premature until RLS could be made to be smarter.

Zig functions aren’t variadic, so a one-argument @intCast makes more sense when there’s a result location in play. RLS could figure that out for you, but I reckon it probably shouldn’t, it’s too close to C casting rules. Explicit is good when evoking potentially-undefined behavior.

Re-adding the two argument @intCast as @downCast(T, v) would reduce the noise to a tolerable level, while keeping things explicit.

lol. nope. @TypeOf’s function definition is @TypeOf(…) because you can give it any number of argument. There’s a few others too I think.

Im not saying RLS should do the cost automatically, Im saying the @intCast that uses RLS to determine the target type can;t figoure out the target type so you have this mess of @as(T, @someotherCast(,…)) all over the place, RLS needs to see through that compuation to get @intCast to do its job.

I still think @intCast should be required for lossy casts, but RLS needs to do a better job and figure out the target type,

The exceptions are @TypeOf and @compileLog. These are actually macros, by any reasonable definition. The language could add the only runtime function which takes either one or two arguments, just to have a variadic @intCast, but I don’t see why it should. @downCast(T, v) would do the job just fine, and the @as(T, @intCast(v)) pattern is common enough, and awkward enough, to justify the inclusion.

We’ll have to agree to disagree about whether RLS should see through arbitrarily complex arithmetic to infer the result type. I say that would lead to code which is easy to get wrong, and hard to understand when read. Zig doesn’t do implicit conversion, and having to know the type of x in x + @intCast(y) to deduce the conversion type is what I consider implicit.

1 Like

for lossless conversions, that shouldnt be an issue.

you would need an int versions and ptr versions I think of the two type. I think those are my most common. (and a two are intfromenum since those can get really bad @as(T, @intCast(@intFromEnum(E)))

I dont think that RLS has been proven itself in my mind yet. I definitely like it less than I thought I would.

1 Like

I find it funny how “peer type resolution” in, say, addition, can complain that the types aren’t the same, but I need to spell them out explicitly again.

If the compiler knows due to PTR that the type is wrong, shouldn’t that locally be the RLS for this position as well? (i.e., shouldn’t it be enough to say const z: i32 = x + @intCast(y) in pub noinline fn f() as written by nyc?).

I’m doing a lot of math involving f32 and [iu]32 (different sources, different sinks) and I find it unnesserily verbose to keep writing the types. Then one refactors something and there’s a lot of churn…

Then again, a comptime_int will easily become a runtime float with PTR, so to some extent the logic seems to be already there.

2 Likes