Thoughts on Anonymous Struct Literals

I’ve seen similar questions to this one more than once regarding the .{} assignment syntax.

I personally rarely write code like this, so my stance on this question fluctuates… Sometimes I think the current semantics seem quite natural if I accept the RLS. But other times, I feel like it’s understandable to be bothered by the problem here, and that .{} encapsulation is more intuitive.

RLS is a feature I quite like and I have always hoped that it would be more powerful, so when this proposal was blocked by the .{} problem, I was a little sad and hoped that this problem could be solved sooner.

@mlugg 's reply on this issue has me a bit worried. What does the plan to “removing result pointers from the language” mean? How much will the RLS be affected? Is it just .{} that is affected?

Here are some personal thoughts on the current Struct Literal assignment. When we use Struct Literal assignment, what kind of behavior do we expect?

Consider a easy struct:

    const S: type = struct {
        a: u8,
        b: u8 = 0,
        c: struct {
            a: u8,
            b: u8,
        },
        d: u8,
    };

If we perform the following assignment:

    var s: S = .{ .a = 0, .b = 1, .c = .{ .a = 2, .b = 3 }, .d = 4 };
    s = .{
        .d = s.a + 1,
        .a = s.b + 1,
        .c = .{ .a = s.b + 1, .b = s.a + 1 },
    };
    try std.testing.expectEqual(@as(S, .{ .a = 2, .b = 0, .c = .{ .a = 2, .b = 3 }, .d = 1 }), s);

Semantically, this is equivalent to doing the following:

    s = undefined;
    s.d = s.a + 1;
    s.a = s.b + 1;
    s.c = undefined;
    s.c.a = s.b + 1;
    s.c.b = s.a + 1;
    s.b = 0;

It might not be so counterintuitive.
It’s just that I may never use such semantics to write code, because for me I may prefer to use multiple statements like s.d = s.a + 1, which is more in line with the way I think when I have this kind of logic.
In addition, this semantics is too similar to the control flow in a block, and the .b that is assigned at the end always makes me feel like an implicit control flow.

The current semantics of T{} will result in another result:

    var s: S = .{ .a = 0, .b = 1, .c = .{ .a = 2, .b = 3 }, .d = 4 };
    s = S{
        .d = s.a + 1,
        .a = s.b + 1,
        .c = .{ .a = s.b + 1, .b = s.a + 1 },
    };
    try std.testing.expectEqual(@as(S, .{ .a = 2, .b = 0, .c = .{ .a = 2, .b = 1 }, .d = 1 }), s);

Since it doesn’t use RLS, its semantics are as follows:

    var expiring_s: S = undefined;
    expiring_s.d = s.a + 1;
    expiring_s.a = s.b + 1;
    expiring_s.c = undefined;
    expiring_s.c.a = s.b + 1;
    expiring_s.c.b = s.a + 1.
    expiring_s.b = 0;
    s = expiring_s;

This semantics doesn’t use RLS, T{} shows encapsulation, so the result is consistent with many people’s intuition.
Although I am not used to using this semantics, it does save an intermediate variable and is indeed valuable. And for users who are used to Python, this usage is more familiar.
However, without RLS, this semantics ultimately requires a structure copy. If the structure is large, this semantics has a certain cost (although it can be optimized away by RVO).
My thought is: when we expect this kind of result, don’t we really want RLS?

When we expect encapsulation semantics for initializers, our expectations might be something like this:

    const expiring_d = s.a + 1;
    const expiring_a = s.b + 1;
    const expriing_c_a = s.b + 1;
    const expiring_c_b = s.a + 1;
    s = undefined;
    s.a = expiring_a;
    s.b = 0;
    s.c = undefined;
    s.c.a = expriing_c_a ;
    s.c.b = expiring_c_b ;
    s.d = expiring_d ;

The “encapsulation” we want may just be an inline initialization function:

test "inline function init" {
    const S: type = struct {
        a: u8,
        b: u8 = 0,
        c: struct {
            a: u8,
            b: u8,
        },
        d: u8,
        pub inline fn init(a: u8, maybe_b: union(enum) { manual: u8, auto: void }, c_a: u8, c_b: u8, d: u8) @This() {
            const b: u8 = switch (maybe_b) {
                .manual => |b| b,
                .auto => 0,
            };
            return .{ .a = a, .b = b, .c = .{ .a = c_a, .b = c_b }, .d = d };
        }
    };
    var s: S = .{ .a = 0, .b = 1, .c = .{ .a = 2, .b = 3 }, .d = 4 };
    s = .init(s.b + 1, .auto, s.b + 1, s.a + 1, s.a + 1);
    try std.testing.expectEqual(@as(S, .{ .a = 2, .b = 0, .c = .{ .a = 2, .b = 1 }, .d = 1 }), s);
}

Expanding the expressions in the Anonymous Struct Literal from left to right, and then initializes at the result location. Is this the semantics we really want? What do you think about this?

1 Like

If you need this behavior for optimization, etc. that is totally fine by me.

The only problem i have is that, It is really easy to assume that .{} is short for T{} if you did not get tripped up by it at least once (i don’t think its even in the docs right now).

I think this whole thing would be fixed if we just gave these semantics a weird syntax like .={}, because then nobody would expect it to behave like T{} and just look up what it actually means.

If you then decide you need both .{} and .={}, just one, or none, i don’t particularly care, because i can just fix it whenever i get a compiler error.

2 Likes

Currently, the relevant documentation appears in the table of expressions affected by result positional semantics. This table specifically mentions that T{} is not affected by result positional semantics, while .{} is affected.

I’m trying to propose a new semantics that would allow .{} and T{} to behave the same while preserving the RLS.
My idea is that .{} is different from a code block, so perhaps we could give it some kind of encapsulation, like a function parameter, by pre-expanding the input expression, while preserving its RLS semantics.

One problem is that {} without anything appended to it is considered a code block which has the type void.

I definitely assumed that, then learned that it assigned a field at a time, then forgot that and assumed again that it was short for T{}. To me this falls into the category of (very) surprising behavior.

Ok, that actually explains it quite well, but there was no way in hell i would have read that before running into this. (I know i probably should have just read the whole thing, but one of the cool parts of Zig is how simple the language is to learn and how few surprises it contains)

Why would that be a problem?
Or was my suggest too weird? I meant:

return .={};

See it. It is indeed a feasible idea. Although I might intuitively think that this is the same as = .{} when I see this usage.

From issue comments it seemed to me as if removing T{} completely from the language is one of the more likely development directions for the language.
Where it seems like inferring the length for array types is still needed, before that could happen. Proposal: allow inferred-size array in type annotations · Issue #19674 · ziglang/zig · GitHub

If there is only .{} syntax, than you have to learn that it works like separate assignments, but it can’t be mistaken to work like something else anymore.

1 Like

That’s good to know, thank you. I guess the simplest way to be sure you’re assigning the entire struct as a single value is to first initialize a separate variable for the new struct, then do the assignment.

I will study the Result Location Semantics section and drill it into my head.