We can have desestruct one day?

nexus_prime · July 24, 2024, 1:57pm

The Zig syntax may look a bit like JavaScript syntax, and one feature of javascript really cool is desestructuring:

import { myImport } from 'path';

const { a, b, ...rest } = { a: 5 }
console.log(a) //5
console.log(b) // undefined
console.log(rest) // []

const [c, d] = [10, 20]
console.log(c) // 10  
console.log(d) // 20

Can we have a desestruct syntax on zig? See the code of Buzz Lang made on zig, on imports, we have it:

...
const _obj = @import("obj.zig");
...
const Obj = _obj.Obj;
const ObjString = _obj.ObjString;
const ObjPattern = _obj.ObjPattern;
const ObjMap = _obj.ObjMap;
const ObjUpValue = _obj.ObjUpValue;

-# found here: buzz/src/buzz_api.zig at main · buzz-language/buzz · GitHub

with desestruct, we can do it:

const { ObjString, ObjPattern, ObjMap, ObjUpValue } = @import("obj.zig");

clear syntax!

nexus_prime · July 24, 2024, 2:01pm

We don’t need desestruct for array or lists, but for structs it could be usefull

n0s4 · July 24, 2024, 4:07pm

You can destructure tuples already:

const std = @import("std");

const S = struct { u8, i32 };

pub fn main() !void {
    const s = S{ 25, -500 };

    const a, var b = s;
    b += 1;
    std.debug.print("a: {d}\nb: {d}\n", .{ a, b });
}

See this post for more info: New Destructuring Syntax

purefns · July 24, 2024, 5:41pm

That’s really interesting, I didn’t know that! It’s not mentioned in the Standard Library, that could be worth putting there

nexus_prime · July 24, 2024, 6:21pm

bruh, so strange. But we can’t desestruct structures yet… It’s the most important desestruct.
We should do this:

const ObjString,
const ObjPattern,
const ObjMap,
const ObjUpValue
      = @import("obj.zig");

AndrewCodeDev · July 24, 2024, 6:34pm

Array destructuring landed a while ago - I’ve only used it for packing return values from functions but otherwise it’s not a regular part of my coding style.

The issue that you’ll face with what you’re proposing is that Zig structs do not have guaranteed well-defined in-memory layout. It would have to work by field declaration order to have consistent behavior. It’s possible to do that, but it’s not the same as the destructuring syntax for arrays - I believe it would need to be a whole separate feature.

mnemnion · July 24, 2024, 7:00pm

It would need to be a whole separate feature, but I don’t think the memory layout question is why.

The idea of destructuring is just syntax sugar that replaces this kind of thing:

const a_struct = fnReturningStruct();
const field_a = a_struct.a;
const field_b = a_struct.b;

With some mechanism to assign field_a and field_b directly. Since the compiler knows how to handle the above code, it wouldn’t have a problem handling syntax sugar for it.

I’m not convinced, however, that having syntax sugar for destructuring struct fields is a great fit for Zig. The tuple destructuring is a strong addition, it gives de facto multiple return values, which is an absolute annoyance in languages which don’t have it.

But that seems less compelling for structs with named fields. Yeah, sometimes you’ll want a subset of those fields, and also want that subset to have the names of the fields, but is that worth adding and supporting as a feature in the language? What’s it supposed to look like?

I came up with this:

.{ const field_a, const field_b } = fnReturningStruct();

Which is… not bad, not great. I’d say, at least, that it’s adequately clear what’s going on here.

One thing that the slice destructuring syntax requires is that elements of the tuple which are ignored are ignored explicitly:

const a, _, var c = fnReturningTriple();

Requiring that for field destructuring is awkward, and breaks the syntax I was sketching out above.

But not requiring it means that data is getting silently dropped, and that makes me nervous. It’s like non-exhaustive switching: say a change to the struct means it’s now carrying a pointer to heap-allocated memory, so the code has to be changed to deallocate that struct. But the new field is being silently dropped, and it still works with the change, that’s easy to miss. If you add an element to a tuple type, destructurings won’t compile until that element is handled or ignored.

I’m not strongly against it, either. But there are considerations for field destructuring which don’t really apply to tuple destructuring, and my opinion is that the field version is less manifestly useful, while also bringing more complexity, and some non-obvious design problems which the tuple version doesn’t have.

AndrewCodeDev · July 24, 2024, 7:10pm

Right, what I was trying to imply there is that we can’t expect to jump from array destructuring to struct-destructuring directly because they are different animals. The implementation for one might not cross over to another. I immediately thought about using something like using a field accessor builtin because we already have reflection over fields. I agree that it’s not a big step from there (in speculation), but it’s just not the same as what we currently have.

I agree that destructuring isn’t as great as it’s advertised to be. In something like C++ where you can destructure into a for-loop capture, it kind of makes sense because there are other language features that it plugs into. That said, for loops in C++ are quite byzantine because of overloading. I don’t want to go in that direction either.

By in large, I still recommend just returning an inplace declared struct instead of using arrays or tuples as returns to be destructured. Something like:

fn foo(...) struct { bar: u8, baz u8 } {

AndrewCodeDev · July 24, 2024, 7:24pm

I will note however that it may not be as far off as I thought as stuff like this compiles:

// Type your code here, or load an example.
export fn square() i32 {
    const x, const y = .{ 2, 2.0 };
    return x + y;
}

In this case, we are destructuring comptime literals of different types from a tuple.

I wonder if you can convert known struct field types to a return tuple. In fact, I bet it’s totally possible.

AndrewCodeDev · July 24, 2024, 7:31pm

@mnemnion, essentially, we’d have to automate the following process with comptime reflection over a type and build up a tuple to return:

const std = @import("std");

const Foo = struct {
    a: usize,
    b: usize,
};

fn bar(comptime foo: Foo) std.meta.Tuple(&.{ usize, usize }) {
    return .{ foo.a, foo.b };
}

export fn baz() usize {
    const x, const y = bar(.{ .a = 42, .b = 55 });
    return x + y;
}

chung-leong · July 24, 2024, 7:35pm

nexus_prime:

bruh, so strange. But we can’t desestruct structures yet… It’s the most important desestruct.
We should do this:
const ObjString,
const ObjPattern,
const ObjMap,
const ObjUpValue
      = @import("obj.zig");

The big question here is whether we want to encourage this kind of coding style. There is value from a readability standpoint in having namespaces attached to types and function. Admittedly that often makes lines of code too wide. I wonder if there isn’t a way to effectively deal with that through the UI instead (render the namespace portion in a really small font?).

AndrewCodeDev · July 24, 2024, 7:48pm

@mnemnion, @chung-leong - looks like this is compiling. It assigns the values of a struct out to a tuple that can then be destructured.

const std = @import("std");

const Foo = struct {
    a: usize,
    b: usize,
};

fn ReturnTuple(comptime T: type) type {
    comptime {
        const fields = std.meta.fields(T);
        var types: [fields.len]type = undefined;

        for (fields, 0..) |field, i| {
            types[i] = field.type;
        }
        const freeze = types;
        return std.meta.Tuple(freeze[0..]);
    }
}

fn asTuple(x: anytype) ReturnTuple(@TypeOf(x)) {

    const T = @TypeOf(x);

    var result: ReturnTuple(T) = undefined;

    const fields = std.meta.fields(T);

    inline for (fields, 0..) |field, i| {
        result[i] = @field(x, field.name);
    }
    return result;
}

export fn baz() usize {
    const foo: Foo = .{ .a = 42, .b = 55 };
    const x, const y = asTuple(foo);
    return x + y;
}

One thing that I don’t know here is if this guarantees field order. Maybe someone can speak to that point.

For capturing function names, you may be able to use a similar technique with using string names that return a tuple given that the @field can also reference functions too. It would look like:

const A, const B = destructure(@import("blah.zig"), &.{ "A", "B" });

mnemnion · July 24, 2024, 8:19pm

I thought about suggesting a conversion function from fields of interest to a tuple as well.

It’s close to the definition of syntax sugar that there’s always a workaround, so I wanted to focus on some of the unanswered questions around field destructuring in my post. But filtering the fields of interest down in a helper function can be nicer than manually destructuring with field assignment, for sure.

Generally, I would say that if code is frequently selecting some consistent subset of fields from a struct, that’s a good indication that the struct is doing too much. Then again, a struct isn’t always coming from user-written code, so there are times when it makes sense (and I would say that having a function which ‘tuple-izes’ the fields of interest is a good approach there).

The main thing for me is that field destructuring either silently drops fields, or it’s more verbose than just selecting the ones you want. The other part is that you get this magical connection between the field name and the variable name. I like a bit of tasteful magic in some languages, but I also like that Zig avoids that for the most part.

There shouldn’t be serious implementation challenges in providing field destructuring, and I don’t think the syntax I suggested is particularly attractive, but it conveys what’s happening well enough, and is broadly aligned with how destructuring works in other languages.

But not having it is no kind of pinch-point in my code, speaking as a party of one, and I’m not sold on the juice being worth the squeeze.

AndrewCodeDev · July 24, 2024, 8:23pm

Me neither, but it’s an excuse to abuse tuples lol - gotta have some fun while the lights are on.

But yeah, beyond the fun of it, I’d say I’m with you.

mnemnion · July 24, 2024, 8:25pm

I missed this part, sorry. I’m having trouble finding the smoking gun in the documentation, but fields are guaranteed to appear in declaration order in .fields. So that should work fine.

Tosti · July 26, 2024, 2:33pm

Sorry for a bit tangential topic, but destructuring in C++ has nothing to do with for loops specifically. Essentially, this

auto [a, b] = my_tuple;

is a syntactic sugar for

auto __compiler_generated = my_tuple;
auto&& a = __compiler_generated.get<0>();
auto&& b = __compiler_generated.get<1>();

or, if std::remove_reference_t<decltype(my_tuple)> doesn’t have a member function template get, then get<i>(__compiler_generated) is used instead. i goes from 0 up to std::tuple_size_v<std::remove_reference_t<decltype(my_tuple)>>.

cv-qualifiers and references in a structured binding declaration apply to the generated variable, but the components are still bound via the auto&& rules. So, for example, this

const auto& [a, b] = my_tuple;

is equivalent to this

const auto& __compiler_generated = my_tuple;
auto&& a = __compiler_generated.get<0>();
auto&& b = __compiler_generated.get<1>();

If member variables of a struct/class are public, then there is no need to provide custom std::tuple_size_v and get implementations, because the compiler is able to bind components to the public member variables in the declaration order.

It works with range-based for loops, because those loops are desugared into a form where the loop variable is initialized as

<declaration> = *iterator;

So, it works for simple declarations like auto x, and it works with structured bindings like auto [key, value], if the type of *iterator supports it (e.g., iterator of std::unordered_map).

AndrewCodeDev · July 26, 2024, 2:48pm

I’m aware of all this. Structured binding support in was introduced in C++17:

https://en.cppreference.com/w/cpp/language/structured_binding

// C++17:
for (const auto& [key,val] : myMap) {  
    // use key/value directly
}

And yes, you can destructure basic structs in C++ directly as well. I’m not saying that it’s designed around for-loops… I’m saying that:

Loops are just one feature that it plays into nicely.

Structured bindings in loops is one place where the syntax makes sense because it’s somewhat convenient to decompose objects into components when looping over them.

mnemnion · July 26, 2024, 3:21pm

Y’know, I’ve thought about that in a Zig context as well.

Would it be too much magic to have:

for (returns.tupleNext()) |a, b| {
    ...
}

It would cooperate ok with the for (thing.next(), 0..) syntax, I think.

I like this more than field destructuring. Although the identity questions when you start using a pointer to a field in a tuple where the contents may or may not have a non-transient result location could get kinda gnarly.

Sze · July 26, 2024, 3:37pm

In the ecs I am working on I currently have code like this:

fn drawRectanglesLines(self: *System, archetype: *Archetype) !void {
    const c = try self.input(archetype, struct {
        position: Vec2,
        bounds: Vec2,
        color: ray.Color,
    });

    for (c.position, c.bounds, c.color) |pos, b, color| {
        ray.drawRectangleLines(pos[0] + 1, pos[1] + 1, b[0] - 2, b[1] - 2, color);
    }
}

If there was a special for <tuple> |a, b, c| { ... } I could write this instead:

fn drawRectanglesLines(self: *System, archetype: *Archetype) !void {
    const c = try self.input(archetype, struct {
        position: Vec2,
        bounds: Vec2,
        color: ray.Color,
    });

    for c |pos, b, color| {
        ray.drawRectangleLines(pos[0] + 1, pos[1] + 1, b[0] - 2, b[1] - 2, color);
    }
}

Or even this (if I don’t care about verifying component order/names):

fn drawRectanglesLines(self: *System, archetype: *Archetype) !void {
    for try self.input(archetype, .{Vec2, Vec2, ray.Color}) |pos, b, color| {
        ray.drawRectangleLines(pos[0] + 1, pos[1] + 1, b[0] - 2, b[1] - 2, color);
    }
}

This would work because System has a slice of self.component_ids internally.

Maybe to avoid unreadable expression monstrosities within such a hypothetical tuple-for, it could instead accept a variable name, which would force you to separate that code out:

fn drawRectanglesLines(self: *System, archetype: *Archetype) !void {
    const components = try self.input(archetype, .{Vec2, Vec2, ray.Color}); // returns tuple
    for components |pos, b, color| {
        ray.drawRectangleLines(pos[0] + 1, pos[1] + 1, b[0] - 2, b[1] - 2, color);
    }
}

I would like that tuple-for possibility, but what I suggest would be a new kind of syntax we haven’t had so far (dropping the parentheses from a syntactic form), but I like it because it seems very readable.

mnemnion · July 26, 2024, 4:31pm

2 posts were split to a new topic: Parentheses in Control Flow Statements