I want a const, but need a var first, to initialize. I think a good way to do it would be as in this contrivance:
const c = init: {
var c_ = [_]u8{1, 2, 3};
c_[1] = 5;
break :init c_;
};
True?
I think this may result in an item-by-item copy from c_ to the final c; yes/no?
If yes, is there a different way… a way to simply tell the compiler, “I just want a symbol that references the modified array, and want it to be considered const from here on out”?
If no, is this construct susceptible to the “aliasing” problems that I remember half-internalizing when discussed a couple of months ago? I feel like it wouldn’t even matter, in this case, though, since I’ve got a const in the end.
Extra: is there a quick way to school me in discovering the answer to number 2, by myself, via godbolt or something? Keep in mind that I’m not very assembly-savvy, and the answer may therefore be: “yeah, if you were more assembly-savvy, but skip it for now.”
did you know that zig is availlable on compiler explorer ? You could try your sample code there and see what the compiler generates. I have done it here for you: Compiler Explorer
I did, actually! (Though I referred to it as godbolt in #5); but I’m afraid I need slightly more help than that, or to be told I just need to learn assembly to do good zig. I’m not sure I’m interpreting that assembly correctly - is it saying that a full deep copy is happening? Can you indicate the assembly lines that indicate that? This, then, would answer #2, and my help indicate an answer to #1: ‘is this a “good” way to do this?’… but if a deep copy is happening, and especially if the answer to #1 is “no”, then I’m especially interested in an answer to #3. (And I’m curious about confirmation on #4, which might be discernible through the godbolt, but that would definitely be hard for me.)
The problem is that almost everyone will have to look at the assembly (and/or benchmark) to answer micro optimization questions, because there is no guarantee that the optimizer (LLVM) won’t change from release to release, and in fact it really does change a lot. So the same is true for Rust, for example, since it uses LLVM. Maybe someday a release-mode Zig backend will have documented optimizations that we can rely on (just my hope).
You can see the assembly for a given source line by color matching. See all the mov instructions that correspond to the break line? You can also see them highlighted if you hover over that source line.
It is highly likely all those moves are copying the array. And in fact you see the same thing if you do the following, except the moves are associated with the c = c_ line:
Ahhhhhh… now that’s especially helpful, in part because it helps me categorize this, too, as a “micro-optimization” question. On an earlier question, I was certainly looking for “efficiency” intel; this one, though, felt a bit like “it either incurs a copy or it doesn’t”, and that invoked my sense that “somebody will just know this”. Thank you, in your other reply, for the hints on interpreting the assembly, and, indeed, changing the code a little to highlight - I should have thought of that. I did know about the coloring help, researching a bit on godbolt last time ‘round, so… I’m getting it… slowly.
That does indeed work as of now because variables declared in blocks will stay intact for the entirety of the surrounding function scope, but you’re still technically producing a dangling pointer here. The current behavior is considered a bug and will (likely) break sometime in the future.
This isn’t tagged with either bug, or accepted, it’s just an issue. There’s substantial debate within that thread about what the problem actually is, I think your confidence is unjustified.
Andrew also tagged it ‘unplanned’ so, there’s that.
Yes I think it would be foolish to tie lifetimes to block scope in addition to function scope. There was a long thread about this awhile back, my position is that better liveness analysis solves the excessive stack use, and with that solved, all that block-scoping lifetimes would accomplish is breaking obviously-useful code like that shown above.
I may be wrong, but the way I’m interpreting it is that it’s not tagged ‘bug’ because it’s not actually causing anything that should work to not work, it’s tagged ‘enhancement’ because implementing it would reduce stack usage and not break any code that’s not taking advantage of the current state of things. Also it’s currently tagged ‘upcoming’.
I agree that the code you sent looks better than the equivalent alternative of declaring the array outside of the block scope, but I’m not sure if it’s a good decision language spec-wise. Currently there are two ‘rule sets’ around how local references escaping their scopes behave. Either the scope is comptime-only, then it’s fine and everything ‘just works’, or it’s (possibly) runtime and illegal behavior. Adding this special case for block scopes inside of function scopes where variables are tied to the surrounding function scope instead of the block scope would introduce a third ‘rule set’ that’s arguably harder to reason about than the ones we already have. I’m not sure if the benefits are worth the complication.
To me it’s a question of whether function scopes for lifetimes would be intuitive or not, for someone learning Zig with this rule. Since C, C++ and Go work this way, it may be intuitive for experienced programmers. The fact that it matches Zig comptime behavior is also in its favor – it’s not a third rule in that sense.
IMO it’s much more important for Zig to be consistent in of itself than to match the behavior of arbitrary other languages (in this case C and languages that presumably just adopted whatever C did). For me being consistent actually makes a language easier to reason about and thus easier to learn.
My point is that it doesn’t match comptime behavior, in the established comptime case you get a transient reference that’s not tied to any other scope wrt its maximum lifetime. For bare blocks all references that escape them would still be tied to the enclosing function.
Ok, I understand. But this just means runtime and comptime lifetimes have two different rules. That’s true whether the runtime rule is per-scope or per-function. I guess the complexity is that visibility and lifetime don’t have the same runtime rule. I see that as a feature, but I can also see it as a little more complex.
If the issue resulted in change which “broke” this… er, feature… then am I right in assuming a compile-time error would start issuing (rather than a bug due to undefined reference)?
This actually makes perfect sense, and seems intentional and “right”, but perhaps that’s hindsight view, and the worry is that it wouldn’t feel natural until one encounters that explanation. Meh, so it’s a little hard learning something sometimes. I hope the “feature” stays.
You beat me to it. (Unimportant: I didn’t mention this, but in the real code, that middle line, that requires the var in the first place, cannot be comptime; it does happen only once, at program initialization, but is runtime. The “performance” question doesn’t concern me much here - only that I can have a const at the end of it, to protect from any (further) changes. The hope for no “copy” has less to do with time, and more to do with space, as the real data might be large.)
That’s what Andrew said in the github issue. However, this depends on implementing a fairly complex safety check. I don’t recommend relying on plans until they happen.
In cases where you cannot use a nested block to “hide” variables that you no longer want to access after the block ends, you can just use naming to make it very unlikely to make that mistake.
export fn square(num: u32) u32 {
var c_init = [_]u32{1, 2, 3};
c_init[1] = 5;
const c = &c_init;
// do stuff with c
return c[num];
}