I’ll admit to some magical thinking here. Even though what you wrote here makes rational sense, at an … emotional level (for lack of better word)… it feels like writing code like this would be inefficient. I guess this is from poor conditioning working with other languages where such constructs are “frowned upon”. It’s nice to read, and makes me want to investigate what other magical thinking i’ve got going on.
Declaring literals and initializers is encouraged, while defaulting individual structure fields is discouraged in some cases. This is especially true in contexts where the structure’s fields aren’t independent. If two related fields are assigned default values, initializing one without the other can easily lead to errors. Declaring literals ensures that the structure as a whole is initialized, rather than the initial values of individual fields, and this is the preferred behavior.
What then?
Admittedly, putting defaults into the struct declaration has the disadvantage that you can only have one set of default values, while in reality different init scenarios might want different defaults.
E.g.:
const MyStruct = struct {
a: i32,
b: i32,
c: i32,
const all_ones: MyStruct = .{ .a = 1, .b = 1, .c = 1 };
const all_zeroes: MyStruct = .{ .a = 0, .b = 0, .c = 0 };
};
const ones: MyStruct = .all_ones;
const zeroes: MyStruct = .all_zeroes;
Same idea as the Default trait in Rust. After reading up on it I’m not actually sure if the Rust Default trait actually supports different default sets.
What Zig is still missing though is a feature to mix such default-sets with user-provided overrides on an item by item basis, e.g. what Rust does via the spread-operator… and in Zig would probably look like this… (ok, this specifically doesn’t look acceptable, too many dots):
const bla: MyStruct = .{ .a = 23, .. .all_zeroes };
…or of course in the end everything could be fixed with comptime helpers, maybe something like this:
const bla: MyStruct = .mix(.{ .a = 23 }, .all_zeroes);`
…but hach, the more I think about it, the more I would like syntax instead of comptime helpers - because everybody will use different conventions for such init helpers.
PS: what I find totally unacceptable is having to split the struct init into different steps, for instance:
var bla: MyStruct = .all_ones;
bla.a = 23;
First: this requires a var
for what should be a const
, it would disallow putting the struct init right into a function call, and with deeply nested structs that’s a lot of redundant noise.
…at least nobody suggested the builder pattern yet
PPS:
const bla: MyStruct = .{ .a = 23 } ++ .all_ones;
This can be rewritten as
const bla: MyStruct = blk: {
var builder: MyStruct = .all_ones;
builder.a = 23;
break builder;
};
I know, but why simple when complicated works too (does that saying even exist in English?)
My understanding is that the syntactic sugar here seems to work simply because it “just happens” to solve the current complex problem: starting with a default value and then overriding specific member values.
If the problem is slightly more complex, it’s beyond the scope of this syntactic sugar, such as if some members have dependencies, and a specific member is used to infer other members, etc.
Designing a separate syntax for initializers that “override default members” might not be a bad idea, as such needs are likely not uncommon.
However, I’ll never dislike structure initialization based on expression blocks or initializers; they’re universal and can solve all problems, rather than designing a new syntax for each unique initialization need.
What then?
I think this has already been covered by others, but to be completely clear: pub const empty: T = ...
is typically fantastic style, and does not in any way depend on struct field default values. In fact, it’s what you should use instead of default values.
pub const MyStruct = struct {
foo: u32,
bar: ArrayList(u32),
pub const empty: MyStruct = .{
.foo = 123,
.bar = .empty,
};
// ...
};
it feels like writing code like this would be inefficient
That’s an understandable concern! But optimizers tend to be fairly good at this kind of thing. Here’s an example in Godbolt, with the same initialization written in a “declarative” way with boilerplate in the initialization and an “imperative” way with a local variable—notice how the generated machine code is identical aside from a slight instruction reshuffle.
PS: what I find totally unacceptable is having to split the struct init into different steps, for instance:
Here’s my question: why do you find this unacceptable? Throw a local var
into a labeled block and change what you need:
const bla: MyStruct = bla: {
var bla: MyStruct = .all_zeroes;
bla.a = 23;
break :bla bla;
};
Simple, legible, and safe—what’s the actual problem here? To be clear, I don’t think this should be something you’re doing frequently—it’s relatively rare that I need it—but in the cases where you do, it works fine.
I too strongly dislike using var
when things should be const
, but the solution is simply to scope your var
s tightly! Nowadays, almost every var
I write is inside a labeled block whose scope ends when the variable’s value is finalized. (I generally deviate from this only when the var
’s “mutation lifetime” is the majority of the function.)
it would disallow putting the struct init right into a function call
Behold, the humble block expression:
needsMyStruct( ms: {
var tmp: MyStruct = .all_zeroes;
tmp.a = 42;
break :ms tmp;
} );
(I’ll admit I had to double check that it works inside a function call)
Yeah but this is so exceedingly rare that one can always fall back to init-code in a labeled block…
FWIW I also have such ‘dynamic default initializations’ in C code, zero-init actually works great for this
E.g. in sokol_gfx.h I’m patching in non-zero defaults in the called functions, and this allows me to do dynamic defaults like this:
…but this selects the defaults based on private values which are not accessible from the outside, so none of the so far suggested solutions would help with this anyway
…the equivalent in Zig would be to make all struct values optional to decide whether a value should be default-initialized (e.g. when the optional is not provided it would be default initialized).
Let’s use a more deeply nested and complex example and it quickly becomes a lot of redundant noise:
const bla: MyStruct = bla: {
var bla: MyStruct = .all_zeroes;
bla.blub.blob.a = 23;
bla.blub.blob.b = 24;
bla.blub.blob.c = 25;
bla.blub.c = 26;
bla.blub.d = 27;
bla.blub.e = 28;
bla.f = 29;
bla.g = 30;
bla.h = 31;
break :bla bla;
};
…it would also be nice to somehow get rid of the label since that really doesn’t have a purpose…
Maybe
const bla: MyStruct = :{
var bla: MyStruct = .all_zeroes;
bla.a = 23;
break bla;
};
…or even just:
const bla: MyStruct = {
var bla: MyStruct = .all_zeroes;
bla.a = 23;
break bla;
};
Let’s use a more deeply nested and complex example and it quickly becomes a lot of redundant noise:
That example feels extremely artificial to me. I can’t think of any realistic situation in which you’d need to do an initialization like this. Do you have one to hand?
I can offer some similar C++ code (without designated init) vs C code with designated init:
…versus C99 (I don’t have that exact same sample as C version, but similar):
…and…
…needless to say, I heavily prefer the C99 version
E.g. structs in my C libraries can easily have 3 or 4 nesting levels, because with designated init, a single deeply nested struct is actually very readable… I usually use nested structs to group related items which otherwise would have a common prefix, e.g. like here:
…this is a wonderfully ‘structured’ way to work with data - as long as the language plays ball
…and since my coding style is so ‘struct oriented’ I also don’t quite like mixing code and data too much. Setting up a struct shouldn’t need to require code for initialization in most common situations - only for special cases.
@mlugg I actually have a much better example here: This is ‘C/C++ code’ (e.g. C code that also needs to compile in C++, so I can’t use designated init):
…and here’s the same thing in plain C99 when designated init is available:
(this last example also uses some subtle C99 array initialization features which make it such a pleasure to use)
I’m fine with this approach, but I’d prefer to keep the explicit :
. For non-nested, normal expression blocks, allowing for anonymous blocks would be a good improvement. I currently use blk:
as a generic name for intended anonymous code blocks. However, for special expression blocks like switch
and while
, anonymous blocks should not be allowed. Nested expression blocks should not be anonymous, either inside or outside.
As a side note, I think it might also be possible to allow names for void expression blocks—something like in-place inline functions. Often, I want to clearly express the intent of a code block, but I don’t want to separate it into a function just to give it a name. In general, I prefer identifier annotations to comments because comments don’t affect the code logic, which makes them prone to outdated errors.
I think you could use std.enums.EnumArray to achieve something quite similar in Zig.
But it would require a bit of meta programming or generation.
Rough sketch:
const EnumArray = std.enums.EnumArray;
const ShaderDesc = struct {
const AttrIndex = enum { @"0", @"1", @"2" }; // could be named instead? (not sure if that would make sense)
const Attrs = EnumArray(AttrIndex, Attribute);
const UniformBlocks = EnumArray(ExhaustiveEnum(max_uniforms), UniformBlock);
attrs: Attrs,
vertex_func: ShaderFunc,
fragment_func: ShaderFunc,
uniform_blocks: UniformBlocks,
...
};
const Shader = struct {
pub fn fromDesc(desc:ShaderDesc) Shader { ... }
};
const shader:Shader = .fromDesc(.{
.attrs = .initDefault(.invalid, .{
.@"0" = .attr("position", "TEXCOORD", 0),
.@"1" = .attr("texcord0", "TEXCOORD", 1),
.@"2" = .attr("color0", "TEXCOORD", 2),
}),
.vertex_func = .{ ... },
.fragment_func = .{ ... },
.uniform_blocks = .initDefault(undefined, .{
.@"0" = ...,
}),
...
});
Personally, I think that the labeled block approach is best unless you have really a lot of these initializations in your codebase and they can’t be represented uniformly that easily.
That said, comptime is so powerful… you can implement almost any interface you can think of, just make sure you provide it enough information and then follow your nose.
test "concat" {
const x = concat(u32, .{
splat(10, 1),
splat(2, 2),
splat(5, 42),
splat(1, std.testing.random_seed),
});
std.debug.print("{any}\n", .{x});
}
pub fn concat(T: type, pieces: anytype) [Concat(T, @TypeOf(pieces)).length]T {
return Concat(T, @TypeOf(pieces)).construct(pieces);
}
fn Concat(T: type, Pieces: type) type {
return struct {
const length = blk: {
var res: usize = 0;
for (std.meta.fields(Pieces)) |field| {
res += field.type.length;
}
break :blk res;
};
fn construct(pieces: Pieces) [length]T {
var res: [length]T = undefined;
var ptr: [*]T = &res;
inline for (pieces, std.meta.fields(Pieces)) |piece, field| {
const chunk: [field.type.length]T = @splat(piece.value); // can we do this better?
@memcpy(ptr, &chunk);
ptr = ptr[chunk.len..];
}
return res;
}
};
}
fn splat(comptime len: usize, value: anytype) Splat(len, @TypeOf(value)) {
return Splat(len, @TypeOf(value)){ .value = value };
}
fn Splat(len: usize, V: type) type {
return struct {
value: V,
const length = len;
};
}
It’s a simple example, but it could be adapted easily enough to support writing into a pre-initialized array in non-overlapping patches, you would just have to change the splat function take both a start and a stop point as well as a value.
Yeah dropping down to comptime helper functions almost always works, and can be a good escape hatch when syntax features are missing, but when they become too common they are IMHO a canary that they should really be syntax.
I have the same feeling about builtins btw. IMHO @splat
shouldn’t exist, or if then only for SIMD vectors but not arrays. SIMD vectors and arrays are entirely different things that only superficially look similar because they share some of the same syntax, but a @splat
on a SIMD vector compiles down to a single SIMD instruction, while on a general array can generate any amount of code. Such ‘unpredictable’ behaviour is not great for a builtin.
Many of the examples in this thread could have also been done with runtime helper functions in C89 to make up for missing C99 syntax, but such helper functions always feel like stopgaps / hacks and they make the code actually less readable (because even if they are perfectly named you still need to look them up and read the code - especially since such helper functions are just a convention, people will use wildly different names for the same thing, or sneak special features into helper functions with standard names).
IMHO programming language syntax shouldn’t be too puristic / minimal, otherwise we could have stopped at assembly code.
The interesting thing is that Zig has been going into the direction of more syntax instead of (stdlib) helper functions in other places. For instance Zig uses syntax for optionals and error handling while (for instance) Rust handles those mostly through the stdlib. Looking at code which uses optionals and errors side by side in Rust and Zig it’s pretty clear that going with the syntax choice was the right decision.
Just my 2ct anyway
I think this is our biggest disagreement. I believe syntax should be minimalist, using the simplest rules to achieve the most needs. Adding specialized syntax for specialized needs adds cognitive noise, adding to the inconsistency of “It’s cool to meet this need, but why doesn’t this syntax work for more advanced needs?”
If a need can be met using the standard library instead of syntax, then the standard library should be used. Generally, the biggest objection to this choice is performance concerns about non-native syntax, but zig’s comptime eliminates this concern.
Of course, I understand that my ideas aren’t a silver bullet, and different people may have different philosophies. It’s just that for now, zig prefers to minimize the addition of syntax.
I guess it’s good to focus on a minimal syntax while Zig is still heavily in development, but I really hope that at a later point there can be a little reversal in philosophy to get some balanced amount of syntax sugar into the language.
Because I think the philosophy to implement things in the stdlib that should be language features is what ruined both modern C++ and Rust
Both those languages also suffer from the problem that there’s no clear separation between the standard library and language features, e.g. syntax may depend on stdlib conventions (like for-loops relying on stdlib iterator interfaces etc…).
PS: also maybe I’m old fashioned or too ‘C-sided’, but I think a language should still be useful without stdlib usage
I’m hesitant to discuss other languages here, but I’m not sure which standard libraries in these languages should be designed as language features.
For me, C++'s most serious problem is that it inherits all the shortcomings of C’s design, while also having too many features rather than too few. This leads to a plethora of programming paradigms, so diverse that different C++ developers often program in completely different paradigms. Furthermore, features like operator overloading have introduced concepts I dislike. The concept of exceptions has also been a near-failure in C++.
I don’t dislike most of Rust’s concepts, nor do I dislike the move concept, which explicitly expresses ownership in function declarations. However, I have a particular issue with Rust’s core principle of ensuring pointer lifetime and ownership at compile time. I think it introduces significant complexity and is difficult to implement correctly. I would welcome the possibility of explicitly and semantically declaring ownership in data structures and APIs, while also shifting from compile-time checks to runtime safety checks for illegal behavior. Another purely aesthetic point is that I think Rust uses too many symbols rather than keywords, and the syntax for lifetime declarations in particular strikes me as ugly.
The coupling of syntax and implementation is indeed quite common in modern languages, as seen in Python and C#. Some C/C++ standard libraries are indeed quite “magical.” Zig employs a compromise between the standard library and syntax, namely built-in functions. Built-in functions acknowledge the use of magic, and in principle, belong to the language itself rather than the standard library, but their form is relatively uniform and doesn’t add noise to the syntax.
In my opinion, a significant magical and inconsistent aspect of built-in functions is their exclusive use of result types. If built-in functions and in-place block expressions can accept result types, why can’t a regular metaprogramming function? This adds to my cognitive load.