Prefered method to initialize GeneralPurposeAllocator

dimdin · September 3, 2024, 8:40pm

Since the default initialization method of GeneralPurposeAllocator is deprecated in master:

Default initialization of this struct is deprecated; use .init instead.

What is your preferred method of initialization and why?

Derive the type from the expression:

var gpa = std.heap.GeneralPurposeAllocator(.{}).init;

Enter the type and use Decl Literal

var gpa: std.heap.GeneralPurposeAllocator(.{}) = .init;

AndrewCodeDev · September 3, 2024, 11:06pm

I’m curious about the rational to make this a container level variable as opposed to a function. Most things I come across that have init present it as a function.

AndrewCodeDev · September 3, 2024, 11:16pm

Ah, I looked at the PR. There’s some other decl literals going in now around other data structures too (such ash Array list Unmanaged):

/// An ArrayList containing no elements.
pub const empty: Self = .{
    .items = &.{},
    .capacity = 0,
};

More here: std: deprecate some incorrect default initializations · ziglang/zig@0b9fccf · GitHub

To your question, I prefer the second one. I think I’m actually digging this pattern the more I think about it - especially for objects that have more than one valid state. I often find myself checking the source code to see if brace initialization has the default values that I want.

ktz_alias · September 4, 2024, 1:59am

Officially, unit tests of GeneralPurposeAllocator still use the default initialization pattern.
Deprecating the default initialization actually will lead to modify these unit tests.

Until then, I’ll wait official way…

squeek502 · September 4, 2024, 7:45am

mnemnion · September 4, 2024, 1:49pm

It’s growing on me as well, but I don’t like the use of init. We’ve come to think of init as short for initialize, which is a verb, and hence, a function.

Now it also means initial, an adjective, eliding initial values, a noun phrase. I’d rather see these decls called initial, it’s three more letters and it won’t make me wonder if the variable assigned to is a function. Since the ‘new thing’ is called a decl literal, I assume that it could be a function.

I wonder why default wasn’t chosen actually, it seems like the most obvious choice, to the point where I assume it was considered and rejected.

var gpa: std.heap.GeneralPurposeAllocator(.{}) = .default;

Other than the whiff of repetition between .{} and .default (can’t be avoided) this reads cleanest to me. Someone familiar with the language will know that .{} means “a struct with all default values”, but in status quo we use SomeType.init(params...) to set non default values, so yeah.

.initial also works fine, although I must point out that any values at all are initial ones, but only a specific set of them are the default, and that set is what’s being called .init.

If the plan is to also get away from the SomeType.init pattern, then this might be fine actually. Although I don’t see an obviously better thing to call a function like that.

dimdin · September 4, 2024, 2:41pm

default is a keyword.

There is an advantage when using:

const name: T = .empty;

instead of

const name = T{};

empty is just a set of values for T attributes; there is no need for default values.

mnemnion · September 4, 2024, 2:48pm

Is it? no appearance of default in the documentation is a keyword. The only keyword starting with d is defer.

const ValueKind = enum {
    default,
    special,
};

test "can we use default" {
    const is_default: ValueKind = .default;
    try expectEqual(ValueKind.default, is_default);
}

I am unconvinced!

Sze · September 4, 2024, 3:06pm

I think using the same name for both an init value and an init function makes sense.

When you use a type you are unfamiliar with you can just use the:

var name: T = .init;
defer name.deinit();

And if it turns out that it is a function the compiler will tell you with its error message what it expects to be called with, otherwise it just works. This seems helpful to me from a developer workflow perspective, instead of having to first lookup whether it is a thing that provides a init function or an initial state.

dimdin · September 4, 2024, 3:18pm

Oh, you are right!
default is a keyword in C and other languages and is difficult for me to forget
Of course it is not a zig keyword; zig uses else where other languages use default.

I also don’t like .init in this context, and yes I also find .default better for the general case. For arrays, hash tables and other containers the use of .empty is perfect.

Sze · September 4, 2024, 3:42pm

I like .empty where it makes sense like for collections and .zero for a vector/coordinate, .identity for a matrix etc.

I guess it improves readability and Zig favors readability so maybe it is good to pick different names that make sense based on the data-structure.

I don’t really like .default because it is one specific value, you don’t really have the option of providing your own custom value. I am fine with it being called .init, or maybe even .value or .val (because GPA.value/Type.value makes sense to me).

Default implies to me, that there are other values that make sense there, but often that isn’t the case with this “specific value as start value” pattern.

Also this pattern is being used specifically to avoid having combinations of field-default-values that aren’t valid, instead of having 3 fields with defaults, you define one specific value and use that as a start value, where the invariants are guaranteed to be correct.
This makes it impossible to only provide 2 field-values and let the 3rd field be auto generated with a default value, which then may violate an invariant because it doesn’t match what was provided in the other 2 fields.

So I think these pristine start values are a slightly different concept than default values, thus I don’t want to use default as name.

mnemnion · September 4, 2024, 7:09pm

That might be, but the motivating example isn’t one of those, the GPA can still be configured.

Consider the statement from the docs:

Default initialization of this struct is deprecated; use .init instead.

It makes just as much sense to say instead:

Default initialization of this struct is deprecated; use .default instead.

I don’t think that .default is meaningfully better or worse than .initial, but both are better than just .init. I don’t like any ambiguity about whether something is a function or not. This is not only ambiguous, it cuts directly against an established convention. Zig has several rules and conventions to maintain that distinction, like snek_case and camelCase, function definition having a special syntax, and so on.

My second job had an ironclad rule against ever using value or val, closest thing to an exception was the k,v pair for a fully-generic dictionary traversal function. That gives me a knee-jerk resistance to using it here, and for the same reason: one is suppose to ask “ok, but what value” and then use that instead, in this case, “initial value” or “default value” depending. It is closer to justifiable, I’ll grant you, because the word “value” appears in the answer to that question, which is uncommon.

It wouldn’t make sense to have two words like this, and while I don’t entirely share your feelings about using .default in cases where there are no other valid options, I do take your point. So perhaps .initial is the better choice. But if we’re going to keep init(), initCapacity(), zeroInit(), and so on, I’d prefer to add the rest of the letters to the not-a-function value we’re discussing. Readability has always been a priority for the language, and this choice undermines that a bit.

castholm · September 4, 2024, 7:58pm

I too think the name init is a bad name for a variable representing an initial default state, when in almost all other cases in the standard library the identifier init is a function, but I also wouldn’t put too much emotional investment into the shape and naming of things in the standard library right now since there is a long-standing goal of auditing the standard library for consistency and correctness before 1.0.

It’s also possible that init might have been chosen on a whim without much deliberate thought because there just so happened to already be several private constants in that source file that also ended in _init and were used in a similar manner:

github.com

ziglang/zig/blob/f87dd43c1285d38d7a0f3092f6487bf1e1f4faa6/lib/std/heap/general_purpose_allocator.zig#L171-L196


      
          total_requested_bytes: @TypeOf(total_requested_bytes_init) = total_requested_bytes_init,
          requested_memory_limit: @TypeOf(requested_memory_limit_init) = requested_memory_limit_init,
          
          mutex: @TypeOf(mutex_init) = mutex_init,
          
          const Self = @This();
          
          /// The initial state of a `GeneralPurposeAllocator`, containing no allocations and backed by the system page allocator.
          pub const init: Self = .{
              .backing_allocator = std.heap.page_allocator,
              .buckets = [1]Buckets{.{}} ** small_bucket_count,
              .cur_buckets = [1]?*BucketHeader{null} ** small_bucket_count,
              .large_allocations = .{},
              .empty_buckets = if (config.retain_metadata) .{} else {},
              .bucket_node_pool = .init(std.heap.page_allocator),
          };
          
          const total_requested_bytes_init = if (config.enable_memory_limit) @as(usize, 0) else {};
          const requested_memory_limit_init = if (config.enable_memory_limit) @as(usize, math.maxInt(usize)) else {};

This file has been truncated. show original

mnemnion · September 5, 2024, 4:39pm

Indeed, although I would bet on the specific convention of functions with init in them sticking around.

If the .init pattern were as deeply rooted in Zig’s history as init-containing functions, I would consider much of this thread to fall under ‘bikeshedding’. I might indulge anyway, because I do love a good bikeshed, but feedback like this would be more productively given during the audit you refer to.

But the pattern is brand new, mere days old, so it seems like a good time to speak up. It’s a pretty low-stakes decision, what to call it, but I do think .init can be improved upon, and the code using it outside of the Zig repo rounds off to zero right now.

A mere whim, that I doubt, but “this is called _init, let’s make it public and call it init” is a plausible thought process, even a likely one. To someone who spends a lot of time inside the Zig codebase, this might be an obvious choice, and the effect it has for someone who uses the public API might not be so obvious.

On a scale from 0 to 10, where 1 is a pure aesthetic bikeshed, and 10 is something I would never not be mad about if it doesn’t change, this is somewhere between a 2 and a 3. Worth speaking up, but not worth dwelling on whatever final decision is made.

wrapitup · September 5, 2024, 5:23pm

I believe name init is too common within the patterns and conventions of zig code that I’ve seen (and I’ve written myself) so far. I’ve been more familiar with the .init() pattern as a function for passing in backing data structures (such as backing allocators, buffers, etc) for initializing a struct. As a result, when trying to understand decl literals, the use of the same name (.init) was a hurdle in comprehension, because I was expecting a function passing in a value. Therefore, my first impression is that whatever term is used (.default, .initial, etc) should make it clear when a type/struct is initialized from a function or an enum.