How are Optional Slices Optimized?

amesaine · September 15, 2024, 1:45pm

optional type optimization

opened 03:25AM - 02 Feb 16 UTC

optimization proposal accepted

Track which values are used when data sizes are bigger than necessary and use th…ose states for maybe types. For example, `?bool` could use value `2` of the bool as the null value, and `??bool` can use value `3` for the null value, and so on. If we have spaces in structs for alignment purposes, that space could be used for the null value, or if any of the fields have null space available, such as a struct with a bool field in it somewhere.

I know that @sizeOf(*T) == @sizeOf(?*T) (although I don’t understand why). With slices, there’s no padding since ptr and len are both usize right? Where would the nullabity be stored instead?

comptime {
    assert(@sizeOf(Node) == 16); // ReleaseFast/Safe
}
const Foo = struct {
    slice: ?[]const u8,
};

tensorush · September 15, 2024, 1:50pm

Langref to the rescue: null value is guaranteed to be address 0.

amesaine · September 15, 2024, 1:55pm

Ah… How does that apply to slices though?

mnemnion · September 15, 2024, 2:12pm

.ptr is a pointer, so a value of 0 is null.

dimdin · September 15, 2024, 2:15pm

Slices are also optimized.

const std = @import("std");

test {
    const T = []const u8;
    try std.testing.expectEqual(@sizeOf(T), @sizeOf(?T));
}

amesaine · September 15, 2024, 4:21pm

I understand now, thank you. Slices can just check .ptr for 0. If .ptr is null, then the slice itself is null. Thinking of slices like normal structs tripped me up.

mnemnion · September 15, 2024, 5:20pm

They are and they aren’t. The usual term for this kind of thing is a “fat pointer”, there are a few kinds of those but a slice is definitely one.

But also, a slice is basically this:

fn Slice(T: type) type {
    return struct {
       .ptr: *T,
       .len: usize,
    };
}

fn MaybeSlice(T: type) type {
    return struct {
        .ptr: ?*T,
        .len: usize,
    };
}

With some essential syntax sugar to make it easy to use.

Currently Zig special-cases what Rust calls “niche optimization” for pointers only, but there are plans afoot to expand this. For instance, right now, a ?u21 has @sizeOf 8, but clearly it could be 4, because there are unused bits in the backing u32.

Another one would generalize what the MaybeSlice does, and say that, if you have a struct type with an unused niche in its value, you can use that field for an optional. So for something like this:

const SomeStruct = struct {
    fee: u64,
    fie: u32,
    foe: u21,
};

Since there’s room in the u21 for bits which aren’t legal for the type, those could be used to tag a ?SomeStruct null type. Making the special niche value std.math.maxInt(u32) would make testing for it as fast, or nearly so, as testing for 0.

Since it doesn’t affect the correctness of programs, there’s no special hurry to add things like this. But it would make a meaningful difference in data size, particularly when arrays of structs start to enter the picture, so it will be nice to have when we do.

amesaine · September 16, 2024, 6:46pm

Could it be guaranteed that if a type has padding, it will be optimized? Or is that to be determined?

mnemnion · September 16, 2024, 7:01pm

That’s a slightly stronger rule, which might have performance consequences. Padding can be any value, which means that initialization, copying, and so on, can ignore pad bytes completely. If a pad byte were used for a niche optimization, then there would be one value of that byte which is load-bearing, and any operation on that struct would have to set the byte to ensure that it didn’t take that value.

By contrast, if a field has an illegal value, like a *T can never be 0, then the type system is already taking care to ensure that said field doesn’t have that value. So using it to tag the struct as optional has no consequences on not-optional structs of that type: the niche value is only checked for when unwrapping an optional, which is where it pays off.

My best guess is that using a pad byte would actually be zero-cost in practice, because initialization and copying are efficiently done by moving word-sized pieces of the struct with each instruction, so it would just come along for the ride. But that’s a less obvious conclusion than using illegal values for niche optimization, because it does constrain the compiler’s options when generating code, and that could have performance implications, but neither of these is true for using an illegal value as the null niche.

amesaine · September 16, 2024, 7:09pm

I understood that completely, thank you!