If you had one wish for zig, what would it be?

Working stable no-fuss LSP

1 Like

That’s not what I am talking about, I am fine with byteSwap.

I am talking about a more subtle issue that fragmentOffset field strides two consecutive bytes in a way that can’t be modeled with packed structs.

See the diagram in the wiki and especially:

for the diagram and discussion, the most significant bits are considered to come first (MSB 0 bit numbering). The most significant bit is numbered 0, so the version field is actually found in the four most significant bits of the first byte, for example.

Contrast with Zig reference:

Each field of a packed struct is interpreted as a logical sequence of bits, arranged from least to most significant.

If a field is multiple bytes, or is entirely inside a byte, this is modelable as a packed struct plus byte swaps.

But if a field strides a byte boundary, then bit order becomes a problem.

EDIT: or maybe I am wrong, let me take one more look here!

1 Like

If it is not representble then the protocol is neither little-endian nor big-endian and the protocol designers should be punished.

Here is full test snippet if you have that qemu binary auto runner binfmt thingy installed

$ zig test test9.zig -target powerpc64-linux-musl
[default] (err): yo im big endian
All 1 tests passed.
1 errors were logged.
error: the following test command failed with exit code 1:
.zig-cache/o/58f640ee2c2313cf06fca888f0bcb581/test --seed=0x56c1a473

$ zig test test9.zig -target x86_64-linux-musl
[default] (err): yo im little endian
All 1 tests passed.
1 errors were logged.
error: the following test command failed with exit code 1:
.zig-cache/o/7e97b8cbb2a898f739d574a525794205/test --seed=0x26f83000
const std = @import("std");
const native_endian = @import("builtin").target.cpu.arch.endian();

const IPv4Header = packed struct(u16) {
    fragment_offset: u13,
    more_fragments: bool,
    dont_fragment: bool,
    reserved: u1 = 0,

    fn deserialize(raw: [2]u8) IPv4Header {
        return switch (native_endian) {
            .big => blk: {
                std.log.err("yo im big endian", .{});
                break :blk @bitCast(raw);
            },
            .little => blk: {
                std.log.err("yo im little endian", .{});
                break :blk @bitCast(@byteSwap(@as(u16, @bitCast(raw))));
            },
        };
    }
};

test {
    const bytes_from_the_network: [2]u8 = .{

        // reserved = 0
        // dont fragment = false
        // more fragments = true
        // fragment_offset + 256
        0b0010_0001,
        // fragment_offset + 255
        0b1111_1111,
    };

    const expected: IPv4Header = .{
        .fragment_offset = 0b1_1111_1111,
        .more_fragments = true,
        .dont_fragment = false,
    };

    try std.testing.expectEqualDeep(
        expected,
        IPv4Header.deserialize(bytes_from_the_network),
    );
}
5 Likes

Yeah, I see it now, genius, thanks!

1 Like

damn dude, someone’s been busy :laughing:

1 Like

The planned hot patching will have the build system coordinate with the executable over a protocol. Allowing the executable to be notified when a patch is available and choose when to do it.

Meaning your code could “simply” save the global state which could just be in local variables, then patch, then apply the old state over the new state.

It is also possible zig could save the old global state for you, saving you some juggling.

1 Like

Cool. Ideally Zig should take care of restoring the state for me. Doing it manually might be “fine” but then it needs to be bulletproof. It wouldn’t be too dissimilar to some already existing libraries that do this already and put it on the user to reinitialize global. Only problem was last time I tried there was some codegen bug in the self hosted compiler that made it crash instead. I think that was fixed in the 0.16 dev cycle though. So possibly/probably this is already working with a library right now.

I’m bit skeptic about hot reloading, wouldn’t it break with pointers which point to dynamically allocated storage which layout does not match the new build anymore?

2 Likes

Hard to say, but probably one of these.

Concepts

Probably something like C++’ concepts for better generic programming.

I think you could implement that mostly as a library with a LOT of pain and bad errors messages, but essentially something where you can say that certain expressions should compile for a given type. Here an example syntax:

// `T` would always be a variable of type `type`
const Byteable = concept(T) {
    // @Some represents every possible value a `T` could be
    _ = @Some(T) + 5;
    const x: u8 = @Some(T);
};

For a concept to be fulfilled, every expression in the concept would need to compile (and can be decided for every expression individually).

You would then use a concept in a generic function like this:

fn pad(T: type, buffer: []u8, data: []const Byteable(T)) void {
    // TODO
}

pad(u7, buffer, data); // fine
const Point = struct{x: f32, y: f32, z: f32};
pad(Point, buffer, data); // compile error: Point does not fulfill the concept Byteable because of expressions "_ = @Some(T) + 5;" and "const x: u8 = @Some(T);"

Obviously this is just one way this could look like, but it would make compile time programming a lot more ergonomic.

unsized SIMD type

Thanks to the existence of variable length vector extension of various architectures, making the size of a @Vector compile time known doesn’t really scale.

No matter which size you chose, on some you will need a loop anyway (because technically the actual size could be just as big as a regular integer scalar; except if you know it will only run of one specific type of CPU), while on others you will leave performance on the table.

If you look at the “recommended” size for an @Vector for RISC-V, they chose dependent on the target platform, which essentially means that the Zig team wants to have a big list of possible every RISC-V CPU in there: Zig Documentation

A vector register being as small as 32 bits and as big as 65536 bits (which is btw the maximum you can do under the RVV extension). While not necessarily done in practise, there are some which are as small as 128 bit, but also some which are as big as 2048 bit (e.g. this one). But since you can combine multiple vector registers into one logical register, this can be even bigger.

Zig’s @Vector just doesn’t map at all to RISC-V’s RVV extension and ARM’s SVE extension is a similar deal with how I understand it.

I want to essentially write this:

fn add(T: type, a: []T, b: []const T) void {
    var vec_a: @Simd(T) = a;
    const vec_b: @Simd(T) = b;
    vec_a += vec_b;
    // the follow means to load the values back into the slice at the right positions
    // this could also be achieved by saying that you need to make `vec_a` to be type `*@Simd`
    a = vec_a;
}
// same meaning but without SIMD
fn add_naive(T: type, a: []T, b: []const T) void {
    for (a, b) |*ea, eb| {
        ea.* += eb;
    }
}

.. and … to ..< and ..=

Instead of .. and ... being only usable in one place and not the other, change .. to ..< and ... to ..= (yes, the syntax is essentially stolen from Odin here) and make them both usable in all places where the other is usable too.

This would even help with the current for loop annoyance where 0..max is always generating usize values. And in case ranged integers would end up as a bit fit for Zig after trying them out (considering how low level Zig is and where it wants to operate, this could happen).

So, which one?

I guess I would probably ask the genie for the SIMD change. The other two are more of annoyances or bad ergonomics while this one is a bad mapping.

1 Like

You have control over when hot patching happens, it is on you to convert types.

If you make a change that is too much to make work with hot patching then you always have the option of restarting the program.

If you are concerned about zigs undefined layout for many types, that isn’t a problem as zig has deterministic and hermetic builds; so a types’ layout should only change if you change a fields type, alignment, or other property that affects layout.

In C# any change to structure, like changing a method signature or even just changing the initial value of a constant, is what’s known as a “rude edit”. The hot reloading service then asks you if you want to restart(and if you want to auto restart every time it happens) since it cannot hot reload anymore. That’s probably harder to do in a compiled language though. I wonder how liveplusplus handles this situation.

2 Likes

I guess the question is if this mechanism would notice it for you or if it lead to a crash/UB instead?

That kind of thing certainly should be a possible. Assuming it’s executable with no dynamic dependencies (self-contained, compiler has all the info).

That would be nice, and I want such a thing, but not strictly necessary.

The use case for hot patching, and it being an extension of the incremental server (--watch -fincremental) providing blazing fasttm compilation speed, mean that changes will be small;
so you should have a clear idea of you did. Not to mention I think it would be good practice to make changes work with patching as you go, if you are using hot patching anyway.

I am not sure if a convenience feature like this will exist at the start/early, but I would expect it eventually.

We might be mixing hot patching and hot reloading slightly here.

This or some type of contract to ensure that passed type must contain specific function signatures. If you call it an interface or something else doesn’t matter.

Wide usage in real systems

1 Like

You and I think about Zig vectors very differently. If I understand you correctly you see them as a way to define a type that matches the CPU vector register size. That way you can write low-level operations and be sure they can map to singular CPU instructions.

For me they are a way to define a data type with multiple elements. That might be 4 float elements for 3D geometry, 12 floats for spherical harmonics, or 4k bytes for the input of a small neural-net. I don’t want to write the loop for a 4k vector added with another one, adjusting my granularity depending on my target. I want to write v3 = v2 + v1 and have the compiler work out the optimal way. If the CPU target vector register width is visible in my Zig code, the abstraction is broken.

In fact, for RISC-V Vector you can have the compiler output be “strip-mining” loops with the VL register, and the setvli instruction. The code adjusts to the vector register width and is target independent. In this way I think Zig maps really nicely on to RISC-V vector. Register size independent Zig generating register size independent machine code. 32-bits or 64K bits, both will be used effectively.

Maybe I’ve misinterpreted how @vector() is intended to be used, but describing the higher level operation surely gives the compiler more information to work with.

For me it would be improvements to casting with vectors. I often have to write code like

const posOffset = physics.velocity * @as(@Vector(3, f64), @splat(deltaT));

or

const relativeChunkPos: @Vector(3, f32) = @floatCast((@as(@Vector(3, f32), @floatFromInt(chunkpos.position)) * chunkSizeVec) - playerPos);

Which I think could be greatly simplified if @splat was able to infer the type its result is being operated with and coerce to it. For example, it could infer that its type needs to be @Vector(3, f64) since it is being multiplied my a vector of that type. This would let you write

const posOffset = physics.velocity * @splat(deltaT);

Even if that it too complex, I think inferring type based on the type of the variable would be much better than what we currently have. You could write

const posOffset: @Vector(3, f64) = physics.velocity * @splat(deltaT);

And it would infer that @splat(deltaT) should be a @Vector(3, f64) and coerce it to that.
This would also let the second example become

const relativeChunkPos: @Vector(3, f32) = @floatCast(@floatFromInt(chunkpos.position) * chunkSizeVec - playerPos);

One more improvement would be that the following code should work:

const chunk_blockpos: @Vector(3, f64) = whatever;
const absolute_position: [3]f32 = @floatCast(chunk_blockpos);

Instead you currently have to write

const absolute_position: [3]f32 = @as(@Vector(3, f32), @floatCast(chunk_blockpos))

Since it does not infer the result type of the @floatCast even though vectors coerce the the same type arrays.

5 Likes

I think it would be super cool to have an opt in zig-specific abi/layout for zig types and functions.