SIMD, @splat, @reduce?

ericlang · April 25, 2025, 7:05pm

There are not much SIMD examples yet I believe, so once more come back here…
I am looking for a fast way to compress an array of byte [15]u8 to a u15, where each non-zero value becomes a set bit.
I do not know much about SIMD, but my guts say it must be possible with that.
for example:
array = 12, 3, 0, 0, 0, 15, 3, 0, 0, 0, 0, 24, 12, 14, 0
must become (written from LSB to MSB)
u15 = 110001100001110

IntegratedQuantum · April 25, 2025, 7:18pm

You can use @as(@Vector(15, u8), @splat(0)) to get the value to compare against.
Using != gives you a vector of bools.
You can then use @bitCast to cast directly to a u15.
Note that this will give you the opposite order, so you might need byteSwap first.

rpkak · April 25, 2025, 7:18pm

@bitReverse not @byteSwap afaik

dimdin · April 25, 2025, 7:21pm

Use @select to choose between a vector of zeroes and a vector with descending powers of 2.
Then @reduce with operator | (bitwise or).

rpkak · April 25, 2025, 7:23pm

One thing about the solution suggested by @IntegratedQuantum and me: It seems not to be really defined if this should work. See this issue-comment:

github.com/ziglang/zig

proposal: do not pack SIMD vectors

opened 06:40PM - 22 Jan 24 UTC

Snektron

optimization proposal

Currently, `@Vector(N, T)` is packed together. For example: ```zig export fn s…quare() u32 { return @bitSizeOf(@Vector(11, u3)); } ``` [returns](https://godbolt.org/z/ov8fvWM9c) 33. This seems counter intuitive to me for several reasons: - `[N]T` is not packed either. - I see no good reason for vectors to be packed. - The performance impact is highly questionable. For instance, consider ```zig export fn square(num: u32) u32 { return @reduce(.Add, @as(@Vector(11, u3), @bitCast(@as(u33, num)))); } ``` This [generates](https://godbolt.org/z/z3KqvGabP) an excessive amount of bit shifts In my opinion, vectors should essentially be a bag of scalars that you want to perform the same operation on, and not provide any layout guarantees at all. This would enable compilers to lower `@Vector(11, u3)` to `@Vector(11, u8)`, and omit these expensive shifts (and a whole lot of headaches). An important edge case here are `@Vector(N, bool)` and `@Vector(N, u1)`. I suspect the reason why the above are packed at all, is to provide the guarantee that those are backed by an integer (and that all operations on it are bitwise). This makes sense to me, and I don't think that we should remove that. I see three main paths forward: - Make only `@Vector(N, bool)` be guaranteed to be backed by `uN`, and allow `@bitcast`ing between these. - Make only `@Vector(N, bool)` and `@Vector(N, u1)` be backed by `uN`, and allow `@bitcast`ing between these. - Add no such guarantee at all (leave it up to the backend), and provide utilities (functions or built-ins) to between `uN` and vectors. In all cases, I think we only need to remove the capability to `bitcast` between vectors and integers.

ericlang · April 25, 2025, 7:27pm

@rpkak wow. impressed.

Edit: (my) problem with reading some code (@select or other builtins) is the high level of abstraction. At a certain moment all typing seems to be lost and you have to dig into it. Or trial and error. Or get smarter…

vulpesx · April 25, 2025, 10:04pm

can you elaborate on how and where typing is lost?

ericlang · April 25, 2025, 11:04pm

For example: @splat(scalar anytype) anytype.
By reading the comment you can see what is going on, but often it is too little info. And no examples.
But basically we see a function where you can put anything in and anything can come out.
When you go and look online, read documentation etc. you can find out what is going on.

vulpesx · April 25, 2025, 11:17pm

You always need to provide a type to @splat via result location semantics, that’s pretty visible, so I wouldn’t describe that as losing typing.

I also wouldn’t describe anytype as losing typing either, because it still has a type, which still gets type checked.
But I do agree, anytype provides zero information about type requirements to the caller.

Sze · April 26, 2025, 12:48am

When in doubt use @compileLog(<variable>) to see what it is.

ericlang · April 26, 2025, 9:01am

true that is

ericlang · April 26, 2025, 9:44am

So in zig 0.14 we can use splat for arrays, I read.

Should we now prefer

var available: [32]u8 = @spat(0);

over

var available: [32]u8 = std.mem.zeroes([32]u8);

?

If so, I like the short look of it.
Is there a difference in what is done under the hood?

BTW: is the code of @splat somewhere visible?

vulpesx · April 27, 2025, 6:05am

@splat will use simd/avx to do the oporation, if available, otherwise its elementwise. std.mem.zeroes uses @splat for arrays and vectors, but has logic for other types

Luke · December 19, 2025, 7:13am

You always need to provide a type to @splat

I don’t understand why this is so. If I do this:

const foo: @Vector(2, i16) = .{0, 0};
const result = foo + @splat(1);

I get:

prog.zig:6:23: error: @splat must have a known result type
 const result = foo + @splat(1);
                      ^~~~~~~~~
prog.zig:6:23: note: use @as to provide explicit result type

It seems to me zig has all the info to deal with this. There should be no ambiguity there.

vulpesx · December 19, 2025, 7:23am

it is ambiguous, it could be a 2 length vector of either i16 or any smaller integer signed or not.

But there are certainly cases where it is not ambiguous yet zig doesn’t infer it.

The reason is usually one of:

zig is pre 1.0, it is still wip
devs deliberately chose not to do this, fairly common,
this area is up for change anyway, no point adding to it until they work out exactly what they want
it’s an edge case they didn’t think of.

Luke · December 19, 2025, 9:15am

Thanks for your answer.

Can you clarify why, If you add 1 to an i16, would 1 be anything else but an i16? It’s not a rhetorical question, these numerical operation are full of gotchas and there is probably something I am missing here.

vulpesx · December 19, 2025, 9:31am

x + 1 does not make 1 the type of x, 1 is still a comptime_int in zig land.

But that still works because zig’s semantics explicitly allow you to do arithmetic with runtime int types and comptime_ints intermixed in most cases, in some you still have to cast/coerce it into a runtime type.

I cannot tell you the reason it doesn’t work with vectors, only the likely reasons, which I have already listed previously.

Joen-UnLogick · December 19, 2025, 4:06pm

Actually if find your code ambiguous. You want to fill out what length with ones? Just one of them or perhaps 42 of them? It’s results type specifically it’s length that cannot be inferred.

Luke · December 19, 2025, 8:20pm

Zig documentation states:

Vectors generally support the same builtin operators as their underlying base types.

All other operations are performed element-wise, and return a vector of the same length as the input vectors.

Moreover:

[@splat] Produces an array or vector where each element is the value scalar. The return type and thus the length of the vector is inferred.

I have a vector of 2 element. I use the + operation of this vector with another vector the result of @splat. I am not sure exactly why @splat would produce a vector of 42 elements.

Maybe I misunderstand your remark.

Joen-UnLogick · December 20, 2025, 11:31am

const foo: @Vector(2, i16) = .{0, 0};
const result = foo + @splat(1);

result is an array of i16 but you never provide a length for result, hence splat cannot know how many 1s it has to fill out.

const result : [16]i16 = foo + @splat(1);

In this case it’s clear splat has to deliver the missing 14 elements.