There are not much SIMD examples yet I believe, so once more come back here…
I am looking for a fast way to compress an array of byte [15]u8
to a u15
, where each non-zero value becomes a set bit.
I do not know much about SIMD, but my guts say it must be possible with that.
for example:
array = 12, 3, 0, 0, 0, 15, 3, 0, 0, 0, 0, 24, 12, 14, 0
must become (written from LSB to MSB)
u15 = 110001100001110
You can use @as(@Vector(15, u8), @splat(0))
to get the value to compare against.
Using !=
gives you a vector of bools.
You can then use @bitCast
to cast directly to a u15
.
Note that this will give you the opposite order, so you might need byteSwap
first.
@bitReverse
not @byteSwap
afaik
Use @select
to choose between a vector of zeroes and a vector with descending powers of 2.
Then @reduce
with operator |
(bitwise or).
One thing about the solution suggested by @IntegratedQuantum and me: It seems not to be really defined if this should work. See this issue-comment:
@rpkak wow. impressed.
Edit: (my) problem with reading some code (@select or other builtins) is the high level of abstraction. At a certain moment all typing seems to be lost and you have to dig into it. Or trial and error. Or get smarter…
can you elaborate on how and where typing is lost?
For example: @splat(scalar anytype) anytype.
By reading the comment you can see what is going on, but often it is too little info. And no examples.
But basically we see a function where you can put anything in and anything can come out.
When you go and look online, read documentation etc. you can find out what is going on.
You always need to provide a type to @splat
via result location semantics, that’s pretty visible, so I wouldn’t describe that as losing typing.
I also wouldn’t describe anytype
as losing typing either, because it still has a type, which still gets type checked.
But I do agree, anytype
provides zero information about type requirements to the caller.
When in doubt use @compileLog(<variable>)
to see what it is.
true that is
So in zig 0.14 we can use splat for arrays, I read.
Should we now prefer
var available: [32]u8 = @spat(0);
over
var available: [32]u8 = std.mem.zeroes([32]u8);
?
If so, I like the short look of it.
Is there a difference in what is done under the hood?
BTW: is the code of @splat somewhere visible?
@splat
will use simd/avx to do the oporation, if available, otherwise its elementwise. std.mem.zeroes
uses @splat
for arrays and vectors, but has logic for other types