I’m attempting to optimize the following function.
/// Swap each byte in data based on the passed lookup table.
pub fn translate(data: []u8, lut: [256]u8) void {
assert(data.len > 0);
assert(data.len % 256 == 0);
for (data) |*px| {
px.* = lut[px.*];
}
}`
However I’m having trouble figuring out a way to index into the lut
array
efficiently? @shuffle
can be used only when the mask
is known at compile
time and this would be a lot faster than the implementation above.
/// Swap each byte in data based on the passed lookup table.
pub fn translate(data: []u8, lut: [256]u8) void {
assert(data.len > 0);
assert(data.len % 256 == 0);
const lut_vec: @Vector(256, u8) = lut;
var cursor = data[0..];
while (true) {
const chunk: @Vector(256, u8) = cursor[0..256].*;
// this will fail with: `note: shuffle mask must be comptime-known`
const result: [256]u8 = @shuffle(u8, lut, undefined, chunk);
cursor[0..256].* = result;
cursor = cursor[256..];
if (cursor.len == 0) break;
}
}`