Arrays vs vectors

jbe · July 19, 2025, 12:16pm

Continuing the discussion from What differences in performance can I expect from using either vectors or arrays?:

Not just with regard to performance, I wonder what’s the difference between arrays and vectors, and how they each relate to slices. When to use array (and/or slices), and when to use vectors?

I made some tests. The following program compiles, but the commented out passages do not (with given error messages):

const std = @import("std");

pub fn takesArray(array: [3]i32) void {
    for (array) |x| {
        std.debug.print("{}\n", .{x});
    }
}

pub fn takesSlice(slice: []const i32) void {
    for (slice) |x| {
        std.debug.print("{}\n", .{x});
    }
}

pub fn takesVector(vector: @Vector(3, i32)) void {
    // The following gives:
    // => error: type '@Vector(3, i32)' does not support field access
    // ---
    // for (vector) |x| {
    //     std.debug.print("{}\n", .{x});
    // }
    // ---
    // Instead I must use:
    for (0..3) |i| {
        std.debug.print("{}\n", .{vector[i]});
    }
}

pub fn main() void {
    const array: [3]i32 = .{ 1, 2, 3 };
    const slice: []const i32 = &array;
    const vector: @Vector(3, i32) = .{ 4, 5, 6 };
    takesArray(array);
    // takesArray(slice.*);
    // => error: index syntax required for slice type '[]const i32'
    takesArray(slice[0..].*);
    takesArray(vector); // surprise, this works
    takesSlice(&array);
    // takesSlice(&vector);
    // => error: expected type '[]const i32', found '*const @Vector(3, i32)'
    takesSlice(&@as(@Vector(3, i32), array)); // somewhat ugly?
    takesVector(array); // this works too
    takesVector(slice[0..].*); // this works like above for `takesArray`
    takesVector(vector);
}

Some takeaways for me (so far):

I cannot iterate using the for (some_vector) syntax, I must use indices for vectors. (Question: why?)
Arrays and vectors coerce into each other automatically, but
- Converting a vector into a slice seems to be much more complicated (i.e. needing explicit coercion with @as and subsequent reference with &). Converting an array into a slice is easy in contrast, and just requires a single &.
- Yet, converting a slice into either an array or a vector is same easy. (Question: why is the opposite direction, i.e. from vector to slice, more difficult?)

Further questions:

When to use arrays, and when vectors?
Is the choice just a matter of efficiency, or should I also take other considerations into account? (especially given the different behavior of both type classes, as demonstrated above)

IntegratedQuantum · July 19, 2025, 12:30pm

Vectors are mainly intended to be used for SIMD operations. So unless you want to use SIMD operations, such as e.g. element-wise addition, you should just use an array.

jbe · July 19, 2025, 12:35pm

Say (for example), I develop an abstract math library that allows calculation with (mathematical) vectors, matrices, etc. Should I then use (Zig) vectors?

For vectors, probably yes. But how would matrices fit into this? There are multidimensional arrays, but not multidimensional vectors. So I guess I rather use arrays for matrices? And then I also use arrays for (mathematical) vectors, to be consistent? Or not?

IntegratedQuantum · July 19, 2025, 12:46pm

It depends on what you want to do, looking at the example you seem to want to use 3d vectors.
3d vectors can be a bit problematic, the extra padding from the 16 byte alignment (which is backend dependent too, so you might possibly get even more than that) can be annoying.

But the big advantage of vector types in my opinion, is the fact that you can use operators on them. So if you want to be able to use operators then it is a must-have.

As for matrices, you can just store them as an array of vectors, then you can still take advantage of vector operations.
E.g. a matrix vector product could look like this

	pub fn mulVec(self: Mat4f, vec: Vec4f) Vec4f {
		return Vec4f{
			dot(self.rows[0], vec), // aka @reduce(.Add, self.rows[0]*vec)
			dot(self.rows[1], vec),
			dot(self.rows[2], vec),
			dot(self.rows[3], vec),
		};
	}

vulpesx · July 19, 2025, 12:46pm

it depends on how large the (math)vectors and matrices are, for small sizes simd vectors will be slower, for large sizes simd vectors will be faster.

as simd vectors have a higher alignment than most small types, so they will take up less space.

most uses of simd vectors are creating them on demand for calculations when data size is large enough.

another reason simd vectors are created on demand is they are most efficient when matching the vector size the cpu can use in one operation, otherwise it will have to be split across multiple operations likely with the last using a not full vector which may be slower than a loop. So code using vectors usually splits data into chunks and does the last stretch with loops.

IntegratedQuantum · July 19, 2025, 12:49pm

The opposite is also true, for very large sizes, simd vectors will be quite slow as well, since llvm generates poor code for them.

Generally you want to hit whatever number is supported by your hardware.

vulpesx · July 19, 2025, 12:51pm

it was a general rule, vectors should be applied intelligently, not as a magic performance gain.

always profile performance before and after optimisations :3

jbe · July 19, 2025, 12:52pm

Well, the example I was thinking of (for the sake of this discussion) was an abstract library that allows ~~vector / matrix calculus~~ edit: oops, sorry, I meant generally linear algebra with vectors and matrices, but I guess matrix calculus is another interesting field.

When you deal with vectors, parallel execution can be helpful.

A concrete (other) example that I’m playing with is a library that generates multivariate normal distributions and provides evolutionary algorithms for multivariate optimization.

jbe · July 19, 2025, 1:28pm

Let’s look at a concrete example (again, this is just an example, I would like to understand more generally when to use arrays and when vectors):

const std = @import("std");
const assert = std.debug.assert;

pub fn triNum(x: anytype) error{Overflow}!@TypeOf(x) {
    const N: type = @TypeOf(x);
    const add = std.math.add;
    const mul = std.math.mul;
    return try mul(N, x, try add(N, x, 1)) / 2;
}

pub fn triNumUnchecked(x: anytype) @TypeOf(x) {
    return x * (x + 1) / 2;
}

pub fn triIndex(row: usize, col: usize) usize {
    assert(col <= row);
    return triNumUnchecked(row) + col;
}

pub fn MultivarNormDist(comptime T: type) type {
    return struct {
        averages: []const T,
        factors: []T,
        pub fn init(averages: []const T, covariances: []T) @This() {
            const t = triIndex;
            const dim = averages.len;
            assert(covariances.len == triNumUnchecked(dim));
            const factors = covariances;
            for (0..dim) |i| {
                for (0..i + 1) |j| {
                    var value: T = factors[t(i, j)];
                    for (0..j) |k| value -= factors[t(i, k)] * factors[t(j, k)];
                    if (i == j) value = std.math.sqrt(value) else value /= factors[t(j, j)];
                    if (!std.math.isFinite(value)) value = 0;
                    factors[t(i, j)] = value;
                }
            }
            return @This(){ .averages = averages, .factors = factors };
        }
        pub fn random(self: @This(), rng: std.Random, output: []T) void {
            const t = triIndex;
            const dim = self.averages.len;
            for (0..dim) |i| output[i] = rng.floatNorm(T);
            var i = dim;
            while (i > 0) {
                i -= 1;
                output[i] *= self.factors[t(i, i)];
                for (0..i) |j| output[i] += output[j] * self.factors[t(i, j)];
                output[i] += self.averages[i];
            }
        }
    };
}

pub fn multivarAverage(
    comptime T: type,
    comptime dim: usize,
    output: *[dim]T,
    samples: [][dim]T,
) void {
    const sample_count = samples.len;
    for (output) |*x| {
        x.* = 0;
    }
    for (samples) |sample| {
        for (0..dim) |i| {
            output[i] += sample[i];
        }
    }
    for (output) |*x| {
        x.* /= @floatFromInt(sample_count);
    }
}

Should MultivarNormDist(…).random(…) return a vector? (Or accept a reference to a vector?)
Should then MultivarNormDist(…).init(…) also take the averages as a vector? But I can’t really take the covariances as a vector, because it’s a slice referring to the packed values of a triangular matrix.
What about the multivarAverage function?
- Should that return a vector? (Or accept a reference to a vector?)
- And the samples argument then could be a slice of vectors?

IntegratedQuantum · July 19, 2025, 2:26pm

For a library like this, I think the right thing to do would be to pass things as structs. Especially when you are dealing with triangular matrices like this it would help readability to have clearly named types.

As for using arrays vs vectors, I’d suggest to keep using arrays/slices for now, since with vectors there are more performance pitfalls for generic code like this.
And if you do start optimizing, then you shouldn’t use one SIMD vector per math vector anyways, instead ideally you should split the math vector into pieces of std.simd.suggestVectorSize and operate on those.

jbe · July 19, 2025, 2:33pm

I thought for larger Zig-vectors, operations are split up anyway? But that’s not as efficient as doing it manually?

IntegratedQuantum · July 19, 2025, 3:18pm

Sure thing, but right now llvm does it rather inefficiently, by generating a lot of instructions, generally it is also not a trivial problem.

You can always do better if you do it manually.

jbe · July 19, 2025, 3:23pm

So there is a sweet spot that is hardware dependent.

So that means in generic code with unknown size “vectors”: Do not use vectors, but use arrays and slices instead?

jbe · July 19, 2025, 3:45pm

I also just realized that both arrays and vectors require comptime-known dimensions. That means any code that has a runtime-dependent dimension will need to use slices anyway.

(Of course, the dimensions of an array or vector may still be comptime-variable in generic code.)