Packed arrays or arrays in packed structs

baltevl · January 15, 2026, 12:56pm

Hey, there
I’m rewriting the xv6 kernel in Zig just for funs. While implementing the Interrupt Controller (PLIC; see spec) i need to access bit flags and a whole lot of them (or at least if i want to implement the spec correctly, there are only 2 really in use right now).
My first naive approach was to use an array in a packed struct…

pub const PLIC: *struct {
    const NUM_CONTEXTS = 15872;
    const NUM_SRCS = 1024;

    /// offset: 0x0
    /// size: 0x1000
    /// Notice: priorities[0] is reserved
    priorities: [NUM_SRCS]u32 = undefined,

    /// offset: 0x1000
    /// size: 0x80
    /// Notice: pending[0] is hardwired to false
    pending: [NUM_SRCS]bool = undefined,
    ...
} = 0x0c000000; // This is where qemu puts the PLIC

I soon figured out (aka. the compiler screamed at me…) that packed structs cann’t contain arrays (see Working around "packed structs cannot contain arrays", Using arrays in packed structs) and i guess there are reasons for this (see make `packed struct` always use a single backing integer, inferring it if not explicitly provided · Issue #10113 · ziglang/zig · GitHub).
And even if arrays would be allowed the total size of the PLIC is too big (I think this is a limitation of the implementation though).

My next idea was to do something like:

pub const PLIC = struct {
    const NUM_CONTEXTS = 15872;
    const NUM_SRCS = 1024;
    const BASE = memlayout.PLIC;

    /// offset: 0x0
    /// size: 0x1000
    /// Notice: priorities[0] is reserved
    const priorities: *[NUM_SRCS]u32 = @ptrFromInt(@intFromPtr(PLIC.BASE) + 0x0);

    /// offset: 0x1000
    /// size: 0x80
    /// Notice: pending[0] is hardwired to false
    const pending: *[NUM_SRCS]bool = @ptrFromInt(@intFromPtr(PLIC.BASE) + 0x1000);
    ...

Which I like less that the first one but sure it’s not too bad, right? Yes, except there is a Problem.
The following code doesn’t panic (x86_64, 0.15.2):

pub fn main() !void {
    const a: *[2]u8 = @ptrFromInt(@intFromPtr(&[2]bool{ false, true }));
    @import("std").debug.assert(a[0] == 0);
    @import("std").debug.assert(a[1] == 1);
}

What I am trying to say is, that arrays are not bit packed and I don’t know a way to get what I want . Any ideas?

LucasSantos91 · January 15, 2026, 1:35pm

What’s the problem and why should it panic?

pachde · January 15, 2026, 1:46pm

I’m not completely clear on the shape of the data you’re trying to access. The spec uses terminology that it doesn’t define, and doesn’t go into sufficient detail.

It seems like the concept of bit pointers could model this data, but alas, Zig doesn’t have bit pointers.

std.bit_set (a 32-bit-sized one, given how the PLIC memory map is described) is probably one component of the solution for accessing this data conveniently.

pzittlau · January 15, 2026, 1:51pm

Zig does have them, but it’s quite obscure. The only things I found are some small section in the language reference in the packed structs part and on zig.guide

pzittlau · January 15, 2026, 1:53pm

I also don’t know what you want to do but usually using some odd-sized integers can be used for padding.

const foo = packed struct {
    _padding: i12,
    useful_array: i100,
    _padding2: i3,
};

You could then maybe create a bitpointer to a specific bit inside the useful_array field or mask the relevant bits.

Hopefully that helps in some way.

chung-leong · January 15, 2026, 2:08pm

Vectors are allowed in packed struct. May that’s what you need?

const std = @import("std");

pub fn main() !void {
    const Struct = packed struct { vector: @Vector(16, bool) };
    const bytes: [2]u8 = .{ 0b1001_0001, 0b1000_0010 };
    const s: Struct = @bitCast(bytes);
    std.debug.print("{}\n", .{s.vector});
}

{ true, false, false, false, true, false, false, true, false, true, false, false, false, false, false, true }

jumpnbrownweasel · January 15, 2026, 3:15pm

If you’re trying to control the format of the struct (padding, etc) to conform to a spec, then you probably want an extern struct rather than a packed struct. And an extern struct can contain arrays.

Justus2308 · January 15, 2026, 3:56pm

github.com/ziglang/zig

Proposal: better bit pointer syntax and semantics

opened 10:05PM - 02 Jun 25 UTC

mlugg

proposal accepted

This proposal arose from a discussion with @jacobly0 on the flaws of the current… bit-pointer syntax in the context of #22915. ## Background Zig has a thing called "bit pointers". They're very niche; many people probably don't realise that they exist. But they look like this: ```zig *align(8:10:5) u9 ``` This pointer type does not mean that there is a `u9` value at the address the pointer represents. Instead, at that address (which is 8-byte aligned) there is a 5-byte "backing integer", which, when interpreted as packed memory, contains a `u9` at bit offset 10. Bit pointers are niche, but under our current semantics for evaluating code like `foo.bar = 123`, they need to exist, because that code is effectively doing `(&foo.bar).* = 123` under the hood. However, bit pointers have some problems. One is in the syntax; the syntax shown above is extremely opaque, and the values seem to be shoved into the `align` qualifier for no clear reason. I don't think any new user would guess that random colon-separated numbers in the "align" field completely change how the pointer works. A more significant problem, which is what first inspired this issue, is that this pointer does not necessarily contain sufficient information to correctly lower loads and stores. The correct lowering depends on the compiler's representation of the backing type, which the Zig specification will not define for "weird" integer types like `u35`. However, the given "host size" (5 in the above example) is insufficient to reverse-engineer the host integer, because it is given in byte units. This definitely limits the allowed representations for integers, but it does so in an *extremely* subtle way. Lastly, there is the problem of vectors. If you take a pointer to a vector element (i.e. `&vec[1]`), then that can't really be a byte-level pointer, because vector elements may be bit-packed -- again, it is the implementation's choice what the representation here is. But representing the actual bit offset in the pointer isn't appropriate, because that offset depends on the implementation, so the *type* of something like `&vec[1]` would become entirely implementation-defined -- plus, it would tie vectors to Zig's "packed memory layout" concept, which may be confusing depending on how vectors are actually represented by the implementation. To solve this case, we currently have... a *fourth* field in the `align` qualifier! ```zig *align(2:0:5:1) u2 // The address is '2'-byte aligned... // ...and points to a vector of *length* '5' (note the distinct meaning) of `u2`... // ...the '0' is, uh, unused... // ...and we're referring to the element at index 1! ``` This type is pretty ridiculous, to be honest. It would be nice to unify these concepts more effectively, under one umbrella which provides sufficient information for any representation, and also combine it with a more intuitive syntax. If only there was a proposal for a new feature that did just that... ## Proposal Eliminate the current bit-pointer forms, and replace them with the following: ```zig *packed(BackingType, offset) EmbeddedType ``` Before I explain this, here are some examples: ```zig // Pointer to a u9 at bit offset 10 into a `packed struct(u35)`, which is naturally aligned (say 8 bytes). *align(8:10:5) u9 // old *packed(u35, 10) u9 // new // Pointer to the element at index 1 of a `@Vector(3, u8)`, which is naturally aligned (say 2 bytes). *align(2:0:3:1) u8 // old *packed(@Vector(3, u8), 8) u8 // new // Pointer to a u3 at bit offset 5 into a `packed struct(u10)`, which is aligned to an unnatural 16-byte boundary. *align(16:5:2) u3 // old *align(16) packed(u10, 5) u3 // new ``` Hopefully those examples gave you a bit of a feel for it, but here's the idea. The `packed(T, o)` qualifier means that the pointer's address does not actually refer to the pointer's element type at the byte level; instead, it refers to a `T`. Then, in the *bit-level representation* (in the `@bitCast` sense; see also #19755) of that `T`, the pointer element type is found starting at bit offset `o` (where 0 means LSB). We use the keyword `packed` because this concept is highly related to that of "packed memory"; it's perhaps not exactly correct (under the accepted #19755 vectors are no longer packable types), but it's much closer than associating it with "alignment". Critically, this gives a user who hasn't encountered this niche feature before at least a vague idea of what it might be about, assuming that they know about the meaning of `packed` in Zig (which they probably do if they're encountering this concept!). The problem of not having enough information to lower loads and stores is solved by this proposal, and we can prove that by implementing loads/stores of these pointers in userland: ```zig //! These implementations are for demonstration purposes only; this code would not be in the standard //! library or anything. In compilation implementation terms, it's possible that `Air.Legalize` could //! perform a transformation akin to this. fn load(ptr: *align(a) packed(B, o) E) E { const Bits = @Int(.unsigned, @bitSizeOf(B)); const ElemBits = @Int(.unsigned, @bitSizeOf(E)); const backing_ptr: *align(a) B = @ptrCast(ptr); // (this might not actually be allowed, just for demonstration purposes) const bits: Bits = @bitCast(backing_ptr.*); const elem_bits: ElemBits = @truncate(bits >> o) return @bitCast(elem_bits); } fn store(ptr: *align(a) packed(B, o) E, elem_val: E) void { const Bits = @Int(.unsigned, @bitSizeOf(B)); const ElemBits = @Int(.unsigned, @bitSizeOf(E)); const elem_mask: Bits = ~@as(ElemBits, 0) << o; const backing_ptr: *align(a) B = @ptrCast(ptr); // (this might not actually be allowed, just for demonstration purposes) const old_bits: Bits = @bitCast(backing_ptr.*); const elem_bits: ElemBits = @bitCast(elem_val); const new_bits: Bits = (old_bits & ~elem_mask) | (@as(Bits, elem_bits) << o) backing_ptr.* = @bitCast(new_bits); } ``` And, we've unified pointers into packed structs and pointers into vectors under one roof. Lovely! One thing I've not mentioned yet: the compiler canonicalizes the `BackingType` into either an integer type or a vector type. That's because things like packed structs all have the same layout as their backing integer, so including the actual struct type in the pointer type would be redundant information. By "the compiler canonicalizes it", I mean that it is *permitted* to write `*packed(packed struct(u32) { ... }, 4) u8`, but the compiler simplifies this type to the equivalent `*packed(u32, 4) u8`; these types will compare equal, and printing the type (`@compileLog`/`@typeName`) will print the latter. Also, it would be possible to use `:` as the expression separator inside `packed(...)` instead of a comma: ```zig *packed(u35:10) u9 *packed(@Vector(3, u8):8) u8 ``` I *might* prefer this, because it visually distinguishes it a bit better in the vector case, but then it's kind of a random syntax. Feel free to bikeshed this in the comments. I'll assume a comma for now, since that seems like the obvious choice. ## Runtime Vector Stores There's one capability we lose in this proposal. Right now, Zig lets you write code like this: ```zig var runtime_idx: usize = 1; // this is runtime-known test "assign to runtime vector index" { var vec: @Vector(4, u2) = @splat(0); vec[runtime_idx] = 1; // <---- this! } ``` Well, given that the line in question works by taking a pointer `&vec[runtime_idx]`, how does that work? Bit pointers need a comptime-known bit offset! Well, there's yet another hacked-on extension to the bit pointer syntax: the vector index can be a special value representing "runtime-known", which the compiler represents with a question mark. ```zig var runtime_idx: usize = 1; // this is runtime-known test "runtime vector index pointer" { var vec: @Vector(4, u2) = @splat(0); @compileLog(@TypeOf(&vec[runtime_idx])); // @as(type, *align(1:0:4:?) u2) } ``` In theory, you're only allowed to store to this pointer, and only in a way where the compiler can "track" the index which was used to create it; there is a compile error for loading from it, or from storing to it when the index has been "lost". In practice, I think there are some broken interactions, and the situations in which this "tracking" succeeds manage to expose subtle compiler implementation details. All in all, this feature is pretty broken. **I am intentionally not proposing a replacement for this syntax here.** Storing values into arbitrary *runtime-known* elements of a vector seems like a pretty niche use case, and not really something worth doing at all. In fact, it seems like a potential antipattern; ideally, all operations on a vector should be SIMD operations. Accessing a single element *based on a runtime index* should be extremely rare. So, in the rare case that this is needed, you can implement it in terms of other language features: ```zig // old fn vecWithChangedElem(v: @Vector(5, u32), idx: usize, elem_val: u32) @Vector(5, u32) { var res = v; res[idx] = elem_val; return res; } // new fn vecWithChangedElem(v: @Vector(5, u32), idx: usize, elem_val: u32) @Vector(5, u32) { const pred: @Vector(5, bool) = @bitCast(@as(u5, 1) << @intCast(idx)); return @select(i32, pred, @splat(elem), v); } ``` It's true that the new one is a bit trickier to understand, but it simplifies the language, this should be an extremely rare operation anyway, and LLVM actually seems to be a little better at optimizing it, at least sometimes: https://zig.godbolt.org/z/48haGorej I don't think there's any justifiable reason for `vec[runtime_idx]` or `&vec[runtime_idx]` to not be a compile error. Currently, the former is, while the latter is not; this proposal brings the two forms into alignment, at the expense of making a rare and often-inefficient operation perhaps require a helper function. That seems like a completely reasonable tradeoff to me.

Includes a thorough explanation of how current bit pointer syntax works

baltevl · January 15, 2026, 5:14pm

Sorry this isn’t really related to the question itself. It’s was to show arrays aren’t bit packed (which is probably the right choice most of the time in an access time vs space trade off)

baltevl · January 15, 2026, 5:35pm

Not really what I’m looking for. Padding is not the problem atm. Thx anyways

invlpg · January 15, 2026, 5:36pm

IIRC the bit_set types don’t have guaranteed layout. You’d need to copy/paste StaticBitSet into your own codebase and declare it extern.

pachde · January 15, 2026, 6:11pm

Looks like you’re really left with using u32 and manual bit-twiddling, which isn’t bad in Zig.

You’d do the same in C, unless you take a dependence on implementation details because C doesn’t guarantee the layout of bitfields.

tholmes · January 15, 2026, 6:35pm

You can use a type function to generate a set of bitflags with nearly endless length:

/// Compose a packed struct with a given number of pseudo-anonymous fields.
pub fn AnonymousBitFlags(comptime field_num: u16) type {
	var fields: []const std.builtin.Type.StructField = &.{};
	@setEvalBranchQuota(@as(comptime_int, field_num) * 1024);
	for(0..field_num) |i| {
		fields = fields ++ &[1]std.builtin.Type.StructField{.{
			.name = std.fmt.comptimePrint("b{d}", .{i}),
			.type = bool,
			.is_comptime = false,
			.default_value_ptr = &false,
			.alignment = 0,
		}};
	}
	return @Type(.{.@"struct" = .{
		.layout = .@"packed",
		.backing_integer = @Type(.{.int = .{
			.bits = field_num,
			.signedness = .unsigned,
		}}),
		.fields = fields,
		.decls = &.{},
		.is_tuple = false,
	}});
}

pub const PLIC = struct{
	const NUM_CONTEXTS = 15872;
	const NUM_SRCS = 1024;
	
	const PendingBackingInteger = @Type(.{.int = .{
		.bits = NUM_SRCS,
		.signedness = .unsigned,
	}});
	
	priorities: [NUM_SRCS]u32 = undefined,
	
	/// Could leave this as undefined instead of default-initialising it.
	pending: AnonymousBitFlags(NUM_SRCS) = @bitCast(@as(PendingBackingInteger, 0)),
	
	// Comptime-assertions of field size
	comptime{
		if(@bitSizeOf(@FieldType(@This(), "pending")) != NUM_SRCS) unreachable;
		if(@sizeOf(@FieldType(@This(), "pending")) != @divExact(NUM_SRCS, 8)) unreachable;
	}
};

From the comptime-assertions, you can see that the size of the bitflags struct is exactly what we expect it to be.

I’ve omitted it for brevity’s sake, but if you’re intent on the PLIC being a packed struct, you could easily replicate this approach for the priorities as well, generating a packed struct of u32s instead of booleans. The two combined would result in the size of the PLIC’s backing integer being just over half of the maximum bit-width possible for an integer in Zig.

How you work with the bitflags is really up to you. You can wrap them in a struct type that provides methods to access them in an “array-like way”, or you can just @bitCast() them to the backing integer type and do bit operations on them, exactly like you would for C bitflags.

Here’s an example of a method that’d allow you to perform array-like indexing:

pub fn get_pending(self: PLIC, index: std.math.Log2Int(PendingBackingInteger)) bool {
	// Use a nasty trick to unroll all possibilities into separate functions at comptime.
	// This allows us to treat the index as comptime-known, even though it actually isn't.
	@setEvalBranchQuota(NUM_SRCS * 1024);
	switch(index){
		inline else => |i| {
			return @field(self.pending, std.meta.fields(@FieldType(PLIC, "pending"))[i].name);
		}
	}
}

And the same method implemented using bit operations:

pub fn get_pending(self: PLIC, index: std.math.Log2Int(PendingBackingInteger)) bool {
	return
		@as(PendingBackingInteger, @bitCast(self.pending)) &
		@as(PendingBackingInteger, 1) << index
		> 0
	;
}

Both seem to compile to roughly the same assembly instructions on ReleaseFast.

baltevl · January 15, 2026, 6:55pm

First of thanks. This was really help full. I remember considering vectors at some point but don’t know why i decided against them at that time. It works almost as I want it to.
There is a matrix with one enable bit per context and source. This would be ideal for a 2d array (or rather vector in this case). Sadly vectors in vectors are not allowed (which, at least to me, also kind of makes sense). Anyways one long vector + functions to access it by index will work fine as well…

baltevl · January 15, 2026, 7:01pm

Thought of that, tried that. Two problems:

Since extern structs are C-ABI compatible smallest type size is 1 byte. I need 1 bit.
Arrays are still not packed. Each element has it’s own byte
Still thanks for your help, though

chung-leong · January 15, 2026, 8:06pm

Packed structs themselves are allowed inside packed struct if memory serves. Maybe that can be used to model a matrix? Just use numeric field names a la tuple.

vulpesx · January 16, 2026, 4:50am

this is subject to change, the backend has complete control over what a vector is, including whether it is packed or not.

It is an implementation detail that shouldn’t be relied on upon, and it will become a compiler error regardless of the backend in the future.

baltevl · January 22, 2026, 2:34pm

Hm, that’s interesting. Before posting here I had thought about the problem quite a bit but since I found a satisfying answer I saw it as a bit of a language quirk and nothing further. If using Vectors in packed structs is not guaranteed to work maybe it’s worth thinking about array in packed structs again. If I understand correctly the problem is the following:

topolarity on github (issue #10113):

Unfortunately, if standard array ordering is always used, a single large @byteSwap will not be a valid endianness conversion for packed structs containing arrays, and the correspondence of array elements to the struct layout might be unexpected. For example:
const Foo= packed struct {
   x: i32,
   y: [2]i32,
};
With standard array ordering and this proposal Foo.y[1], not Foo.y[0], is adjacent to x on big-endian systems.

Currently implemented is 1. I might be wrong but if we just always use reverse ordering I think now we have the same problem with little-endian. So 2. is probably also not an option. 4. just sounds like a footgun. 3. on the other hand doesn’t sound to bad in my opinion. Building on this my idea was the following:
Arrays now have an order. Either ascending or descending (this naming is on purpose not something like default and reverse). If an order is specified we have a guarantee by the compiler to honor it. If no order is specified we have no guarantees about the in memory representation.
When using an array in a packed struct you have to specify an order.
Something like:

const assert = @import("std").debug.assert;

const a: [2]i32 = .{ 0, -1 }; // No ordering specified
const b: order(.ascending) [2]i32 = .{0, -1};
const c: order(.descending) [2]i32 = .{0, -1}; 
assert(&b[1] == @ptrFromInt(@intFromPointer(&b) + 4);
assert(&c[1] == @ptrFromInt(@intFromPointer(&c) - 4);

For most use cases this doesn’t change anything except that you have no guaranteed memory layout. This shouldn’t matter though because if you want a guaranteed layout you can just use order().

Any opinions? Is this just some very specific use case and it’s okay that this is awkward or is this actually a problem worth solving?

vulpesx · January 22, 2026, 3:17pm

My thoughts are we should just have (explicitly) packed arrays, yes it will be bitpacked (petition to make that the keyword :p).

const Foo = packed struct {
    x: i32,
    y: packed [2]i32,
};

It’s also convenient to reuse a keyword, instead of adding a new one. packed makes sense semantically here.

I think it makes sense for packed arrays to follow the same semantics of packed struct/union, That being least significant → most significant bits, with the mapping of bits → type being native endian.

When using for protocols, you already have to deal with what endian the protocol uses.

If just internally used within your code, you have the control to do whatever you want.

Indexing in reverse order is trivial maths; if you need/want to be smart with larger memory operations, you are either restricted by a protocol, or you have the control to make it work how you want.
If you have a conflict between multiple fancy operations, you would have that regardless of if arrays had built-in ordering.

So I don’t think there is much benefit in having ordering on arrays, even if it’s limited to packed arrays.

pachde · January 22, 2026, 5:06pm

AFAIK that is an approved proposal, and I just saw a pr go by on the bot channel watching issues someone made in that direction.