Coming to low-level programming just recently (Zig is the gateway for me), I still can’t wrap my head around alignment.
This has been bugging me for last few days.
So I have this tagged union with tag of a few members, and just save it in the middle of a buffer:
const std = @import("std");
const Tag = std.meta.Tag;
const Thing = union(enum(u4)) {
foo: void,
bar: void,
baz: u16,
pyramid: struct {
cc: u8,
dd: u16,
ee: u16,
ff: u116,
},
};
const MEM = 48 * 4;
pub fn main() !void {
// some meta info
std.log.err("@alignOf(u116)={}, @sizeOf(u116)={}", .{ @alignOf(u116), @sizeOf(u116) });
std.log.err("@alignOf(u16)={}, @sizeOf(u16)={}", .{ @alignOf(u16), @sizeOf(u16) });
std.log.err("@alignOf(u8)={}, @sizeOf(u8)={}", .{ @alignOf(u8), @sizeOf(u8) });
std.log.err("@alignOf(Tag(Thing))={}, @sizeOf(Tag(Thing))={}", .{ @alignOf(Tag(Thing)), @sizeOf(Tag(Thing)) });
// get memory
var buff: [MEM]u8 = .{0xAA} ** MEM;
var fba = std.heap.FixedBufferAllocator.init(&buff);
var ts = try fba.allocator().alloc(Thing, 4);
// write a thing
ts[1] = Thing{ .pyramid = .{
.cc = 0xCC,
.dd = 0xDDDD,
.ee = 0xEEEE,
.ff = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFF,
} };
// dump
const stdout = std.io.getStdOut();
try stdout.writeAll(&buff);
}
When I dump the memory and run it through hexdump -C
:
@nauron:~$ zig run utags.zig | hexdump -vC
error: @alignOf(u116)=16, @sizeOf(u116)=16
error: @alignOf(u16)=2, @sizeOf(u16)=2
error: @alignOf(u8)=1, @sizeOf(u8)=1
error: @alignOf(Tag(Thing))=1, @sizeOf(Tag(Thing))=1
00000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000010 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000020 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 |................|
00000040 dd dd ee ee cc 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000060 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000070 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000080 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000090 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
000000a0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
000000b0 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
000000c0
I wonder: why does the tag – the 03
on the address 00000050
take up the whole 16 bytes?
My current understanding of alignment is limited and perhaps even misguided. Please bear with me, and feel free to point me to other materials, etc…
So by my understanding, alignment problem exists because the CPU can effectively only address chunks of memory of certain size, starting at certain places of the memory. Therefore a layout where a value crosses that boundary is guarranteed to be ineffective: the CPU would need to load both chunks and work on a combined result.
This does not mean that having more values within that chunk is necessarily as ineffective; if I want to alter just one of the values, I can imagine the CPU loading the chunk and just altering the relevant part. Hence it’s OK for my .dd
, .ee
. and .cc
fields be stored just next to each other.
But what I can’t explain is why the tag 03
is stored separately, taking up the whole chunk for itself?
Why are we not seeing a layout like this instead:
...
00000020 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 |................|
00000040 dd dd ee ee cc 03 00 00 00 00 00 00 00 00 00 00 |................|
00000050 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
00000060 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa |................|
...
…which would have the whole Thing
take only 32 bytes vs. 48 bytes, right?
Edit: I forgot to add: this is Zig 0.13.0 on Debian 12, x86_64.