Wrong sizes of extern/packed structs?

Copy-past from here.

I took only a part of ELF header and observe strange things:

const std = @import("std");

const ElfHdr1 = extern struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8,  // 2 for 64 bit
    data: u8,  // endianess, 1 for LE
    vers: u8   // == 1
};

const ElfHdr2 = packed struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8,  // 2 for 64 bit
    data: u8,  // endianess, 1 for LE
    vers: u8   // == 1
};

pub fn main() !void {
    std.debug.print("sizeof(ElfHdr1) = {}\n", .{@sizeOf(ElfHdr1)});
    std.debug.print("sizeof(ElfHdr2) = {}\n", .{@sizeOf(ElfHdr2)});
}

This prints 8 for both variants:

$ ./elf 
sizeof(ElfHdr1) = 8
sizeof(ElfHdr2) = 8

But this must be 7, mustn’t it?

Same in C:

#include <stdio.h>

struct elf_hdr {
    int sign;
    char clas;
    char data;
    char vers;
} __attribute__((packed));

int main(void) {
    printf("sizeof(elf_hdr) = %lu\n", sizeof(struct elf_hdr));
}

Prints 7, as it must be:

$ ./a.out 
sizeof(elf_hdr) = 7

Is it a bug? Or do I misunderstand something completely?

1 Like

@bitSizeOf returns the number of bits without padding.
@sizeOf returns the number of bytes including padding:

This size may contain padding bytes. If there were two consecutive T in memory, the padding would be the offset in bytes between element at index 0 and the element at index 1.

6 Likes

The question is why packed struct has padding at all?

To have the correct alignment in arrays.


Also:

packed struct {
    foo: u1
}

What is the sizeof this packed struct?

1 Like

1 of course (I checked :-)). Computers (historically) are kinda octet-oriented, you can not address a single bit of RAM.

1 Like

Hm, ok. But packed structs are very frequently used as single entities, not in arrays.
Suppose you have a message (from a socket for ex., but it does not matter).
This message consists of a header and a body.
The header is described by packed struct, say, 5 bytes long.

When parsing such messages, it is very common to do something like

void func(... u8 *msg ...) {
    struct msg_hdr *hdr = (struct msg_hdr*)msg;
    u8 *body = msg + sizeof(struct msg_hdr);
    ...

Now we are trying to have same struct in Zig, but its size appears to be 6.
How would you do this in Zig then?

I tried using reader().readStruct(). The thing is that while the struct was read correctly, the reader head was then moved passed the padding and is skipping a byte.

const std = @import("std");

const ElfHdr2 = packed struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8, // 2 for 64 bit
    data: u8, // endianess, 1 for LE
    vers: u8, // == 1
};

test ElfHdr2 {
    var bytes = [_]u8{ 0x7f, 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0xff, 0xff, 0xff, 0xff };
    //                                                           ^end of struct
    //                                                                 ^reader resumes here

    // using a reader
    var stream = std.io.fixedBufferStream(&bytes);
    const elf1: ElfHdr2 = try stream.reader().readStruct(ElfHdr2);
    std.debug.print("{}\n", .{elf1});
    var rest: [4]u8 = undefined;
    const bytes_read = try stream.read(&rest);
    try std.testing.expectEqual(4, bytes_read);
    std.debug.print("rest of bytes: {d}\n", .{rest});
}
// || 1/1 test2.decltest.ElfHdr2...test2.ElfHdr2{ .sign = 1179403647, .clas = 2, .data = 1, .vers = 1 }
// || expected 4, found 3
// || FAIL (TestExpectedEqual)

Packed structs are represented in memory using a backing integer. Access to fields is essentially a convenience feature around bitshifts. This means the backing integer is subject to the same alignment rules as regular integers. This means you will have padding in the most significant bits (highest memory address for little endian systems, lowest memory address for big endian systems).

readStruct is confusion and should probably be removed or modified from its current state in the std lib.

I use a lot of packed structs in my library because I have a binary protocol that can be defined using the little endian layout of a packed struct. You can find some examples of conversion functions etc here.

I also created this issue:

And you may be interested in this issue as well

4 Likes

Packed structs currently don’t work well in general for binary protocol serialization / deserialization. Though they are very close to being excellent.

They currently favor protocols that are little endian. Big endian protocols (network stuff) are left in the dust.

Also you should be aware that you currently cannot put arrays in a packed struct. And if you did, my serialization functions I linked would likely break, since I am relying on the big endian representation being just the byteswap of the little endian representation.

2 Likes

It’s just a note … when designing (application level) binary protocols I strongly prefer BE for everything (even if I know for sure there are LE machines on both ends), just because when you dump packets with hd or similar tool, it’s more convenient to read (BE is more natural for us, we write 2024, not 4202)

How would you write

#include <stdio.h>
#include <string.h>

struct s {
    char x;
    char y;
    char z;
} __attribute__((packed));

char *buf = "123456789";

int main(void) {
    printf("sizeof(struct s) = %lu\n", sizeof(struct s));
    struct s *p = (struct s*)buf;
    int k = 0;
    for (; k < strlen(buf) / sizeof(*p); k++, p++)
        printf("s[%d] @ %p = {'%c','%c','%c'}\n", k, p, p->x, p->y, p->z);
}

in Zig?

My attempt is

const std = @import("std");
const S = packed struct {
    x: u8,
    y: u8,
    z: u8,
};

pub fn main() void {
    std.debug.print("sizeof(S) = {}\n", .{@sizeOf(S)});
    const buf = [9]u8{'1','2','3','4','5','6','7','8','9'};
    var k: usize = 0;
    while (k < 3) : (k += 1) {
        const p: *S = @constCast(@ptrCast(@alignCast(&buf[3 * k])));
        std.debug.print("s[{}] = {any}\n", .{k, p.*});
    }
}

Well, it panics with incorrect alignment message in Debug build, but works as intended in other modes (x86_64).

To achieve byte-packed structs, you can manually override the alignment of all fields with larger than one alignment:

const ElfHdr1 = extern struct {
    sign: u32 align(1),
    clas: u8,
    data: u8,
    vers: u8
};

This should give you a struct of size 7, alignment 1.

7 Likes

Yes, thank you.
Though only works with extern, not with packed.
But does not matter.