Wrong sizes of extern/packed structs?

dee0xeed · September 19, 2024, 8:06am

Copy-past from here.

I took only a part of ELF header and observe strange things:

const std = @import("std");

const ElfHdr1 = extern struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8,  // 2 for 64 bit
    data: u8,  // endianess, 1 for LE
    vers: u8   // == 1
};

const ElfHdr2 = packed struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8,  // 2 for 64 bit
    data: u8,  // endianess, 1 for LE
    vers: u8   // == 1
};

pub fn main() !void {
    std.debug.print("sizeof(ElfHdr1) = {}\n", .{@sizeOf(ElfHdr1)});
    std.debug.print("sizeof(ElfHdr2) = {}\n", .{@sizeOf(ElfHdr2)});
}

This prints 8 for both variants:

$ ./elf 
sizeof(ElfHdr1) = 8
sizeof(ElfHdr2) = 8

But this must be 7, mustn’t it?

Same in C:

#include <stdio.h>

struct elf_hdr {
    int sign;
    char clas;
    char data;
    char vers;
} __attribute__((packed));

int main(void) {
    printf("sizeof(elf_hdr) = %lu\n", sizeof(struct elf_hdr));
}

Prints 7, as it must be:

$ ./a.out 
sizeof(elf_hdr) = 7

Is it a bug? Or do I misunderstand something completely?

dimdin · September 19, 2024, 8:29am

@bitSizeOf returns the number of bits without padding.
@sizeOf returns the number of bytes including padding:

This size may contain padding bytes. If there were two consecutive T in memory, the padding would be the offset in bytes between element at index 0 and the element at index 1.

dee0xeed · September 19, 2024, 8:37am

The question is why packed struct has padding at all?

dimdin · September 19, 2024, 8:55am

To have the correct alignment in arrays.

Also:

packed struct {
    foo: u1
}

What is the sizeof this packed struct?

dee0xeed · September 19, 2024, 10:23am

1 of course (I checked :-)). Computers (historically) are kinda octet-oriented, you can not address a single bit of RAM.

dee0xeed · September 19, 2024, 11:50am

Hm, ok. But packed structs are very frequently used as single entities, not in arrays.
Suppose you have a message (from a socket for ex., but it does not matter).
This message consists of a header and a body.
The header is described by packed struct, say, 5 bytes long.

When parsing such messages, it is very common to do something like

void func(... u8 *msg ...) {
    struct msg_hdr *hdr = (struct msg_hdr*)msg;
    u8 *body = msg + sizeof(struct msg_hdr);
    ...

Now we are trying to have same struct in Zig, but its size appears to be 6.
How would you do this in Zig then?

wrapitup · September 19, 2024, 1:36pm

I tried using reader().readStruct(). The thing is that while the struct was read correctly, the reader head was then moved passed the padding and is skipping a byte.

const std = @import("std");

const ElfHdr2 = packed struct {
    sign: u32, // 7F 45 4C 46, '[DEL]ELF'
    clas: u8, // 2 for 64 bit
    data: u8, // endianess, 1 for LE
    vers: u8, // == 1
};

test ElfHdr2 {
    var bytes = [_]u8{ 0x7f, 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0xff, 0xff, 0xff, 0xff };
    //                                                           ^end of struct
    //                                                                 ^reader resumes here

    // using a reader
    var stream = std.io.fixedBufferStream(&bytes);
    const elf1: ElfHdr2 = try stream.reader().readStruct(ElfHdr2);
    std.debug.print("{}\n", .{elf1});
    var rest: [4]u8 = undefined;
    const bytes_read = try stream.read(&rest);
    try std.testing.expectEqual(4, bytes_read);
    std.debug.print("rest of bytes: {d}\n", .{rest});
}
// || 1/1 test2.decltest.ElfHdr2...test2.ElfHdr2{ .sign = 1179403647, .clas = 2, .data = 1, .vers = 1 }
// || expected 4, found 3
// || FAIL (TestExpectedEqual)

kj4tmp · September 19, 2024, 3:49pm

Packed structs are represented in memory using a backing integer. Access to fields is essentially a convenience feature around bitshifts. This means the backing integer is subject to the same alignment rules as regular integers. This means you will have padding in the most significant bits (highest memory address for little endian systems, lowest memory address for big endian systems).

readStruct is confusion and should probably be removed or modified from its current state in the std lib.

I use a lot of packed structs in my library because I have a binary protocol that can be defined using the little endian layout of a packed struct. You can find some examples of conversion functions etc here.

github.com

kj4tmp/zecm/blob/main/src/wire.zig

//! Serilization and deserialization utilities specific to EtherCAT.

const std = @import("std");
const assert = std.debug.assert;
const native_endian = @import("builtin").target.cpu.arch.endian();

pub fn packedSize(comptime T: type) comptime_int {
    comptime assert(isECatPackable(T));
    return @divExact(@bitSizeOf(T), 8);
}

pub fn isECatPackable(comptime T: type) bool {
    if (@bitSizeOf(T) % 8 != 0) return false;
    return switch (@typeInfo(T)) {
        .Struct => |_struct| blk: {
            // must be a packed struct
            break :blk (_struct.layout == .@"packed");
        },
        .Int, .Float => true,
        .Union => |_union| blk: {

This file has been truncated. show original

I also created this issue:

github.com/ziglang/zig

Document Effects of Endianness on Packed Structs

opened 04:56AM - 16 Jul 24 UTC

kj4tmp

docs

### Zig Version 0.14.0-dev.66+1fdf13a14 ### Steps to Reproduce and Observed Be…havior The docs say packed structs have defined in-memory layout but do not fully describe what that in-memory layout is, especially when considering host endianness, non-byte aligned, and not-byte-width fields. ### Expected Behavior I expected the docs to describe the in-memory layout of structs under the effects of host-endianness, non-byte aligned, and not-byte-width fields.

And you may be interested in this issue as well

github.com/ziglang/zig

Disallow `reader.readStruct` for packed structs

opened 07:19AM - 25 Sep 22 UTC

zxubian

bug contributor friendly standard library

### Zig Version 0.9.1 (windows, chocolatey),0.10.0-dev.4166+cae76d829 ### …Steps to Reproduce 1. create file repro.zig ``` zig /// repro.zig const std = @import("std"); const expect = std.testing.expect; const PackedStruct = packed struct { a: u48, b: u48, c: u16, }; test "reading a packed struct" { const file = try std.fs.cwd().openFile("repro.zig", .{}); const reader = file.reader(); _ = try reader.readStruct(PackedStruct); const pos_from_reading_struct = try reader.context.getPos(); try reader.context.seekTo(0); _ = try reader.readBytesNoEof(@bitSizeOf(PackedStruct) / 8); const pos_from_reading_bytes = try reader.context.getPos(); std.log.warn("{}, {}", .{ pos_from_reading_struct, pos_from_reading_bytes }); try expect(pos_from_reading_struct == pos_from_reading_bytes); } ``` 2. ```zig test repro.zig``` ### Expected Behavior Test passes. Log output should be ```14. 14``` ### Actual Behavior Test fails. Log output is ```16, 14```. This is because ```@sizeOf(PackedStruct)``` is 16. However, using sizeOf in readStruct is undesirable for reading consecutive packed structs, because now the seeker position is wrong, affecting future reads downstream. To correct this manually, the developer has to check if packed struct requires padding, and manually wind the seeker back using ```seekBy```.

kj4tmp · September 19, 2024, 4:27pm

Packed structs currently don’t work well in general for binary protocol serialization / deserialization. Though they are very close to being excellent.

They currently favor protocols that are little endian. Big endian protocols (network stuff) are left in the dust.

Also you should be aware that you currently cannot put arrays in a packed struct. And if you did, my serialization functions I linked would likely break, since I am relying on the big endian representation being just the byteswap of the little endian representation.

dee0xeed · September 19, 2024, 4:55pm

It’s just a note … when designing (application level) binary protocols I strongly prefer BE for everything (even if I know for sure there are LE machines on both ends), just because when you dump packets with hd or similar tool, it’s more convenient to read (BE is more natural for us, we write 2024, not 4202)

dee0xeed · September 19, 2024, 7:42pm

How would you write

#include <stdio.h>
#include <string.h>

struct s {
    char x;
    char y;
    char z;
} __attribute__((packed));

char *buf = "123456789";

int main(void) {
    printf("sizeof(struct s) = %lu\n", sizeof(struct s));
    struct s *p = (struct s*)buf;
    int k = 0;
    for (; k < strlen(buf) / sizeof(*p); k++, p++)
        printf("s[%d] @ %p = {'%c','%c','%c'}\n", k, p, p->x, p->y, p->z);
}

in Zig?

My attempt is

const std = @import("std");
const S = packed struct {
    x: u8,
    y: u8,
    z: u8,
};

pub fn main() void {
    std.debug.print("sizeof(S) = {}\n", .{@sizeOf(S)});
    const buf = [9]u8{'1','2','3','4','5','6','7','8','9'};
    var k: usize = 0;
    while (k < 3) : (k += 1) {
        const p: *S = @constCast(@ptrCast(@alignCast(&buf[3 * k])));
        std.debug.print("s[{}] = {any}\n", .{k, p.*});
    }
}

Well, it panics with incorrect alignment message in Debug build, but works as intended in other modes (x86_64).

IntegratedQuantum · September 19, 2024, 8:00pm

To achieve byte-packed structs, you can manually override the alignment of all fields with larger than one alignment:

const ElfHdr1 = extern struct {
    sign: u32 align(1),
    clas: u8,
    data: u8,
    vers: u8
};

This should give you a struct of size 7, alignment 1.

dee0xeed · September 19, 2024, 8:22pm

Yes, thank you.
Though only works with extern, not with packed.
But does not matter.