Struct Inheritance and Alignment

Alignment and struct inheritance

I’m new to Zig, and working through the “Crafting Interpreters” book ATM.

I was/am struggling with the “struct inheritance”, where the type Obj is just an enum and the (first of a handful) of actual object types uses this as the first field type in the struct. The C ABI, as the author says, grants the memory of the struct so that casting pointers is easy.

In Zig, we have to use @parentFieldPtr to downcast a *Obj to a *ObjString.
OK, I understand this (though it took me some time).

However, I’m not sure if I understand the alignment.
Note: Until now, my Obj type does not yet contain the next attribute (I’m just before chapter 19.5 “Freeing Objects”).

Without an ‘@aligncast’, the Zig compiler complains about alignment:

pub const ObjType = enum(u4) {
    STRING,
};

pub const Obj = struct {
    typ: ObjType,
// Some methods, but no other fields yet
}

pub const ObjString = struct {
    obj: Obj,
    length: u32,
    chars: [*:0]const u8, // This is a variable length 0-terminated C string.

    pub fn asObj(self: *ObjString) ObjPtr {
        return &self.obj;
    }

    pub fn fromObj(o: *Obj) *ObjString {
        std.debug.assert(o.typ == .STRING);
        const s: *ObjString = @fieldParentPtr("obj", o);
        return s;
    }
   ...
}

@fieldParentPtr increases pointer alignment
‘*align(1) object.ObjString’ has alignment ‘1’
‘*object.ObjString’ has alignment ‘8’
@fieldParentPtr(“obj”, o)

I can solve this in two ways (or are there better ideas?):

I could explicitly use u64 instead of u4 for the enum.
But that wastes 7 bytes of memory for each object, so that’s for sure not the right solution.

Or I could use an @alignCast together with @fieldParentPtr, as @Southporter did in the generic as function in line 59 of zlox/src/Object.zig at master · Southporter/zlox · GitHub

What exactly are the consequences of an @aligncast here?

AFAIK an @aligncast is a promise I give to the compiler.
Does this impose a risk that the correctness depends on the actual memory layout of the ObjString struct?

When I continue working, the other object “subtypes” , e.g for functions, will contain obj: ObjType field as the first field, too.

Is this construct safe?
What is considered best practice in cases like this?

Hi @hvbargen,

The Zig compiler does not guarantee that the memory layout will be in the same order as what you lay out in the text file. This allows for optimizations if possible. That’s why zig has the @fieldParentPtr built-in to make this type of casting available.

To your question about alignment. The two structs have different alignment. In C you can do the casting without the ceremony of fixing alignment, but Zig requires this to be explicit. In this case, the @alignCast is what you want because you are changing the pointer to a struct with a different alignment. In essence, you are telling the compiler, “don’t worry, I am expecting these to be different and I’m fine with that.”

2 Likes

I’m also interested in the answer to this question. I’m going through the book myself and went through what you’re going through.

Here’s what helped me Zig's @constCast, which is essentially what @Southporter said above, though I would appreciate any deeper explanation from the talented people here.

Here’s my repo GitHub - quangd42/zlox, would love an opinion/review from you guys! :slight_smile:

Just going to point out zig has tagged unions, which is effectively what you are implementing, though there are some differences.

1 Like

@quangd42 I think @constCast is unrelated to this. In your code, you are using @aligncast.

First, after reasoning a bit more about it, to answer my own question about alignCast in this context:

Using @alignCast, I promise to the compiler that the pointer actually has the expected alignment for the result type. And I can promise this for sure because I know that the obj field is part of an ObjString struct which was allocated (sorry, not shown in the code) using allocator.create(ObjString).

The promise would be broken for example if the ObjString was instead
part of a packed struct, unless very carefully constructed.

In fact, as @vulpesx points out, the whole construct is basically a tagged union in disguise.
But the point is that the content is always created on the heap and allocates only the actual necessary length for the payload.

One could call it a variable length tagged union.

By searching the net for this, I stumbled over More control over the data layout of tagged unions · Issue #1922 · ziglang/zig · GitHub and Can I allocate only the required amount of memory for tagged unions?.

One idea in the comments for issue 1922 is to use NaN tagging (a similar idea is pointer tagging). I know this idea from Robert Nystroms “wren” language (see Performance – Wren).
I’ll keep this idea in mind for later, when I have completed the language. Then I’ll be able to compare the performance of the current basic approach, Nan tagging, pointer tagging or the data-oriented approach (from the other link) for different use cases, because I reckon the results strongly depend on the language’s type system, dynamic vs static typing and the program under test (e.g. number crunching vs typical business logic in DB applications, which usually involve records with many string fields and logic based on strings, so for a real language I’d consider using a special case for short strings).

I already had a solution without @fieldParentPtr before, by using packed struct with obj as the first field. But I had to add @alignCast in several places nevertheless (that could have been caused by the strange way I allocated memory in that version, though).

The two different approaches from the “Can I callocate…?” discussion could probably be used as well: Using a union of different pointer types, or a data-oriented approach of using different ArrayLists (one for each ObjType tag, in my case). However, @sze’s idea using @unionInit, std.meta.TagPayload and std.meta.Child requires a bit more Zig knowledge than I have ATM.

1 Like

If you read the whole article it ends up talking about ptrCast as well as alignCast. Either way, thanks for the detailed response and good resources to follow up with!

At this stage, yes, tagged unions work. However, a few chapters later, the Obj starts to get extra fields that cannot be expressed wit only a tagged union unless you duplicate the field on all the subtypes.

For the field parent ptr vs tagged union comments, tagged unions are great when your different “subclasses” are all about the same size, and you know how many you’re going to have. I usually find I know how many “subclasses” I’m going to have, but you could make a hypothetical library where you want users to “inherit” from some base “classes”.

Another reason to lean towards field parent ptr is memory pinning. Callbacks and C interop often leads to an anyopaque pointer, and I find that field parent ptr fits that pattern very well.

I want to critique @Southporter ‘s point about member composition. I see where you’re coming from, it’s good to want to avoid duplicating fields, but consider:

  • comptime checks wrapping the “base class” will give you errors when you don’t have the right types duplicated
  • Duplicated fields between tagged union members take up the same space
  • If it does hit some critical mass for complexity, you can store shared fields in a struct (assuming all “subclasses” have those fields, and then store unique fields for each “subclass” as a tagged union beside those shared fields.

Note that there are other methods to achieve similar, but I’m not super experienced with them, and afaict you probably want a serious project first for them to be worth the effort.

2 Likes