Better understanding of Zig idioms

deckarep · May 14, 2024, 7:51pm

Hello Zig fellows:

So I’m working on small module that I’m hoping can largely be a well-documented and a first class experience for Zig end users but I have some questions on some Zig related concepts:

Managed vs Unmanaged - Zig is manual memory management based, I feel like these words are a bit of a misnomer. But it’s my understanding that the only difference is whether or not the data structure holds a reference to a std.mem.Allocator internally vs having a memory related methods take an allocator? Is this a correct understanding or is there more to this?
For a first class data structure, would Zig end users usually expect a Managed and Unmanaged variant? What do people typically reach for in everyday coding?
What about “intrusive” types? I see Zig community members also periodically using those types. I’m wondering if I should consider building an intrusive version of my data structure.

My goal is to model my Set collection as if it were part of the standard lib while still largely being a general purpose solution.

AndrewCodeDev · May 14, 2024, 8:20pm

Hey @deckarep, first - thanks for your work on creating a general set object. I think it’s a great example of a community contribution to the ecosystem.

Let’s go through your points one at a time.

Managed vs Unmanaged

Yes you are correct that there is a misnomer here, but that’s not due to the objects you’re referring to. Fundamentally, it’s because it’s not about objects so I disagree with the framing of “do I have an allocator reference or not”.

Manual memory management means that you think about memory differently. Let’s think about a stack allocator as an example.

As far as the stack is concerned, it doesn’t have a list of foo type objects. It has a block of bytes and those bytes are occupied or not. Now, I as the programmer can imagine that as a group of foo types, but I can also choose to “deallocate” them by just resetting the index of the stack to zero. That’s really what manual memory management is about - I choose to see bytes the way I want to see them.

So how does this play into your original question: what’s the deal with managed vs unmanaged data structures? Let’s take another example of ye ol’ ArenaAllocator to answer this question.

I may have one arena allocator that manages the memory over some number of arbitrary data structures (let’s call it N for short). In this case, I do not want each individual object acting like it’s an island of its own concerns - it’s not. The arena determines the lifetime of the group of N objects. Unmanaged data structures make this very apparent. In that case, why do I want N references to an allocator lying around when they are already handled centrally by one allocator that inherently sees them as a group?

First class data structures

Both, honestly. Not sure what else to say here but you can see that pattern played out in the standard library.

Intrusive types

I don’t think there is anything particularly special about intrusion in regards to Zig. If you look at the ArenaAllocator again, you’ll see intrusion used there. This makes perfect sense because we don’t want to have a slice of bytes stored with each node because we’d have to make a separate jump to get the actual bytes after getting the node. That said, the size of that allocation is ambiguous so we just store bytes along with an indicator of how many bytes follow the intrusive node.

Ultimately, I’m saying the following - imagine a linked list with a data member of usize. What’s the value of making that “intrusive”? Not much, really.

deckarep · May 14, 2024, 8:29pm

@AndrewCodeDev - awesome answer, and this is exactly the kind of context I’m hoping to gain a deeper understanding of. I may have some follow-up questions but let me noodle on this.

Super helpful and thanks for the kind words. The Zig community has been very awesome and welcoming.

dude_the_builder · May 15, 2024, 12:55am

Even afte @AndrewCodeDev 's answer, I still don’t understand what intrusive types are. Could you guys explain this to us non-type-system-wizard types? (see what I did there? )

deckarep · May 15, 2024, 1:06am

I’ve barely come to learn about them. Honestly before this week I had never heard of them.

My understanding is that for some datastructures instead of creating some outer separate collection type, ie the container type…instead each element has a reference to the container type built right into it. Hence the “intrusive” name.

So think about it like this:

Either you have: a container object that contains each element.

or

You have: Each element has a pointer to the “container”…where now you are able to just lookup where your next item is.

Somewhere I also read: They can be great for cache locality, all the elements can be placed right next to each other in memory. It’s not the most intuitive concept to grasp IMO.

Here’s an example from MitchellH: libxev/src/queue.zig at main · mitchellh/libxev · GitHub

AndrewCodeDev · May 15, 2024, 1:07am

Sure - you embed the “node” into the data structure you want to link-up and you usually cast back up to the thing you’re linking to. A good rough sketch is @fieldParentPtr - you get the “parent” back from the “child” type.

So in the “external linked list” (the typical one we think about) you get something like this:

+----NODE-----+             +----NODE-----+ 
|    data     | ----------> |    data     |---------> ...
+-------------+             +-------------+

But with an intrusive list, it’s the other way around. You embed the “links” inside of the data object and you point to another of the same kind of node in different object.

This is powerful because you can add an arbitrary number of links in these things. They aren’t just “contained” by one list - you can basically make a web if you wanted to.

Good article: https://www.gamedeveloper.com/programming/in-depth-intrusive-lists#close-modal

dude_the_builder · May 15, 2024, 1:32am

Very interesting. Sounds like the things game developers might use for efficiency. Thanks @AndrewCodeDev and @deckarep !

deckarep · May 15, 2024, 1:34am

From my cursory searching around apparently this pattern is used in the Linux kernel as well.

Yes, you’re welcome @dude_the_builder and also thanks @AndrewCodeDev for the additional context.

Luke · May 15, 2024, 5:33am

My understanding is, this pattern is used in case memory is constrained, your data structures are already contiguous and all you need it a way to “arrange” them (element might be removed, the order matters and might change, etc). In that case, no need to allocate specific structures to keep that order, you might as well embed that in the original data struct. But you still want to keep some separation of concern so the specialized code that manage the list is still in a separate module.

nyc · May 26, 2024, 1:41pm

intrusive type just means you put add the fields required to support a collection type (like next pointer for a linked list or parent, left, and right child pointers for a tree) into data itself (which is a often a complex struct)

const Coord = struct {
    x: u32,
    y: u32,
   next: ?*Coord,
};

You want that next pointer the first in the struct for hardware reasons (there’s fast and slow paths for pointer chasing you can take advantage of), but zig doesn’t let you control that unless you make the struct extern and now we’re getting off topic.

If your language has the correct compile time semantics you can invert that a little and use encapsulation to do:

const Coord = struct { x: u32, y: u32};
pub fn LinkedNode(Data: type) type {
    return struct {
        const This = @This();
        d: Data,
        next: ?*This,
    };
}
const LinkedCoord = LinkedNode(Coord);

the c++ version if that is easiest to understand would be

template<typename Data>
struct LinkedNode {
    Data d;
    next: *This;
};
using LinkedCoord = LinkedNode<Coord>;

This lets you still treat the linked list somewhat as its own structure and making it more reusable. The second way isn’t possible in many language without some ugly hackery (like it can be done in C with heavy macro use, but it just easier to do the first way).

I don’t know if this second way is tecnically invasive, but it shares most of the same characteristics. with some addition posiitives and negatives (eg, having multiple invasive structures – fairly common – can get really ugly, and the next pointer isn’t repurposed nearly as easily – think of a an object with a next pointer being used to chain in a hashtable put then used for a free list when not in the table).

The alternative non-intrusive way would be keeping the as an opaque pointer void pointer (generic Object reference in Java,

const LinkedListNode = struct {
   data: *anyopaque,  // void* in C
  next: ?*LinkedListNode,
};

you two pointer hops to get to your data now. linked lists are slow enough that doubling the pointer chases is way painful.

They are a good example of expedience over pureness - usually hallmarks of system languages.

I actually have something like this in some code, 3 invasive structs on a single node and some of those values (the two next pointers) carry double duty depending on the current state of the object (and I use a hack to make the pointers nullable). Doing this in non-intrinsics code would be a monstrosity.:

const TradeReq = struct {
     // ... a bunch of fields for reconsiling some events
   const This = @This();
    fl_next: *This,
    tr_parent: *This,
    tr_childred: *This,
    oob_next: *This,
};

mnemnion · May 26, 2024, 6:16pm

There’s a tradeoff here: using extern one can apply old-school knowledge of struct packing, and where intrusive pointers should be placed. But letting the compiler take care of it offloads all of that to a machine.

I know that the current Zig compiler knows how to optimally use space, so struct packing just to get the smallest size isn’t necessary, Zig will just do that for you. I don’t know if it currently takes into account heuristics like “a pointer to the same struct type I’m building should be the first field”, and I’m pretty sure there’s no (perhaps limited) profile-guided optimization going on.

But Zig will be around for a long time, and compiler optimization tends to be a one-way ratchet which steadily improves. I’d say that in most cases this kind of decision should be left up to the compiler. While this is likely to be not applicable to the specific case you’re describing, there can be cases where different layouts are better for different architectures, or even different programs. Handling that sort of thing without compiler support is difficult and tedious.

nyc · May 26, 2024, 6:29pm

zigc has a very unnuanced idea of performance and only tries to minimize space by moving the larger aligned fields up front (prob 60% of the structs I write are extern so i can control the layout, but yes you do give up the ability to use non-extern structs then which kind of sucks).

the optimization i was taking about is that x64 (intel and amd) have fast paths to prefect the address load from a struct but it has to be on the same page. if it is it gets and the latency is just pipelined and and OOO away. So you want those pointers earlier in the struct to allow that more. If the struct croses a page boundary, it will fetch the value from the wrong place and then notice the tag is wrong and have to go to the slow path and refect the correct location. It only happens on page boundaries, but if your struct is 256 bytes that is 1 in 16 of your accesses are going to stall

i have a date structure that is laid out in a very specific order to allow bit casts to int and then compares properly with just < and >. there’s all sorts of neat little things you can do when you take over the formatting.

There is no PGO, and that only works well for a certain kind of program. Like some of the code I work on my fast past isn’t the most often taken (spinning read the network and waiting for a specific event that might be take a few times an hour). LLVM supports it and it would be nice to have for all the code it does work well with. Same with LTO - would be very nice to have.

It doesnt understand false and true sharing either - i wrote a SPSC rring buffer and had to take total control to keep the atomcs and shared data either separate or in cache lines with other data each side of the channel would need.

It would be nice if there was a version “fixed struct” or something that was just like a regular struct but didn’t much with the field ordering.

mnemnion · May 26, 2024, 6:39pm

I’m aware, I opened #19917 because the layout code can’t currently tuck a tag into the body of an anonymous struct in a tagged union. There’s a lot of low-hanging fruit.

But again, compilers tend to only improve over time. We don’t need to import the C++ superstition that the compiler can optimize things it clearly can’t, but placing a pointer clearly intended for a linked list in the right place is not a challenging or compile-time consuming optimization.

Isn’t this just extern? Extern structs can have decls, what’s missing?

nyc · May 26, 2024, 7:00pm

extern structs can’t contain anything that is not extern.

just last week I was trying to wrap an atomic around or mutex around some generic structs, but to keep the mutex in the right place I need to add byte padding and to get them in the right order (lock, padding, other) not (other, lock, padding) I needed to make it extern. but at that point some of the structs I wanted wrap couldn’t be done since they weren’t extern. you also can’t contain slice, or bits (u1s) or anything C doesn’t have.

Its struct coloring basically. You can’t use most of std in an extern struct unless you copy and paste it out and make some changes and use it that way. I’m not quite willing to stoop to that level yet, so it is still an open problem.

mnemnion · May 26, 2024, 7:21pm

Ok, that makes sense. Open question how an intermediate fixed struct type could contain other struct types with no defined layout, but they have a width and an alignment, it’s not an insane thing to want.

The Zig compiler is in an awkward growing stage, where people are starting to want it to be more sophisticated than it actually is at the moment. But a fixed struct type strikes me as a band-aid here. Compilers aren’t magic, but they can be, and mature ones are, good enough to be trusted to lay out memory except when an exact order is needed. Currently Zig can’t make a u8-tagged union with a struct containing a u32 field come out to 8 bytes, so clearly we aren’t anywhere near the destination.

But if you add fixed struct then the language is stuck with it forever, and I do expect your use case (getting memory layout to do some very obvious things which a mature compiler could certainly figure out) will fade over time.

I could be wrong here. Maybe we need something more like callconv for layout. Instead of extern, packed, fixed, have struct layout(.C), struct layout(.packed) struct layout(.fixed) and so on. That would be extensible at least, the language could add endian-specified layouts for networking purposes, for example.

nyc · May 26, 2024, 7:44pm

s/layout/__attribute__(())/

I dont think the need for fixed will everr go away for high performance code, like no modern compiler thatI know of can figure out false sharing prevention along with true sharing. They can default to just throwing atomic on its own cache line, but that’s just lazy and suoptimal.

My SPSC queue has a cache line for the reader, one for the writer, a shared cache line of read only data, and two more cache lines that each hold an atomic and one of piece of data that is written to perioidically to publish to the other side. The shared RO line is kept in MOESI state Shared, (each side has its own core it is assumed), the local ones to each side are in Exclusive, and the publishing cache lines are kept in Owned in one cache. This cuts down on cache line conflicts and coherency traffic. I’ve haven’t heard of anything that would figure that out.

Even just something like putting the size at he font of a wire struct to allow readers to jump over if it they dont know or care about it I can’t see working out very well.

I can’t think of a technical reason it couldnt just be treated similarly to const - extern applies to that struct only (i know that isnt a perfect analogy). If you have a non-extern struct inside an extern one, there is no guarantees on it. At some level you need to let deverlopers be experienced developers and stop trying to protect them from everything.

modern languages sometimes remind me of hellicopter parents that kept every germ away from their kids and now their kids have no immune system. Sometimes letting people mess up is the best thing you can do. The first few times some jr dev includes a non-foxed layout struct inside a fixed one, they’d soon stop. Maybe put a lint on it that could be turned off with a comment.

I’ve seen the tagged union issues. I dont use them, dont return slices (they go on the stack) when I dont have to, dont use optional types that don’t have a niche carveout, a lot of things because of the compilation on them is still immature. They’ll get better but some thing I thnk will also be useful.

mnemnion · May 26, 2024, 8:11pm

I have a deep aesthetic hatred of double underscore bracing, no thank you please. I much prefer the way that callconv is handled now be extended to layout.

There are open issues around the intention of extern structs and the naming of packed structs, as well as #6478 which directly addresses your fixed proposal. I’d say that collapsing all this proliferation into a single layout keyword is a reasonable solution. It allows for more than one attribute to be applied like layout (.fixed, .little_endian) and it leaves room for further extension, like a layout(.Swift).

These are mostly growing pains. You’ve convinced me that there’s a place for a layout which is fixed but not merely in the way that the C ABI allows, at least. But register-allocating both words of a fat pointer is not an undecidable problem, it’s just one of a myriad of things that the small core of Zig compiler engineers haven’t gotten around to yet.

Sze · May 26, 2024, 9:37pm

Can you explain what you mean, I might be misunderstanding that example?
Do you mean something like this?

const std = @import("std");

const Example = struct {
    const Choices = union(enum) {
        a: u8,
        b: u8,
    };

    count: u32,
    choice: Choices,
};

pub fn main() !void {
    std.debug.print("sizeOf: {}\n", .{@sizeOf(Example)});
}

This prints 8 for me on zig 0.12

AndrewCodeDev · May 26, 2024, 9:41pm

Indeed

I read that exactly the opposite - I thought @mnemnion meant a union where u8 is the tag.

mnemnion · May 27, 2024, 12:19am

Correct, but I oversimplified to the point of being wrong.

This is over a word in width:

const E = enum(u8) {
    float,
    bob,
    overlong,
};

const ATagUnion = union(E) {
    float: f32, // fine
    bob: struct {
        oneField: u32, // still fine
    },
    overlong: struct {
        byte: u8,
        oops: u32, // now we're over
    },
};

test "tag union size" {
    std.debug.print("\nSize of ATagUnion is {d}\n", .{@sizeOf(ATagUnion)});
    try expect(true);
}  // -> Size of ATagUnion is 12

The alignment of .overLong is 4, so the .byte field takes up 4 bytes, making it word-aligned. The enum is outside the struct body, so it ends up being 12 bytes, even though there’s 3 bytes of padding which could be used to store the enum.

It’s just how the compiler works right now. This can be remedied. I’ve even looked into doing it myself, there’s a lot to learn in the Zig repo and it’s slow going, but y’know. Be the change you want to see…