Is the trick used in the demo code good?

I want to implement a patricia tree alike structure.
In the implementation, there are 3 types: Value, Node and Tree.
Value depends on Tree, Node depends on Value, and Tree depends on Node.
Obvious self-dependency here, but it is not really, because a Tree will never use Node.value.
So I use a trick shown below to avoid the self-dependency. It looks it works, but I don’t know how safe is the trick. Does it works in all build modes?

const std = @import("std");

const Value = struct {
	x: u32,
	deepeTree: Tree,
};

const Node = struct {
	parent: *Node,
	left: *Node,
	right: *Node,
	value: Value,
};

const NodeWithoutValue = struct{
	x: struct {
		parent: *Node,
		left: *Node,
		right: *Node,
	} align(@alignOf(Node)),
};

const Tree = struct {
	root: *Node = undefined,
	
	// Here, Node is not used to avoid self-dependency.
	_zero: NodeWithoutValue = undefined,
	
	// nullNode.value will never be used.
	nullNode: *Node = undefined, 
	
	pub fn init(self: *Tree) void {
		self.nullNode = @ptrCast(&self._zero);
		self.root = self.nullNode;
	}
};

pub fn main() !void {
	var t: Tree = undefined;
	t.init();
}

This is not safe. Regular structs don’t have a defined memory layout. The compiler is allowed to reorder fields, or add hidden fields. And the compiler is actually doing this for example if you change the alignment of value the compiler will reorder the struct behind your back:

const Node = struct {
	parent: *Node,
	left: *Node,
	right: *Node,
	value: Value align(32),
};

is reordered to:

const Node = struct {
	value: Value align(32),
	parent: *Node,
	left: *Node,
	right: *Node,
};

Here, you can check it yourself if you want:

var x: Node = undefined;
std.log.err("Position in struct: {}", .{@intFromPtr(&x.value) - @intFromPtr(&x)});

The solution to this would be to use a packed or extern struct. However, I would recommend to take a step back and look for a different approach.

Do you even need _zero to be present in the Tree? Do you intend to change it at runtime? If not then you can make it a global variable and it will work fine:

var _zero: Node = undefined;

Also what are you even trying to do here? Couldn’t you just use a ?Node and regular null?

3 Likes

Many thanks for your explanations.

Although the _zero var might be modified, it is okay to make it be a global var now.
This is actually the current implementation. But the implementation doesn’t safe for
using multiple trees concurrently, so I’m seeking for a more elegant way.

It is a red-black tree implementation ported from https://www.eecs.umich.edu/courses/eecs380/ALG/niemann/s_rbt.txt,
I tried ?*Node as Node fields, it doesn’t work (it might work by modifying the ported code heavily).