I donât know what you are referring to, do you mean the children slice?
I made some rough calculations, I think the StencilVector implementation I have would work very well for your usecase and the custom allocator too.
You could use the StencilVector directly to implement your nodes, this would give you a compact memory layout and no need to manually sort children of nodes, the only part which would need some figuring out is how the building actually works.
I think your current code uses mutation of nodes to build the data structure, so it mixes the node identity and changing the size of the data associated with that identity.
Because the StencilVector is allocated with a specific count of nodes it canât change the size directly, but it would be relatively easy to work around, while building you would have an array of stencilvector references (either index or pointer).
When creating a new node you would add it to the array, the index is the logical identity of the node, the value is the index/pointer to the current stencilvector that represents the node data. Then when a node grows the value is replaced with the new stencilvector, while the identity stays the same.
When the building is done and there wonât be any size changes anymore, you could do a pass over all the logical ids and replace them with the direct ref (index or pointer). Thus removing that extra indirection.
I think this would achieve much better space utilization than a solution that tries to make the nodes change in size while keeping an identity directly, the existing node also has an indirection that isnât needed, there it is the children slice.
For this StencilVector solution the extra indirection would only be needed while the datastructure is being constructed and can be removed after building.
Here are the rough calculations:
Scribble node sizes:
---------------------------------
Orig Node
node size: 1 byte data + 3 bytes padding + 16 bytes slice
min node size: 20 (0 children)
max node size: 640 (31 children)
---------------------------------
multiarraylist Node
node size: 4 bytes start of children + 1 byte children count + 1 byte data
min node size: 6 (0 children)
max node size: 192 (31 children)
---------------------------------
PointerNode StencilVector
MaskInt: u31
MetaBits: 33
---------------------------------
min node size: 8 (0 children)
max node size: 256 (31 children)
---------------------------------
IndexNode StencilVector
MaskInt: u31
MetaBits: 1
---------------------------------
min node size: 8 (0 children)
max node size: 132 (31 children)
---------------------------------
IndexNode StencilVector (avoid bad flags layout)
MaskInt: u31
MetaBits: 1
---------------------------------
min node size: 4 (0 children)
max node size: 128 (31 children)
plus 1-12 bytes for flags (using DOD) for 1-32 nodes
---------------------------------
The only slightly annoying thing is the IndexNode variant being 8 bytes for empty children, but when you think about it you realize that it doesnât matter, because (If I am simulating things correctly in my head) we donât actually need to store node.data.code
when using these StencilVectors and that means that there are only 2^3 = 8 different possible empty nodes (based on the remaining 3 boolean flags) so we can just create these as static instances. Further we can check whether only some of them are actually used and then only create those.
Thus a node ref directly tells us whether it is one of the empty nodes, thus we donât need to store flags for them. Alternatively we can use Data Oriented Design and store the nodes and flags using SoA thus the flags just become either a single byte/nibble in a flags array or 3 bit in three separate bitarrays.
Or store 1 bit in the meta bit and 2 bit in two separate bitarrays.
(there might be a whole lot of ways to actually deal with these flags)
If that doesnât work for some reason, we also could use the remaining single meta bit, to indicate that the bytes need to be interpreted differently for some kind of small node optimization.
The reason why I say that node.data.code
isnât needed, is that when using StencilVectors, every node that refers to other nodes already stores the same information in its bitmask, so only external references or code that does calculations would need to hold on to that information until it is encoded in a bitmask again. External references would basically be StencilVector-index + u5.
There might be more other tricks possible, if I understood the algorithm and the connections between the nodes fully. I guess that could be found out by actually trying to implement it.
But this should be enough info so that you could calculate some rough numbers about how much memory each variant would require.