Something like ArrayList that preserves index positions?

nairou · October 1, 2023, 8:33pm

I’m converting some C code to Zig, and am curious whether the standard library provides something like an ArrayList, where items can be added/removed and are contiguous in memory, but where an item’s index never changes. So it would use something like a linked list to track unused slots in the list once they are removed, and hopefully tracking the generation of each index allocation to prevent blind reuse.

I’m not sure what this type of item management is called, so searching documentation hasn’t been fruitful.

Does Zig have something like this, or do I just need to port that code as well and manage it myself?

AndrewCodeDev · October 1, 2023, 8:47pm

It would be fairly easy to fashion something like this from an existing data structure.

added/removed and are contiguous in memory, but where an item’s index never changes

You could have an array list of optional types. So your declaration would be something like…

ArrayList(?T)

If you don’t care about how you store the order of free-indices, you could have a remove or vacate function that takes an index as a parameter. Then, in a second ArrayList (of some kind of indicial type like u64 or similar), just append the index to that list.

In that case, I would use ArrayLists’s append to push back the back the index of the list the item you want to remove (after doing proper clean-up on the index that you’re interested and setting it to null). To get a free index, just use ArrayLists’s pop function to grab one off the buffer.

You could easily implement other helper functions like vacant that can tell you if an index is free kind of like return self.items[index] == null.

Unless I’m missing something really particular here, I don’t see the advantage of a linked list. A linked list in this case would be useful if the memory address should not change. It sounds like you don’t want the index position to change. So why not just buffer those indices and pop them off as you need them?

nairou · October 1, 2023, 8:51pm

You are correct. In my C code, I was linking unused slots via index rather than using a pointer. They still chain one to the next like a linked list, each holding the index of the next free slot.

But your idea or using a separate ArrayList to handle this is interesting, and might reduce some of the complexity…

AndrewCodeDev · October 1, 2023, 8:58pm

It ultimately sounds like you’re trying to do something fairly simple. Have a buffer of nullable items that grows linearly where items are not removed, just nulled-out (or decommissioned).

You avoid all the complexity of having to walk a linked list and you get contiguous access. You can also preallocate your buffers so you don’t have to resize as much up front (ArrayList has a growth factor which gives it some headroom so you don’t have to resize it every time you want to append).

Plus, deallocation is a breeze - just iterate and deinit your data (if you even have to… if they are trivially destructible types that don’t manage other data, you can just deallocate the whole list and be done with it). And you can certainly deallocate the index buffer by just calling deinit on it once.

The idea of using a linked list is actually not unlike what happens in allocators where memory cannot move. I can give you some links to topics on this here on the forums if you’d like.

Also, forgot to welcome you to the forums @nairou

nairou · October 1, 2023, 9:07pm

The problem is that I’m going to be adding and removing a lot, and I’d like to be able to set an upper limit on the buffer size and avoid allocations.

Context: this is for a Delaunay triangulation algorithm, so I’m constantly deleting a triangle and then adding some number of new ones. So never reusing slots in the buffer could become a problem for a large enough set.

Using two ArrayLists would also work, if I pre-filled the second with every available index, and re-added indices when they are removed from the main ArrayList.

But either way, it’s sounding like I’ll have to implement the logic myself rather than using something existing from the standard library.

Thanks!

AndrewCodeDev · October 1, 2023, 9:10pm

Right, and that’s what I meant (I probably could have been more clear). Once an index in nullified, it becomes vacant again and it’s numerical position goes into the free-index buffer. By pre-allocating, you can just stick to that size up front if you know what you want.

Anyhow, sounds like you’ve got some ideas. Post some code when you get a bit further!

dude_the_builder · October 1, 2023, 9:58pm

Not sure if this is helpful in your case, but check out ArrayList swapRemove which removes an item at a specified index and fills the slot with the last item in the list. If order isn’t important, this operation is O(1) versus orderedRemove which preserves the order but is O(N). In one of my past projects I was hit hard when I used pointers to items in the list instead of indexes. While testing, the list was small and never re-allocated so all tests passed. Once I got to the real data, boom!, invalid pointers. I switched to using indexes and problem solved; no matter how many times the underlying memory is re-allocated, the indexes will remain the same.

andrewrk · October 1, 2023, 10:30pm

std.SegmentedList

This is a stack data structure where pointers to indexes have the same lifetime as the data structure
itself, unlike ArrayList where append() invalidates all existing element pointers.
The tradeoff is that elements are not guaranteed to be contiguous. For that, use ArrayList.
Note however that most elements are contiguous, making this data structure cache-friendly.

Because it never has to copy elements from an old location to a new location, it does not require
its elements to be copyable, and it avoids wasting memory when backed by an ArenaAllocator.
Note that the append() and pop() convenience methods perform a copy, but you can instead use
addOne(), at(), setCapacity(), and shrinkCapacity() to avoid copying items.

This data structure has O(1) append and O(1) pop.

It supports preallocated elements, making it especially well suited when the expected maximum
size is small. prealloc_item_count must be 0, or a power of 2.

andrewrk · October 1, 2023, 10:35pm

In this case I would recommend std.ArrayListUnmanaged with a call to ensureTotalCapacity() and then use the set of functions that include AssumeCapacity in the name. If you do this, you can be sure the pointers are never invalidated if you never pass an Allocator to the array list.

andy_mango · October 2, 2023, 4:11pm

I has some similar requirements and solved the problem by creating a direct allocation scheme on top of an array. The introduction in the program document gives the requirements and design strategy so you can determine if they meet your needs. The source code and an example program are also available.