In a post here I came across someone mentioning std.Treap
. I looked it up, did some research and found it very interesting. I am, however, wondering why the Treap specifically made it into the stdlib and what are its tradeoffs / advantages compared to something like a normal avl tree?
The treap is used in the GeneralPurposeAllocator to store buckets. Why a treap specifically? Don’t know. It might be that OS pages have a tendency to be ordered, and ordered inputs force maximum rotation on AVLs. So using a random priority would avoid that.
But that’s just a guess.
Thank you already though, thats useful information to my understanding of why the treap is in there
Treap
in standard library is used in three places.
-
GPA
From the commit comment that added the Treap:Before this commit, GeneralPurposeAllocator could run into incredibly degraded performance in scenarios where the bucket count for a particular size class grew to be large.
…
std.Treap is used instead of a doubly linked list for the lists of buckets. This takes the time complexity of searchBucket [used in resize and free] from O(n) to O(log n), but increases the time complexity of insert from O(1) to O(log n) -
Futex
It seems related to the fairness of wake, but I am not sure. -
resinator (windows resource compiler)
Keeps source mapping for#line
handling.
Why is a treap superior to other trees here though? Or why else was it used?
I think your question is more towards tree-heaps in general instead of std.Treap
. You can find more information about tree-heap data structures through some direct searches:
Yep, thats true. I didnt know where else to ask, though. Thank you for the link!
Note that std.Treap
technically predates all of these usages, but the motivation was for std.Thread.Futex
(and for a while Futex
was the only usage of Treap within the Zig codebase). The pull request that introduced std.Treap
has a nice writeup that contains the motivation:
The reason for adding a Treap in the standard library over more common self-balancing binary search trees like AVL and RedBlack is that the implementation is iterative, requires similar Node overhead, and (most importantly) is relatively much simpler.
I used it for the GeneralPurposeAllocator because it existed in the standard library and had the properties I was looking for:
Any data structure with
O(log n)
or better search/insert/delete would also work for this use-case.I initially used a skip list implementation that I wrote for this because I wasn’t aware of
std.Treap
, butstd.Treap
slightly outperformed it in my benchmarks and provides all the same benefits.
OH YES this was the answer I was hoping for, case solved.
The first quoted paragraph is it!
Even more context for those curious:
std
had a red-black tree implementation at one point but it was deleted and moved to the std-lib-orphanage:
And as somewhat of a proof of the
[Treap] is relatively much simpler.
claim, std.rb
had bugs. From this PR that wanted to reinstate it for use with an Allocator implementation:
Note that this was marked as a draft due to (quite major) bugs found in std.rb - a rewrite of it is underway.
(that rewrite never materialized)
Interesting, where is the stdlib orphanage though?