Assuming get
and free
have to be safely callable from any thread on any box, I have two approaches you could try.
Simple approach (doesn’t compute multiple box values in parallel):
Make the cache hold 2 mutexes (std.Thread.Mutex
), which I will refer to as get_mutex
and memory_pool_mutex
.
Wrap the entire contents of get
with get_mutex
as follows:
fn get(cache, id) T {
cache.get_mutex.lock();
defer cache.get_mutex.unlock();
// rest of function
}
This will handle data races on the data structure that checks for a box’s cached value (calculating already_computed
). It will also eliminate certain race conditions such as when two threads both see that a value isn’t in the cache at the same time, and both try to compute it from scratch.
Then, wrap all calls to functions on the memory pool (or whatever data structure manages your free-list and memory allocations) with memory_pool_mutex
, or just make a wrapper struct for your memory pool that adds these locks to every function.
Finally, you need to ensure there are no data races on the counters, which you can do by simply storing each one as a std.atomic.Value
.
Expanded approach (computes multiple box values in parallel):
The first approach locks the entire get
function, meaning you can only calculate a single box’s value at once across all threads. I’m assuming that the computation of a box’s value is by far the most intensive part of get
, so this isn’t ideal.
To get around this while still keeping things relatively simple, you could make retrieving a cached value not just give a pointer, but a pointer and a ‘latch’ that must be waited on until the pointed-to value is initialized. This way the get
function could do the allocation of the box’s value and the marking of the box’s value as ‘computed’ in the locked section, then actually compute the value in an unlocked section:
fn get(cache, id) T {
*lock mutex*
if already_computed (LOCKED)
get latch and pointer to value (LOCKED)
*unlock mutex*
wait on latch (UNLOCKED)
return pointer to value (UNLOCKED)
else
initialize the counter (LOCKED)
allocate value (LOCKED)
make latch for value (LOCKED)
mark value as computed (LOCKED)
*unlock mutex*
compute value, storing in allocated spot (UNLOCKED)
open latch (UNLOCKED)
return pointer to value (UNLOCKED)
}
Unfortunately zig stdlib doesn’t have a premade latch that I know of, but a simple latch like this should work in your case:
const Latch = struct {
is_open: std.atomic.Value(u32),
const init: Latch = .{
.is_open = .init(0),
};
pub fn open(self: *Latch) void {
self.is_open.store(1, .release);
std.Thread.Futex.wake(&self.is_open, std.math.maxInt(u32));
}
pub fn wait(self: *Latch) void {
while (self.is_open.load(.acquire) == 0) {
std.Thread.Futex.wait(&self.is_open, 0);
}
}
};
You might be able to reduce blocking further and allow some of the smaller things like the already_computed
checks to be done in parallel, but I generally think that it’s good to just try and keep the locking logic as simple as possible, focusing mostly on getting the ‘big chunks’ of computation to be more parallel/non-blocking, even if the heavy-handed/naive locking on smaller computations is ‘uglier’.