This is a structure that stores data about chess positions during search for best move. I plan to use multi-threaded search where I will run multiple search routines in separate threads and all threads with store or load data from global ttArray, which will act as shared memory. This approach is called Lazy SMP (Lazy SMP - Chessprogramming wiki).
As far as I was reading and studying I need to use @atomicRmw for storing and @atomicLoad for loading data, but I haven’t found good examples and I am not sure if this is correct approach.
I think std.atomic provides a friendlier interface for atomic operations. For Zig master it’s here and for 0.11 here. You basically wrap your data in an Value or Atomic and then you have load and store methods.
You may also want to test the non-atomic way using a std.Thread.Mutex. I’ve found that depending on the situation atomics may be faster and in other cases the mutex is faster. You always have to test and see with multithreading.
There is no such thing as “atomic access to the entire array”. Atomic is about operations, not types. You operate atomically on an element. There is no way to operate atomically on an array, unless you can pack the entire array into a register. std.atomic is just a very thin wrapper over the @atomic* builtins. In this case, the atomic approach doesn’t work because @atomic* doesn’t operate on structs, hence the casts I used above.
Yeah, at that point you’re crossing into mutex territory. Now, if you wanted an array of atomic values, that’s another issue. Or, if you wanted to build a kind of ticket-mutex that uses an atomic counter to track whose turn it is… then we’re getting into atomics being useful.
@jabolcni If you just want to lock down a resource, you probably want a mutex. I’d start with that first and see where your bottlenecks are. At the very least, it’s super easy to implement an example to get some baselines for further experiments.
Pointer casts are trial and error, this seems to work, but using @atomicLoad is slower than just using load_simple. I am not sure which type of order I should use, I read about them all, but this is the first time that I am using anything atomic, so I do not have any experiences. I also need to test this on linux.
I will also try to figure out mutex-es, so that I can test differences.
Another question you should ask yourself is if you really need sequential consistency. There are many weaker(→faster) versions of atomicity which might be enough here.
You can check out the llvm guide for help, it’s a bit technical, but it gives you a description what each of the atomic orders do: https://llvm.org/docs/Atomics.html#id7
There is also .Unordered. This doesn’t guarantee any order of reads/writes.
It basically only guarantees that you don’t read garbage values. What you read must have been written somewhere before.
You should also test the performance of doing the load with .Acquire and the store with .Release. This should be faster than SeqCst and slower than Monotonic or Unordered, but are often recommended as the best practice for load / store sequences and offers sane guarantees on ordering of operations.
Once I figured out how to do pointer casting, the rest is quite easy and understandable.
Using atomics for storing/loading is relatively fast, almost no performance hit. I am not so sure about mutex, but I will have to wait for the results of the tests.
Just a tip for common cases of using a mutex in Zig; you can take advantage of defer to ensure the mutex is unlocked no matter what happens. In your case, this isn’t really necessary because the code between lock and unlock (the critical section) is just one infallible operation. But for larger critical sections and ones containingg code that can fail, you can do: