I’m not qualified to comment on writing a safe lock implementation, but the standard library does have a Mutex struct that might be worth looking at, specifically the SingleThreadedImpl:
Lock-free design is one of those topics that is so tricky, the formal study in Computer Science is well-deserved. My recommendation here is to find a MOOC or textbook that goes over it, if you want to dive into that stuff.
I suggest Mark Batty’s thesis on the C/C++ concurrency model, chapter 3.
It’s not the easiest way to learn something if you are new to this topic, but it helped me to finally understand it. Chapter 3 gives a comprehensive definition of the memory model.
Although it is explained in the context of C/C++, mutexes and atomics are pretty much the same for all languages.
Some side note, about the usefulness:
In my current studies I came across these lock-free data structures. And even if they are locking really promising after some investigation, they are not that great. When you get to the point where you have to implement some kind of insert and remove operations, you get to a point where it becomes a huge pain.
And when you look at the performance benefits you get from using these no-lookup data structures, they are not that impressive.
From the paper I linked below, these are some measurements they came across.