For the life of me I cannot understand the comment in the std/Thread/Condition.zig
that the following scenario can deadlock. It looks fine to me. What am I missing? Will adding
thread-1: mutex.unlock()
and uncommenting lines 40-41 resolve the deadlock?
Alright so there are 2 deadlocks I can see here:
- thread2 calls signal
- thread1 locks the mutex
- thread1 waits (forever)
and
- thread1 locks the mutex
- thread2 calls signal
- thread1 waits (forever)
The proposed solution to lock the mutex only guards against the second deadlock, so I think the example is incomplete. I modified it a bit to use a predicate. Here is a full example that you can try on your own that reliably produces a deadlock:
const std = @import("std");
const Mutex = std.Thread.Mutex;
const Condition = std.Thread.Condition;
var condition = Condition{};
var mutex = Mutex{};
var predicate = false;
fn thread1() void {
mutex.lock();
defer mutex.unlock();
while(!predicate) {
std.log.err("thread1 read predicate", .{});
std.time.sleep(2000000);
std.log.err("thread1 waits", .{});
condition.wait(&mutex);
}
std.log.err("thread1 awoke", .{});
}
fn thread2() void {
predicate = true;
std.log.err("thread2 wrote predicate", .{});
//mutex.lock();
//mutex.unlock();
condition.signal();
std.log.err("thread2 signaled", .{});
}
pub fn main() !void {
const t1 = try std.Thread.spawn(.{}, thread1, .{});
std.time.sleep(1000000);
const t2 = try std.Thread.spawn(.{}, thread2, .{});
defer t1.join();
defer t2.join();
}
The output of the program is this:
error: thread1 read predicate
error: thread2 wrote predicate
error: thread2 signaled
error: thread1 waits
As you can see thread2 signals before thread1 started waiting, so thread1 starts forever.
And if thread2 locks the mutex (uncomment the code), then the signaling of thread 2 cannot happen between thread1 reading the predicate and successfully waiting on the condition. This solves the deadlock.
A great thing to remember is that threads can come along at any point in time. I can understand this example being very confusing if you imagine them both starting at the same time - but if you think about them being totally independent of each other and starting at different times, then deadlocks are easier to imagine.
I think the mental picture comes from the fact that we’re used to things being in lock-step from single threaded material. So when I read that the first time, that’s how my brain read it - one statement after another and very orderly. Took me a minute to think of a scenario, too.
This deadlock should never happen if thread-2
sets condition=true
before calling signal. In this case while(!condition) c.wait(&m)
in thread-1
never calls wait
.
The same std/Thread/Condion.zig
has a working example that is not supposed to deadlock:
But the code from doc comment didn’t show any such condition.
And even if it did, there is no guarantee that there is a global order between memory reads and writes. We could still have the following case:
- thread2 sets
condition=true
, and sends it to main memory - thread2 calls signal
- thread1 locks the mutex
- thread1 reads
condition=false
from main memory (or from one of its caches) condition=true
reaches main memory- thread1 waits (forever)
I think the mutex.lock();
should come before predicate=true;
so that thread1
while-loop sees consistent predicate
value.
Yeah, either that or it could use Unordered
atomic reads/writes for the predicate.
Though I think for bools it isn’t actually necessary.