Is there a way to kill/cancel a thread in Zig?

pedropark99 · September 9, 2024, 1:42pm

Hey! My main question is: “Is there a way to cancel or to kill a thread in Zig?”

I’m currently reading the source of the Thread struct in the Zig Standard Library to understand how threads are implemented and used in Zig. I was looking for a method that could kill or cancel a thread, like kill(), or cancel(), or even a deinit() method. And I did not find such methods.

For example, in the pthreads library in C, we can kill a thread by sending a SIGTERM signal to it, through the pthread_kill() function.

Anyway, I was just wondering if there is a similar way to kill or cancel a thread in Zig.

dimdin · September 9, 2024, 1:58pm

Hi @pedropark99
Welcome to ziggit

No, there is no generic way to terminate a thread in zig.
Of course you can use platform specific ways, e.g. calling windows kernel32.dll ExitThread.
But the best way to do it is to simply return from the thread function.

dude_the_builder · September 9, 2024, 2:03pm

A pattern I’ve found useful is to define an atomic bool global:

var running = Atomic(bool).init(true);

and then your thread code would be in a loop:

while (running.load(.monotonic)) {

and then you can make them all stop by just setting to false:

running.store(false, .release);

pedropark99 · September 9, 2024, 2:44pm

Uhmm I see what you mean. To use a while loop inside the thread to control the execution. It is an interesting way to solve the problem. Thank you!

pedropark99 · September 9, 2024, 2:45pm

Thank you @dimdin !

dude_the_builder · September 9, 2024, 2:58pm

I think the while loop with atomic bool could cause high CPU usage depending on the amount of computation done inside the loop. Maybe it could be signaled using a std.Thread.Condition , but I’m not sure, haven’t tried it. Maybe some concurrency expert in the community could chime in with a recommendation.

gonzo · September 9, 2024, 3:24pm

In general, it is not a good idea to asynchronously kill a thread; the risk of dangling resources is too high. It is much better to create some form of notification (such as the bool suggested by @dude_the_builder, or a pipe where you write something when you want to terminate).

pedropark99 · September 9, 2024, 4:22pm

I agree with you @gonzo. I’m currently writing an open and introductory book about Zig: GitHub - pedropark99/zig-book: An open and introductory book for the Zig programming language (🚧 in construction 🚧)

So I wanted to know if there is a generic method to kill/cancel a thread in Zig, with the objective of recommending the readers to not use it , and describing which are the better/safer alternatives.

Thanks everyone for the help!

chung-leong · September 9, 2024, 5:03pm

Is the use of atomic operations really necessary here? We’re not really doing any sort of synchronization. As long as the value shows up at the other CPU some point, it should be sufficient. A check on a simple boolean variable should essentially be free.

LucasSantos91 · September 9, 2024, 5:33pm

It is, and the minimum ordering should be monotonic. Without this, the compiler is free to assume that the value won’t be modified somewhere else, and therefore it is free to, amongst other things, store the value in a register, or even precompute the value at compile time. In doing so, it is possible for the value to never be updated.
I’ve had a couple of infinite loop bugs caused by this. They are incredibly hard to track down.

chung-leong · September 9, 2024, 5:59pm

Even if we’re dereferencing a boolean pointer received as an argument? That’s how I imagine it would work in practice. The thread would receive a pointer for the purpose of cancelation. Marking the pointer as volatile should be sufficient, right?

dimdin · September 9, 2024, 6:20pm

No it is not sufficient.
When you have two or more threads accessing the same memory location and one thread writes you have a race condition.
What happens when you have a race condition is undefined behavior.
What can help you prevent race conditions are locks (mutexes, condition variables) and atomics (memory ordering).

LucasSantos91 · September 9, 2024, 6:25pm

From the Zig documentation:

Note that volatile is unrelated to concurrency and Atomics. If you see code that is using volatile for something other than Memory Mapped Input/Output, it is probably a bug.

Sze · September 9, 2024, 8:58pm

While I agree with this for cases where you would want to use another value that either hasn’t been sent yet, or has been written to where it will be read from, this case seems different to me.

Here the code just wanted that the thread will eventually read the value true (in whatever iteration of the while loop), so I am wondering a bit if this advice is just people wanting to make extra sure to make it well defined for (other) common cases.

This answer looks to me more like this answer: Why is volatile not considered useful in multithreaded C or C++ programming? - Stack Overflow
In this particular case we don’t actually care about the second point, which makes the statement that:

The problem with volatile in a multithreaded context is that it doesn’t provide all the guarantees we need. It does have a few properties we need, but not all of them, so we can’t rely on volatile alone.

untrue for this case, because we don’t read any data, we just want the thread to quit eventually.

But playing devil’s advocate: Isn’t this case essentially receiving a message from memory (eventually) that it is ok for that thread to shutdown now? Thus it is as if this was a received Memory Mapped Input.

Only when you also want to read some value that was created by that thread you then need something to syncronize/order before then accessing whatever value is being shared.

I wouldn’t be surprised if this would cause the thread to stay alive a little bit longer then if it syncronized on the value with .monotonic, but if the code doesn’t care if the shutdown happens a bit later, then you could argue that using more syncronization than necessary for the entire run time, is worse than it staying around for a bit longer (caused by reordering we don’t have to care about as long as there aren’t any reads of other data being done).

I think (and I might be wrong, because I just have a bit of knowledge fragments assembled from various things I read or listened to) that a more interesting question might be: Could using volatile (which isn’t a locking mechanism) instead of other syncronization primitives cause a bigger performance bottle neck?
or said another way: Could using a monotonic lock be actually faster than forcing the thread to go all the way out to main memory all the time?

I don’t really know, but my suspicion is that clever architectures might be really good at avoiding contention and optimizing locks and the going to main-memory is such a big slowdown that can’t be avoided once you have forced it, that the better reason for not using volatile might be simply that it causes so much slowness, that you are better off betting on another horse, which would be just using a syncronization mechanism (what ever you can get away with).

Just to repeat, I am not an expert on this, if I am wrong I want to know why.

chung-leong · September 9, 2024, 9:01pm

As I said, we’re not dealing with synchronization here. Unless you’re working on a multi-threaded nuclear launch system, immediately halt to execution is probably not imperative. The request to stop is basically advisory in nature. We know that whatever the thread is working on will be discarded anyway so there’s no point in continuing. If it continues for a while longer, it’s not a problem.

LucasSantos91 · September 9, 2024, 9:25pm

From the point of view of the language, concurrent read/write is undefined behavior. No language designer, nor LLVM, is going to promise you that the program will do what you’re expecting.
With that said, I can’t think of any code transformation that would break the assumption you’re making. Maybe volatile would suffice.

chung-leong · September 9, 2024, 11:21pm

We’re making an assumption about cache coherency here. Are there commercial computers these days that don’t implement that? As long as you’re not programming some esoteric machine employing an atypical NUMA architecture, the approach should work.

Tosti · September 10, 2024, 6:33am

Your load is monotonic

But store is release. What’s the reasoning behind that? IIUC, this can’t provide more synchronization guarantees than monotonic-monotonic ordering.

From my understanding, it should be

Monotonic store, monotonic load if you are not synchronizing other data by these atomic operations.
Release store, acquire load if you are syncrhonizing other data by these atomic operations.
Seq_cst store, seq_cst load if other atomics are involved and you need a total ordering for all of them.

Release store, monotonic load gives you a release-acquire ordering only if there is an acquire fence after this monotonic load. Otherwise it’s like a monotonic ordering, but slower.

mocompute · September 10, 2024, 9:55am

You are correct, this is the typical way to manage a long-running thread, say which accepts messages to do some work. You loop on an atomic bool for cancellation/quit, and you sleep on a condition variable inside the loop. Have to be careful about spurious wakeups, but otherwise it’s very straightforward.

dude_the_builder · September 10, 2024, 11:15am

I believe that in previous Zig versions, if you tried a store with monotonic, you got an error stating that it had to be a type of release ordering. But I just tried it with monotonic and it works in 0.14 dev, so good to know. Thanks.