`groupAsync` vs `Group.async`

The resources associated with each task

Is the important part, the group my have other resources depending on the implementation.

That guarantee is: that there are no resources being wasted for tasks already complete, which is important since it could (and often does) spawn an arbitrary number tasks. Without that guarantee it could, and did, use an excessive amount of memory which could easily be OOM on some systems.

I know, but it is very easy to misinterpret, so I clarified.

2 Likes

Sorry. I had an injury a few weeks ago that’s left me a bit sleep deprived and irritated. My comments have been more acerbic than usual, and I’ve tended to read the worst into things.

1 Like

Andrew updated the docs to reinforce that group async tasks are not required to run until await or cancel:

https://codeberg.org/ziglang/zig/commit/56265d6f9934f7321d2001c9d07ce82b0c9be126

I don’t understand it, as that would imply Select.async followed by Select.await is allowed to deadlock.

4 Likes

I think it’s not that they would deadlock, but more that the tasks are only then worked on, similarly to how I’ve tried to explain above.

1 Like

Just want to note that that commit was indeed made after reading this thread. Let’s discuss further… hopefully we can all build consensus on this topic.

3 Likes

If you want to keep the definition flexible for a possible future stackless coroutines that need someone driving them, then you would need to define a yield point, because the task would run when the user runs any blocking operation via the io instance. I don’t see any (real) possible implementation that would need to wait until group.await, as that would make the concept of group.async significantly less useful. Additionally, then Select.async doesn’t make much sense, unless the vtable includes select primitives again.

2 Likes

Io.Select.async should not be used when you’re going to spawn an unbounded amount of tasks. There’s a high probability of getting a deadlock due to “eager” execution of your async function call, which will stuck upon putting the result into the queue when its buffer is full.

2 Likes

I’m not completely sure if I fully understand your requirements, but I suspect the real need might be like this:
There are multiple tasks within a group, and they have an ‘async’ relationship, but they are not required to run concurrently.
However, between the main thread and this transaction itself, there is a ‘concurrent’ relationship, and you want this transaction to run in the background. Here, it may be necessary to layer the tasks.

Yes, close but not quite.

This is correct.

This isn’t. But this entire thread is basically a testament to my misunderstanding on where to put the async boundary in this API I create.

But just for completeness sake imagine you have something like this that always sets a key to some specific value(very much pseudocode). lookup and update internally would do async calls; abort and commit would do cancel and await respecively:

tsn = db.beginTransaction(io);
defer db.abort(io, tsn);

value = db.lookup(io, tsn, "some key");
If (value == 42) {
    return true;
} else {
    db.update(io, tsn, "some key", 42);
    try db.commit(io, tsn);
}

Because you want to decide on the value, you have get the result back and can’t wait for the abort or commit to “trigger/start” the (async) execution of the lookup. Because then it would just hang forever. So I need the guarantee that the operation will run before await/cancel is called.

And if you had something like this:

tsn = db.beginTransaction(io);
defer db.abort(io, tsn);

for (0..100, 100..200) |key, value| {
    db.insert(io, tsn, key, value);
}
try db.commit(io, tsn);

You would like for the inserts to be done in parallel. But it also isn’t wrong to do them serially. The async function of the Io VTable fulfills this guarantee. But the async one of Group doesn’t. This is both not consistent and can also lead to misunderstandings and subtle bugs.

And based on this discussion we had here in this thread I noticed, that I had the wrong idea on how to structure this. I will now let the user decide when to use nothing, async or concurrent and just internally support all of these cases. So these code snippets above would then conceptionally look like this:

tsn = db.beginTransaction(io);
defer db.abort(io, tsn);

value = db.lookup(io, tsn, "some key"); // sync operation
If (value == 42) {
    return true;
} else {
    io.async(db.update, .{tsn, "some key", 42});
    try db.commit(io, tsn);
}

and

tsn = db.beginTransaction(io);
defer db.abort(io, tsn);

for (0..100, 100..200) |key, value| {
    io.concurrent(db.insert, .{tsn, key, value});
}
try db.commit(io, tsn);

Again pseudo code. The operations still take an Io parameter because they of course need to do I/O to read/write/whatever.

1 Like

Sorry, this confuses me greatly. If you need to get the lookup result immediately, lookup should be executed synchronously rather than asynchronously. Or, even if you run it asynchronously, you must await it before getting its result

1 Like

Yes, I’ve also realized that and am trying to say this in the second part of my answer

1 Like

I agree that the synchronization/asynchronous/concurrency of operations within a transaction should not be determined by the transaction itself. However, note that when users control the asynchronicity of operations within a transaction, they still need to pay attention to receiving future resources and explicitly handle their await and cancel before the transaction commits. When executing multiple asynchronous tasks, it is best for users to construct a group themselves.

I think the related issues may be concentrated in infinitely looping task models (I believe such models should be concurrent rather than async), and in the transaction model there should not be this concern.

And you should generally not use an unbound amount of tasks (which includes connections) since that’s pretty much asking for getting DOS’d.

If I would make a proper server, I would put a semaphore somewhere.

2 Likes