Let’s say I want to apply a function to every item of a slice. This function is pure, there is no data dependency.
In my use-case the function needs a buffers to operate, so I also pass it
for(items) |*item| {
foo(item, buffer);
}
My first approach was to use a thread pool and do a round-robin scheduling. (The items are a few KB in size, so there are no problem of false sharing)
I’ve used the Thread.Poll of std
var wg = std.Thread.WaitGroup{};
const Job = struct {
start: usize,
end: usize,
stride: usize,
fn run(job; Job, comptime func: anytype, items: []T , common_args: anytype) void {
var i: usize = job.start;
while (i < job.end) : ( i += job.stride) {
@call(.auto, func, .{items[i]} ++ common_args);
}
}
};
for(0..n_jobs) |i| {
const job: Job = .{
.start = i,
.end = items.len,
.stride = n_jobs,
};
pool.spawnWg(&wg, Closure.run, .{ job, foo, items, .{ buffers[i] } });
}
pool.waitAndWork(&wg);
However this form of static scheduling has the problem that on unbalanced loads it doesn’t use the cpu cores effectively.
The solution is to use a dynamic scheduler, however I’m not experienced in building one.
Are there implementation in zig? There is something in std that I’ve missed?