Pointer to another thread stack variable

roign · April 9, 2024, 5:39pm

I know that returning pointers to stack variables from the same thread is a bad idea, but I was wondering if pointers to variables on a different thread’s stack are safe to use as long as the thread is still alive (i.e. before calling join or detach).
Testing a channel implementation I found that depending on how I run the tests (zig build test vs zig test file.zig) using pointers from different, still running, threads works when ran with zig test, but points to invalid memory when ran with zig build test.
I would expect thread stacks to be stable in memory until the thread is freed. Is that not the case?
Trying to figure out if there is something wrong with my build.zig or with the whole idea of sharing stack pointers between threads. Any help is appreciated. Thank you!

AndrewCodeDev · April 9, 2024, 5:47pm

Can you post your build.zig here? I think it would help if we can see your code.

roign · April 9, 2024, 6:17pm

Thank you for the prompt reply.

This would be my build.zig:

const std = @import("std");

pub fn build(b: *std.Build) void {
    const target = b.standardTargetOptions(.{});
    const optimize = b.standardOptimizeOption(.{});

    const zul = b.dependency("zul", .{
        .target = target,
        .optimize = optimize,
    });
    const zul_module = zul.module("zul");

    const lib = b.addModule("lib", .{
        .root_source_file = .{ .path = "src/root.zig" },
        .target = target,
        .optimize = optimize,
    });

    lib.addImport("zul", zul_module);
    lib.linkSystemLibrary("c", .{});
    lib.linkSystemLibrary("sqlite3", .{});

    const exe = b.addExecutable(.{
        .name = "exe",
        .root_source_file = .{ .path = "src/main.zig" },
        .target = target,
        .optimize = optimize,
    });

    exe.root_module.addImport("lib", lib);
    exe.root_module.addImport("zul", zul_module);

    b.installArtifact(exe);

    const run_cmd = b.addRunArtifact(exe);

    run_cmd.step.dependOn(b.getInstallStep());

    if (b.args) |args| {
        run_cmd.addArgs(args);
    }

    const run_step = b.step("run", "Run the app");
    run_step.dependOn(&run_cmd.step);

    const lib_unit_tests = b.addTest(.{
        .root_source_file = .{ .path = "src/root.zig" },
        .target = target,
        .optimize = optimize,
    });

    lib_unit_tests.root_module.addImport("zul", zul_module);
    lib_unit_tests.root_module.linkSystemLibrary("c", .{});
    lib_unit_tests.root_module.linkSystemLibrary("sqlite3", .{});

    const run_lib_unit_tests = b.addRunArtifact(lib_unit_tests);
    run_lib_unit_tests.step.dependOn(b.getInstallStep());

    const exe_unit_tests = b.addTest(.{
        .root_source_file = .{ .path = "src/main.zig" },
        .target = target,
        .optimize = optimize,
    });

    exe_unit_tests.root_module.addImport("lib", lib);
    exe_unit_tests.root_module.addImport("zul", zul_module);

    const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests);
    run_exe_unit_tests.step.dependOn(b.getInstallStep());

    const test_step = b.step("test", "Run unit tests");
    test_step.dependOn(&run_lib_unit_tests.step);
    test_step.dependOn(&run_exe_unit_tests.step);
}

dimdin · April 9, 2024, 7:29pm

It is not always a bad idea.
The lifetime of the variable is the same as the function that declares it. Returning a pointer to stack variable works only if accessing the pointer happens before exiting the function that declares the variable.

It is safe as long as the function, that declares the variables, has not returned.

The thread stack is stable, but the contents of the stack can be changed if the function returns.

Stack is for local variables and function activation records. For sharing memory it is best to use the heap with some synchronization mechanism.

LucasSantos91 · April 9, 2024, 8:01pm

There’s nothing intrinsically bad about pointers to the same thread or to some other thread. Nor is heap safer than stack. It’s possible to have a pointer to heap memory and have that pointer become invalid, by freeing the memory before you are done with it. It all depends on the lifetime of the object being pointed at.

Consider this example:

fn launchThread(payload: anytype) void;

fn main() void{
  var data: u8 = 1;
  launchThread(&data);
  joinThreads();
}

This is perfectly valid. Since we’re joining before data goes out of scope, the launched thread can operate on it safely. It’s possible to do this even if the thread is long living, but you’re going to need to synchronization to replace the join.

const Counter = std.atomic.Value(u8);
Const Payload = struct{
  data: *u8,
  counter: *Counter,
};
fn threadMain(payload: Payload) void{
  // Do work.
  // When done, decrement the counter to signal it.
  _ = payload.counter.fetchSub(1, .Release);
}
fn launchThread(function: anytype, payload: anytype) void;

fn main() void{
  var data: u8 = 1;
  var counter = Counter.init(1);
  launchThread(
    threadMain, 
    Payload{ .data = &data, .counter = &counter}
  );

  // Sping until the counter reaches 0.
  while(counter.load(.Acquire) != 0){}
}

AndrewCodeDev · April 9, 2024, 8:56pm

I’m not seeing something obvious in your build that would cause this issue and everyone has already covered what I would say about threads. At this point, @roign I think we need to see your threading code.

The standard library tests threads safely - you can see some tests here in this file: zig/lib/std/Thread.zig at master · ziglang/zig · GitHub

roign · April 10, 2024, 7:35am

Thank you everyone for the insightful answers.
Indeed, it looks like the problem was that the thread’s main function returned even if the thread itself was still alive. Having the thread function wait for a close signal solves the issue.
This pointed me in the right direction:

It is safe as long as the function, that declares the variables, has not returned.

By “bad idea” I meant returning a pointer to a variable on a stack frame that is about to be freed.
My intention was to avoid the heap in hopes of reducing latency.

nyc · April 11, 2024, 8:59pm

I’ve done this before to avoid copying a large struct back to a main thread and I didn’t want to dynamically allocate it (there was a bunch of them and I didn’t know until the end which one would be returned).

Declare on the stack. When you want to return it, set a global or pass it to the other thread somehow (a queue, channel, whatever). Then sleep or wait on something before you return from the function that contains what you want to send back. If you are just going to exit the program and don’t care about joining the threads back together, ignore them, do what you have to do, then exit and they will all get torn down anyways. If you want to join them and exit them cleanly, signal the semaphore they are waiting on. Its super hacky, but something just the easiest thing to do.