Are mutable global variables actually required?

Is there a concrete use-case where a zig mutable global is strictly required that cannot be accomplished by a stack allocation and “threading it through” the rest of your application?

I am wondering if global variables (the zig ones) can just be deleted all together.

They are part of the C FFI so extern vars likely have to stay, but what about zig global vars?

Could some compiler (and human) simplicity be gained by it potentially being easier to determine if a function is pure?

I tried searching the zig codebase for usages of pub var, and there are only 26, none of which are very convincing to me. Some examples:

  • std.testing.allocator, could be a stack allocation at the beginning of your test?
  • same for the rest of std.testing
  • std.os.argv, this could be a function that returns a pointer? it comes from the OS anyway. Same for std.os.environ.

All zig files can be represented as structs, so “global variables” can essentially be thought of as the equivalent to static variables in a java class within a zig struct. This is useful for when you’re implementing thread-local singletons, like the SMP allocator in the standard library.
In theory, I guess you could pre-allocate on the stack at runtime, but that’d cause unnecessarily for you to have to pass memory via a function to initialize structs that would otherwise be able to be statically initialized.
Also, when doing embedded development, I have found it useful for when I’ve needed a reserved section of volatile memory for memory mapped IO, since you can have the linker set that memory to the desired memory address. (Though, admittedly, you can also just create the pointer from an integer within the code.)

1 Like

Oh wait, you wrote pub var. I thought you meant all top-level vars.

I found this use-case from myself which is run-time log level. I want the user to pass the log level as a command line option:

var log_level: std.log.Level = .warn;
pub const std_options: std.Options = .{
    .log_level = .debug, // effects comptime log level
    .logFn = logFn,
};
fn logFn(
    comptime message_level: std.log.Level,
    comptime scope: @TypeOf(.enum_literal),
    comptime format: []const u8,
    args: anytype,
) void {
    if (@intFromEnum(message_level) <= @intFromEnum(log_level)) {
        std.log.defaultLog(message_level, scope, format, args);
    }
}

pub fn main() !void {
    log_level = ... // parse log level from command line arguments
}

But nothing technically stopping me from threading a log implementation throughout my application, this is just a tautology, “there is a global log implementation therefore it requires a global variable to configure”.

Eh, once you go down this road to start getting into that functional programming territory…

I want to point out that pub var is not really a good enough heuristic for finding mutable global state because it only covers specifically variables that are completely exposed to the world without any protection. Usually, you don’t want to expose global state as regular variables, you want to hide it behind some function to prevent it from being misused in a way that violates assumed invariants, or not even expose it in any meaningfully user-visible way at all.

For example, consider the mutex used when std prints to stderr (e.g. std.debug.print()):

On some platforms (e.g. macOS, Android or the web via Emscripten) the main function hands off control to an external event loop which then calls into user-provided callbacks (like a frame callback or event handlers).

That means you can’t keep application state on the stack, since the main function stack isn’t accessible by the callbacks. You can at best put the state into a heap allocation, but then you somehow need to communicate the pointer to that state on the heap to your callbacks, and not all platforms have the concept of a userdata pointer, so you’ll need to place the pointer to your state in a mutable global variable. And at that point, why not simply keep the whole state in a global.

Rust doesn’t have mutable global variables (without unsafe at least AFAIK), and that really sucks on such callback-driven platfoms.

The usability problems this causes on Rust in general would be my argument for allowing global variables in Zig, as it does currently. It isn’t just the special cases where globals are really needed. It’s also just convenience in cases where you know it is safe (as opposed to being able to prove it is safe to the compiler, as is necessary with Rust).

Doesn’t this fit with Zig’s philosophy of giving the developer full control?

3 Likes

Yeah on Rust it’s kinda understandable because the borrow checker needs a single entry point and ‘origin’ for each memory location, but with such event handler callbacks you have many independent entry points (from the point of view of the borrow checker), each accessing the same global state and that messes up the whole idea of enforcing ‘single writer - multiple readers’.

But Zig doesn’t have a borrow checker or ownership/lifetime tracking, so there’s no point in not allowing mutable globals IMHO.

Incidentally, threadlocal var is also a mutable global variable. Of course, there’s some debate about its usage, but I still quite like using it for crash reporting.

This is a great question to ask - it shows that you’re examining fundamental assumptions critically, which is the hallmark of transitioning into being an advanced programmer.

I think that your implication here is spot on - when you can use stack allocation and “threading it through”, global variables are strictly worse.

But there are cases where global variables are a low-level primitive, without which it is impossible to implement certain things. For example, if you have a look at lib/std/heap/SmpAllocator.zig:

var global: SmpAllocator = .{
    .threads = @splat(.{}),
    .cpu_count = 0,
};
threadlocal var thread_index: u32 = 0;

Ask yourself the question, is possible to implement this allocator without global variables (and threadlocal variables)?

5 Likes

Perhaps when you don’t even have a stack or that stack is super-small?

You could allocate all function scope variables in static memory as long as they are not called recursively. Lots of embedded C compilers did this for a long time.

Now, with any processor with a cache, that’s not a great idea from a performance standpoint. Stack memory is always going to be hot in cache while static memory is always going to be cold in cache.

You can even go further and thread the stack through your application like “continuation passing style” (CPS) does.

A reactive signal library would be best built using a global variable or at least thread local global. It might be not practical to build one due to lack of lambda function in Zig. But if one is taking on the task, global variable will make life much easier.

This does seem like a bad idea. Adding those extra 3 lines to every single test block in the standard library (and all user code), is a lot of needless work and adds extra noise to the code. Plus it is also a possible cause for bugs to sneak in (e.g. maybe you forget doing the leak check).

And all that for what gain? How many people have you seen on this forum who had problems because they edited std.testing.allocator? I’ve not seen a single one, but on the other hand I’m sure there’d be many complaints if it were removed.

As I mentioned in an earlier post, pointers to global variables can be passed as comptime arguments. This allows you to synthesize new functions through comptime binding. Can’t do that with variables in main, even though their positions are invariant through the program’s lifetime too.

2 Likes