Un-modified mutable function argument should be a compile error

kj4tmp · November 9, 2024, 7:58am

Why is the following not a compile error?

pub const MyStruct = struct {
    a: u8,

    // this function accepts a pointer to mutable data,
    // but it does not mutate the child data of the pointer.
    // why is this not a compile error?
    pub fn getPlus1(self: *MyStruct) u8 {
        return self.a + 1;
    }
};

I think this would help me minimize my API surface if this was a compile error.
Taking a mutable pointer for an API can impact users, forcing them to unnecessarily mark structs as var and wasting memory that could be re-used if it was const. This was a common mistake for me when first learning zig, and led to me re-factoring code (which was fairly easy but could have been avoided).

If I needed to reserve the right to change the data, I could use some sort of _ syntax:

pub const MyStruct = struct {
    a: u8,

    pub fn getPlus1(self: *MyStruct) u8 {
        _ = &self; // not sure what this should look like
        return self.a + 1;
    }
};

IntegratedQuantum · November 9, 2024, 9:38am

I think there are some edge-cases that make this hard or impossible to do reliably.
Consider the following comptime function:

fn callMemberFunction(comptime T: type, t: *T) void {
    t.fun(); // Bad: we get an error depending on the parameters of T.fun
}

Or consider the following struct:

const MyStruct = struct {
    member: if(builtin.os.tag == .windows) [1024]u8 else []u8,
    fn modifyMember(self: *MyStruct, index: usize, val: u8) void {
        self.member[index] = val; // Bad: Compiles on windows, but errors on other platforms.
    }
};

And even worse:

const MyStruct = struct {
    member: @import("module").MemberType, // The build system could switch out the underlying files.
    ...
}

So, in conclusion: The compiler cannot reliably produce this error if the type, or any of its members, depend on comptime code, or were taken from another module.
Effectively you could only have such an error if your struct does not contain members from the standard library or any other module and has a straight-forward definition.

I think this would be better suited for a linter, which can catch all the common cases, without having to worry about breaking compilation in some unlikely edge cases.

msw · November 9, 2024, 9:49am

cf. Zig and Liveness of Code - #15 by mnemnion

dee0xeed · November 9, 2024, 10:58am

What if we temporarily add const to an argument? func(self: *T) becomes func(self: *const T) and then…

if the code modified in this way does not compile then unmodified code is correct, an argument must be a pointer to mutable data, no const needed
otherwise (modified code does compile) original code is “incorrect”, we should add const.

Though I am not sure if this dumb method will work for the examples given by @IntegratedQuantum.

mnemnion · November 9, 2024, 10:28pm

In a way it’s worse than that, the compiler probably could produce fugitive and hard to diagnose errors along only some code pathways, which could only be resolved with a fugitive _ = &param; to ‘mutate’ the inconsistently-constant parameter.

I agree that it would be a good thing to lint for, though. I’d like to have something I can use to search member functions for receivers which could be *const, it helps me keep track of what’s going on.

But I don’t think we want the compiler making this an error on a best-effort basis. If it can’t cover all the bases, it’s a job for some other tool.

kj4tmp · November 10, 2024, 5:10am

Zig already promises to do this reliably. For example, the following does not produce a compile error:

pub fn foo(bar: u8) u8 {
    if (false) {
        return bar + 1;
    }
    return 1;
}

test foo {
    _ = foo(3);
}

What makes this inherently harder than checking if a parameter is not used?

IntegratedQuantum · November 10, 2024, 9:13am

To know if a parameter is used, the compiler needs to check if the value is accessed somewhere, basically the compiler can just go through the function and check if bar is used as an identifier anywhere inside the function.

To know if a pointer should be const or not, we need to know if all the function calls involving that pointer have a const pointer parameter or not. Furthermore we need to know if all function calls that could be generated have a const pointer parameter, and this is where things get difficult, because we need to analyze (not just check the syntax) all possible comptime paths the function call can take. Which like I said is impossible, since the build script could also swap it for a different implementation.

Now you could argue that we could just say that all function calls, and all other edge just, are just assumed to always take a mutable pointer. However I think that this would just make it rather useless.

mnemnion · November 11, 2024, 6:39pm

A lot of ‘eager’ things in Zig are impossible or impractical, because the language uses lazy compilation.

I think it would be lazy-possible, but that we don’t want that. Because as you point out, a pointer might be const in one flavor of compilation, and variable in another, and so you’re happily chugging along using the non-const variant, then compile the const variant and it won’t compile for you.

So at that point we need to insert a _ = &ptr; for no other reason than to handle something which was fine without the additional liveness rule.

It’s a kind of thing where a linter can catch the easy cases, and be valuable as such, but we want the compiler to handle things correctly or not at all, and correctness for all cases is out of reach.

chung-leong · November 12, 2024, 1:52am

One of the trickiest problems with pointers passed to functions is that that a function call can cause side-effects in the in the future. Consider the following code:

var a_ptr: *StructA  = undefined

fn a(ptr: *StructA) void {
    a_ptr = ptr;
}

fn b() void {
    a_ptr.world = 123;
}

Function a() doesn’t modified what ptr points to, but calling it enables b() to modified that memory subsequently.

amesaine · November 16, 2024, 3:01pm

IntegratedQuantum:

const MyStruct = struct {
    member: if(builtin.os.tag == .windows) [1024]u8 else []u8,
    fn modifyMember(self: *MyStruct, index: usize, val: u8) void {
        self.member[index] = val; // Bad: Compiles on windows, but errors on other platforms.
    }
};

I don’t understand how this produces a compilation error. I can only reproduce runtime panics.

IntegratedQuantum · November 16, 2024, 3:10pm

My wording was a bit imprecise here: It would cause a compilation error, if the proposed compile error was implemented. But it isn’t implemented, and probably won’t be exactly because of cases like this.