Invalid use of Optional C-Pointer

AndrewCodeDev · July 3, 2024, 4:02pm

Null and Pointer Values

Zig pointers are safety checked to ensure that the address of 0 is not allowed. Clasically, this is called null-ptr and is often used to denote a pointer that is not currently pointing to a valid address. To denote such a pointer in Zig, we use optional types:

// optional pointer with value
const ptr: ?*u8 = &byte; // take address of byte

// optional pointer that's null
const ptr: ?*u8 = null; // set to null

Note that zero-addresses can be allowed: Documentation - The Zig Programming Language

C-pointers and the Zero Address

Note:

C-pointers are discouraged. Instead, prefer to use the correct Zig-pointer for your task.

C-style pointers do not have this same restriction and can infact use the value 0 without safety checks:

// classic null-ptr
const ptr: [*c]u8 = @ptrFromInt(0);

These pointer types can still be used as if they were optionals:

const ptr: [*c]u8 = @ptrFromInt(0);

// check if pointer has value of 0
if (ptr == null) ... 

// also works with orelse syntax
_ = ptr orelse return null;

Footgun - Optional C-pointer

Consider the following example:

const p1: [*c]u8 = @ptrFromInt(0);
const p2: [*c]u8 = p1;
return p2 == null;

Obviously, we can see that this will in fact return true because p2 is just a copy of p1 and p1 has the zero-address.

Now consider adding an optional qualification:

const p1: [*c]u8 = @ptrFromInt(0);
const p2: ?[*c]u8 = p1; // is optional now
return p2 == null;

The user may anticipate that p2 should now be null because it’s being assigned to from a pointer with the zero-address. This is not true though.

In the last example, p2 is now an optional pointer - it either holds a pointer value or the value of null. In this case, the optional is not null but the pointer itself has the value of a null-ptr.

Dereferencing the value contained by the optional will create a segfault:

p2.?.* // segfault

This can cause sneaky bugs where the user believed they checked for a null value, but they checked the optional itself, not the value of the pointer that was contained by the optional.

Recommendation

Be careful about adding optional qualifiers to [*c] type pointers. If you want to see if the [*c] pointer has the value of 0, just check by directly comparing it to null as in ptr == null or use an orelse.

In short, don’t confuse the optional-pointer’s null value for the address of the pointer itself when dealing with C-pointers.

Only use optional C-pointers ?[*c] when you intend to convey that there may not be a pointer assigned to that value. Ideally use a concrete normal optional pointer instead (like e.g. ?[*:0]u8 if the data is a zero terminated string where the pointer itself can be null).

C-pointers can be coerced to concrete normal Zig pointer types and optional pointer types, using the more concrete Zig types, helps in avoiding invalid uses. This requires you to understand what the more concrete Zig type is (this is different based on the example) and declare that type in your Zig code, that way Zig can enforce the correct usage. Also read the Language Reference: C-Pointers for more details.

Here is an example where a complex pointer type is described exactly on the Zig side: Null terminated array of strings from C function - #12 by castholm

Sze · July 3, 2024, 4:33pm

Language Reference: C-Pointers

It seems like we should add some of this or references to the Language Reference:

This type is to be avoided whenever possible. The only valid reason for using a C pointer is in auto-generated code from translating C code.

Coerces to other pointer types, as well as Optional Pointers. When a C pointer is coerced to a non-optional pointer, safety-checked Undefined Behavior occurs if the address is 0.

Note that creating an optional C pointer is unnecessary as one can use normal Optional Pointers.

Does not support Zig-only pointer attributes such as alignment. Use normal Pointers please!

Basically adding a bit of a disclaimer that we shouldn’t use so much c pointers to begin with and instead use proper Zig types in Zig code.

AndrewCodeDev · July 3, 2024, 4:38pm

I actually disagree with that sentiment because there are several cases (especially with iterators) when moving a pointer backwards is far easier to do and reason about than using unsigned values that have to rely on saturating arithmetic (or additional logic) around the value 0. In those cases, additional work has to be done to check if we’re past the beginning of a slice. A big part of safety is how easy something is to reason about, so I disagree with the language reference here.

However, for a non-arithmetic pointer, I can agree. You should get way from the C pointers as fast as you can in that case. I think a good practice is “use standard Zig pointers first” - I can get behind that. If you want to add something like that to the doc, then I’m in support.

Also, note that in the case which spawned this doc that needed to inspect a c pointer, that advice would not have changed the outcome. The footgun was still present.

Sze · July 3, 2024, 4:53pm

I think I agree if the pointer arithmetic needs to go below the pointer/beginning, if it only needs the positive direction then multi-item pointers already handle that.

I disagree here.
I used pub extern "c" fn readline([*c]const u8) ?[*:0]u8; to declare the pointer type as a Zig type at the function definition side and that fixes the bug, the same thing could be done manually at the call site with const maybe_temp: ?[*:0]u8 = rline.readline(@ptrCast(prompt[0..].ptr));
This works because of the 2nd quoted part about coercing to optional pointers.

The footgun was present because the c pointer was used, with the zig pointer it disappears.

AndrewCodeDev · July 3, 2024, 5:01pm

Okay, I can see your point about also doing a pointer conversion with a null terminator. I haven’t tried that myself but if you say that works then I’ll buy it. We just have to be clear that the pointer has to be taxonomically correct for what’s being attempted.

In that case, I think we should definitely add a section about pointer conversions in the “Recommedation” segement and link to the language reference.

Sze · July 3, 2024, 5:45pm

I am still not sure about that sentence because [*c] can already be a null pointer, so I don’t think ?[*c] is useful/helpful. It makes it more confusing to me, I think it is better to just have [*c] or a concrete ?*T.

castholm · July 3, 2024, 5:50pm

Relevant: make the C pointer type always a compile error unless the file declares that it is generated code · Issue #2984 · ziglang/zig · GitHub

I think the best general advice is that you should never ever even write a line of code that contains the [*c] token in the first place. The sole exception is if you’re writing tooling that automatically translates C code.

A correctly annotated pointer is always the better solution and if you’re unsure what you should coerce or cast the result to, you should either consult the relevant header files or docs for more information or assume a conservative option like ?*T

AndrewCodeDev · July 3, 2024, 5:56pm

@Sze, @castholm, I agree on principle that we should be encouraging Zig pointers first. There’s a few important caveats to that though.

First, it’s entirely possible to write ?[*c]. Because of that, I think it’s valuable to let people know what the consequences are because that actually helps people understand why they are problematic to begin with. While I agree that we should discourage their use in general, I don’t think that should be a reason to be silent about them because they are a valid type that exists.

We should probably add a line at the top that says “you should avoid writing code like this to begin with and use the correct Zig-pointer instead”.

Yeah, it’s probably a mistake to begin with. That said, again, it’s a valid type so I see no reason why we shouldn’t document what they are and how they’re used.

AndrewCodeDev · July 3, 2024, 6:03pm

I’ve added a note that discourages the use of C-pointers and instead encourages the use of the correct Zig pointer.

dude_the_builder · July 3, 2024, 6:23pm

Just for completeness, maybe add:

(Unless allowzero is used.)

AndrewCodeDev · July 3, 2024, 6:38pm

Yeah, we can put a note in for is_allowzero for posterity.

pierrelgol · July 4, 2024, 5:08pm

thanks to the help of everyone I now understand this better, but I don’t think that it’s obvious that [*c] can be null, when it doesn’t share the ? syntax. I know this is skill issue on my end, but I’m still not sure why [*c] is not expressed ?[*c]. It feels like an inconsistency. I know we aren’t supposed to use those types, but at the same time when you are a beginner like me I didn’t even think of declaring the prototype like you suggested. Maybe it’s because I’ve been severly abused by C in general where the slightest change to a function prototype makes the function suddenly disappear, but I don’t think that it’s very intuitive.

Yesterday before asking on ziggit I used this part from the doc as a reference [*c] T : Coerces to other pointer types, as well as Optional Pointers. When a C pointer is coerced to a non-optional..

which is why I tried to coerce it to ?[*c], because it said that it could coerce to an optional pointer, which in my little brain, meant that it wasn’t an optional yet, but that it could be if helped a little.

Sze · July 4, 2024, 6:02pm

I think in that case you didn’t actually coerce it to an optional, but instead assigned the zero valued pointer to an optional, I agree that this case is confusing. I also find it a bit unsatisfactory that the syntax isn’t ? can have zero and everything else can’t.

But I think the reason for that is, that sometimes certain syntax constructs can be used with different concrete types which can make it difficult to realize what exactly is going on sometimes.

Personally I think the best way to untangle these hard to grasp situations is to make generous use of @compileLog(@typeInfo(@TypeOf(some_variable))) and variations of it and then carefully read, whether the exact types are really what you expected, including looking at details like is_allowzero.

const a: [*c]u32 = @ptrFromInt(0);
const b: ?[*c]u32 = a;
@compileLog(b); // @as(?[*c]u32, @as([*c]u32, @ptrFromInt(0)).*)

This gives a fairly good description, but I think as soon as things are more runtime dependent you quickly get less informative results. So having some specialized utilities for printing type information and the data of variables at run-time could also help.

std.debug.print("{?*}\n", .{b}); // u32@0

Doesn’t seem like a great output until you realize that it is actually telling you that the value of the optional is a u32 zero pointer.
Because a null optional just looks like this:

const c: ?[*c]u32 = null;
std.debug.print("{?*}\n", .{c}); // null

Overall I am unsure what can be done to make this less foot-gunny, I guess the issue mentioned by @castholm about making using of c pointers more inconvenient, could help.

Until then you can try to avoid c pointers as much as possible, or be very careful with them and inspect the types and values carefully.

I also think that defining more concrete function signatures is very valuable to avoid problems, that said I have only used it a little bit, but it is similar to the tradeoff between dynamic typing and static typing, the former can be faster until you get problems and then you wish you had done the latter to begin with.

Also depends a lot on how understandable and predicatable the C APIs are and how their types cross over into the Zig side, when you know what to expect you can use some of the more loose c types directly, when you aren’t completely sure it may be better to type things more concretely.

pierrelgol · July 4, 2024, 10:00pm

Yes I agree with you, and in the future, I think I’ll try harder to define C function prototype using Zig’s type system instead of taking the translation as is, definitely a rookie mistake on my end. In any case as always thanks everyone.

squeek502 · July 4, 2024, 11:49pm

I don’t think this should necessarily be the takeaway. The function prototype containing [*c] is fine, but using or retaining the [*c] type in your Zig code when calling that function is a mistake.

In other words, this would be nicer with Zig pointer types in the binding, but this is still fine:

// autogenerated
extern fn foo(ptr: [*c]const u8) [*c]const u8;

// your code
fn fooCaller(
    /// Needs a NUL-terminated sequence
    something: [:0]const u8,
) void {
    // Returns a NUL-terminated sequence
    const result: ?[:0]const u8 = std.mem.sliceTo(foo(something), 0);
    _ = result;
}

but something or result having a type involving [*c] is definitely worth avoiding.

pierrelgol · July 5, 2024, 12:49pm

Yes this was the original intent indeed, if you look at my readline function, it was a thin wrapper around the original readline, and don’t worry I’ve been bullied enough by C that I don’t want to use it’s type system if I can avoid it ahah. But now I know better about the true meaning of [*c]T, and everyone shared some really good tips including you

chung-leong · July 5, 2024, 4:02pm

To add to @AndrewCodeDev’s point, it’s worth pointing out that an optional C pointer is larger than a C pointer, unlike the other pointer types:

const std = @import("std");

pub fn main() void {
    std.debug.print("@sizeOf(*i32) == @sizeOf(?*i32) -> {any}\n", .{@sizeOf(*i32) == @sizeOf(?*i32)});
    std.debug.print("@sizeOf([]i32) == @sizeOf(?[]i32) -> {any}\n", .{@sizeOf([]i32) == @sizeOf(?[]i32)});
    std.debug.print("@sizeOf([*]i32) == @sizeOf(?[*]i32) -> {any}\n", .{@sizeOf([*]i32) == @sizeOf(?[*]i32)});
    std.debug.print("@sizeOf([*c]i32) == @sizeOf(?[*c]i32) -> {any}\n", .{@sizeOf([*c]i32) == @sizeOf(?[*c]i32)});
    std.debug.print("@sizeOf(*allowzero i32) == @sizeOf(?*allowzero i32) -> {any}\n", .{@sizeOf(*allowzero i32) == @sizeOf(?*allowzero i32)});
}

@sizeOf(*i32) == @sizeOf(?*i32) -> true
@sizeOf([]i32) == @sizeOf(?[]i32) -> true
@sizeOf([*]i32) == @sizeOf(?[*]i32) -> true
@sizeOf([*c]i32) == @sizeOf(?[*c]i32) -> false
@sizeOf(*allowzero i32) == @sizeOf(?*allowzero i32) -> false