How to express the fact that an imported C function can indeed return null, when the translated prototype doesn't reflect the nullability?

Hi everyone, ok so I’m playing around with a shell implementation of mine, and I’m trying to translate it in zig, so far so good, in this project I have to use readline to read the inputs, the problem is the prototype of the readline function doesn’t express the fact that it can return null ? so I’ve tried to express the nullability myself, but evidently failed to do so, and I was hopping I missed something, otherwise the rest is working as expected, I can type and use readline from zig no problem :slight_smile:

here’s the code sample.

const std = @import("std");
const rline = @cImport({
    @cInclude("stdio.h");
    @cInclude("unistd.h");
    @cInclude("fcntl.h");
    @cInclude("stdlib.h");
    @cInclude("readline/readline.h");
    @cInclude("readline/history.h");
});

pub fn readline(allocator: std.mem.Allocator, prompt: []const u8) !?[]const u8 {
    const maybe_temp: ?[*c]u8 = rline.readline(@ptrCast(prompt[0..].ptr));
    const temp = maybe_temp orelse return null;
    defer std.c.free(temp);
    const slice = std.mem.span(temp);
    const result = try allocator.dupe(u8, slice);
    return (result);
}

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer {
        if (gpa.detectLeaks())
            @panic("Leaks detected!");
        _ = gpa.deinit();
    }
    const allocator = gpa.allocator();
    const stdout_handle = std.io.getStdOut();
    const stdout_writer = stdout_handle.writer();
    while (try readline(allocator, "shell$>\n")) |line| {
        try stdout_writer.print("{s}", .{line});
        defer allocator.free(line);
    }
}

Voila, if anyone is able to help me I’d be very happy to hear how to do it, properly, I don’t even think that I’m doing everything right.

Completly forgot to share the error :sweat_smile:

thread 8632 panic: reached unreachable code
/home/pollivie/zig/0.14.0-dev.130+cb308ba3a/files/lib/std/debug.zig:412:14: 0x10375fc in assert (zshell)
    if (!ok) unreachable; // assertion failure
             ^
/home/pollivie/zig/0.14.0-dev.130+cb308ba3a/files/lib/std/mem.zig:1018:23: 0x103a44e in len__anon_7339 (zshell)
                assert(value != null);
                      ^
/home/pollivie/zig/0.14.0-dev.130+cb308ba3a/files/lib/std/mem.zig:792:18: 0x1034f51 in span__anon_6885 (zshell)
    const l = len(ptr);
                 ^
/home/pollivie/minishell/zshell/src/main.zig:15:31: 0x1034d00 in readline (zshell)
    const slice = std.mem.span(temp);
                              ^
/home/pollivie/minishell/zshell/src/main.zig:30:24: 0x10353d6 in main (zshell)
    while (try readline(allocator, "shell$>\n")) |line| {
                       ^
/home/pollivie/zig/0.14.0-dev.130+cb308ba3a/files/lib/std/start.zig:515:37: 0x10362ae in main (zshell)
            const result = root.main() catch |err| {
                                    ^
../sysdeps/nptl/libc_start_call_main.h:58:16: 0x7f6ead6451c9 in __libc_start_call_main (../sysdeps/x86/libc-start.c)
../csu/libc-start.c:360:3: 0x7f6ead64528a in __libc_start_main_impl (../sysdeps/x86/libc-start.c)
???:?:?: 0x1034bd4 in ??? (???)
???:?:?: 0x0 in ??? (???)
run
└─ run zshell failure
error: the following command terminated unexpectedly:
/home/pollivie/minishell/zshell/zig-out/bin/zshell
Build Summary: 5/7 steps succeeded; 1 failed (disable with --summary none)
run transitive failure
└─ run zshell failure
error: the following build command failed with exit code 1:
/home/pollivie/minishell/zshell/.zig-cache/o/b64b00f884f8551b6d2983a1fbe0fc2f/build /home/pollivie/zig/0.14.0-dev.130+cb308ba3a/files/zig /home/pollivie/minishell/zshell /home/pollivie/minishell/zshell/.zig-cache /home/pollivie/.cache/zig --seed 0xb107c79f -Z1be2ef5f0feb5298 run

as you can see my orelse return null didn’t quite make it, and the reason is that readline in the translation returns a [*c]u8, but not an ?[*c]u8

Hey Pierre, been a while :slight_smile:

It looks like your problem is coming from len in this branch here:

.C => {
    assert(value != null);
    return indexOfSentinel(info.child, 0, value);
},

And that gets called from span… which gets called from your readline.

Have you tried inspecting that pointer first? As in actually checking the numeric value of it before handing it to span? Essentially, std.mem.len is documented to assume non-null. I’m curious what you’ll get if you print the address from that pointer in the problematic case.

1 Like

thanks, it’s good to see you too, good catch, In the case where I send EOF, I do get the address 0 if I do:

    const maybe_temp: [*c]u8 = rline.readline(@ptrCast(prompt[0..].ptr));
    const value : usize = @intFromPtr(maybe_temp);
    std.debug.print("{d}", .{value});

so how should I implement this ? because it seems weird to have to cast the pointer to check if it’s zero, but maybe that’s the only solution, I just thought that maybe there was a way to coerce the fact that zero is indeed null ?

Can you show me signature of the rline.readline function? I have a suspicion about that function that I’d like to settle.

1 Like

so readline translation look like this.

fn readline([*c]const u8) [*c]u8

my fix is that :

pub fn check(ret : [*c]u8) ?[*c]u8 {
    const value : usize = @intFromPtr(ret);
    return if (value == 0) null else ret;
}

pub fn readline(allocator: std.mem.Allocator, prompt: []const u8) !?[]const u8 {
    const maybe_temp: ?[*c]u8 = check(rline.readline(@ptrCast(prompt[0..].ptr)));
    const temp = maybe_temp orelse return null;
    defer std.c.free(temp);
    const slice = std.mem.span(temp);
    const result = try allocator.dupe(u8, slice);
    return (result);
}

oh and in the man of readline is also explained that the return value is null, if readline receives EOF and there’s nothing typed yet.

I’d have to build a basic example, but try this and tell me what you get (I’m not at my main computer to run zig code):

const ptr: *u8 = @ptrFromInt(0); // I believe this throws an error

const ptr: [*c]u8 = @ptrFromInt(0); // This may not and could be your problem

If the second one works without a problem then I think that’s your issue and you’ll need to inspect the pointer.

1 Like

Yep, that’s correct the second type compiles just fine, while the first one doesn’t.

By that logic, [*c] can legitimately carry the value 0. You’ll have to inspect that value first then because it doesn’t get expressed as null. Makes sense though - they have to be compatible with C-pointers and… well… that’s a thing so…

1 Like

Yes I think I’m just going to wrap it in the little check function that I’ve made, and call it a day, thanks for your help, tho. I’m still curious as to why it doesn’t just assume null ability from C pointers translations right of the bat, like for me every C pointer should be considered a nullable pointer by default, because well it’s C so you never know, but maybe there’s a good reason why that’s not the case.

1 Like

This is the problem, ?[*c]u8 doesn’t act like a proper optional where 0 is equivalent to null, I don’t completely understand the details, but it seems that ?[*c]u8 is basically like [*c]u8 just that it tricks zig into allowing optional syntax.

I think in practical terms you just have to try to avoid using [*c] within zig as much as possible. One way to do this here, is to define the signature for the readline function manually:

const fixup = struct {
    pub extern "c" fn readline([*c]const u8) ?[*:0]u8;
};

pub fn readline(allocator: std.mem.Allocator, prompt: []const u8) !?[]const u8 {
    const maybe_temp = fixup.readline(@ptrCast(prompt[0..].ptr));
    const temp = maybe_temp orelse return null;
    defer std.c.free(temp);
    const slice = std.mem.span(temp);
    const result = try allocator.dupe(u8, slice);
    return (result);
}

Notice if you re-add the old type the program is broken again:

const maybe_temp: ?[*c]u8 = fixup.readline(@ptrCast(prompt[0..].ptr));

I don’t know why ?[*c]u8 acts like an optional with broken invariants, maybe it is needed for some kind of tricky c compatibility edge case. But would be good to know, if there is a reason for it behaving that way.

Also, I just realized that you can skip the integer check entirely and do the following:

const ptr: [*c]u8 = ...

if (ptr == null) // check for zero implicitly

So we don’t really need a wrapper or helper here. It’s a non-obvious point of syntax because of the lack of ? but it does what you’d expect.

That’s why this works:

.C => {
    assert(value != null);
    return indexOfSentinel(info.child, 0, value);
},

Interestingly, it works with x == null but not orelse which is why this is confusing. Edited: see next post.

1 Like

Okay, so no… there’s more to this story than meets the eye here.

@Sze, @pierrelgol, here’s what’s actually happening… look at the following example:

  const p1: [*c]u8 = @ptrFromInt(0);
  const p2: [*c]u8 = p1;
  return p2 == null;

Obviously, we can see that this will infact return true because p2 is just a copy of p1 and p1 has the zero-address. That works fine.

This however is what was happening in @pierrelgol’s case:

  const p1: [*c]u8 = @ptrFromInt(0);
  const p2: ?[*c]u8 = p1; // is optional now
  return p2 == null;

Now, p2 is optional. The optional has a value of p1, so the optional itself is not null. The pointer that it’s holding onto IS null though so it sneaks by this check. This is a great footgun example. I’m going to mark this as the solution because this is actually what is happening here.

So @pierrelgol, what you really need to do here is just treat the [*c] like you would any other optional pointer. It does in fact work with orelse. You don’t need to hand it to another pointer with a qualified optional - you can go directly against the pointer itself.

2 Likes

So if I understand this correctly [*c]u8 is already an optional type, that just doesn’t look like an optional type, because there is no ? anywhere.

And then ?[*c]u8 is an optional of an optional?
Where the outer optional has the inner optional as value and the inner one is the zero pointer.


const p1: [*c]u8 = @ptrFromInt(0);
const p2: ?[*c]u8 = p1; // is optional now
@compileLog(p1, @typeInfo(@TypeOf(p1)));
@compileLog(p2, @typeInfo(@TypeOf(p2)));
@as([*c]u8, @as([*c]u8, @ptrFromInt(0)).*), @as(builtin.Type, .{ .Pointer = .{ .size = .C, .is_const = false, .is_volatile = false, .alignment = 1, .address_space = .generic, .child = u8, .is_allowzero = true, .sentinel = null } })
@as(?[*c]u8, @as([*c]u8, @ptrFromInt(0)).*), @as(builtin.Type, .{ .Optional = .{ .child = [*c]u8 } })

I think the part that allows to use the [*c]u8 with optional syntax is the .is_allowzero = true in the type info.

So I think pointers with allow zero, behave like optionals.

So I really think this problem was accidentially creating double “optionals” where the inner one is a pointer with allow zero.

Also this section of the language reference is relevant: C-Pointers

1 Like

Ok, thanks everyone for the details, but now that you uncovered that situation, I’d really like an answer lol, is this a bug or is this intended ? because if it is intended I think that this is a serious footgun indeed, and I’d really want to hear why that’s the case.

EDIT: I mean to be fair it does make sense, because I also expect every C pointers to be considered possibly null, but the syntax is weird

I just made a doc for it.

I believe this is infact intended behaviour but it’s really sneaky.

An optional pointer may not have been assigned to a pointer yet, but that says nothing about the address of the pointer that it’s holding onto. It technically makes perfect sense, but we’re so used to ? being associated with null that it doesn’t come to mind when using [*c] pointers which also have their own semantics surrounding null and orelse.

1 Like