What are the reasons for exposing errno in std.os.linux syscalls?

Shemi · October 29, 2023, 6:23am

I tried writing some code that used Linux-specific syscalls and was surprised that the implementation of these syscalls returned actual usize integers as return codes and not zig error-sets as with most other functions in std. I will try and present my objections, and hope someone will refute them.
I tried looked into it and found this issue which reorganized std.os, and it states:

In Zig-flavored POSIX [what is now std.os], errno is not exposed; instead actual zig error unions and error sets are used. When not linking libc on Linux there will never be an errno variable, because the syscall return code contains the error code in it. However for OS-specific APIs, where that OS requires the use of libc, errno may be exposed, for example as std.os.darwin.errno().

I personally do not understand the reasons for the choice of making std.os.linux syscall return a number instead of an error set, and would like someone to explain to me why it makes sense. It degrades syscall error handling to a C-style error handling style. And so for system-specific syscalls Zig shares the faults of the C error handling system (or at least up to ignoring return values):

// C raw syscalls
int main() {
    int fd = open("does_not_exist/foo.txt", O_CREAT);
    write(fd, "hi", 2);
    close(fd);
    return 0;
}

// Zig raw syscalls
pub fn main() !void {
    var fd: isize = @bitCast(linux.open("does_not_exist/foo.txt", 0, linux.O.CREAT));
    _ = linux.write(@truncate(fd), "hi", 2);
    _ = linux.close(@truncate(fd));
}

(code example from road to zig 1.0)
This style of course half-implicitly ignores any errors returned from these without any try keywords. This leads people who write code which heavily uses OS-specific interfaces to write a wrapper file with error-set wrappers like this one with code that should honestly be in std.
Another problem created which one might notice in the above code is the current interface creates a lot of redundant casting, it would make more sense for std.os.linux.open to return an i32 type with an error set instead of a usize.
A supposedly correct program which at least panics on errors (not even gracefully handles cases) from these functions will look like this:

const std = @import("std");
const linux = @import("std").os.linux;

inline fn panic_on_err(value: usize) void {
    if (linux.getErrno(value) != .SUCCESS) {
        std.debug.panic("There was _some_ error, but to know which enum values to check we must consult manpages! {}", .{linux.getErrno(value)});
    }
}

pub fn main() !void {
    var ret = linux.open("does_not_exist/foo.txt", 0, linux.O.CREAT);
    panic_on_err(ret);

    var fd: i32 = @truncate(@as(isize, @bitCast(ret)));
    ret = linux.write(fd, "hi", 2);
    panic_on_err(ret);

    ret = linux.close(fd);
    panic_on_err(ret);
}

Which, for me feels too unziggy. I would like to understand the reasons for this design which (for me at least) feels like it makes it harder to write code which uses syscalls.

IntegratedQuantum · October 29, 2023, 8:51am

I think the interface in std.os.linux is supposed to represent the raw interface without any ziggification.

If you want a better interface, take a look at std.os which for example has std.os.open(), returning a OpenError!fd_t where fd_t is an alias for i32.

Shemi · October 29, 2023, 9:49am

I agree that is what it is, but a better interface does not exist for OS-specific syscalls, like mount for example. As I pointed out, that means each programmer needs to rewrite a ziggification for those apis (refer to this).
Maybe the example I chose didn’t reflect my intentions but that is my issue.

squeek502 · October 29, 2023, 11:05am

This is largely a guess, but I have a feeling that this is a part of it:

github.com

ziglang/zig/blob/37295696ec5b1350333b88173187a3db8b199925/lib/std/os.zig#L60-L73


      
          /// Applications can override the `system` API layer in their root source file.
          /// Otherwise, when linking libc, this is the C API.
          /// When not linking libc, it is the OS-specific system interface.
          pub const system = if (@hasDecl(root, "os") and root.os != @This())
              root.os.system
          else if (builtin.link_libc or is_windows)
              std.c
          else switch (builtin.os.tag) {
              .linux => linux,
              .plan9 => plan9,
              .wasi => wasi,
              .uefi => uefi,
              else => struct {},
          };

That is, std.os.system can vary based on target OS and/or if libc is being linked, and it’s used like this:

github.com

ziglang/zig/blob/37295696ec5b1350333b88173187a3db8b199925/lib/std/os.zig#L644-L652


      
          pub fn kill(pid: pid_t, sig: u8) KillError!void {
              switch (errno(system.kill(pid, sig))) {
                  .SUCCESS => return,
                  .INVAL => unreachable, // invalid signal
                  .PERM => return error.PermissionDenied,
                  .SRCH => return error.ProcessNotFound,
                  else => |err| return unexpectedErrno(err),
              }
          }

So this code could be translating the linux syscall return or it could be translating the libc function return (and getting the actual error using _errno() in the libc case). This setup allows them to share the same implementation.

Relevant getErrno implementations (note that std.os.errno = system.getErrno)

Linux:

github.com

ziglang/zig/blob/37295696ec5b1350333b88173187a3db8b199925/lib/std/os/linux.zig#L221-L225


      
          pub fn getErrno(r: usize) E {
              const signed_r = @as(isize, @bitCast(r));
              const int = if (signed_r > -4096 and signed_r < 0) -signed_r else 0;
              return @as(E, @enumFromInt(int));
          }

C:

github.com

ziglang/zig/blob/37295696ec5b1350333b88173187a3db8b199925/lib/std/c.zig#L114-L120


      
          pub fn getErrno(rc: anytype) c.E {
              if (rc == -1) {
                  return @as(c.E, @enumFromInt(c._errno().*));
              } else {
                  return .SUCCESS;
              }
          }

Shemi · October 29, 2023, 11:52am

That’s a neat observation actually! But even if so - wouldn’t it make sense to use this system errno interface only internally for std? Or even better - abolish it entirely and make the switch-cases that std.os uses and move them into the std.os.foo apis such that std.os.foo.func returns the same error set as std.os.func. Either way this would also imply std.os.foo.os_specific_func should return error sets and not errnos (and even in the current implementation os-specific functions aren’t releated in any manner to std.os and could be replaced, although the heterogeneity would be bad).

IntegratedQuantum · October 29, 2023, 11:53am

From what I gathered the goal of std.os is not to contain a ziggified version of every syscall for every supported operating system out there. Instead it’s main purpose is to serve the standard-libray and the compiler (and maybe also common use-cases outside of that). And I guess mount is just not on that list.

And I think that’s a good thing since it means less maintenance work for the people working on the standard library and compiler.

But it doesn’t mean that each programmer needs to write their own ziggified API.
You can for example make a library for that, which can be reused across different projects. Now that the package manager exists that should be relatively easy.

mcadamy · October 30, 2023, 2:01pm

Why does r keep getting cast back and forth from isize to usize? It would be much nicer to just map the errno value directly to a error instead of each syscall rapper repeating similar and partially overlapping (and you hope identical) transformations in the switch. Have one place where that mapping is done and one error set for all errno values. Then you can expose that for code outside stdlib.

Shemi · November 5, 2023, 9:02am

I agree with this, I am not suggesting that we should put mount inside std.os. Instead I am suggesting we make std.os.foo return actual error codes instead of usize.

It doesn’t make sense to me to require language users to install a 3rd party library just for using basic syscalls in a normal zig fashion. The place for these API wrappers should is definitely std, wouldn’t you agree?

IntegratedQuantum · November 5, 2023, 9:19am

I agree that zig should contain API wrappers for basic/common syscalls.
And it does already have that for a lot of syscalls, like for example the posix socket API.
If you feel that there is some important functions missing, then I’d suggest you to go on github and make an issue or pull request there.

Shemi · November 6, 2023, 7:22am

To conclude this thread, here’s the issue.
Thanks everyone for your feedback!

chrboesch · August 21, 2024, 11:40am

In the current Zig version (> 0.11), you can and perhaps should use:

const std = @import("std");
const print = std.debug.print;
const posix = std.posix;

pub fn main() !void {
    const rights = 0o755;

    try posix.mkdir("does_not_exist", rights);
    const fd = try posix.open("does_not_exist/foo.txt", .{ .ACCMODE = .WRONLY, .CREAT = true }, rights);
    defer posix.close(fd);

    const ret = try posix.write(fd, "hi");
    print("{d} bytes written\n", .{ret});
}

I think it’s very ziggy.