C/C++ macro challenge #2: repetitive error handling

chung-leong · November 28, 2023, 5:58pm

This week’s challenge is an example in the Node-API documentation. The API is used to create native Node.js modules. Every function returns a napi_status, an enum indicating whether the operation has succeeded or not. This includes basic operations like getting the variable of a JS variable. Dealing with potential failure point on a case-by-case basis would be extremely tedious. The API example therefore provides us with a helpful macro:

// addon.c
#include "addon.h"

#define NODE_API_CALL(env, call)                                  \
  do {                                                            \
    napi_status status = (call);                                  \
    if (status != napi_ok) {                                      \
      const napi_extended_error_info* error_info = NULL;          \
      napi_get_last_error_info((env), &error_info);               \
      const char* err_message = error_info->error_message;        \
      bool is_pending;                                            \
      napi_is_exception_pending((env), &is_pending);              \
      /* If an exception is already pending, don't rethrow it */  \
      if (!is_pending) {                                          \
        const char* message = (err_message == NULL)               \
            ? "empty error message"                               \
            : err_message;                                        \
        napi_throw_error((env), NULL, message);                   \
      }                                                           \
      return NULL;                                                \
    }                                                             \
  } while(0)

static napi_value
DoSomethingUseful(napi_env env, napi_callback_info info) {
  // Do something useful.
  return NULL;
}

napi_value create_addon(napi_env env) {
  napi_value result;
  NODE_API_CALL(env, napi_create_object(env, &result));

  napi_value exported_function;
  NODE_API_CALL(env, napi_create_function(env,
                                          "doSomethingUseful",
                                          NAPI_AUTO_LENGTH,
                                          DoSomethingUseful,
                                          NULL,
                                          &exported_function));

  NODE_API_CALL(env, napi_set_named_property(env,
                                             result,
                                             "doSomethingUseful",
                                             exported_function));

  return result;
}

What is the best way of doing this in Zig?

Here’s a file mocking the relevant Node-API functions:
napi.zig (3.1 KB)

jmc · November 29, 2023, 10:46pm

try

(Here’s more text because Discourse prevents me from posting a reply which contains less than 10 characters.)

LucasSantos91 · November 30, 2023, 2:40am

To elaborate a bit, put all this logic into a function, make it return an error instead of null and, at the call site, just use try. The reason why they did it this way in C++ is because they don’t have a convenient way to check for an error and end the function, which is what try does. However, even in C++ they could have done a much better job. When writing macros for custom control flow like they did there, you want to put as much as possible into normal functions, and write the smallest macro that gets the job done. In this case, the macro should be if(errorCheck(env, call) == NULL) return NULL. And all the logic would go into errorCheck. The way they are doing there, the same code is getting pasted all over the code base, which is like force inlining a function. This code isn’t so small and it’s probably called a bunch of times, so the compiler would probably not inline it, making the code smaller and faster due to less icache pressure. In Zig, you get these benefits without even realizing, by just doing it the convenient way.

neurocyte · November 30, 2023, 11:44am

What exactly is the challenge?

All that macro does is convert a C error return code to a C++ exception, which zig doesn’t use. It doesn’t actually handle the error. A function to convert napi_status to some Zig error is a one liner.

chung-leong · November 30, 2023, 9:20pm

The challenge here is that macros actually accept expressions as arguments. In the given example, it doesn’t make a difference since the expression is evaluated immediately in the macro. There are cases where evaluation needs to happen in the middle of the macro. For example, a C++ macro might need to do it inside a try/catch block.

Something like this would be more generalized:

const std = @import("std");
const napi = @import("./napi.zig");

const NapiError = error{NotOK};

fn call_api(env: napi.napi_env, comptime f: anytype, args: std.meta.ArgsTuple(@TypeOf(f))) NapiError!void {
    const status = @call(.auto, f, args);
    if (status != napi.napi_status.napi_ok) {
        var error_info: *const napi.napi_extended_error_info = undefined;
        _ = napi.napi_get_last_error_info(env, &error_info);
        _ = napi.napi_throw_error(env, null, error_info.error_message);
    }
}

pub fn main() !void {
    napi.memory_exhausted = true;
    const env: napi.napi_env = @ptrFromInt(1234);
    var result: napi.napi_value = undefined;
    try call_api(env, napi.napi_create_object, .{ env, &result });
}

From an ergonomic standpoint the syntax is not very good.

AndrewCodeDev · November 30, 2023, 10:05pm

First, to be clear, we’re investigating why the voting ability isn’t enabled for this thread - @dude_the_builder is looking into it… the voting plugin may be a bit more touchy the we realized… anyhow, that’s under investigation.

To the issue at hand though, you’ll notice that the macro also has a return statement in it.

This is an important thing to point out because this is why macros do things that create unexpected results. We aren’t returning from the macro itself, the macro is injecting the return statement into the surrounding code.

To replicate the behaviour of this macro @chung-leong, you’d also need to have a return statement that pushes an error through (you have it in the function return type, but I don’t see where that’s apparent in the call_api function that was provided). It may be more accurate to return an optional here because the original macro returns NULL (which is roughly our equivalent mapping). That said, this forces us to deduce the child type of the optional, so I think a singular error code would probably be better.

It’s also important to note that including expressions in the way you mentioned is actually another form of code injection. Those expressions will not evaluate first and then leave their results as the arguments of the macro (I’m sure you’re aware of this, @chung-leong, but I’m describing this for other participants), but instead they will be injected into the code directly.

To be clear, this is not behaviour that I believe Zig wants to replicate. Code injection with macros is one of the main reasons they are a more like a foot-cannon than a foot-gun.

So I propose we should be clear about the requirements here - is it ergonomics we’re after or better behaved error handling? Since we’re probably not in the business of code injection, I think we need to sacrifice one for the other.

I think your answer is the closest one so far with the addendum of anonymous structs that can define calling functions to be inlined at the call-site (and since we’re talking about code injection, changing the .auto to .always_inline is probably more accurate, too).

Sze · November 30, 2023, 10:47pm

I think my preferred way of doing this would be to create some static array of some zig data type, that describes all the functions their names, arguments and how their error handling needs to be managed.

That data is basically a description of the api, then write a custom build step that uses it to generate source code that implements exactly these functions. So this would basically be a binding that is generated at build time.

If that api already has some other description that specifies the api, maybe that build step could instead read/parse that description, instead of creating it’s own description.

Alternatively if you just need like 5 different functions of that api, I would just write those functions by hand, then looking at those it would be easier to see if there are places where repeating code can be pulled out into helper functions.
I find it is often times easier to start with non generic code and then make it more generic, than the other way around. When you start with all the concrete examples, you quickly notice when your idea of finding a generic solution doesn’t apply to one of those examples (or makes them awkward to use).

When you don’t have those examples written out, it is easy to think “this is easy we always need this” and then forget about some case, where you need something else, for example to make using those functions more friendly for the user.

This also is a bit of a taste question whether you prefer using very low level direct apis/bindings, or want something that is more adapted to the language.

chung-leong · November 30, 2023, 10:53pm

The goal here is not to replicate the macro’s behavior. The goal here is to handle potential errors with fewer lines.

The ideal solution would magically “zigify” the whole API so that you can do things the normal way. Something like this where you want to create an object:

   const object = try env.createObject();

I don’t think we can do this kind of transformation at comptime presently. Probably not even a good idea. I suspect it’d be super confusing when things are this magical. People would want to see actual code. Automated code generation (probably involving a scripting language) is probably a better solution here.

AndrewCodeDev · December 1, 2023, 1:19am

That, in my opinion, is one of the most important lessons I’ve repeatedly learned in the last 5 years. It seems so simple, but the road to complexity is paved with clever intentions. Currently, the most interesting form of coupling that I’m aware of is when all of the thinking surrounding a problem gets shaped by what generic functions are already written. The iterator model is a success story in many programming languages because it is a fleshed out perspective on a genuinely generic problem: what does it mean to traverse containers with different underlying data structures?

Yes, possible… but like @Sze said, what really may be the ultimate question here is what’s the intent and does that lend itself to a certain design? I’m going to keep thinking about this one because my impression of what this would require keeps changing as I think about it more. Honestly, I’ve liked your macro challenges so far… makes me think

Anyhow, sorry for the long replies - we’re bending the rules of this category a bit here, but once we figure out what’s up with the voting system and response style, we’ll go back to business as usual.