Function Pointers

dude_the_builder · June 7, 2024, 1:45pm

If I can do field: fn(u8) bool, and also field: *const fn(u8) bool, , then why should I choose one over the other? What are the use cases for each?

nyc · June 7, 2024, 2:42pm

I asked this same question a few months ago and the answer I got was that
the function body will be known at compile time and inlined, but the pointer can only be known at runtime and cannot be.

That just prompted the question: how about fn(u8) bool vs comptime *const f...), which I still don’t have an answer to.

And there seem to be cases where if the pointer is taken locally after inlining around it, llvm can devirtualize the call through the pointer and inline it in a limited scope.

So I’m still very unsure, still.

Syntactically, I’m sure you can do an array of function bodies though.

Sze · June 7, 2024, 5:26pm

~~I think it has to be a comptime tuple of function bodies, because they are treated as different types (at least I think so).~~

If the function signature is the same you can put them into an array, otherwise you could use a tuple.

haydenridd · June 7, 2024, 5:40pm

Yeah… I think presumably different types if they were different function body signatures. I believe you could do an inline for if the function signature was uniform, but you just wanted to iterate through different functions (of the same signature) .

Sze · June 7, 2024, 5:47pm

You are right, I was wrong.

const std = @import("std");

fn foo() void {
    std.debug.print("foo\n", .{});
}
fn bar() void {
    std.debug.print("bar\n", .{});
}

pub fn main() !void {
    const BodyType = fn () void;

    const funcs = [_]BodyType{ foo, bar };
    inline for (funcs) |func| {
        func();
    }
}

I guess I was to used to using functions with different signatures.

nyc · June 7, 2024, 6:51pm

so you can do that. When I asked this last time, I thought I was told it wasn’t possible, but I never tested it.

Still not sure what the difference between a function body and a comptime function pointer are though. seems redundant.

dimdin · June 7, 2024, 7:48pm

Both are types. The difference is:

zig function bodies are compile time symbols that are resolved by the loader to an address
zig function pointers are runtime pointers to the starting address of the function

C function pointers map to zig function pointers, it is the same concept.
Zig function bodies are a comptime concept; the address of the function is not known at comptime because it is decided by the loader at run time.

Zig Language Reference / Functions

There is a difference between a function body and a function pointer. Function bodies are comptime-only types while function Pointers may be runtime-known.

EDIT: Note that “function pointers may be runtime-known”, this is the most confusing part, it means that some times zig treats function pointers as comptime function bodies
e.g. this inline for works

    const funcs = [_]*const fn () void{ foo, bar };
    inline for (funcs) |func| {
        func();
    }

dude_the_builder · June 7, 2024, 8:10pm

Your post reminded me that somewhere I had read about this distinction when it appeared as a change in the language. After some digging, I found it: 0.10.0 Release Notes - Function Pointers

In Summary

If you need to store a function in a variable or field and you will only know which function it will be at runtime, you must use a function pointer type (i.e. *const fn() void). Otherwise the function to be stored is known at compile time, so you can use a function body type (i.e. fn() void).

nyc · June 7, 2024, 8:53pm

This post didn’t clear up anything for me. Can you rephrase it maybe?

How is the a comptime concept reliant on a runtime value?
“the address of the function is not known at comptime” is the same for both function pointers and function bodies then, so that seems to say they are the same,

it means that some times zig treats function pointers as comptime function bodies

that to me says that “sometimes” they are the same thing – when is that sometimes?

this inline for works

What is inline doing there? Since the array size is compile time know, i would think it is just force unrolling the loop and has nothing to do with its contents.

Here is my current best understanding (very vague):

function body (always comptime) can be inlined but doesn’t have to be, but will always be a static call (even if needed to be patched by the loader).

pointer to function can be inlined in some situations (not sure what those are still), but if it cant will use the pointer at runtime to dynamically call the function. (through a dynamic address, not the plt)

comptime pointer to function- no fucking clue lol.

dude_the_builder · June 7, 2024, 9:20pm

Consider this code:

const std = @import("std");

fn foo() bool {
    return true;
}

fn bar() bool {
    return false;
}

fn call(
    a: fn () bool,
    b: fn () bool,
    coin: bool,
) bool {
    const c: *const fn () bool = if (coin) a else b;
    return c();
}

pub fn main() !void {
    const r = comptime call(foo, bar, true);
    std.debug.print("{}\n", .{r});
}

Several scenarios:

The one shown: In call, c is a function pointer type. In main you call call with the comptime keyword forcing a comptime context. It works. So a function pointer type can point to a function known at comptime.
Remove the comptime keyword in the call in main. It works. So the function being called is determined at runtime via the if and the function pointer can point to either foo or bar.
Change *const fn to just fn in call and keep the comptime keyword in main. It works. The function body type is known at comptime even if there’s an if conditional. Everything is comptime known.
Change *const fn to just fn in call and remove the comptime keyword in main. It doesn’t compile. You get a “comptime type depends on runtime control flow” error. So the function to be called is not comptime known and can only be stored as a function pointer type.

At least this is how I understand it. Hope it helps.

dimdin · June 7, 2024, 9:20pm

If you run an executable under a debugger the main function starts at a concrete address. The OS loader decided where to load the program, before that runtime loading of the program none knew what is the value of the main address.

When you compile the program the address of main is not known. At compile time main is just a symbol for the body of the main function. The compiler and linker generate an executable that marks where main starts and all its references (this is how the OS loader fixes actually the addresses so that the calls of functions work.

nyc · June 7, 2024, 9:47pm

I generally now how the linker and loader work, that those details are the crux of the confusion on which one should you chose, and does it even make a difference sometimes?

There are essentially 3 ways to call a function at a call site (mostly everything else is a derivative of these).

it can be inlined at compile time. There’s no jump, no call instruction. The optimizer gets to it and can do its magic though the lexical function boundary. This is generally the fastest case, what you want for the hot path, but also the most restrictive (the function body need to be known at the same time for this to happen).
it can generate a static symbol and when the program is loaded that static symbol gets patched right before it starts to be execute. Here the function body doesn’t need to be known, but the name needs to be known so the compiler can produce a symbol for it. This is great for functions not in the hot path (doesn’t pollute L1i cache, keep the hot path straight when not called). When run, the CPU sees a fixed address, as if the compiler knew it itself.
it can use a dynamic call where just the address is known, and the knowledge of the actual function has been lost. This is how virtual calls and callbacks usually work. Sometimes the compiler can devirtualize or has enough information to know what is behind the pointer and do do #1 or #2, but only in fairly trivial cases where the pointer provenance is traceable.

So the question is what produces what at an instruction level.

Will a function body ever produce #3 or do you always need to turn it into a pointer?
what does a comptime *const fn produce and how it that any different than a either a *const fn or fn body? Does it basically cover both cases? Why would I ever not use comptime *const fn because it seems like it does both.
In what cases is *const fn inlineable and are those cases any differed than comptime *const fn (ie, are there any case where all else being equal does one differ from the other – can the code ever change just by adding or removing comptime if both signature compile)?

I wish there just some doc or description of these somewhere.

AndrewCodeDev · June 7, 2024, 10:00pm

I’m curious about something similar here. I’d like to know what is the difference between these two:

comptime bar: *const fn (usize) bool

// vs...

comptime bar: fn (usize) bool

Both have ultimately generated identical assembly for me. I get that there may be something different about when and how they get resolved (if that’s meaningful, please explain) but from a practical standpoint, I don’t see there being a difference. By that, I mean I wouldn’t recommend one over another if asked which to use at my current state of understanding.

Maybe I just haven’t tried an example that’s convoluted enough to cause the compiler to change output.