Is it safe to cast function pointers like this?

IntegratedQuantum · August 8, 2023, 6:57am

I have a gui system that uses callbacks to handle events. A Callback consists of a function pointer and some user data:

pub const Callback = struct {
	callback: *const fn(usize) void,
	arg: usize = 0,

	pub fn run(self: Callback) void {
		self.callback(self.arg);
	}
};

Now I wonder if it is safe to do something like this:

fn callbackFunction(ptr: *SomeStruct) void {...}

	const callback = Callback {
		.callback = @ptrCast(&callbackFunction), // casts fn(*SomeStruct) to fn(usize)
		.arg = @intFromPtr(...) // also casts the argument to usize
	};
	... = Button.init(..., callback);

	callback.run(); // Somewhere inside the Button struct

Intuitively I think this should work, given that usize and pointer types have the same size and alignment. But is this safe? Does this work everywhere or just on my machine? Am I triggering some undefined behavior?

neurocyte · August 8, 2023, 7:36am

I don’t think it’s UB, although I’m not entirely certain. It’s a big cannon sized foot gun though and I recommend you avoid it. You are basically type erasing a function pointer, which can break things in horribly difficult to debug ways if either the from or to function type ever changes in an incompatible way.

I suggest you avoid casting the function pointer and instead cast just the argument. If needed, you can generate a wrapper function with the correct type that does the cast.

Also, I suggest you use *anyopaque for the argument type if you expect them to commonly be type erased pointers.

IntegratedQuantum · August 8, 2023, 7:53am

It’s a big cannon sized foot gun though and I recommend you avoid it

I plan to wrap this into a generic function that would check if the argument types of the function are valid. So unless it’s undefined behavior I think I should be safe with that.

I suggest you avoid casting the function pointer and instead cast just the argument.

Yeah, that’s what I’m doing right now. It’s not really ergonomic though.

I suggest you use *anyopaque for the argument type if you expect them to commonly be type erased pointers.

Yeah I was considering it. But the thing is that I commonly just use a number. And honestly it just feels weirder to cast a anyopaque pointer to a number than casting a number to a pointer.

AndrewCodeDev · August 8, 2023, 10:57am

Can you explain what you mean when you say it’s not ergonomic? What problem are you tying to solve here in terms of ergonomics? Because interoperability with other existing code could be hindered by this quite a bit, especially given that the validity of this approach is implementation defined for C code.

I personally see it the other-way around - pointers dereference numbers all the time. Here’s why I see it differently:

Let’s take foo(x) as an example. Now, x is a usize argument… usually, I would expect this to calculate something… like a factorial operation or the like. I would be very surprised to find out that x is being casted to a pointer and then being used as a handle to a struct.

Now, let’s say x is an *anyopaque (or for the C folks, a good ol’ fashioned void pointer). I would fully anticipate that foo is going to cast that argument at some point. Like… almost no doubt.

So can you help me understand what the ergonomic issue here is?

IntegratedQuantum · August 8, 2023, 11:53am

Can you explain what you mean when you say it’s not ergonomic?

What I mean is that it is annoying having to @ptrFromInt or @ptrCast the argument at the beginning of all my callback functions. It also feels unsafe.

Maybe it’s best if I give you an example of what I currently have and what I want to have:

// Just a random example:
// When I click the button it should open the world with the given name
fn openWorld(namePtr: usize) void {
	const nullTerminatedName: [*:0]const u8 = @ptrFromInt(namePtr);
	...

// When creating the button I give it a callback with function ptr and name:
Button.initText(..., .{.callback = &openWorld, .arg = @intFromPtr(name.ptr)})

Now instead I would like to do something like this:

fn openWorld(nullTerminatedName: [*:0]const u8) void {
	...

// When creating the button I give it a callback with function ptr and name:
Button.initText(..., Callback.init(&openWorld, name.ptr))

Here Callback.init does the function pointer cast and would also do some safety checks, making sure that @TypeOf(name.ptr) and @TypeOf(meta.ArgsTuple(openWorld)[0]) match.

To me this would be more ergonomic because I don’t need to do any casting(only once inside the init function), and I’m probably even safer because I cannot mess up by for example accidently passing a non-terminated pointer.

this approach is implementation defined for C code.

I only want to use this inside of my zig project. Is this undefined behavior for Zig as well?

Now, let’s say x is an *anyopaque (or for the C folks, a good ol’ fashioned void pointer). I would fully anticipate that foo is going to cast that argument at some point. Like… almost no doubt.

I agree, but currently the majority of use-cases just store a number(like an index into some list) in the callback. To me it just feels wrong needing to cast it to a pointer when creating the callback. And honestly I’m a bit afraid that some future null safety check would screw me over if I did.

dee0xeed · August 8, 2023, 12:02pm

Oh, yes, old good “universal” pointer (void*), i.e pointer to anything, it’s our everything

    fn workD1(sm: *StageMachine, src: ?*StageMachine, dptr: ?*anyopaque) void {
        _ = src;
        var me = @fieldParentPtr(Worker, "sm", sm);
        var io = util.opaqPtrTo(dptr, *EventSource);
        // interpret this as you want in this particular place

kristoff · August 8, 2023, 4:03pm

I think you can prevent certain optimizations by conjuring a pointer out of a number as the compiler has to fall back to maximally conservative assumptions when that happens.

AndrewCodeDev · August 8, 2023, 6:39pm

I’m having trouble locating documentation relevant to your use case - I’ve found several lateral bugs that have been fixed. The documents are rather thin regarding this issue you’re facing and I’ve spent a good chunk of time looking at this point.

I’m going out on a limb and saying that the actual pointer casts are not UB - you’re covering the most common pain point already by using a usize (much of “implementation defined” stuff is in regards to the actual integer size itself).

Calling the function pointer itself may actually be UB though - again, I can find direct C documentation that basically tries to prohibit what you’re doing but there’s not a lot of obvious Zig documentation on it that I can find.

In general, I would avoid doing this. First like @kristoff mentioned, you could lose out on some optimizations (however, you’re in type-erasure territory so performance is already taking a big hit). I’m going to take this ever further though and say that you’ll almost always get better language support (including bug fixes) if you do things in a more recognizable way.

IntegratedQuantum · August 8, 2023, 9:06pm

Thanks for taking the time to research this @AndrewCodeDev
I really appreciate it!

I think I’ll try to go with wrapper functions then, like neurocyte suggested.
To avoid needing to cast between usize and pointer types I think I can just use a union of usize and *anyopaque.

neurocyte · August 9, 2023, 9:10am

There is a good article on how to generate the sort of wrapper functions we are talking about, and write interfaces in general. I’ll just leave it here for future reference.

Zig Interfaces for the Uninitiated

See especially the implementation of Iterator.init().