Switch based polymorphism vs Pointer based. Which is more efficient?

In my learning about Zig, I have learnt about two ways to approach polymorphism. (More definitely may exist that I don’t know about yet)

Now I am wondering what are the pros/cons of these two approach.

The first approach is based on making use of switch. For example:


const Dog = struct {
    fn bark(self: Dog) void {
        _ = &self;
        std.debug.print("{s}\n", .{"WooF"});
    }
};

const Cat = struct {
    fn meow(self: Cat) void {
        _ = &self;
        std.debug.print("{s}\n", .{"Meeaow"});
    }
};

const Pet = union(enum) {
    dog: Dog,
    cat: Cat,
};

pub fn speak(self: Pet) void {
    switch (self) {
        .dog => |dog| dog.bark(),
        .cat => |cat| cat.meow()
    }
}

pub fn main() !void {
    const pet1 = Pet {.dog = Dog{}};
    speak(pet1);

    const pet2 = Pet {.cat = Cat{}};
    speak(pet2);
}

The seond method uses pointers to emulate interfaces. Most recent pattern I know of is from here Generic Programming and anytype

const PetInterface = struct {
    obj_ptr: *anyopaque,
    func_ptr: *const fn (ptr: *anyopaque) void,
    pub fn speak(self: PetInterface) void {
        self.func_ptr(self.obj_ptr);
    }
};

const DogIntance = struct {
    fn bark(ptr: *anyopaque) void {
        const self: *DogIntance = @ptrCast(@alignCast(ptr));
        self.do_bark();
    }

    pub fn speak(self: *DogIntance) PetInterface {
        return PetInterface{
            .obj_ptr = self,
            .func_ptr = bark,
        };
    }

    fn do_bark(self: DogIntance) void {
        _ = &self;
        std.debug.print("{s}\n", .{"WooF"});
    }
};

const CatIntance = struct {
    fn meow(ptr: *anyopaque) void {
        const self: *CatIntance = @ptrCast(@alignCast(ptr));
        self.do_meow();
    }

    pub fn speak(self: *CatIntance) PetInterface {
        return PetInterface{
            .obj_ptr = self,
            .func_ptr = meow,
        };
    }

    fn do_meow(self: CatIntance) void {
        _ = &self;
        std.debug.print("{s}\n", .{"Meow"});
    }
};

pub fn speak2(self: PetInterface) void {
    self.speak();
}

pub fn main() !void {
    var dog = DogIntance {};
    speak2(dog.speak());

    var cat = CatIntance {};
    speak2(cat.speak());
}

My question now is, without running benchmark but soley based on analysis, which of these approach is better in terms of performance and resource usage.

In terms of readability I slightly prefer the first approach, but I also doubt if it would scale?

Thoughts?

2 Likes

This video is a bit old but still offers some relevant insight into this question. It also shows some benchmark comparisons of the different interface implementations.

The way I see it, the tagged union approach should be the default, go-to, interface implementation method. It’s easy to read, write, and should perform very well since it’s a more static type of dispatch where all the implementations are known at compile time. The downside is that it’s a closed model, where you have to have control over the interface’s source code and modify it directly to add new implementations. Like you said, it doesn’t scale well to general purpose, library-type interfaces.

The *anyopaque / vtable method does scale because it’s an open model, where client implementations don’t need to touch the interface source code. But IMO it’s harder to read, write, and could incur a performance penalty since it’s basically runtime only dispatch.

As always in software dev, tradeoffs, tradeoffs.

8 Likes

Also, don’t forget that Zig has a third option. Reflection based static dispatch. Maybe you could also call it static duck typing dispatch. This is likely the fastest of the three and also an open model, but entirely static.

pub fn speak(self: anytype) void {
    const Self = @TypeOf(self);
    return if (@hasDecl(Self, "bark"))
        self.bark()
    else if (@hasDecl(Self, "meow"))
        self.meow()
    else
        @compileError(@typeName(Self) ++ " does not bark or meow");
}
7 Likes

Still yet to truly grok comptime, but the fact that the above can be done at compile time makes me pause…and think to myself: maybe there is indeed something amazing in comptime.

Switch is only good for ahead-of-time-known variants. If you need to program against interface (real polymorphism, and there are many real use-cases like io readers/writers, db drivers, auth modules, build system jobs, canvas backends, http clients for mocking, …) then pointers/vtables are your only choice.

2 Likes

but we can inject implementations at comptime. the module can have a single setup point that will accept an array of a type and implementations of polymorphic functions and generate a union for internal use. it can have the added benefit that we can observe all types participating in an interface in one place, making the system easier to grok.

2 Likes

Very interesting. It would be really nice to see some sample coda that does that, or maybe some existing code that’s doing something similar.

2 Likes

Performance should be more stable with switch than pointer. Indirect branch prediction requires more silicon space than regular branch prediction since we need to store the full address. Weaker CPU cores might employ the simple strategy of returning the last address seen. That’s adequate a lot of the times. In a situation where the code actually does alternate between different jump targets performance would be quite poor.

I vaguely recall reading a paper some time ago about a profile-driven optimization technique whereby virtual function calls would get replaced with simple branches.

Welcome to Ziggit @cztomsik!

There is another choice, which is anytype. The tradeoff there is a) each specialization of the function has to be compiled for the provided type, if there are a lot of types that can bloat the program and b) you can’t dispatch with runtime genericity, as in the case where neither the calling site or the function know the type. For that you do need something like a vtable, the Allocator interface is the prime example of that approach. I can hand, say, an *ArrayList(u8) to a function, that function can call array_list.deinit(), and the ArrayList doesn’t know, or have to know, the specific type of allocator which the Allocator vtable was derived from.

1 Like

I actually do exactly this in my CLI library to allow for parsing and collection of user arguments into nearly any Zig Type. The Union itself can be seen here and there’s an explanation on the library’s wiki page here.

2 Likes

for event processing, one guidline is to not place fn in structs. this help us to think about the fn separate from the data.

for awareness, a hard plugin-style pattern:

  • define external fn
  • runtime event 1 - link to a shared object
  • runtime event 2 - attach SO to extern functions
  • runtime event 3 - invoke fn

a soft plugin-style vtable pattern - structure of fn pointers, a hashmap of int or string to fn structure

  • ‘comptime’ event 1 - build hashmap (construct struct of fn, add to hashmap w/key)
  • ‘comptime’ event 2 - derive hashmap key - invoke needed fn

part of the use case is - our-land vs user-land-system (data export to runtime format) or user-land-configuration (think button mapping on a game controller) will lead us to one paradigm; however, if we are not in user-land, then always ask ourselves …

why do we need virtual fn?

This comment may be a bit off-topic, but if your use-case allows using union-switch approach, and if there are many objects you want to call speak with, you may also consider data-oriented approach where you keep separate arrays for each type. So, if with union-switch approach you have something like that

const pets: []Pet = ... // a lot of pets
for (pets) |pet| {
    pet.speak();
}

you may consider this instead

const dogs: []Dog = ... // all dogs are here
const cats: []Cat = ... // all cats are here
for (dogs) |dog| {
    dog.bark();
}
for (cats) |cat| {
    cat.meow();
}

In theory it should be better, because CPU doesn’t have to guess the correct branch each iteration, because there is no branching.

The con is that your use-case has to allow iterating all objects of one type before all objects of another type. If you need interleaving order, then const pets: []Pet may be an appropriate solution.

If you have a single Pet object (or at least not many of them), then union-switch approach is fine.

For selecting the approach, I would go like this:

  1. Try data-oriented design.
  2. If it’s not appropriate (for example, you need an interleaving order), try something like union-switch.
  3. If it’s not appropriate (for example, you need to load implementation of some interface from a shared library at runtime), use pointers.

The ideal is trying different approaches and benchmarking them in action on your particular use-case.

3 Likes

It looks to me with this data-oriented approach the whole idea of “polymorphism” is thrown out of the window, and, the gain would be performance, because the data locality ensures the CPU will be able to process things faster?

I have not read up on data oriented design yet so pardon if I misinterpreted the suggestion.

1 Like

This is correct. But the observation is: if you are able to list all your types in the union (i.e., all types are known at compile time), then you are able to list all arrays (cats and dogs) as well. You may define a struct that holds the lists. You may also define a function that iterates over all these lists (the analogue of speak that takes these lists). So, on the user side it may look something like

var pets: Pets = .{} // holds arrays of cats and dogs
pets.addDogs(...); // you may add functions like this for convenience
pets.addCats(...);
pets.speak(); // all dogs bark, then all cats meow.
4 Likes