Polymorphism, the Zig way

Hello dear Zig community,

I’m having fun on a Game Boy emulator toy project and looking for a convenient way to map an Op-Code to an instruction to execute.

Here is the best I can come with, playing around with polymorphism:

const std = @import("std");

pub const Abstract = struct {
    attr: i32,
    execute: *const fn () void,
};

const Instance1 = Abstract{
    .attr = 1,
    .execute = struct {
        fn hello() void {
            std.debug.print("Works but...\n", .{});
        }
    }.hello,
};

const Instance2 = Abstract{
    .attr = 1,
    .execute = struct {
        fn hi() void {
            std.debug.print("... not feeling confident\n", .{});
        }
    }.hi,
};

pub fn main() void {
    const byte: u8 = 0x00; // Do *not* assume comp-time known
    const instance: Abstract = switch (byte) {
        inline 0x00 => Instance1,
        inline 0x01 => Instance2,
        // ...
        else => {
            return;
        },
    };

    instance.execute();
}


but I’m still facing some issues:

  • I’m still not 100% sure of why it’s working (Zig beginner here).
    I cannot fully wrap my mind around the .execute in the instance.
    (sounds stupid but I cannot find the name of the struct {}.identifier construction; I think knowing it would help)
  • In the instance the execute pointer function is quite hard to read for me / unintuitive
    (if it’s a skill issue, that’s okay but if I’m just miss-using the language, I’d like to know)
  • I find this construction a little bit too verbose (which is basically fine)
    one thing triggering me is the need to name the function in the struct (maybe it can be anonymized)
  • My fear is missing something more fundamental here:
    Why would be a function pointer that different to handle than another field?
  • Is some hard to see overhead hidden in this implementation?

Some context may be helpful:
I know plain functions can be used and simply have the execute field take its address,
which I find a bit cumbersome but the real question is: Is it cumbersome or am I biased using for too long OOP languages?

Of course, if you think what I’m trying to do is an anti-pattern / not the Zig way, pointers to alternatives are welcomed.

Thank you and sorry because I’m pretty sure this was already asked / documented somewhere I could not find.

Tbh, I think that approach is a bit overengineered for a Z80 instruction decoder even if it compiles to optimal code, straight and simple switch-case where the opcode actions are directly in the case branch is usually better.

You could also consider an ‘algorithmic decoder’ where you directly look at the bit patterns of an opcode, I used that approach here: kc85.zig/src/emu/CPU.zig at 49e657479a7baeaed928abf73070c87818597b48 · floooh/kc85.zig · GitHub

(also see here for how that works: Decoding Z80 Opcodes)

I also have a ‘cycle-stepped’ Z80 emulator here, but that relies on code-generation and is probably overkill for a Gameboy emulator: chipz/src/chips/z80.zig at main · floooh/chipz · GitHub

(for how that works, see here: A new cycle-stepped Z80 emulator)

2 Likes

Thank you very much for the valuable pointers about the emulator
(the bit pattern switch-cases are indeed very interesting)!
I’m not flagging your answer as the solution as I’d still like to investigate the polymorphism implementation, even if it does not make it in the emulator, to have a better grasp on Zig.

Take a look at this exercise, it should be pretty close to what you are looking for: Ziglings #92

3 Likes

Indeed, this exercise is really close.
Bonus point: it’s a recurring pattern browsing around.

I’m just waiting a bit to pin a solution (would help if someone steps by to have a small opinion and / or breakdown of the weird construction I posted
especially the .execute = struct { fn self() void { } }.self,

BTW, thank you so much for zigling!

1 Like

This is a pretty standard pattern in Zig, nothing weird about it. Sometimes people opt to do it inline like this, others may opt for a private function definition within Instance1 and assign it by name without the encapsulating anonymous struct, but that is mostly a stylistic choice.

1 Like

I would put the instruction handlers in individual structs inside a namespace:

const instructions = struct {
    pub const LOAD = struct {
        pub const op_code = 0x01;
        pub const frequent = true;

        pub fn execute() void {
            // ...
        }
    }
    pub const STORE = struct {
        pub const op_code = 0x02;
        pub const frequent = true;

        pub fn execute() void {
            // ...
        }
    }
    // ...etc
}

Then I would use comptime to loop through the instructions. The frequent instructions would be handled using an inline loop:

    const decls = @typeInfo(instructions).@"struct".decls;
    inline for (decls) |decl| {
        const instr = @field(instructions, decl.name);
        if (instr.frequent and instr.op_code == byte) {
             instr.execute();
             break;
        }
    } else {
        const handler = handler_table[byte];
        handler();
    }

While the infrequent instructions would be handled through a function pointer lookup table, which is generated at comptime using information from the namespace.

The arrangement allows code that execute frequently to be inlined into the loop and take advantage of the CPU’s branch prediction unit.

Comptime allows you to do things you could never do in other languages. I would suggest that you don’t get too fixated on an approach you might have learned elsewhere.

3 Likes

I’m afraid this (functions inside instances, not types) is impossible:

const Instance1 = Abstract {
    .attr = 1,
    .execute = &hello,

    fn hello() void {
        std.debug.print("Works but...\n", .{});
    }
};
2.zig:12:5: error: expected field initializer
    fn hello() void {

And this is a bit unusual (and even somewhat interesting) - you define virtual method (execute) at instance level, not at “class” (type, structure level). And if I understand it right, the only way to do so is to have an anonymous struct with function inside and then accessing this only field immediately.

the function can come from anywhere, an anonymous struct is often used to define the function close to/inline where you use it.

/pedantic but useful for a later reader that is new to zig

I meant that function body can not be placed directly into an instance of a struct, only inside anonymous struct.

i know

i was trying to clarify this part, to prevent someone from misunderstanding and thinking you had to make an anonymous struct to get a function pointer, when in fact you dont.

it might seem obvious but I’ve seen experienced developers make similar mistakes before :3

1 Like

No doubt:

const VirtualMethod: type = *const fn() void;
const Abstract: type = struct {
    attr: i32,
    execute: VirtualMethod,
};

fn hello() void {
    std.debug.print("Works but...\n", .{});
}

const Instance1: Abstract = Abstract {
    .attr = 1,
    .execute = &hello,
};

O

Ok, I think we understood each other.

you define virtual method (execute) at instance level

Oh thank you, that was exactly the missing puzzle piece!

Just to make sure I got it: am I right to assume that in this case there is no hidden magic like VTable (if we forget about the optimizer)?
And therefore, every single instance has its own implementation of the method instead of a pointer right?

What an elegant solution! Thank you.
The introspection plus the beauty of mixing inline for and run-time dispatching!

Likely over-kill for what I want to implement but I don’t care and will probably go for it :slight_smile:
Anyway one of the main goal of this toy project is to dig in Zig.

I would suggest that you don’t get too fixated on an approach you might have learned elsewhere.

Valuable piece of advice; the main difficulty is know about being oriented because of some background.

I think that

  • vtable is not a magic at all
  • your execute function pointer is a sort of vtable (with one element)
  • but you assign this pointer for each struct instance, not for struct type, which (imho) is a bit unusual

Compare with @chung-leong example - concrete implementations are in structs (types). And in this example there is no vtable (a set of function pointers), instead meta-programming/comptime-type-reflection magic ( :slight_smile: ) is used.

Yes, they have, but you call them via execute function pointer anyway.

In this code

the part from struct up to .hi is coerced by compiler to function pointer.

1 Like

Also you can explicitly use & (address of) operator, like this:

    .execute = & struct {

In both cases .execute field will contain address of hello function, which is stored… mmm… somewhere.

1 Like

Why only inline the frequent instructions?

Seems to me the code this would generate is equivalent to something like a switch on the byte value, but instead of manually typing out each op_code, you comptime-generate the switch cases.

If you’re confident that the compiler would recognize that it’s switch structure and generate a jump table for it, then yeah, you can use a inline for to deal with the infrequent instructions too.

Arguably this talk could have been titled “polymorphism, the zig way”: Andrew Kelley - Practical DOD

For your exact use case though, I’m not convinced you need polymorphism. How about direct imperative code instead, completely avoiding virtual function calls? That’s generally going to generate better machine code as well as be more malleable source code.

5 Likes