Extracting function logic

ericlang · December 9, 2024, 11:43pm

I have this struct which produces and stores ‘moves’ to a std.ArrayList.
But now I want to ‘extract’ this storage functionality to the outside.
Get rid of the internal arraylist and provide a store function on the outside.

But I just cannot get my syntax right. A little sketch down here.

const Producer = struct
{
   fn init () {...}
   fn producemoves(self) 
   { 
        produce and foreach move do self.storemove(context, move) 
   }
   fn store_move(self, move)
   {
      // call outside function store_move(context, move)
   }
}

const Receiver = struct
{
    fn execute(self: *Receiver)
    {
        // create producer(Receiver??, store_move??)
        // producer.producemoves();
    }
    fn store_move(self: *Receiver, move: Move)
    {
        // store here / or count / or do whatever we want with the move.
    }
}

It must be some magic combination of comptime / anytype / generic.
I would be very thankful if someone could explain the secrets / syntax.
How to create a generic Producer with a ‘context’ which is in this case ‘Receiver’
and a pointer to its ‘store-move’ function.

pachde · December 9, 2024, 11:57pm

Why do you need this to be generic/dynamic at all?

Write it first with concrete types. You may find that for a generic/ploymorphic version YAGNI.

ericlang · December 10, 2024, 8:16am

I am surely gonna need it.
In one case it is a randomgame player.
In another case an engine which only has to count the top 10 moves or store all the moves.
it is an abstact ‘event’ onStore.

Sze · December 10, 2024, 9:09am

IMHO it is still better to write the two concrete implementations and then figure out the generic interface from there.

Personally I would just start with:

const Producer = struct
{
   receiver:Receiver,
   fn init () {...}
   fn producemoves(self) 
   { 
        produce and foreach move do self.receiver.storeMove(move) 
   }
}

Then once you have a second one you can switch to:

const Producer = struct
{
   fn init () {...}
   fn producemoves(self, receiver:anytype) 
   { 
        produce and foreach move do receiver.storeMove(move) 
   }
}

Where you pass the pointer to the receiver as second parameter.

andrewrk · December 10, 2024, 9:12am

It’s still too early to implement it, since you believe that you are surely going to need it, but you don’t need it yet. The point of YAGNI is that you benefit from never implementing something until the very moment that you actually truly really do need it right now.

ericlang · December 10, 2024, 10:18am

Some truth in that, but I mainly cannot get my head around it (generics / anytypes/interfacss).
I am building a scrabble engine (comparable to a chess engine).
This Engine uses a MoveGenerator who has to pass a “I have a new move produced” to the Engine.
Sometimes we just count the moves, sometimes we need them all, sometimes we need only one, sometimes we need a filter. That is why I need the flexibility of a custom function inside the Engine struct.

edit: I never implement something I do not need

ericlang · December 10, 2024, 10:22am

Yes I think I will start with that for now.
Although I do not really like the Receiver 'uses 'Producer and Producer ‘uses’ Receiver.

Sze · December 10, 2024, 1:41pm

Why would the receiver create the producer?

ericlang · December 10, 2024, 2:09pm

I don’t know yet if the structure is perfect but you have to see it in a “Engine Iterative Deepening AlphaBeta Search Pruning” context.
It currently lacks some interface flexibility but it works.

Today (experimenting) I divided things into:

MoveGenerator
Just producing moves and calls a parameterized function from MoveHandler.
contains a pointer to MoveHandler and a function from MoveHandler like store.

MoveHandler
Which contains a movelist has functions like:
fn store(self, move) // store in movelist
fn count(self, move) // just count, do not store
fn any(self, move) // please dear MoveGenerator quit if found one move
fn filter(self, move) // just take filtered moves

Engine
Which performs a tree search. the engine creates a MoveHandler and a MoveGenerator.

var handler = MoveHandler.init();
var gen = MoveGenerator.init(board, &handler, MoveHandler.count)

or
var gen = MoveGenerator.init(board, &handler, MoveHandler.filter)

(and still looking for more comptime stuff, but it is kind of working now with same speed as before)

mnemnion · December 10, 2024, 3:25pm

My instinct is that you’re applying patterns from other languages, which won’t be a good fit for Zig. For instance, you can just have a pointer to a MoveHandler and call store with that pointer. It seems like you need some flexibility in which function of the MoveHandler is called: consider using an enum switch to decide that in a dispatch function.

As a general rule, you don’t need dispatch based on function pointers unless you have an open-world problem: that is, there’s code out of your control where you’d like to make it possible for that code to implement a type-compatible interface, which can be stored/created/passed around as a concrete type, that is, not specialized with anytype: using an anytype specialized at comptime can be a better choice if there isn’t a need for the interface to be a single type. See Allocator for the classic example of the open-world pattern.

From what I can tell, your code is closed-world, all the variations you need will appear in code which you control. It looks like you’re translating code which used an interface in another language, you did some reading on how Zig does interfaces, and you’re trying to apply that knowledge. My additional tip is that Zig does interfaces (in the sense you’re implementing) rarely, because it has other ways of structuring code which are more efficient and less fragile.

ericlang · December 10, 2024, 5:08pm

Very true! Still I need the storage handling on the outside of the movegenerator, that is: the generator does not know anything about what is being done with the moves.

mnemnion · December 10, 2024, 5:21pm

If the MoveHandler needs to be doing only one thing at a time, then it can hold a state enum, and have a generic dispatch function which uses that information.

If you need to create the move conditions in a way which doesn’t affect the source, then you can have a MoveGenerator struct which holds a pointer to the MoveHandler and the action-specific state enum, then calls a similar dispatch function which passes in that state.

A nice thing about doing it this way is that the state enum is explicit data about what the generator is doing: it’s easy to log, use later if you discover that wait, it’s actually helpful for the generator to be able to check what’s happening. You can just look at it when debugging and know what’s going on, it’s plain old data.

Function pointers make all of that hard, and limit optimization opportunities in the process.

ericlang · December 10, 2024, 8:25pm

Also very true, and about what I am looking for.

You mean the MoveHandler would be a struct witth a comptime state?

fn MoveHandler(comptime state: SomeEnum) type
{
    return struct 
    {
        fn generic_do_something_with_move(self: MoveHandler, move: *const Move) void
        {
              switch (state)
             {
                // do this or that depending on state?
             }
        }
    }
}

and pass this comptime thing to the MoveGenerator?

mnemnion · December 10, 2024, 8:47pm

Possibly. That will produce a unique MoveHandler type for each state. That might or might not work, the consequence will be viral: everything which accepts a MoveHandler will need to accept it as an anytype, and that will generate a version of the function which takes it for each sort of MoveHandler.

That may be good, actually, in terms of generating optimal code for each pathway. The binary will be larger but depending on how much time is spent inside each of those function calls, you might come out on top. It gets challenging though if you need to store the MoveHandler in another struct, because you can’t do that if you have say, five distinct MoveHandlers.

In that case you might want the switch to happen at runtime. It’s still likely to be better to have a conditional leading to known function addresses, rather than an implicit conditional (as far as the branch predictor is concerned) based on a runtime function address. No guarantees on that front, unfortunately the performance of modern CPUs can be pretty hard to reason about in advance when it comes to this type of tradeoff. A lot of it depends on how much time it spends in the switch vs. in the switched-to function, and how predictable the switch (or the destination address) is to the branch predictor.

I got the impression that you want the MoveHandler to be passed in and used, and not stored by the struct which receives it, in which case using a comptime enum to eliminate the switch, at the cost of several specializations, that’s likely to be your best-performing option.

ericlang · December 10, 2024, 8:59pm

Yep. I once wrote an insanely fast movegenerator in Rust for chess.
In there the entire optional movegeneration was monomorphized (generate captues, checks, evasions etc.) inside the MovGen. But there the movelist was a relatively small fixed array (easily fitting in the stack) in which I incremented a pointer when storing moves.

In scrabble sometimes there are more than 100.000 possibilities. And a move is 16 bytes instead of 2.

I think I more or less catch the idea.
Willl upload the stuff and come back here…

ericlang · December 10, 2024, 9:07pm

Something like this. But not yet perfect

pub fn test_stuff() void
{
    var h = TestHandler(.Count).init();
    var g = MoveGenerator(.Count).init(&h);
    g.execute();
}

const TestState = enum
{
    Count,
    Any,
};

fn TestHandler(comptime state: TestState) type
{

    return struct
    {
        const Self = @This();
        count: u64 = 0,
        bestmove: EngineMove = EngineMove.EMPTY,

        fn init() Self
        {
            return Self {};
        }

        fn handle_move(self: *Self, move: *const EngineMove) void
        {
            switch (state)
            {
                .Any => { self.bestmove = move.*; std.debug.print("(ANY)", .{}); },
                .Count => { self.count += 1; self.bestmove = move.*; std.debug.print("(COUNT)", .{}); },
            }
        }
    };
}

fn MoveGenerator(comptime state: TestState) type
{
    return struct
    {
        const Self = @This();
        handler: *TestHandler(state),

        fn init(handler: *TestHandler(state)) Self
        {
            return Self { .handler = handler };
        }

        fn execute(self: *Self) void
        {
            for (0..3) |_|
            {
                self.store(&EngineMove.EMPTY);
            }
        }

        fn store(self: *Self, move: *const EngineMove) void
        {
            self.handler.handle_move(move);
        }

    };
}