Recursive calling strategy

I have this recursive move generator function. The max depth is between 40 and 45.

fn gen
(
    self: *MoveGenerator, // mutable self
    board: *const Board, // immutable board
    square: Square, // 1 byte  -> new square will be passed on
    node: Node, // 10 bytes -> new node will be passed on
    rack: Rack, // 8 bytes -> new rack will be passed on
    move: Move, // 24 bytes -> new move will be passed on
    path: Path, // 32 bytes -> new path will be passed on
    comptime inputflags: GenFlags, // u2
    comptime ori: Orientation, // 2 distinct values
    comptime dir: Direction // 2 distinct values
) void

and was wondering what would be the best strategy.

  1. keep as is
  2. put the mutable vars in a struct
  3. put the mutable vars in a struct and maintain my own list (max 42 deep) on the stack. and pass a depth int + pointer.

The thing is: not always will the “mutables” to pass on be changed, but just passed on unchanged.
I have the suspicion that the compiler will outsmart me and tend to choose option 1.

What you intelligent people think of that?

1 Like

Solutiions 1. and 2. should be basically the same, maybe with 2. the data might be a bit smaller, because zig is allowed to reorder struct members, but maybe it does the same thing for function arguments too?
3. seems worse to me. It needs more memory on the stack, since you end up passing an additional 16 bytes to the function. And you are making it harder to read, both for yourself and the compiler.

If possible, the best strategy for recursion is generally to not use recursion at all. Recursion has inherent overheads, since it needs to duplicate constant function arguments for every call.
But it is rather difficult to go from recursive to iterative functions, since you basically end up simulating your own stack and return addresses. Furthermore the gains get smaller the more expensive your function body is. So in practice it is often just not worth the effort. Judging by your function signature it likely falls into that category.

I think you should spend your energy instead on optimizing the function body itself, instead of focusing on the recursive function call here.

1 Like

Thank you.

Yes the recursiveness is as good as a must I believe. In reality there are 2 recursive functions calling eachother. I will not touch that for now.

  1. seems worse to me. It needs more memory on the stack

It actually does because I need to allocate the max depth in an array on the stack, before going recursive.

I will focus on the correctness and speed first and show it later on…

Chess programs maintain a stack I know…

1 Like

This seems overly proscriptive to me. It’s true that you don’t want to borrow a page from functional programming and write “iteration is recursion” code in Zig.

But compilers can be pretty smart about recursion, small constants (some value of ‘small’) can be kept in registers, larger struct parameters can be passed by *const and I see that as best practice now given the uncertain status of Parameter Reference Optimization.

The additional headache of recursion is ensuring that recursive calls will be reasonably bounded. That’s stricter than making sure that while loops have a termination condition in all cases, because one might have limited control over how deep in the stack a function call is triggered. It’s a hard maximum of 8MiB of headroom (or whatever the platform number is), the actual amount available can be much smaller.

The closer to the tail of the function that the recursion happens, the more room the compiler has to minimize stack use, since anything which is never referenced after the recursive call can be recycled.

Zig also has tail call elimination, which can be forced with @call and .always_tail, so you can actually do the ‘iteration is recursion’ pattern if it makes sense (and ensure that you actually get a tail call).

I agree with all of your specific advice to the OP actually, I’m more speaking to the help topic itself.

2 Likes

That sounds advanced…

I noticed a bit of speedup after making parameters *const for structs > 8.
I think the famous chess engine Stockfish minimizes recursion and keeps its own stack with a abundant pointer festival.

it would be interesting to see what would happen if i rewrite the 3 recursive functions
in here
into one monolithic big non-recursive function as an experiment.

1 Like

It’s not the first step you’d want to take, no. But it can make a major difference for certain kinds of code.

If you can afford to optimistically mark the opp as processed, then this call can be moved to the tail position, and you can at least try to force the issue with @call and .always_tail.

For LLVM / Zig, whether a call can be a tail call is more complex than it is in Lua or Scheme, so that may not actually work. Specifically it can have problems with mutually-recursive cycles, and fair warning, the error you’ll get if .always_tail isn’t possible is fairly cryptic, it comes straight from LLVM and has the term musttail in it.

But that’s the kind of thing you can look for.

I can’t claim to understand the code in detail, but my intuition is that a Scrabble board has a finite size, meaning that the recursion here will be bounded, with an upper limit which isn’t dynamic by the size of the data. So I wouldn’t worry too much about using recursion. That said, recursion does prevent some optimizations, notably inlining, so it’s worth looking for opportunities to reduce it, and possibly getting rid of it entirely.

You definitely want to pass structs by *const in a recursive web like this. If it were C, I would advise doing what you can to make the prefix of function calls identical, and that’s probably a good idea here as well. The optimizer will still look for opportunities to reorganize things, to minimize the movement of data needed to set up the prelude to a function call, but this is a permutative operation, so it’s very expensive and optimizers give up fairly fast on it. Since recursion prevents inlining, whatever call convention the optimizer comes up with for a function has to work at every call site, it doesn’t want to generate several versions of the function which differ only by the placement of data. So any help you can give it improves your odds.

The recursiveness is a blessing I must say. But I will certainly try it as an exercise. And limited to a depth of 30 on a 15x15 board.
I will experiment with the @call.
In theory the algo is very easy: follow the squares using either a boardletter or try-letter, keeping track of the letter-node in the “gaddag” character-tree.
it still needs 200 lines of code in my case.
Which is 6 or 8 times less than my old delphi code by the way.

Currently it produces around 3.000.000 moves per second, which is quite good I think. Not sure yet if there are much speedups which can be achieved without deep diving into SIMD or higher math.

1 Like

I think that this scrabble algorithm can be seen as a special case of what is called WaveFunctionCollapse which is basically a set of rules for how things can be connected (in the scrabble case with complex constraints) and then generating new tiles / placing them, it is likely that the special casing is easier to optimize, but the more general algorithm could give additional ideas for techniques that could be explored.

1 Like

O my… Luckily scrabble is more constraint.

1 Like

BUT, it will still use my comptimes I hope?
My whole speed expectations are based on this.

Recursion should have no impact on comptime, no.

I just noticed I can’t. I have to change it.

1 Like

I just succeeded to fix that and even made algorithm simpler. As a bonus I managed to sucessfully survive the unicode hell so my Slovenian works now.
It is very difficult to get good lists of words from the internet BTW. A lot of them are garbage.

1 Like

The recursion cannot get deeper than the longest word length, right? The overhead of recursion should not be a big problem in Scrabble Word Generation from a static rack on a static board (as opposed to expanding and searching a game tree in chess). Learning to make the recursive call as late as possible in the code is a good habit.

The main downside of recursion is it confounds real-time “symbolic debugging”, but you should not need to do that here.