Methods chaining

dee0xeed · October 27, 2023, 11:20am

If some (or all) functions within a struct return a pointer to self,
then we can easily chain methods calls, akin using a pipe in a shell:

const std = @import("std");

const Object = struct {

    i : i32 = 0,

    pub fn inc(self: *Object) *Object {
        self.i += 1;
        return self;
    }

    pub fn add(self: *Object, n: i32) *Object {
        self.i += n;
        return self;
    }

    pub fn sqr(self: *Object) *Object {
        self.i = self.i * self.i;
        return self;
    }

    pub fn sub(self: *Object, n: i32) *Object {
        self.i -= n;
        return self;
    }

};

test "a pipe" {
    var o = Object{};
    try std.testing.expect(8 == o.inc().add(2).sqr().sub(1).i);
}

is anybody using this “technique”?
are there (not so trivial) examples in Zig standard library?
what are (potential) pros and cons of this?

AndrewCodeDev · October 27, 2023, 11:43am

This technique is referred to as the “fluent interface”. It has a long history in many languages.

A potential con of method chaining can actually be subtle performance differences when passing self via some form of reference (to see a really detailed discussion of this, look up the C++ conversations regarding “deducing this” as it relates to passing by value vs passing by reference as they’ve been fixed to self-references because the this pointer was always implicitly captured by all class methods).

Honestly though, your biggest concern is regarding the order of method calls. It may visibly look like things are being called in the order you’d like, but looks can be deceiving. I personally don’t do chained arithmetic operations like you’re presenting here (nothing wrong with it, just not my style) so I haven’t spent a lot of time testing this (and remember, you need to test it under different optimization levels). You wouldn’t want something like the following to get re-ordered by the optimizer:

x.sub(1).sqr();

…because (x - 1)^2 is different than x^2 - 1.

Someone else a bit more savvy to using this pattern will have to comment on the order of struct method calls.

This is just good practice in any language that you are working with - test the properties of your expressions (are they associative, are they commutative… etc).

You’ll see a lot of similar things in the Zig standard library but not exactly like this - it’s quite common to see an import that uses a method call to create a struct that is then used to access another method that then creates another struct… so on…

I can’t think of an example off the top of my head, but if I do, I’ll post one

dee0xeed · October 27, 2023, 12:09pm

Well, at least that particular example works as expected both with -O ReleaseFast and with -O ReleaseSmall.

IntegratedQuantum · October 27, 2023, 12:13pm

I think it might be worth mentioning that we don’t need pointers to do this, we can also return by value.
I do this for matrix multiplication for example:

const Mat4f = struct {
    columns: [4]Vec4f,
    
    pub fn mul(self: Mat4f, other: Mat4f) Mat4f { ... }
}
...
const modelMatrix = (
	Mat4f.identity() // TODO: .scale(scale);
	.mul(Mat4f.rotationZ(-ent.rot[2]))
	.mul(Mat4f.rotationY(-ent.rot[1]))
	.mul(Mat4f.rotationX(-ent.rot[0]))
	.mul(Mat4f.translation(Vec3f{
		@floatCast(pos[0]),
		@floatCast(pos[1]),
		@floatCast(pos[2]),
	}))
);

Not using pointers also allows us to add parenthesis in our expression, giving more freedom over the order of operations:

a.mul(b.add(c))
a.mul(b).add(c)

And I think not using pointers is less error-prone because it has no side effects on the original value.

dee0xeed · October 27, 2023, 12:20pm

And then it is more closer to functional paradigm, all our methods are “pure”, they return a new (most likely modified, unless we just want a copy) instance of an entity. Did I get the idea right?

IntegratedQuantum · October 27, 2023, 12:21pm

Yes, exactly.

AndrewCodeDev · October 27, 2023, 12:22pm

In general, I think this is good advice - prefer passing by value.

It’s worth noting here that the Zig compiler can choose to pass by reference if it sees a benefit (under certain optimization levels). This has unintended side effects. This can cause really nasty aliasing issues.

For instance, the following code can do very, very weird things because of this:

m = rotate(m, 90); // apply a 90 degree rotation to the same object

This can actually give you the wrong answer. Same thing can happen in Jai.

I urge everyone to watch this to understand why:

bmacho · October 27, 2023, 12:24pm

I wonder if the compiler can notice if your original Mat4f struct is not available anymore, and instead of creating a new Mat4f, chooses to reuse an old one.

dee0xeed · October 27, 2023, 12:29pm

If so, then it’s not functional style (every function is pure) anymore

dee0xeed · October 27, 2023, 12:39pm

If my ‘object’ is allocated on the heap and we allocate a new one on every method call, then we have a lot of garbage which should be removed on every method call. It is big overhead, isn’t it?

AndrewCodeDev · October 27, 2023, 12:41pm

Depends on the kind of allocator you’re using

If you have a stack allocator that can be reset between calls, it’s as cheap as moving an index back. So that really depends.

I’m not sure I’m answering your question though - maybe a code example would help clear that up.

dee0xeed · October 27, 2023, 3:19pm

I made an example with I/O:

const std = @import("std");

const Converter = struct {

    srcf: std.fs.File = undefined,
    dstf: std.fs.File = undefined,
    buff: [1]u8 = .{0},

    pub fn init() Converter {
        return Converter {
            .srcf = std.io.getStdIn(),
            .dstf = std.io.getStdOut(),
        };
    }

    pub fn readOneByte(self: *Converter) ?*Converter {
        const cnt = self.srcf.read(self.buff[0..]) catch 0;
        return if (0 == cnt) null else self;
    }

    pub fn xorWith(self: ?*Converter, byte: u8) ?*Converter {
        var conv = self orelse return null;
        conv.buff[0] ^= byte;
        return self;
    }

    pub fn writeOneByte(self: ?*Converter) ?*Converter {
        var conv = self orelse return null;
        _ = conv.dstf.write(conv.buff[0..]) catch unreachable;
        return self;
    }

};

pub fn main() void {
    var conv = Converter.init();
    while (true) {
        _ = conv
            .readOneByte()
            .?.xorWith(0x01)
            .?.writeOneByte()
        orelse break;
    }
}

It kinda works, but crashes at the end of the input (after pressing ^D):

$ ./xor 
bcde
cbed
    thread 80620 panic: attempt to use null value
    xor.zig:41:13: 0x21e80d in main (xor)
            .?.xorWith(0x01)

I understand why it crashes, but do not know how to fix it

AndrewCodeDev · October 27, 2023, 3:21pm

Looks like that’s returning a null value. I’ll have to look at it a bit more and figure out what you’re doing with the optional self values.

dee0xeed · October 27, 2023, 3:25pm

Yes, exactly, null from readOneByte was supposed to be propogated through the entire sequence… but it’s not workng.

dee0xeed · October 27, 2023, 3:41pm

Probably, my question was not quite clear.
Ok, I’ll try to reword it.
Are there any examples of entities in stdlib which has entirely fluent interface?

dee0xeed · October 27, 2023, 4:08pm

fixed:

const std = @import("std");

const Converter = struct {

    srcf: std.fs.File = undefined,
    dstf: std.fs.File = undefined,
    buff: [1]u8 = .{0},
    stop: bool = false,

    pub fn init() Converter {
        return Converter {
            .srcf = std.io.getStdIn(),
            .dstf = std.io.getStdOut(),
        };
    }

    pub fn readOneByte(self: *Converter) *Converter {
        const cnt = self.srcf.read(self.buff[0..]) catch 0;
        if (0 == cnt) self.stop = true;
        return self;
    }

    pub fn xorWith(self: *Converter, byte: u8) *Converter {
        if (false == self.stop)
            self.buff[0] ^= byte;
        return self;
    }

    pub fn writeOneByte(self: *Converter) *Converter {
        if (false == self.stop)
            _ = self.dstf.write(self.buff[0..]) catch unreachable;
        return self;
    }

};

pub fn main() void {
    var conv = Converter.init();
    while (false == conv.stop) {
        _ = conv
            .readOneByte()
            .xorWith(0x01)
            .writeOneByte();
    }
}

edit: @AndrewCodeDev in this example I can not see any specific reason to pass by value and to return a value, can you?

AndrewCodeDev · October 28, 2023, 1:29pm

Not in that instance, no.

In the case you’ve provided here, you’ve created an object that is maintaining state about the current process.

Just so everyone is following along at home, what @dee0xeed has created is a converter that reads from standard in, XOR’s a byte, and the writes it to standard out. The converter object is just a convenience wrapper to make this happen.

Here’s why I don’t think this example is a pass-by-value issue (and, additionally, one of my problems with the fluent interface).

In the example that @IntegratedQuantum was providing (applying an affine transformation), there is a defined outcome of that step that creates a unique and well-formed object. For instance: x * y + z where all variables are f64.

x * y → f64 that is the product of x and y… we’ll call him u

u + z → f64 that is the addition of the former product with z.

At each step here, there is a defined, well-formed outcome that we can use independently of the next operation.

In your case, what’s the use of reading the bytes if we aren’t going to XOR them? And then, what’s the use of XOR’ing them if we aren’t going to write them somewhere?

Each individual state represents an incomplete part of a total process. You need to preserve that state between calls.

NOW… the fun part

The example you provided is one of the reasons I personally do not like the fluent interface. It strongly couples an interface to an operation and each operation assumes something about the previous state. I would much rather see a function like readWriteXOR where everything you’re doing is the product of one function.

In essence, the fluid interface encourages us to plug-and-play; it’s flexible and allows for easily making your own sequence of events. Sounds good, right?

Well, in most cases I’ve seen, it turns out that people do a lot less plug-and-play than you would expect. They actually do a few things that need to be done in order and depend on the last step. In other words, we have a lot of flexibility when in reality, there is a very simple process that needs to exist and should not be modified.

For instance, let’s take your example:

        _ = conv
            .readOneByte()
            .xorWith(0x01)
            .writeOneByte();

And let’s just change one thing (we’ll swap the order of two functions):

        _ = conv
            .readOneByte()
            .writeOneByte()
            .xorWith(0x01);

Well, that doesn’t make sense now, does it? I’m XOR’ing a byte after I wrote something? That seems idempotent.

In fact, there are many orderings you can come up with that do not make any sense. So now, we need to add controls to make sure validity is maintained. Great, more state lol.

Let me play devil’s advocate here for a moment. Where could this maybe be a good thing?

If you are creating an interface that genuinely needs to maintain state between operations AND the operations can be reordered to give many valid combinations AND you have guard rails for operations that must be only called at a specific point… okay then, this could work.

Otherwise, just write a function lol. Thanks for the great example, @dee0xeed, and thanks for reading my rant.

dee0xeed · October 28, 2023, 2:17pm

external world is such an external world
you never know when a user will press ^D
so that program uses a flag in xorWith() and in writeOneByte(),
which was set at EOT event in readOneByte()

dee0xeed · October 28, 2023, 2:30pm

Neither do I.
They (fluent interfaces) are for doing some number crunching, not for i/o (which is driven by external world)

dee0xeed · October 28, 2023, 2:34pm

I got the joke.
There is no such a thing, that is commonly named as “AI”