Simple function chaining

I tried to thin this to the bare bones; please help me with what I’m missing. Given (the somewhat ludicrous, now):

pub const Foo = struct {
   const Self = @This();
   pub fn bar(self: *Self) Foo { _ = self; return .{}; }
   pub fn bam(self: *Self) Foo { _ = self; return .{}; }
};

I can do this:

   var foo = Foo {};
   var f1 = foo.bar();
   var f2 = f1.bam();
   var f3 = f2.bar();
   _ = &f3;

But I can’t do this:

   var foo = Foo {};
   _ = foo.bar().bam().bar();

I get error: expected type ‘*main.Foo’, found ‘*const main.Foo'cast discards const qualifier

(0.16 master, if it matters)

1 Like

Hi ! This is completely intended !

When chaining function in your case, you are asking in your function to pass a *Self pointer, that is a pointer to a mutable value. However, when immediately using return values from functions, the value is immutable.

For example the following would not work :

const foo: Foo = .{};
var f1 = foo.bar();

// equivalent to
var f1 = (Foo {}).bar();

In your case, you would need to modify your function to take a const pointer/value :

pub const Foo = struct {
   const Self = @This();
   pub fn bar(self: Self) Foo { _ = self; return .{}; }
   pub fn bam(self: Self) Foo { _ = self; return .{}; }
};

Parameters are by default constants in zig

However, if you need to still mutate data and do function chaining, you would need for example to hold a pointer to mutable data internally, but that is not the zig way :smiley:

3 Likes

I will clarify with for example a Java style factory, the “zig way” would be :

pub const Options = struct {
  allow_a: bool = true,
  allow_b: bool = false,
};

pub const Foo = struct {
  has_a: bool,
  pub const init(options: Options) @This() {
    // init code
    return .{
      .has_a = options.allow_a,
    };
  }
};

If you still need function chaining, do :

pub const Data = struct {
  allow_a: bool,
  allow_b: bool,
};

pub const Foo = struct {
  data: *Data,

  pub const init(allocator: std.mem.Allocator) @This() {
    return .{.data = allocator.create(Data) };
  }
  
  pub const deinit(self: @This(), allocator: std.mem.Allocator) void {
    allocator.destroy(self.data);
  }
  
  pub fn allowA(self: @This()) void {
    self.data.*.allow_a = true;
  }
};
1 Like

Ah, right, of course. My mistake - the * (mutability) derived from the “real” code, but the purpose is actually avoidable (mutability is not necessary).

For completeness, I guess it’s worth pointing out that another “solution” exists. IF mutability is not the aim, but, rather, avoiding expensive copy* is, then

 pub fn bar(self: *const Self) Foo { _ = self; return .{}; }

*- I appreciate that it’s the compiler’s job to decide whether to copy or just (const) reference, and it’s supposed to do that well, so I shouldn’t have to feel the burden of using const* to avoid an expensive copy. At least, I think that’s the right attitude I’m to take.

2 Likes

Yeah definitely ! The compiler will decide in and on itself if the value should be copied, moved or referenced to (it depends on register size, value size, calling conventions etc.).

You should only do *const Self if you want to specifically do pointer arithmetic on the reference, for example with a @fieldParentPtr.

This also exposes another danger in return .{}; - I think this “works” here because(?) this function will be inlined; if this was potentially a “return of a local, which goes out of scope”, that would be bad. The original code (from which I derived this) was explicitly inline and ONLY consisted of return .{stuff}, but I see that my simplified example code looks more suspect. Oops.

1 Like

Returning a local is never a problem if :

  • it doesn’t contain a pointer to a local
  • it is not a pointer to something local

ie

const local: usize = 42;

return local; // ok
return .{.value = local}; // ok
return .{.value = allocator.dupe(local) }; // ok
return &local; // UB, in master, returns undefined
return .{.value = &local}; // UB

only partially related, but as an aside: I don’t think you need @This in the OP’s example since you don’t have an anonymous struct. You could just use

pub const Foo = struct {
   pub fn bar(self: *Foo) Foo { ...

This doesn’t affect the mutability problem of the OP.

1 Like

Or you can return the struct as a mutable pointer, *Foo in this case.

That works, but it doesn’t mesh with the classic “builder pattern” because you cannot simply create something and start chaining it, you need a var to take a pointer of first.

If it’s a heap pointer, you can get away with this (awkward) approach:

const my_foo: *Foo = (try allocator.create(Foo)).build1(fee, fie).build2(foe, fum);

Which I do not endorse or recommend, but it works.

Zig’s conventions just don’t favor “chain as much as possible”. Which doesn’t suit everyone, but suits me just fine.

Oops, I was conflating my trimmed posted code with my original code again, and, even then, I was confusing unrelated things; sorry for the noise, but your post is a good reminder, of course.

Ah, great! Actually, my “real” code is @This-less, but I hadn’t worked out the real value of @This, yet (for anonymous structs), and I thought I’d make my posted code look more like I’ve seen other code with @This - must’ve been all anon structs but I missed the pattern. So you answered a question I should have asked. I see the standard doc does say “This can be useful for an anonymous struct that needs to refer to itself“ right near the top; just missed it. Lots of good things learned in this thread, thank you all.

This is the case that I was thinking about in my follow-up about returning references/pointers to locals that will go out of scope. I know you can do what you say, here, but remind me again where the gotcha is? When do you have to be careful about NOT doing this, lest it turn into a UB case? Or am I forgetting that wrong?

your (try foo).build1(… illustration brings to the surface another niggle in this thing I’m trying to do – I have a case wherein those chained function calls would like to return !Object instead of Object, but chaining trys is ugly. I noticed that long ago (2020), trychain was proposed - it looks like it was boo’d, even by the OP eventually, though I didn’t see lots of detail; the issue was closed (as completed?!) but I don’t think anything was ever attempted, as “trychain” doesn’t exist. I can think of some downvote reasons, too, but found it curious as I considered my interest. It may just be that, if my interest really depends on chaining, especially if the chained functions need to do allocations and (or otherwise) might fail, then it’s just not a good fit or zig, or I’ll need to find another way to express that structure building; something other than function calls.

You must not return a pointer to stack memory from a function, because that pointer is no longer valid. So if an initializer returns a *Foo, it had better take an Allocator so that the pointer lives in heap. By convention we call those create btw.

It’s fine to return a Foo, though, because that ends up in the result location: conceptually, the struct is built inside the function call and copied back to the variable (or ‘place’ at least) where the result goes. The optimizer might elide the copy but it’s better to assume it won’t.

A Foo all by itself can only coerce to a *const Foo, not a mutable *Foo:

const my_foo: Foo = .init(stuff).doThing(more_stuff);

This will only work if doThing has the receiver types Foo or *const Foo. Any parameter type which is not a pointer is immutable, and this is consistent with that.

So you’d have to do this:

var my_foo: Foo = .init(stuff);
_ = my_foo.doThing(more_stuff);

Not so convenient! Zig does not encourage or reward chain-heavy code. I think of that as more of a consequence than a design goal.

It’s always best to work with the language. If you want to write something as chained code, but the process is cumbersome, ugly, hard to understand: maybe don’t do it that way.

var foo: Foo = .init;
try foo.preheat(allocator, param1);
try foo.doTheNextThing(allocator, more, stuff);
if (foo.readyForAction()) {
   // ...
}

If you absolutely, positively must write this kind of code as a series of method chains, you’ll need to use another language. That’s just how it is. But I guarantee you: the CPU? Doesn’t care. Users? Don’t care. Do you care?

I’m sorry, this was too obvious. I misinterpreted; of course returning a *Foo will need an allocated Foo, rather than returning a reference to a local-stack Foo. But it’s good to spell that out for me anyway, perhaps. However, what I was trying to get ahold of was an “inline” case… I can’t put my finger on the message that highlighted this for me a week ago… but something like: you can(?) return a reference to a local-scope stack var IF the function is inline, because(?) then the var isn’t actually scope-bound? … but it’s not a good idea anyway and may be UB and may eventually compile-time (or run-time?) detected and disallowed….?

But MAYBE it was what you mentioned here about the result location, and how a struct copy might be elided if the compiler can do so, “but it’s better to assume it won’t” – is it possible that inlining the function guarantees or increases the chance that that copy is elided?

This all seems like pretty basic stuff, in a way, so thank you all for you patience; coming back from python-only-land for many years requires reflection on old basics.

In my case, it’s not that chaining is absolutely required, but that it models the higher-level domain better; this would be a case of: (lib) users DO care, because they’d find the ergonomics familiar. But there’s no need for me to force, ultimately; the exercise is good for the brain, and the “not a fit” result is a fine conclusion, or I might be inspired to come up with a better creative way to appease the ergonomic cause without over-bending zig to a bad fit.