Whatever happened to 'functions as expressions'?

I was reading through this article on polymorphism in zig when I stumbled across this:

Below, you will see me use a pattern like this:

   const SomeVar = struct {
       fn SomeFunction(SomeArgs) void {
           DoSomething().
       }
   }.SomeFunction

This is a workaround for Zigs lack of inline function declarations. You should think of the above as

   const SomeVar = fn(SomeArgs) void {
       DoSomething();
   };

Proposal #1717 will address this eventually, but in the meantime this is required.

So, why not just use the latter implementation?

After sifting through the proposal on github, it seems that the community is very fond of functions as expressions. The proposal was accepted and all seemed to be well. Until it was rejected.
This was Andrew’s reasoning. It seemingly got met with disapproval.

Allow me to address the two goals of this proposal:

Provide syntactic consistency among all statements which bind something to an identifier

This proposal does an excellent job at accomplishing this goal, and is the reason I originally accepted it. However, I will now make an argument that there is a good reason for there to not be syntax consistency between function declarations and other declarations.

Ultimately, Zig code will output to an object file format, an executable binary, or an intermediate format such as LLVM IR or C code that is ultimately destined for such a place. In those places, functions have symbol names. These symbol names show up in stack traces, performance measurements, debugging tools, and various other things. In other words, functions are not unnamed. This is different from constants and types, which may exist ephemerally and be unnamed.

So, I think the syntax inconsistency appropriately models reality, making Zig a better abstraction over the artifacts that it produces.

Provide syntactic foundation for a few features: functions-in-functions (#229), passing anonymous funtions as arguments (#1048)

I have rejected both of these proposals. In Zig, using functions as lambdas is generally discouraged. It interferes with shadowing of locals, and introduces more function pointer chasing into the Function Call Graph of the compiler. Avoiding function pointers in the FCG is good for all Ahead Of Time compiled programming languages, but it is particularly important to zig for async functions and for computing stack upper bound usage for avoiding stack overflow. In particular, on embedded devices, it can be valuable to have no function pointer chasing whatsoever, allowing the stack upper bound to be statically computed by the compiler. Since one of the main goals of Zig is code reusability, it is important to encourage zig programmers to generally avoid virtual function calls. Not having anonymous function body expressions is one way to sprinkle a little bit of friction in an important place.

Finally, I personally despise the functional programming style that uses lambdas everywhere. I find it very difficult to read and maintain code that makes heavy use of inversion of control flow. By not accepting this proposal, Zig will continue to encourage programmers to stick to an imperative programming style, using for loops and iterators.

So I reiterate, why aren’t functions expressions? As far as I can tell, from what the github comments show, there’s no reason not do so — is there?

Andrew rejected it. That’s why it’s not a thing. The community does not decide how zig works.

1 Like

We can of course keep nagging him about it :grin:
Anyway welcome to the community :partying_face:

1 Like

That’s fair; but it’s odd for Andrew to initially accept it, then turn on it a few years later. I’ve heard Andrew was initially opposed to functions in structs, but the community pushed for it. The community’s pushed for this, too, and for quite a while.

1 Like

You can’t get rid of callbacks/lambdas entirely - even zig stdlib is using callbacks, so the whole argument is sort of weak/invalid.

The current state is that we have an ugly hack, which is ugly to write, ugly to read, and it has even uglier stack traces…

4 Likes

IMO Andrew’s objections are solid. While there are cases where the problems he identifies aren’t a big deal, it means that programmers have to know about the nuances of when functions-as-expressions will cause issues, and when they won’t.

I’m a great believer in making it harder to write bad code, even if that comes at the expense of ergonomics. I think fewer surprise pitfalls, and less syntax, makes it easier for less experienced devs to contribute to making high quality software.

Having said all that, I love using functional programming style map/reduce/filter patterns unreasonably much when I’m using JIT compiled/interpreted languages. I think that’s one of the main drivers behind people asking for functions-as-expressions.

But there are already ways of emulating them stylistically (in some cases) by using comptime functions, eg. for a reducer:
(Example adapted from from What is Zig's Comptime? | Loris Cro's Blog)

const std = @import("std");

const Op = enum {
    Sum,
    Mul,
    Sub,
};

fn apply_ops(comptime operations: []const Op, num: i64) i64 {
    var acc: i64 = 0;
    inline for (operations) |op| {
        switch (op) {
            .Sum => acc +%= num,
            .Mul => acc *%= num,
            .Sub => acc -%= num,
        }
    }
    return acc;
}

pub fn main() !void {
    const ops = [4]Op{.Sum, .Mul, .Sub, .Sub};
    const x = apply_ops(ops[0..], 3);
    std.debug.warn("Result: {}\n", x);
}

Callbacks are tricker to make less ugly though. Maybe someone more familiar with comptime stuff has ideas?

I though the ideology of Zig hinged on the principles of simplicity, performance, and modernity. Zig fulfills the latter two quite well, but it’s inherently going against simplicity if it’s forcing developers to create hacks for what should be an industry standard for programming languages: functions as expressions. In my opinion, I don’t see any downside in allowing developers to use a more functional style.

2 Likes

Andrew and other core team members change their mind on language design all the time; if they didn’t, Zig would still look like it did in 0.1.0! We’re all constantly getting more experience and insight which crafts Zig’s design.

Regarding the community “pushing” for this: that isn’t how Zig’s development works. The opinions of Zig users, and even of core team members, have pretty much no effect on Zig’s design. What matters is justification, use cases, and honest analysis of features. After weighing up all of the arguments, Andrew eventually concluded that #1717 wasn’t a good fit for Zig, so the language didn’t switch to that syntax. That is really all there is to it, and of all the decisions made over Zig’s lifetime, I think this is one of the least likely to be reconsidered (no matter how much the community “pushes for it”).


By the way, the most common usage for function literals of this form is for providing callbacks, but this use case isn’t appropriate in Zig due to the lack of closures. Instead of taking a callback function as a parameter, an API should take a context type, which contains a callback function, whose first parameter is that context type. This avoids needing to use globals to pass data to the callback, providing thread-safety and reentrancy guarantees, like a type-safe and slightly more flexible version of C’s void *user idiom.

12 Likes

Is there somewhere in the stdlib that’s a good example of this?

It is commonly used to implement a vtable:

And in sorting:

3 Likes

Context types with callbacks are also used in all of zig’s hash maps (i believe) such as https://github.com/ziglang/zig/blob/0bf44c30934bced6fc8f6451cf418ae40db665e6/lib/std/array_hash_map.zig#L87

Context there allows users to provide hash() and eql().

1 Like

The keywords to search for are:

ctx: *anyopaque
context: *anyopaque
*const fn (context
*const fn (ctx

Another example is readers:

In this example, it is useful to have the context pointer because it allows the callback to have side effects. The callback decrements a counter on the number of bytes left. The counter is stored in the memory pointed to by the context pointer. The limited reader is therefore able to track how many bytes have been read without a global variable.

2 Likes

matters is justification , use cases , and honest analysis of features

std.xxx.sortBy(&arr, |it| -it.foo.bar) is better than having to define one-shot comparator function which I’m never going to use anywhere else.

BTW: This is lambda, so no need for ctx, but there are many use-cases for callbacks, not just FP-style, which would be problematic without GC anyway.

4 Likes

You just defined a one-shot comparator function which you’re never going to use anywhere else: |it| -it.foo.bar. How is that different from:

const cmp = struct{
  pub fn f(it: anytype) u32{
    return -it.foo.bar;
  }
};
std.xxx.sortBy(&arr, cmp{});

Slightly more verbose, I agree, but that’s a minor annoyance. On the flip side, it’s easier to extend. You can easily add fields to make your comparator stateful or add helper functions.

3 Likes

I am unsure what others’ opinions might be about the style of it, but when I want to use a lambda-esque syntax, I will write it like this:

std.xxx.sortBy(&arr, struct {
    pub fn cmp(it: anytype) u32 {
        return -it.foo.bar;
    }
}.cmp);

It somehow “feels” a bit more like a one-shot, inline function over a named struct definition in the lines above it. It is a “good enough” happy-medium solution for me.

6 Likes

I wonder if it would be practical to use an iterator for cases like sorting:

var iterator = std.sort.sortByIterator(&arr);
while(iterator.needsComparison()) |it| {
    it.result = it.a.value < it.b.value;
}

While it might make the sorting algorithm a bit more difficult to implement, on the surface this is basically like passing a lambda, you can even “capture” variables from the surrounding scope.

2 Likes

I find Andrew’s rejection unconvincing, possibly due to not understanding the consequences of the compiler implementation. I’ll write my feelings on this below and if my understanding is wrong please correct me.

This comment is essentially the entire post I would make, so I’ll try to keep it short.

Currently, functions are like both first-class objects and not, because while you can pass them around like first-class objects they are cannot be created and assigned like any other object. This is by design and I while I disagree and find the arguments for keeping it this way unconvincing I can understand why you’d want it this way, functions are “special” in how they get stored and referenced in an object file.

But, the fact that you can literally just ignored this and create a function within a struct and then assign it means that the current design essentially feels inconsistent because functions are treated differently depending on the context they’re in. I think that if functions are to be treated specially, then doing

const addOne = struct {
    fn call(a: i32) i32 {
        return a + 1;
    }
}.call

should also be forbidden.

2 Likes

This is great point, but I’m not sure how it could work for IO, unless zig gets async/await again.

EDIT: for async IO (it will obviously work for sync IO but that’s no-go for many apps)

Not having functions as expressions never bothered me, but not having nested functions felt like a missed opportunity. I feel like the latter was thrown out along with the former, without being given much consideration. Seems like a better solution than the (ugly) status quo. What would be problematic with this code?

fn foo() void {
    fn bar() void {
        ...
    }
    fnWithCallback(bar);
}
2 Likes

I’m not understanding a compelling requirement for nested functions. Even in languages thst support them, I usually end up lifting them out of the function they’re nested in, anyway.

3 Likes