This and the answer provided by @affine-root-system (I don’t know how to quote multiple users) are in the same class, which is creating temporary variables, and something I consider a barely better sidegrade to the nested parenthesis mess. I’d have to give it a chance again, but I do remember resorting to this when implementing an algorithm that had something to do with checking if a point inside a triangle and I didn’t like it at the time.
I wonder, going back to this example:
const z: X = ._(x, .@"+", y);
Imagine if the language could support custom syntax highlighting defined per function.
const X = struct {
…
fn @"_" (…) {
@highlighting(myConfig);
…
}
};
This would help with the squinting/brain-training.
In my example I wanted to show you that chaining is possible if you use the ‘object-like’ notation, you can even keep chained operations in different lines:
const n = a
.add(b)
.add(c);
I thought chaining operations was your concern, but this is possible. Also, you don’t need to create temporary variables, you can pass the operands as arguments directly.
This example given by proposal 8204 is closer to the type of problesm I deal with:
sigma.mul(sqrt(2 * pi)).recip().mul(x.sub(mu).div(sigma).square().div(2).neg().exp());
Format it in multiple lines and it becomes more readable already, you can also add comments.
I think that works, but I’m iffy on that idea. Rust has a common complaint of type/trait complexity, but the often touted solution is “Are you not using an ide with syntax highlighting, lsp and deep integration with clippy and rust-analyzer?”. That rebuttal falls appart when you want to read diffs on code forges, which will definitely not have any of those ide niceties.
Addition: I don’t want operator overloading, I think it hurts readability. A + should not be something else “masqueradable” as an addition. Separating user/lib infix functions from language keywords is important imo, both for readability and compiler implementation.
For my purposes (more binary expression trees like SQL where clauses, less math) approaches from any or all of the comath link or @mg979 or @affine-root-system will all work fine - especially formatted across multiple lines - but I wanted to say thanks for a good discussion here. think the (valid) pushback here is really refining your arguments for the better. I definitely have a clearer idea what you are trying to do here. If for nothing other than collecting these options in one place, I think it’s a good discussion.
I think if you want a real language change you will probably need to dig deeper into why it was rejected though. I do wonder if removing “usingnamespace” has addressed some of their concerns - cited in the closed issue - and if there’s something to gain from the core team revisiting the question.
But if there’s a solution available without changing the core language, that’d be even better.
As for
sigma.mul(sqrt(2 * pi)).recip().mul(x.sub(mu).div(sigma).square().div(2).neg().exp());
vs
1/(sigma * sqrt(2 * pi)) * exp(-((x - mu)/sigma)^2/2)
I think I would want to be able to write it something like this
1/($sigma * sqrt(2 * $pi)) * exp(-(($x - $mu)/$sigma)^2/2)
and hand that to a comptime type function and get back a struct with these fields (sigma, pi, x, mu) and a single function that returned a new value. Maybe with some options to choose between @panic or return error on overflow. Almost like comath but as a type factory, using it would just be
const value = try Calculation1.calc(.{ .sigma = …, .pi = …, .x = …, .mu = … })
For bonus points, you could then mix that generated function back in to build new expressions where you call Calculation1() as part of another calculation.
Assuming we could finish drawing the rest of the owl, would something like that work in your cases?
If you want to do math with both vectors and complex numbers in one function, you need some way to generate a single function that will do overload resolution in userland. This is what the issue’s linked gist demonstrates. The feature feels incomplete without this support library, but also this support library would need to be pretty complicated. The one provided only does overload resolution based on the left hand side, which is likely not enough in practice. It also generates error messages that would be very difficult to diagnose and fix. We think that putting comptime execution and usingnamespace on the path between an operator and the function it calls makes the code more difficult to read and understand, and obscures the code enough that it erases the benefits of looking more like the math.
If I read correctly, the main blocker was related to overloading base operators, which is orthogonal to the usingnamespace problem which was specific to the implementation of that proposal. But the idea I’m pushing forward sidesteps that problem (“you need some way to generate a single function that will do overload resolution in userland”) by denying base operator overloading, instead enforcing userland identifiers for custom infix operators.
I think the rest of that particular owl is already done with comath, here’s the example on it’s readme:
const std = @import("std");
const comath = @import("comath");
test {
const ctx = comath.ctx.simple({});
const value = comath.eval("a * 2", ctx, .{ .a = 4 }) catch |err| switch (err) {};
try std.testing.expect(value == 8);
}
and the referenced repo zilliam shows how you’d use it to implement infix functions. My gripes are mostly the whole stringification of identifiers preventing lsp renames, having to implement your own infix operator processing as done in zilliam and equations as structs rather than functions. Because of that last point, you lose the possibility of tweaking the equation with control flow, which has an analog in sql as query building from user search parameters with different tables joined and filters apppended depending on user runtime provided params.
Now that I’m thinking more about it, I should give comath an honest try on a real project to see if my own problems could be alleviated. But the fact that the core devs still accepted the @Matrix proposal indicates that they’re aware of this issue and consider a comath type solution unsatisfactory. I just think that it’s heavy patch/bandaid solution that doesn’t extend well.
You can go further than this:
const Matrix = struct{
fn @"+"(lhs: Matrix, rhs: Matrix) Matrix {}
}
m3 = m1.@"+"(m2);
With that said, I really like operator overloading. Why not make the @"+" function just be invoked when both sides are the same type, and the type has the "@"+" function? This has no confusion regarding which function gets called, which is the argument against function overloading.
If Zig wanted to go to the extreme of only having one way of doing things, then we should add primitives the same we call functions: const a = 3.@"+"(4);. It’s clear that some amount of compromise needs to be made in this regard, and I think having + invoke one very specific function is part of this nice compromise.
Just wanted to note that as of a few weeks ago, a@Matrix builtin has been accepted and is coming to the language. According to the linked reply by Andrew, the SPIR-V backend necessitated it. It won’t solve all the issues of ergonomics that you outlined, but may offer some a few solutions for basic multiplication, addition, etc, similar to @Vector.
I think that’s the key difference between the comath example and what I’m imagining.
would become
All 100% untested and unfinished and unclear how I’d write more clear argument passing to those internal functions, but since you mentioned
I did want to clarify the thought.
On the other hand, lsp renames sound like a totally unsolvable problem with this approach, and I recognize that I’m totally out of my depth on the actual mathematics side of this, so make of it what you will.
Already you can do
// first parentheses aren't necessary
const m4 = (m1) . @"+" (m2) . @"+" (m3);
// or
const m4 = (m1) . add (m2) . add (m3);
Besides the need for parentheses and dots, syntax is quite easy to look at with some spacing.
Far from the main point of this thread, but @offset you have mentioned that comptime for is used for loop unrolling several times, but that is not true. comptime evaluates the given expression entirely at compile time. It does not unroll loops. The inline keyword is used to unroll loops. Consider this snippet:
comptime var x: comptime_int = 0;
comptime for (0..10) |i| {
x = i + x * 2;
};
inline for (0..5) |_| {
std.debug.print("x is {}\n", .{x});
}
For both kinds of loops, the number of iteration must be known at compile time. Both loops could loop over an array with a comptime known value, but an inline for can also loop over an array (note: comptime known length) with a runtime known value. The body of a comptime loop can only operate on comptime data, like I am demonstrating by mutating a comptime var. The body of an inline for loop can do runtime operations, which I am demonstrating by doing IO.
Postfix and concatenative is another alternative to infix comath type solutions. A simple example:
const std = @import("std");
pub fn numberOfBytes(c: u8) u8 {
if (c & 0x80 == 0x00) {
return 1;
}
if (c & 0xe0 == 0xc0) {
return 2;
}
if (c & 0xf0 == 0xe0) {
return 3;
}
if (c & 0xf8 == 0xf0) {
return 4;
}
return 0;
}
const TokenKind = enum {
eof,
number,
word,
};
const Token = struct {
kind: TokenKind,
start: usize,
};
const Parser = struct {
str: []const u8,
pos: usize = 0,
message: []const u8 = "",
pub fn parse(self: *Parser) !Token {
var t: Token = undefined;
try self.start(&t);
return t;
}
fn start(self: *Parser, t: *Token) !void {
while (true) {
if (self.pos >= self.str.len) {
t.kind = .eof;
t.start = self.pos;
return;
}
const c = self.str[self.pos];
const num = numberOfBytes(c);
if (num == 1 and std.ascii.isDigit(c)) {
t.kind = .number;
t.start = self.pos;
self.pos += 1;
return try self.number(t);
} else if (num == 1 and c == '-') {
t.kind = .number;
t.start = self.pos;
self.pos += 1;
return try self.number(t);
} else if (num == 1 and std.ascii.isAlphabetic(c)) {
t.kind = .word;
t.start = self.pos;
self.pos += 1;
return try self.word(t);
} else if (num == 1 and c == ' ') {
self.pos += 1;
continue;
} else {
self.message = "expected number or word";
return error.Parse;
}
}
}
fn number(self: *Parser, t: *Token) !void {
_ = t;
while (true) {
if (self.pos >= self.str.len) {
return;
}
const c = self.str[self.pos];
const num = numberOfBytes(c);
if (num == 1 and std.ascii.isDigit(c)) {
self.pos += 1;
} else if (num == 1 and c == ' ') {
return;
} else {
self.message = "expected digit";
return error.Parse;
}
}
}
fn word(self: *Parser, t: *Token) !void {
_ = t;
while (true) {
if (self.pos >= self.str.len) {
return;
}
const c = self.str[self.pos];
const num = numberOfBytes(c);
if (num == 1 and std.ascii.isAlphabetic(c)) {
self.pos += 1;
} else if (num == 1 and c == ' ') {
return;
} else {
self.message = "expected message";
return error.Parse;
}
}
}
};
fn cat(comptime T: type, comptime expr: []const u8, ctx: anytype) !T {
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
defer arena.deinit();
const allocator = arena.allocator();
var stack = std.ArrayList(T){};
var parser = Parser{ .str = expr };
while (true) {
const t = try parser.parse();
switch (t.kind) {
.eof => {
break;
},
.number => {
const num = try std.fmt.parseInt(T, expr[t.start..parser.pos], 10);
try stack.append(allocator, num);
},
.word => {
const name = expr[t.start..parser.pos];
if (std.mem.eql(u8, name, "add")) {
const b = stack.pop().?;
const a = stack.pop().?;
const c = add(a, b);
try stack.append(allocator, c);
} else {
// Look up `name` as a field in `ctx`
var pushed = false;
inline for (std.meta.fields(@TypeOf(ctx))) |field| {
if (std.mem.eql(u8, field.name, name)) {
const val = @field(ctx, field.name);
// Coerce to T if needed
const as_t: T = @as(T, val);
try stack.append(allocator, as_t);
pushed = true;
break;
}
}
if (!pushed) {
return error.InvalidWord;
}
}
},
}
}
return stack.pop().?;
}
fn add(a: i64, b: i64) i64 {
return a + b;
}
pub fn main() !void {
const value = try cat(i64, "1 2 add 3 add x add", .{ .x = 4 });
std.debug.print("value: {}\n", .{value});
// value should be 10
}
I can do a matrix example which will be also ergonomic later. Also complex and tensor are similar.
Thank you, I knew I messed up somewhere.
@vulpesx Turns out I caused the whole misunderstanding, and was proven wrong. I had inline for in mind when I was talking about comptime for and you were correct saying that comptime for can only be run in comptime context.
Oh, that’s an interesting problem. It has almost nothing to do with infix functions, but I can see the “composing” aspect of it.
Rather than passing arguments into internal functions (sounds awfully close to prop drilling), perhaps instead extand the internal functions? So that
const f3 = GeneratedFinal.f3.eval;
looks like this under the hood:
"(a * 2) + (b + 45)"
In your design, you still run into the issue of what happens if you define variables with the same name in different functions.
const GeneratedLayer1 = CreateOwlHere(.{
.f1 = "a * 2",
.f2 = "b + 45",
.f4 = "8 - a",
},
...);
const GeneratedFinal = CreateOwlHere(.{
.f3 = "f1 + f2 + f4",
.{
.f1 = GeneratedLayer1.f1
.f2 = GeneratedLayer1.f2
.f4 = GeneratedLayer1.f4
},}, ...);
now at the call site, when you do
const value2 = try f3(.{ .a = 4, .b = 5 });
How do you differentiate between the a that goes to f1 and the a that goes to f4? You end up having to leak implementation details in any scenario, which limits how far the composability scales.
Yep, that’s what I meant by “unfinished”. I can imagine you might intend to share one “pi” across the whole expression but you might want two separate “a”s and you would have to give it a name. That extra info would either have to go into the comptime text being parsed or a separate mapping passed in. Plus any other code-gen options. I honestly don’t know if the composable version is worth it, I just suspect it could be done. And that’s what I was thinking when I referenced creating “the rest of the f*** owl”.
I think this could at least get far enough to turn the string “1/(sigma * sqrt(2 * pi)) * exp(-((x - mu)/sigma)^2/2)” into a plain function by building on comath, they did the hard part, I’m just considering the api a little differently.
I think a primary consideration here is you have operator overloading or infix function, you’re tending to write functions at the level of a single vector, matrix or tensor. When you get into any serious engineering you’re probably looking at large systems of these types, and there you want to be writing code that operates on large collections in efficient ways.
This was how I took @tholmes comment:
To put it another way… The desire to operate on the low-level structure is at odds with data-orientated design, which is a key philosophy behind the design of Zig. Zig is trying to nudge you down paths which produce code the CPU will work on as efficiently as possible.
Now infix functions are a much better idea than operator overloading, but it’s just syntactic sugar. My problem is that unless you allow infix functions to be named symbolically (ala Haskell), you end up with very unwieldy expressions. So now you need new rules around new identifier forms. You also have the precedence problem to solve. How do you define the precedence of custom infix functions with respect to each other and the existing +, -, *, /, etc…? Edit: Also associativity
Towards each other could be done by just putting all on the same precedence level. How it would relate to all the other ones would be a bit harder.
But things like this are something you would talk about AFTER you agreed that it should be added to not derail discussions.
I’m watching from afar this discussion as I’m a math major, and, using Zig a lot, am interested in the topic.
I feel like I have to intervene over the following :
No ! Things like that should be talked while considering to accept solutions or not ! In fact, I’m with whom you’re replying to that the proposed solution would not allow operator precedence and/or associativity, in its current form (or at least would not be clear).
Personnally, doing a.mul(b) has gone a long way for me for non associative operations in adding code clarity. Adding operator overloading would add a layer of misdirection, whichever its form.