Fluent: Algorithmic Chaining Library for Slice Manipulation

AndrewCodeDev · April 16, 2024, 4:04pm

I’m very happy to share version 1.0 of a library that @pierrelgol and I collaborated on. Our efforts resulted in a library called Fluent:

As the name suggests, it is an implementation of the fluent interface, allowing you to chain algorithms in succession and accomplish tasks succinctly. Fluent has many algorithms built in and can be used on any numeric-type slice and has special functionality for string manipulation.

Fluent interfaces are lightweight (the size of a slice), comptime deduced, and never allocate. The entire library is one file that can be imported into your project much like std.ArrayList.

Fluent Interface:

Here’s an example of using the fluent interface to concatenate two strings, trim spaces from both sides, and then capitalize the first letter of each word:

const str_a: []const u8 = "    hello,";
const str_b: []const u8 = " world!   ";
var buf: [64]u8 = undefined;

const result = Fluent.init(str_a)       // initialize our interface on str_a
        .concat(str_b, buf[0..])        // concatenate str_b into buffer
        .trim(.periphery, .scalar, ' ') // trim spaces on both sides
        .title();                       // python title function

// based on std.math.Order, could also be result.order(...) == .eq
try std.expect(result.equal("Hello, World!"));

Here we take a sum of squares and then take the square root:

const magnitude = std.math.sqrt(
    Fluent.init(floats[0..]).mapReduce(f32, sqr, add, 0.0)
);

I also like fluent for one-off functions - sorting is easy:

_ = Fluent.init(slice).sort(.ascending);

Ccompute sigmoid function:

const x = Fluent.init(buf[0..])
    .copy(&[_]f32{ -2, -1, 0, 1, 2 })
    .map(.{
        Fluent.negate,
        std.math.exp,
        Fluent.bind(.{ 1.0 }, Fluent.add),
        Fluent.bind(.{ 1.0 }, Fluent.div),
    });

Iterators:

Fluent also comes equipped with a composable iterator backend. Iterators can be used in conjunction with fluent interfaces or independently as a stand-alone utility. Here’s a few examples…

Copy the reverse of a list using a reverse iterator:

const count = Fluent.iterator(.reverse, items_a[0..]).write(items_b[0..]);

Sum up the square of all elements that are even:

const rdx = Fluent
    .iterator(.forward, items[0..])
    .filter(isEven)
    .map(square)
    .reduce(i32, Fluent.add, 0);

Set stride to increment by two 2 and produce windows of size 4:

var itr = Fluent
    .iterator(.forward, items[0..])
    .strided(2)

while (itr.window(4)) |window| { // ...

Fuse multiple unary functions and filters:

var itr = Fluent
    .iterator(.forward, items[0..])
    .map(.{
        Fluent.negate,
        std.math.exp,
    }).filter(.{
        skipNans,
        skipInfs,
    });

while (itr.next()) |value| { //...

The backends of Fluent are split into Mutable and Immutable variants. The mutable variants pick up all of the functions that the immutable ones have (thus mutable strings can do everything that const strings can such as search, compare, etc).

If a function is marked as immutable, the function cannot modify its current slice. It can, however, return a modified slice that contains the result of the operation. Trim, for instance, doesn’t remove anything, but instead returns a new fluent interface that holds a slice with adjusted boundaries. Concat and join are also considered immutable as they do not modify the original slice, but instead write their results to a provided buffer and return a fluent interface attached to that buffer.

Here’s a list of the algorithms we currently have:

Immutable:

all          - check if all elements of the acquired slice are true by given predicate
concat       - appends the aquired slice to a given slice into a given buffer
contains     - check if contains a given scalar, sequence, or any
containsFrom - check if contains a given scalar, sequence, or any after a given index
count        - counts all, leading, trailing, until, inside, inverse of scalar, sequence, any
endsWith     - checks if the acquired slice ends with a scalar, sequence, or any
equal        - returns true if lexicogrpahical order is equal to a given slice
find         - returns first index of scalar, slice, or any
findFrom     - returns first index after a given position of scalar, slice, or any
getAt        - returns an element for given positive or negative index
join         - appends the aquired slice to a given range of slices into a given buffer
mapReduce    - applies unary function and reduces on intial value and binary function
max          - returns an optional maximum value from the acquired slice
min          - returns an optional minimum value from the acquired slice
none         - check if no elements of the acquired slice are true by given predicate
product      - returns the product of all elements or zero if slice is empty
print        - prints the acquired slice based on a given format string
order        - returns the lexicographical order compared to a given slice
reduce       - returns a reduction based on an intial value and binary function
slice        - chainable slicing operation for acquired slice
startsWith   - checks if the acquired slice starts with a scalar, sequence, or any
sample       - randomly samples a range from the acquired slice given a size
sum          - returns the sum of all elements or zero if slice is empty
trim         - trims left, right, periphery of scalar, sequence, any
write        - writes the acquired slice to a given buffer

Mutable:

copy         - copy a given slice into the acquired slice
fill         - fills the acquired slice with a scalar value
map          - transforms every elment in the acquired slice with a given unary function
partition    - partiions the acquired slice based on predicate in stable or unstable manner
replace      - replaces slice, sequence, or any at left, right, periphery or all
reverse      - reverses the acquired slice
rotate       - rotates the array by both negative and positive amounts
setAt        - sets a given position with a provided value using index wrapping
shuffle      - randomly shuffles the acquired slice
sort         - sorts the range in ascending or descending order

And for strings:

Immutable:

digit          - returns optional integer parsed from string
differenceWith - returns set diference between acquired slice and given slice
float          - returns optional floating-point number parsed from string
getToken       - extract a token given a set of delimiters
intersectWith  - returns set intersection between acquired slice and given slice
isDigit        - check if string only contains digits
isAlpha        - check if string only contains alphabetic characters
isSpaces       - check if string only contains whitespace
isLower        - check if string only contains alphabetic lower case
isUpper        - check if string only contains alphabetic upper case
isHex          - check if string only contains hexidecimal characters
isASCII        - check if string only contains ASCII characters
isPrintable    - check if string only contains printable characters
isAlnum        - check if string only contains alpha numeric characters
unionWith      - returns set union between acquired slice and given slice

Mutable:

lower       - transform all alphabetic characters to lower case
upper       - transform all alphabetic characters to upper case
capitalize  - transform first character to upper case and rest to lower case
title       - capitalize each sequence separated by spaces

We also support the following iterators (from the standard library):

split       - splits a sequence on a given delimiter
tokenize    - tokenizes a sequence on a given delimiter

The current plan for Fluent is to gather feedback, optimize and upgrade algorithms, integrate more SIMD over time and add functionality.

All in all, I had a great time putting this together with @pierrelgol and I think we made a neat little swiss-army knife that you might enjoy as well. Thanks everyone

pierrelgol · April 16, 2024, 4:15pm

Thanks for everything @AndrewCodeDev , I had a blast too, and I can’t wait to maintain that with you. I’m happy with how it turned out, and I have a lot of ideas that I’d like to get some feedback on, but that’s maybe for another time. Anyway just like you said I’d be happy to hear some feedback about it so anyone, feel free

gonzo · April 16, 2024, 4:38pm

This looks amazing, great work @AndrewCodeDev and @pierrelgol!

pierrelgol · April 16, 2024, 8:39pm

thanks a lot

cancername · April 16, 2024, 9:15pm

Extremely pog! Will definitely consider using your awesome library.

mntnmntn · April 17, 2024, 1:04am

Looks nice! Reminds me of underscore.js.

gonzo · April 17, 2024, 8:38am

[Tangentially related question]

I have been waiting for the package manager to “come into its own” – is it there yet? For example, say I wanted to add Fluent to a couple of projects of mine – could I do this by adding a small configuration file per project, without a specific build.zig, and have Fluent magically be downloaded (and maybe kept up to date) without me having to copy Fluent’s single file everywhere?

AndrewCodeDev · April 17, 2024, 8:43am

That’s a great question actually. I was planning on looking into this more this week to figure out what’s the best option to release a package for this kind of thing. Essentially, you’re right - it’s not like an app that you download and install.

I’d really like to hear more people’s thoughts on this, so thanks for bringing up the question. As soon as we know, I’ll get something together if I need to do something on my end.

Sze · April 17, 2024, 10:41am

The short answer is no (to magic).
The longer answer is, it works well, but you need to do some things yourself.

The build.zig is the configuration file, together with the build.zig.zon.

No magic allowed, only things working in reasonable and understandable ways.
It gets downloaded automatically, but you are responsible for updating.
(Somebody could build tooling to make updating easier)

That isn’t necessary, but you need to use the build.zig and build.zig.zon.

For more explanation and an example repo take a look at:

I really think you shouldn’t avoid the build.zig and build.zig.zon and just get used to them and how they work, better then reinventing half-baked solutions via config files, that break down as soon as you want something a little bit more complex. Or when your config file tool, no longer is actively maintained.

tgirod · April 17, 2024, 12:34pm

okay I’ve just started looking at the code and its really interesting ! love the way methods are injected at comptime with usingnamespace.

gonzo · April 17, 2024, 12:34pm

Hey @Sze thanks for the reply, that is informative.

When I said “with a config file, without a build.zig”, I was thinking that it would be ideal to only have to add build.zig.zon; this way, if all you are doing is adding external dependencies to your project, they would be “magically” read from the .zon file, without having to craft a build.zig just for this purpose.

msw · April 17, 2024, 12:48pm

are you aware of how clojure runs let’s call it “algorithmic concatenation” with “reducers”? Clojure - Reducers ?
The most interesting part, IMO, is preventing production of intermediary result collections, so instead of doing f(g(x)) as a mapping through the sequences, you map (f o g)(x) through it, for the values f “produces”. In terms of zig, a comptime sequence of functions could be an argument to a fluent operator operating on a slice, couldn’t it, as it includes all the necessary step-and-collect logic and structure?

AndrewCodeDev · April 17, 2024, 12:54pm

Yes, I’ve thought about that quite a bit.

Currently, we have some support for that using the mapReduce so you don’t have to write to intermediate results. You can also use map to pass any unary transformation to it and that can be as complex as you want. So you can directly pass in f after g to map.

That said, certain operations won’t allow that - such as sort:

// get top 10 results
const x = Fluent.init(items)
    .sort(.descending)
    .slice(0, 10);

In that case, we need the actual sorted range - the intermediate results are a necessary part of the process here.

That said, I am interested in doing more of what you’re suggesting because I absolutely see the value in that. We have a few points of entry for that at the moment, but more work on that could be done in the future

AndrewCodeDev · April 17, 2024, 1:02pm

I’d also like to point out that part of what we’re doing in the next release is a function adapter section that will allow you to do comptime binds, compositions, and argument adapters. I have a very simple one called adapt in Fluent currently, but the plan is to expand that quite a bit. We need those because many operations in the standard library are sort of “psuedo-binary”.

std.math.add for instance actually takes 3 arguments because the first is the type. One idea here is to detect that pattern and automatically handle it, but function adapters are definitely on the horizon.

Edit -

It’s also important to think about SIMD too - we’re going to add more of that as we go along and provide special operations for when SIMD is applicable. So sum, product, max, and min, are all SIMD right now. You could do a reduce to get those results, but by my measurements it’s slower. We should maintain a generic pipeline, certainly, but we also want to provide a special pipeline for operations that are SIMD friendly.

Also, iterators. Transform iterators, reverse iterators for things like windowing, etc. Currently, this allows us to do things like:

var itr = Fluent.init(items).window(4, 1); // advance in groups of 4
while (itr.next()) |w| {
    const x = Fluent.init(w)
        //.something...
        //.something...
}

So now we can operate over chunks of the slice in a more lazy, piecewise way.

AndrewCodeDev · April 17, 2024, 1:26pm

Thanks! I personally try to avoid exposing functions that don’t work for a type - that’s why we decided to go in that direction.

AndrewCodeDev · April 18, 2024, 10:54am

@msw, update here on the fusing functions - we just pushed unary function composition for transforms:

// create iterator that filters on alpha-numeric hex
var itr = Fluent.init("Fx01_a_88_qq_9_BC").filter(.{
        std.ascii.isHex,
        std.ascii.isAlphabetic,
    });

// compute sigmoid function
const x = Fluent.init(buf[0..])
    .copy(&[_]f32{ -2, -1, 0, 1, 2 })
    .map(.{
        Fluent.negate,
        std.math.exp,
        Fluent.bind(.{ 1.0 }, Fluent.add),
        Fluent.bind(.{ 1.0 }, Fluent.div),
    });

var itr = Fluent.init(&[_]f32{ 
    1, 2, 3, 
    4, 5, 6, 
    7, 8, 9, 
}).window(3, 3);

// apply foo after bar for each row element and reduce row
while (itr.next()) |w| {
    const baz = Fluent.init(w).mapReduce(f32, .{ foo, bar }, add, 0.0);
}

Basically, we’re unrolling all the unary functions with inline calls with no intermediate storage between them. bind currently only does front binding, but I’m working on doing back binding too depending on the position of the arguments.

Also added a transform iterator that applies a function or chain of functions to single elements.

AndrewCodeDev · April 20, 2024, 3:43pm

One more update here - we’ve implemented our iterator branch as an extension to Fluent’s interface. Iterators can be used from a fluent interface or by themselves using the iterator function.

Copy the reverse of a list using a reverse iterator:

const count = Fluent.iterator(.reverse, items_a[0..]).write(items_b[0..]);

Sum up the square of all elements that are even:

const rdx = Fluent
    .iterator(.forward, items[0..])
    .filter(isEven)
    .map(square)
    .reduce(i32, Fluent.add, 0);

Set stride to increment by two 2 produce windows of size 4:

var itr = Fluent
    .iterator(.forward, items[0..])
    .strided(2)

while (itr.window(4)) |window| { // ...

Fuse multiple unary functions and filters:

var itr = Fluent
    .iterator(.forward, items[0..])
    .map(.{
        Fluent.negate,
        std.math.exp,
    }).filter(.{
        skipNans,
        skipInfs,
    });

while (itr.next()) |value| { //...

Also, Fluent now provides convenience types so you can easily equip your slice-based data structures with iterators:

// inside user defined type
pub fn iterator(
    self: Self, 
    mode: Fluent.IteratorMode
) Fluent.BaseIterator(T, mode) {
    return Fluent.iterator(mode, self.items);
}

And with that, we’re taking a little break for a while to let the dust settle and hopefully people can give it a fair chance and give us constructive feedback. Thanks again

natalie-o-perret · April 21, 2024, 11:37am

Amazing work!

AndrewCodeDev · April 21, 2024, 11:37am

Thank you, and welcome to Ziggit

pierrelgol · April 21, 2024, 11:55am

thanks for the kind words