Basic Question on Allocation

This is perhaps too vague a question, but it’s on how to approach memory and such in Zig.

My background/understanding is we have the stack where we’d usually store transient state. However, some state must be dynamically allocated. This can be either when the alliocation amount is unknown or when the lifetime of a resource isn’t bound to function scope. Hence why there is malloc in C. Malloc is a general purpose purpose allocator handling requesting and tracking memory from the OS, so the idea with custom allocators is to have an allocator that suites your use case.
However, I’ve been unable to wrap my mind on how to properly do that.

In my mind, I’m either going to go exact on memory or I’m going to at least have an upper bound. E.g., if my problem as a fixed memory size requirement, since CPU cycles are cheap, calculate out the maximum size required for memory and just allocate that amount at once.

Alternatively, if for some reason you can make some calculations so as to obtain an upper bound, yet further calculations would be prohibitively expensive, I could see getting the upper bound and allocating that.

Is that the normal approach most people do when writing Zig and all the low-level concerns, or is it fine to play more fast and loose with things as (sometimes?) the memory isn’t actually grabbed immediately? E.g., you have the general purpose allocator (GPA) and people just alloc away.

What caused my question is that, doing what comes to mind for most memory efficient, the “calculate it all out” beforehand, “feels” very ugly, but if that’s how it has to be, I’d accept that.

2 Likes

You seem to understand exactly the way zig wants you to think about memory.

I’m confused about what your question is? Are you confused about the point of custom allocators? Passing around the interface? Or what specific allocators do?

Also, I think this makes more sense in the explain category, help is more so for more practical help with code I changed it for you. The category descriptions probably should be easier to differentiate.

3 Likes

Do you have some specific code examples in mind that you think are “ugly”?

I think of it as a trade-off between local and global complexity.

Allocating as you go is “locally nice”, in that you only have to think about memory when you actually need it, but it becomes “globally ugly” very fast - every allocation needs to be tracked, you’ve got a lot of call sites to check for memory handling errors, etc. It’s easy for subtle bugs to sneak in, and those bugs can be hard to catch. As your codebase grows, the global effects start to outgrow the local ones.

I think “just thinking about it” puts you in Zig’s good graces here. The fact that you’re aware that there are different paradigms and that they have trade-offs is exactly what it wants. Picking one is your responsibility as the designer of the code.

Short-running CLI tools can just leak everything and panic on allocation failure, for instance. That is a choice Zig allows you to make, it’s no less valid than managing memory “properly” in the right codebase, but obviously not the right choice for every codebase.

4 Likes

I think there are three cases here:

  1. There’s a static upper bound, and I know it
  2. There’s a static upper bound, but I can’t be bothered to know it.
  3. The bound is dynamic, and depends on the runtime input.

1&2 are similar, 3 is different. A good separating example here is a Key-Value store (a server that exposes PUT key value and GET key endpoints). If keys and values are ultimately stored on disk, you have an upper bound for required memory to process a single request, and, factoring-in concurrency, you can get an upper bound for memory. However, you can also imagine an in-memory KV, where your memory requirement is the cumulative result of all puts, and where you can’t really predict an upper bound (you can impose one artificially, but there isn’t a natural one).

For 1&2, what I’ve learned from TigerBeetle is that though computing upper-bounds feels like it should be complicated, its actually is surprisingly easy in practice.

The key insight is that the upper-bound is compositional: you don’t actually need to know the upper-bound, in bytes, for the entire system. You only need to make sure that each subsystem has some upper bound, and then, by composition, the entire system will have some upper bound. And establishing existence of upper bound for a subsystem is usually easy — everything you use directly is straightforward, and for indirect dependencies you only need “existence” of the upper bound, not its numeric value:

7 Likes

@vulpesx I more so am questioning how to use allocators right considering I always run into something that makes me go “wait, is this as efficient as this gets?”

@tsdtas Not code per se, but the concept of what happens.
Consider a string that you are parsing. In my mind, I can pre-process this string once over to calculate the number of Alpha-elements I’ll need to store so I only request memory once. Then, I can just allocate that array of Alpha-elements and reparse the string, storing into my pre-allocated array as I go.

Now, up to that point, I have no problems except:

  1. A shame I can’t store this on the stack. For some reason my mind jumps to “variable length array”?
  2. I’m making the trade off of CPU calculation cycles vs. memory allocation.

The issue arises at the next part.
To continue this pattern, assume we then need to utilize the data from these Alpha-elements (not their quantity), to know how many Beta-elements there are.

This is when I feel stuck between a rock and a hard place conceptually.

  • I can’t pre-compute the number of Beta-elements before allocating and storing the Alpha-elements (unless one wants to calculate every single possible combination of potential Alpha-elements to get an upper bound for Beta-elements–which falls under being prohibitively expensive).
  • I can pre-compute the number of Beta-elements before allocating space of for the Beta-elements but after having stored the Alpha-elements, such that I can know the exact number of Beta-elements to request memory for. However, although that grants me contiguous memory for the Beta-elements, I “feel” like I messed up, since now I have allocated two separate strips in memory when I would’ve hoped for just one.
  • Thus, the only way to not have two syscalls/two separate strips would be to allocate an absurd amount at the start so that I know allocator.alloc will never fail or need to allocate more memory in most reasonable cases. But this goes counter to the idea of minimizing memory usage.

As for local vs. global, yeah. The benefit of it being an application and not a library is that I can determine the fixed allocator to use depending on how this should go, and I don’t have to account for peoples’ different requirements.

@matklad Referencing the aforementioned example, I think what I have falls under 3. But, on the subject of your comment, you mention just establishing the existence of an upperbound (not necessarily having to compute it). Does this line of reasoning have anything to do with the idea of memory requested vs. memory committed? I know I’ve read somewhere about “request a ton of memory, but you don’t have to worry as it won’t use it up until you start committing.”

1 Like

this depends on how the alpha data affects the beta element count. But if this statement is true, then there is no way around having separate and delayed allocations.

You should practically never need to calculate every possible combination to get an upper bound, there is almost certainly a worse case at most points of contention.

allocation does not always mean a syscall, the OS provides somewhat large chunks (usually 4kb), called pages, at a time; allocators will try to reuse it as much as practical, there are exceptions since zig supporting custom allocators which result in speciality ones such as page_allocator which does do a syscall every time, other allocators typically build on top of it.


I would like to point out, that while an upper bound is ideal, a reasonable bound should also be considered. Defaulting to an upper bound is not optimal if it is rare and large.

For example, zig allocates 80% (i think) of source file size for tokens or ast (cant remember) as that cathes the majority of code, but it does allow growth for the rare case that needs more than that.

2 Likes

I think there’s a case you’re missing here:

  • The allocation is too large for the stack. This normally coincides with it having a long lifetime, but not always. The normal process limit on Linux for stack is just 8MB, so stack usage is really just for small things.

I’m wondering if a future enhancement for Zig might be the use of a separate (and potentially larger) “data stack” which is only for scope variables, and leave the system stack for return addresses only (…a bit like having a shadow stack but without the redundancy). That way, data and control flow structures are separated in memory, making it impossible to corrupt return addresses by buffer overflows and similar attacks. Whenever you call C you’d have to return to C calling conventions of course.

2 Likes

There’s a fourth one:
4. There’s no static upper bound, but I will enforce one for convenience.

The standard library likes to do this with file names, for example. In most places, allocating to meet the actual runtime-known upper bound for the system is just inconvenient (this is no longer an issue with `std.Io` but was before it), and the stdlib decides to not support that use case.

1 Like

I think you hit the nail in the head here. Contrary to what other languages lead you to believe, in the vast majority of cases, heap allocation is not necessary.
Everyhing has an upper bound. In the worst case, the upper bound is std.math.maxInt(usize). But when the average case requires way less memory than the maximum, that’s when heap allocation is useful.

Some C libraries are an allocation gallore for this reason. In the vast majority of cases where heap allocation would be used for this, the best thing to do is to put the variable on the stack, somewhere higher up in the callstack, and pass a pointer to it. Zig does it right.

Ada (in practice, although the standard does not require it) has a secondary stack, it doesn’t quite work the way you describe it but it’s similar. Zig is unlikely to add this, but it’s interesting to think about. An approach to handling RAII of variable-sized objects, which is perhaps worth reviving in some other language which uses RAII.

I think it would make sense if arena had an easy way to provide an initial capacity, then one possibility would be to use an arena for allocation and call:

arena.reset(.{ .init_capacity = estimated_upper_bound }) // NOTE this mode isn't implemented

On it to initialize it to the estimated size, then if it turns out that your allocations are bigger than your estimate, the arena will simply grow by adding a new region of memory, in cases where your estimate is big enough you will end up with a single memory region.

I thought retain_with_limit works, but @Allocate_this correctly pointed out that this only uses the minimum of what was previously allocated or passed as limit, what I was suggesting was a way to directly set the capacity explicitly.

retain_with_limit: usize

This is the same as retain_capacity, but the memory will be shrunk to this value if it exceeds the limit.

retain_capacity
This will pre-heat the arena for future allocations by allocating a large enough buffer for all previously done allocations. Preheating will speed up the allocation process by invoking the backing allocator less often than before. …


There are cases where you can’t predict the final size and only estimate it for typical inputs, I think in those cases it is fine if you end up with a few separate allocations, the arena already makes it possible to allocate in batches (instead of every few bytes on their own, while keeping the allocator interface that allows you to request individual bytes of those batches) and keep those batches at stable memory addresses, while also automatically allocating new bigger batches if the existing ones aren’t big enough.

Regarding fragmentation and compaction:
I think you have to decide whether that is a big issue or not, I think if you use an arena and it ends up allocating a few batches, that this won’t be a big issue in a lot of cases.

If it is an issue you usually know that it is, because your code specifically only works with a single compact memory region, if that is the case you also could have a secondary/later pass that transforms the arena allocated memory into a single region without any fragmentation, once you have read in everything (but I would only do that if it is worth it and you have a good reason for it, so it would make sense to measure / benchmark your use case (and also different allocators)).

In cases where you read, process and then write the data: you could probably just live with some level of fragmentation and then save the data in a dense format, thus getting rid of the fragmentation while saving / serializing.


Overall I would think that using an arena could save you some memory over using something like malloc (which has hidden fragmentation and meta-data), because you get automatic batching instead of individual allocations (which in the worst case can result in lots of fragmentation). If you use other allocators then it is difficult to make general claims, without knowing the specific scenario, especially in dynamic sized cases.

3 Likes

FWIW Arena already offers stack-like allocation and deallocation. It’s very close to that dedicated look-aside allocator idea, but explicit like any other allocator, and something in user space, thus no special language support needed.

1 Like

Very interesting read from everyone in this thread, I too was going to bring up the case in which the amount you needed to allocate was greater, or reasonably to great to place onto the stack. While 8mb is a somewhat arbitrary stack size and you can change it, expecting consumers to change there stack size is a terrible idea. In my projects I try to not use heap allocation as much as possible and so for places where I need data to persist beyond it’s scope I just use a global buff and then pass that to a FixBufferAllocator, I’m not saying that this is the best way, just the one that I have settled on and maybe someone can provide a better answer (for my case and) for your question.

You might like std.heap.StackFallbackAllocator. For cases it does not exceed the estimated limit, it uses FixedBufferAllocator, which is equivalent to allocating on the stack. And only when the actual demand exceeds what the stack should handle does it start allocating on the heap.

In practice, I modified it to StackFallbackArena because I know that my fallback allocator is an Arena, so it can conveniently be reset and reused.

1 Like

SZE, the high-level premise (“if you can’t predict size, you need growth or a second phase”) is true but: the specific Zig call is mis-explained: reset(.{ .retain_with_limit = X }) doesn’t preallocate to X; it only retains/shrinks what you already had.

Or are you saying something different?

I was under the impression there was an easy way to set it to an initial size, but apparently not, I think I misremembered. I still think it could make sense to have a way to set an initial size for an arena, either via a initCapacity function or maybe via a .set_capacity / .init_capacity mode for reset.

Well:

var arena: ArenaAllocator = .init(gpa);
const allocator = arena.allocator();
_ = try allocator.alloc(u8, 1024 * 1024);
arena.reset(.retain_capacity);

It’s only two lines more than would be needed anyway.

MemoryPool has a preheat function, but doing that manually would be a for loop.

1 Like

Just as a light venting: man, implementing this Cartesian product while keeping the memory as tight as I can is torture. It’s when a data structure is good for minimal memory but terrible for the type of access I need for the algorithm. In the end, once I figure something out, I’ll post the code in here to share the ugliness and if anyone has any insights or criticism.

3 Likes

We do these things,
Not because they are easy,
But because,
We thought they would be easy.

8 Likes

The attempt of what I was trying to do here was fairly basic.
You have a string. In the string you can replace values enclosed with {..} according to some definition.
E.g. some can be replaced by values in an enum, others are replaced with a counting numeric value, others a fixed string. Now, since I wanted to support more than one variable length strings, I had to introduce the concept of scopes. In this way, the restriction is lightened. As opposed to one variable length string per string, it’s at most one variable length string per scope since we can determine the max length of the variable length string from the other fields in the scope.

Selecting the values for {..} will output a filled out string.
The torture starts with the counting values and the enums. For those, you will have multiple instances of the string. E.g. if your enum is ‘A’, ‘B’, ‘C’, and someone selects A & B, you want to generate two instances of your string, one with ‘A’ and one with ‘B’. say you have two enums in the string, this one with ‘X’ and ‘Y’, then you’ll want all combinations of (‘A’, ‘X’), (‘A’, ‘Y’), (‘B’, ‘X’), (‘B’, ‘Y’), (‘C’, ‘X’), (‘C’, ‘Y’).

So, from this we can see the problem is “simply” forming the Cartesian products of all these sets. Then for each n-uple in the Cartesian product fill out the format string.

Now, actually doing that is a pain in the ass. So, for selecting from the enums, I thought to use an array of u1 (0 = unset, 1 = set. Array of u1 instead of u32 bitset due to ease of use). This minimizes the memory used for selecting each enum.

Per the topic’s original premise, I wanted to minimize the number of allocs. So, that array of u1 is a single call to alloc where the length is the sum of all enums. E.g. in the aformentioned example alloc(u1, 2 + 3).

Filling it out is a non-issue as the filling out is iterative. E.g. you handle ‘A’, ‘B’, ‘C’ first, so, once that’s done, you know you’re working on array[3..], and once you handle ‘X’, ‘Y’ you know you’re at array[5..].

It’s the Cartesian product aspect that dooms me.
My only thought is to also alloc a counter to know which value in each enum I’m currently writing out during this iteration of writing out the filled out string.

The proceeding is from a new day: so, I thought to try incrementing in order. E.g., is there a rule to the mod of each field that I can increment each value each iteration and have it work out? Ultimately, I couldn’t get that to work, so I “flipped” how I viewed the problem and just did it the traditional way of fixing [0-n] counters and updating the n + 1 counter (which, if it rolls over, means updating counter n, etc.)

So, I finally have this working (save for variable strings).

const std = @import("std");

const Capture = struct {
    section: usize,
    start: usize,
    end: usize,
};


const Section = struct {
    parent_idx: usize,
    exclusive_length: u8,
};

fn parseSections(format: [] const u8, sections: []Section, captures: []Capture) !void {
    var section_idx: usize = 0;
    var furthest_idx: usize = 0;
    var capture: bool = false;

    sections[0].parent_idx = 0;
    sections[0].exclusive_length = 0;

    var num_captures: usize = 0;

    for (format, 0..) |c, idx| {
        switch (c) {
            '(' => {
                furthest_idx += 1;
                sections[furthest_idx].parent_idx = section_idx;
                sections[furthest_idx].exclusive_length = 0;
                section_idx = furthest_idx;
            },
            ')' => {
                section_idx = sections[section_idx].parent_idx;
            },
            '{' => {
                if (capture) {
                    return error.InvalidExpression;
                }
                else {
                    captures[num_captures].section = section_idx;
                    captures[num_captures].start = idx + 1;
                    num_captures += 1;
                    capture = true;
                }
            },
            '}' => {
                if (!capture) {
                    return error.InvalidExpression;
                } else if (captures[num_captures - 1].section != section_idx) {
                    return error.InvalidExpression;
                } else {
                    captures[num_captures - 1].end = idx;
                    capture = false;
                }
            },
            else => {
                if (capture) {
                } else {
                    sections[section_idx].exclusive_length += 1;
                }
            },
        }
    }
}

const DefinitionType = enum {
    enumerable,
    countable,
    var_string,
};
const FullDefinition = struct {
    name: []const u8,
    definition: Definition,
};
const Definition = union (DefinitionType) {
    enumerable: EnumDefinition,
    countable: CountDefinition,
    var_string: VarStringDefinition,
};

const EnumDefinition = struct {
    values: []const []const u8,
};

const CountDefinition = struct {
};

const VarStringDefinition = struct {
};

const DefBinding = struct {
    capture_idx: usize,
    definition_idx: usize,
};


fn processCountable(value: *usize, reader: *std.Io.Reader, writer: *std.Io.Writer) !void {
    try writer.print("Set max value:\n", .{});
    try writer.flush();

    const line = try reader.takeDelimiterExclusive('\n');
    reader.toss(1);
    value.* = try std.fmt.parseUnsigned(u32, line, 10);
    std.debug.print("Selection: {d}\n", .{value.*});
}


fn processEnum(selected: []u1, definition: EnumDefinition, reader: *std.Io.Reader, writer: *std.Io.Writer) !void {
    try writer.print("Select from:\n", .{});
    for (definition.values) |value| {
        try writer.print("{s}\n", .{value});
    }
    try writer.flush();

    const line = try reader.takeDelimiterInclusive('\n');
    var last_start: usize = 0;
    for (line, 0..) |c, idx| {
        if (c == ',' or c == '\n') {
            const selection = line[last_start..idx];
            std.debug.print("Selection: {s}\n", .{selection});
            const found_idx: ?usize = for (definition.values, 0..) |value, value_idx| {
                if (std.mem.eql(u8, value, selection)) {
                    break value_idx;
                }
            } else null;

            if (found_idx) |value| {
                selected[value] = 1;
            } else {
                return error.InvalidSelection;
            }
            last_start = idx + 1;
        }
    }
}


fn updateCounters(counters: []usize, max_values: []usize) void {
    var i: usize = counters.len;
    while (i > 0): (i -= 1) {
        const idx = i - 1;

        counters[idx] += 1;
        if (counters[idx] == max_values[idx]) {
            counters[idx] %= max_values[idx];
        } else {
            return;
        }
    }
}

pub fn main() !void {
    const site_values = [_][]const u8{
            "us",
            "jp",
    };
    const site_enum_def = EnumDefinition{
        .values = &site_values,
    };
    const environment_values = [_][]const u8{
            "dev",
            "prod",
    };
    const environment_enum_def = EnumDefinition{
        .values = &environment_values,
    };
    const server_count_def = CountDefinition{};

    const definitions: [3]FullDefinition = .{
        .{ 
            .name = "site",
            .definition = .{ .enumerable = site_enum_def },
        },
        .{
            .name = "environment",
            .definition = .{ .enumerable = environment_enum_def },
        },
        .{
            .name = "count",
            .definition = .{ .countable = server_count_def },
        },
    };
    const format = "{site}|{environment}|{count}";
    var section_count: usize = 1;  // implicit outermost section
    var capture_count: usize = 0;
    for (format) |c| {
        switch (c) {
            '{' => {
                capture_count += 1;
            },
            '(' => {
                section_count += 1;
            },
            else => {
            }
        }
    }

    const GPA = std.heap.GeneralPurposeAllocator(.{});
    var gpa: GPA = .init;
    var allocator = gpa.allocator();

    const sections = try allocator.alloc(Section, section_count);
    defer allocator.free(sections);
    const captures = try allocator.alloc(Capture, capture_count);
    defer allocator.free(captures);
    const bindings = try allocator.alloc(DefBinding, capture_count);
    defer allocator.free(bindings);

    try parseSections(format, sections, captures);

    outer: for (captures, 0..) |capture, capture_idx| {
        for (definitions, 0..) |full_definition, def_idx| {
            if (std.mem.eql(u8, format[capture.start..capture.end], full_definition.name)) {
                bindings[capture_idx].capture_idx = capture_idx;
                bindings[capture_idx].definition_idx = def_idx;
                continue: outer;
            }
        }

        return error.DefinitionNotFound;
    }

    var num_enums: usize = 0;
    var num_countables: usize = 0;
    var enum_size: usize = 0;
    {
        var i: usize = 0;
        const len: usize = bindings.len - 1;
        while (i <= len): (i += 1) {
            const forward_idx = len - i;
            const binding = bindings[forward_idx];
            const capture = captures[binding.capture_idx];  //  technically same as forward idx.
            const full_definition = definitions[binding.definition_idx];
            std.debug.print("{s}\n", .{format[capture.start..capture.end]});
            switch (full_definition.definition) {
                .enumerable => |enum_def| {
                    num_enums += 1;
                    enum_size += enum_def.values.len;
                },
                .countable => |count_def| {
                    _ = count_def;
                    num_countables += 1;
                },
                .var_string => {
                },
            }
        }
    }

    const enum_flag_array = try allocator.alloc(u1, enum_size);
    defer allocator.free(enum_flag_array);
    @memset(enum_flag_array, 0);

    const countable_value_array = try allocator.alloc(usize, num_countables);
    defer allocator.free(countable_value_array);

    const max_capture_values = try allocator.alloc(usize, capture_count);
    defer allocator.free(max_capture_values);

    var stdin_buffer: [1024]u8 = undefined;
    var stdin = std.fs.File.stdin().reader(&stdin_buffer);

    var stdout_buffer: [1024]u8 = undefined;
    var stdout = std.fs.File.stdout().writer(&stdout_buffer);

    var max_combinations: usize = 1;
    {
        var enum_start: usize = 0;
        var countable_start: usize = 0;

        var i: usize = 0;
        const len: usize = bindings.len - 1;
        while (i <= len): (i += 1) {
            const forward_idx = len - i;
            const binding = bindings[forward_idx];
            const capture = captures[binding.capture_idx];  //  technically same as forward idx.
            const full_definition = definitions[binding.definition_idx];
            sw: switch (full_definition.definition) {
                .enumerable => |enum_def| {
                    const selected = enum_flag_array[enum_start..(enum_start + enum_def.values.len)];
                    processEnum(
                        selected,
                        enum_def,
                        &stdin.interface,
                        &stdout.interface
                    ) catch |err| switch (err) {
                        error.InvalidSelection => {
                            continue :sw full_definition.definition;
                        },
                        else => {
                            continue :sw full_definition.definition;
                        },
                    };
                    var num_selected:usize = 0;
                    for (selected) |value| {
                        if (value == 1) {
                            num_selected += 1;
                        }
                    }
                    max_capture_values[binding.capture_idx] = num_selected;
                    max_combinations *= num_selected;

                    enum_start += enum_def.values.len;
                    std.debug.print("Enum values: ", .{});
                    for (enum_def.values) |value| {
                        std.debug.print("{s}\n", .{value});
                    }
                },
                .countable => |count_def| {
                    _ = count_def;
                    try processCountable(
                        &countable_value_array[countable_start],
                        &stdin.interface,
                        &stdout.interface,
                    );
                    max_combinations *= countable_value_array[countable_start];
                    max_capture_values[binding.capture_idx] = countable_value_array[countable_start];
                    countable_start += 1;
                    
                },
                .var_string => {
                    max_capture_values[binding.capture_idx] = 1;
                },
            }
        }
    }

    const capture_counters = try allocator.alloc(usize, capture_count);
    @memset(capture_counters, 0);

    var current_combination: usize = 0;
    while (current_combination < max_combinations): (current_combination += 1) {
        var current_capture: usize = 0;
        var enum_start: usize = 0;
        var i: usize = 0;
        while (i < format.len): (i += 1) {
            const c = format[i];
            switch (c) {
                '{' => {
                    const binding = bindings[current_capture];
                    const full_definition = definitions[binding.definition_idx];

                    switch (full_definition.definition) {
                        .enumerable => |enum_def| {
                            const enum_set = enum_flag_array[enum_flag_array.len - (enum_start + enum_def.values.len)..(enum_flag_array.len - enum_start)];
                            enum_start += enum_def.values.len;
                            const active_selected = capture_counters[current_capture];

                            var current_position: usize = 0;
                            for (enum_set, 0..) |value, idx| {
                                if (value == 1) {
                                    if (current_position == active_selected) {
                                        std.debug.print("{s}", .{enum_def.values[idx]});
                                        break;
                                    }
                                    current_position += 1;
                                }
                            } else {
                                unreachable;  // at least one is always selected.
                            }
                        },
                        .countable => {
                            const value = capture_counters[current_capture];
                            std.debug.print("{d}", .{value});
                        },
                        else => {},
                    }
                    current_capture += 1;
                    i = captures[binding.capture_idx].end;
                },
                '}' => {
                },
                '(' => {},
                ')' => {},
                else => {
                    std.debug.print("{c}", .{c});
                },
            }
        }
        updateCounters(capture_counters, max_capture_values);
    }
}