How to create an immutable map at compile time

finlaydotb · November 13, 2024, 5:18am

I would like to create an map of some values. The map needs to be immutable because it is just a mapping of some key to some constants.

I have not been able to find a way to do this.

For example, let’s say I want to have a hashmap of country code to country name, for example

MAP
30 → “Greece”,
31 → “Netherlands”,
32 → “Belgium”,
…
etc

This map is a static list of the phone code to country name and it does not need to be mutable.

In my search I found a mention of ComptimeStringMap in Type of ComptimeStringMap and also from this reddit post https://www.reddit.com/r/Zig/comments/krnuac/is_it_possible_to_create_hash_maps_at_compile_time/

But it seems this is no longer available in the v0.3.0 standard library.

So question now is, how can one create such a immutable hashmap at comptime in Zig?

permutationlock · November 13, 2024, 5:43am

It seems that ComptimeStringMap was renamed to StaticStringMap.

tgirod · November 13, 2024, 6:34am

If the keys are consecutive numbers, maybe an array would suffice ?

finlaydotb · November 13, 2024, 6:42am

Unfortunately no. Although the actual use case is not like the country code, but just like the country code there are numbers that don’t map to anything.

Plus the numbers, that will serve as keys could be very large.

So I think with these two constraint, an array won’t suffice.

finlaydotb · November 13, 2024, 7:39am

Do you by chance know if there is an alternative to this where the key can be a u8 or []u8 or generic ?

swenninger · November 13, 2024, 10:06am

I think it is currently only possible with the StaticHashMap, so no generic types.

There are open issues for making allocators available at compile time.

github.com/ziglang/zig

use case: comptime allocator

opened 05:23PM - 25 Jul 18 UTC

andrewrk

use case

Here's an attempt at implementing the `Allocator` interface at compile-time: …```zig const std = @import("std"); const Allocator = std.mem.Allocator; comptime { var map = std.AutoArrayHashMap(i32, i32).init(comptime_allocator); map.put(12, 34) catch unreachable; } pub const comptime_allocator: Allocator = .{ .ptr = undefined, .vtable = &comptime_allocator_vtable, }; const comptime_allocator_vtable: Allocator.VTable = .{ .alloc = comptime_alloc, .resize = comptime_resize, .free = comptime_free, }; inline fn comptime_alloc( context: *anyopaque, comptime len: usize, comptime log2_align: u8, return_address: usize, ) ?[*]u8 { _ = context; _ = return_address; const S = struct { var bytes: [len]u8 align(1 << log2_align) = undefined; }; return &S.bytes; } fn comptime_resize( context: *anyopaque, buf: []u8, log2_buf_align: u8, new_len: usize, return_address: usize, ) bool { _ = context; _ = log2_buf_align; _ = return_address; // Always returning false here ensures that callsites make new allocations that fit // better, avoiding wasted .cdata and .data memory. return false; } fn comptime_free( context: *anyopaque, buf: []u8, log2_buf_align: u8, return_address: usize, ) void { _ = context; _ = buf; _ = log2_buf_align; _ = return_address; } ``` After #4298 is solved: ```diff --- a/lib/std/mem/Allocator.zig +++ b/lib/std/mem/Allocator.zig @@ -223,8 +223,6 @@ fn allocBytesWithAlignment(self: Allocator, comptime alignment: u29, byte_count: } const byte_ptr = self.rawAlloc(byte_count, log2a(alignment), return_address) orelse return Error.OutOfMemory; - // TODO: https://github.com/ziglang/zig/issues/4298 - @memset(byte_ptr[0..byte_count], undefined); return @as([*]align(alignment) u8, @alignCast(byte_ptr)); } @@ -293,8 +291,6 @@ pub fn reallocAdvanced( return error.OutOfMemory; const copy_len = @min(byte_count, old_byte_slice.len); @memcpy(new_mem[0..copy_len], old_byte_slice[0..copy_len]); - // TODO https://github.com/ziglang/zig/issues/4298 - @memset(old_byte_slice, undefined); self.rawFree(old_byte_slice, log2a(Slice.alignment), return_address); const new_bytes: []align(Slice.alignment) u8 = @alignCast(new_mem[0..byte_count]); @@ -309,8 +305,6 @@ pub fn free(self: Allocator, memory: anytype) void { const bytes_len = bytes.len + if (Slice.sentinel != null) @sizeOf(Slice.child) else 0; if (bytes_len == 0) return; const non_const_ptr = @constCast(bytes.ptr); - // TODO: https://github.com/ziglang/zig/issues/4298 - @memset(non_const_ptr[0..bytes_len], undefined); self.rawFree(non_const_ptr[0..bytes_len], log2a(Slice.alignment), @returnAddress()); } ``` I believe there is only one problem left: ``` test.zig:15:6: error: expected type '*const fn (*anyopaque, usize, u8, usize) ?[*]u8', found '*const fn (*anyopaque, comptime usize, comptime u8, usize) callconv(.Inline) ?[*]u8' .alloc = comptime_alloc, ~^~~~~~~~~~~~~~~~~~~~~~ test.zig:15:6: note: pointer type child 'fn (*anyopaque, comptime usize, comptime u8, usize) callconv(.Inline) ?[*]u8' cannot cast into pointer type child 'fn (*anyopaque, usize, u8, usize) ?[*]u8' test.zig:15:6: note: generic function cannot cast into a non-generic function referenced by: comptime_allocator: test.zig:11:16 comptime_0: test.zig:5:51 ``` I think this would have already been working if we hadn't changed the Allocator interface from good ol' `@fieldParentPtr` to vtables. But anyway, I think the language rules can be relaxed. This coercion can be allowed as long as it makes the value a comptime-only value. In other words, function pointers are allowed to point to inline functions as long as the function pointer value is comptime-known. Attempt to coerce the function pointer to a runtime-known value would yield a compile error. And then everything should work fine.

github.com/ziglang/zig

Making (FixedBuffer)Allocator available at comptime

opened 02:07AM - 16 Mar 23 UTC

rsepassi

standard library

### Zig Version 0.11.0-dev.1975+e17998b39 ### Steps to Reproduce and Observed …Behavior ```zig test "comptime alloc" { comptime { var buf: [1]u8 = undefined; var fba = std.heap.FixedBufferAllocator.init(&buf); var allocator = fba.allocator(); var x = allocator.create(i8) catch unreachable; _ = x; } } ``` ``` .../std/mem/Allocator.zig:105:65: error: unable to evaluate comptime expression const slice = try self.allocAdvancedWithRetAddr(T, null, 1, @returnAddress()); ^~~~~~~~~~~~~~~~ ...:11:29: note: called from here var x = allocator.create(i8) catch unreachable; ~~~~~~~~~~~~~~~~^~~~ ``` ### Expected Behavior Expected FixedBufferAllocator to work at comptime, which was suggested in this old issue https://github.com/ziglang/zig/issues/5873#issuecomment-760620505. Goal is to be able to call code at comptime that expects an Allocator. Seemed sensible to use a FBA, but breaks because of the use of `@returnAddress()`. The impl of FBA does not use the return address passed down from Allocator ([code](https://github.com/ziglang/zig/blob/a2c6ecd6dc0bdbe2396be9b055852324f16d34c9/lib/std/heap.zig#L417)).

If this is solved, then you could do something like:

    const map = comptime blk:{
        var gpa = std.heap.GeneralPurposeAllocator(.{}){};
        const allocator = gpa.allocator();

        var result = std.array_hash_map.AutoArrayHashMap(u8, []const u8).init(allocator);
        try result.put(30, "CoolCountry");
        try result.put(31, "SecondCoollCountry");
        // ....
        break :blk result;
    };

tgirod · November 13, 2024, 6:51pm

Maybe this could be an inspiration for you ?

FObersteiner · November 14, 2024, 6:06am

Another example, I use a StaticStringMap in zdt, zdt/lib/tzdata.zig at master · FObersteiner/zdt · GitHub

Travis · November 15, 2024, 12:01am

A couple more options, second one is mine

Unlike std.StaticStringMap these both support non-string keys.

offset · November 15, 2024, 2:52pm

Another solution is to create lookup tables with comptime. If the keys are unique, you can sort them to use binary search at runtime. This is the most memory compact way I think this could be solved, but it only support key->value lookups, and the keys have to be a of a known fixed size. If the values are also a known fixed size, you can skip the second table entirely.

const std = @import("std");
const assert = std.debug.assert;
const eql = std.ascii.endsWithIgnoreCase;

const data = [_]struct{u8, []const u8}{
    .{2, "hello"},
    .{3, "world"},
    .{9, "greece"},
    .{12, "usa"},
    .{13, "antartica"},
};

const NumType = u8;
const Key = packed struct {
    val: NumType = 0,
    off: usize = 0,
    len: usize = 0,
};

const ALL_STR_SIZE = get_str_size(data);

const both = build_lookups(data);
const keys = both[0];
const strs = both[1];

fn get_str_size(comptime a: [data.len]struct{NumType, []const u8}) usize {
    comptime var c: usize = 0;
    inline for (a) |x| c += x[1].len;
    return c;
}

fn build_lookups(comptime ar: [data.len]struct{NumType, []const u8}) struct{[data.len]Key, [ALL_STR_SIZE]u8} {
    comptime var k: usize = 0;
    comptime var z: usize = 0;
    var a = [_]Key{.{}} ** ar.len;
    var b = [_]u8{0} ** ALL_STR_SIZE;
    inline for (ar, 0..) |x, i| {
        k = x[1].len;
        inline for (0..k) |j| b[z+j] = x[1][j];
        a[i] = .{.val=x[0], .off=z, .len=k};
        z += k;
    }
    return .{a, b};
}

const LNS_JUNC = 64;
fn lookup(val: NumType) ?[]const u8 {
    var bot: usize = 0;
    var top: usize = keys.len - 1;
    var mid: usize = top >> 1;
    while (true) {
        const k = keys[mid];
        if (val == k.val) return strs[k.off..k.off+k.len];
        if (bot >= top) return null;
        if (top - bot <= LNS_JUNC) {
            return for (keys[bot..top+1]) |i| {
                if (i.val == val) break strs[i.off..i.off+i.len];
            } else null;
        }
        if (val > k.val) {
            bot = mid + 1;
        } else {
            top = mid - 1;
        }
        mid = ((top - bot) >> 1) + bot;
    }
}

test "lookups" {
    assert(eql(lookup(2).?, "hello"));
    assert(eql(lookup(3).?, "world"));
    assert(eql(lookup(9).?, "greece"));
    assert(lookup(4) == null);
    assert(lookup(5) == null);
    assert(lookup(6) == null);
    assert(eql(lookup(12).?, "usa"));
    assert(eql(lookup(13).?, "antartica"));
}

Validark · November 15, 2024, 4:57pm

Why do you have to find a way to do this? Is it not a valid option to just write whatever code you want?

Use a switch statement. Use a perfect hash function. Some people enjoy writing trie implementations, that’s an option too. You’re a programmer, aren’t you? You can solve this in 100 different ways.

squeek502 · November 15, 2024, 5:54pm

A few years ago, I didn’t like/understand this answer when it was used to justify not merging this ComptimeHashMap PR, but have since come around to it.

A catch-all ComptimeHashMap that can be used for anything is convenient, but bespoke solutions tailored to your specific use-case are likely to be better. This is even true for StaticStringMap (previously named ComptimeStringMap), since it only actually has a benefit when the data has keys of varying string lengths (the optimization is just to sort the keys by length and then only compare equal-length strings in get).

As an example, here’s some ISO3166 ‘alpha2 → country’ lookup code I wrote but never ended up using (e.g. "be" → .belgium). It uses an array of type [26 * 26]Country and then @as(u16, a) * 26 + b to calculate the lookup index for an alpha2 code (where a and b are lowercase_char - 'a'). If I use the same alpha2_codes data with a StaticStringMap like so:

const alpha2_static_string_map = std.StaticStringMapWithEql(Country, std.static_string_map.eqlAsciiIgnoreCase).initComptime(alpha2_codes);

it works fine, but performs much worse, since the StaticStringMap is just doing a linear lookup because all the keys are of length 2:

Benchmark 1: array.exe
  Time (mean ± σ):      29.6 ms ±   0.6 ms    [User: 30.2 ms, System: 4.8 ms]
  Range (min … max):    28.4 ms …  32.2 ms    61 runs

Benchmark 2: map.exe
  Time (mean ± σ):     180.0 ms ±   1.1 ms    [User: 175.8 ms, System: 2.6 ms]
  Range (min … max):   178.9 ms … 183.5 ms    15 runs

Summary
  'array.exe' ran
    6.09 ± 0.13 times faster than 'map.exe'

Benchmark code

const std = @import("std");
const iso3166 = @import("iso3166.zig");

pub fn main() !void {
    const iterations = 1_000_000;
    var rand = std.Random.DefaultPrng.init(0);
    const r = rand.random();

    const max_len = 256;
    var buf: [max_len]u8 = undefined;

    var num_invalid: u64 = 0;
    var i: u64 = 0;
    while (i < iterations) : (i += 1) {
        const len = if (r.float(f32) >= 0.01) 2 else r.uintAtMost(usize, max_len);
        const code = buf[0..len];
        const alpha_only = r.float(f32) < 0.95;
        if (alpha_only) {
            for (code) |*c| {
                c.* = r.uintAtMost(u8, 26);
                c.* += if (r.boolean()) 'a' else 'A';
            }
        } else {
            r.bytes(code);
        }
        _ = iso3166.Country.fromAlpha2StaticStringMap(code) catch {
            num_invalid += 1;
        };
    }

    std.debug.print("num_invalid: {}\n", .{num_invalid});
}

I’m sure there are ways to make this lookup even faster, but I hope it shows what I’m trying to get at.

mnemnion · November 15, 2024, 7:08pm

There are also some fine data structures in the std.enum namespace, which many reading may not be aware of. None of them allocate, making them appropriate for comptime. I’ve found them quite useful a number of times.

Travis · November 16, 2024, 7:19am

I was curious to see how my static-map compared to your hand rolled lookup so I added a benchmark here. Looks like the hand-rolled lookup is about 25% faster in this benchmark which does 100_000 iterations searching for a random 2 letters.

$ zig build -Doptimize=ReleaseFast -Dbench-file=src/bench2.zig
$ poop 'zig-out/bin/bench std_static_string_map' 'zig-out/bin/bench this_static_string_map2' 'zig-out/bin/bench squeek502_hand_rolled'
Benchmark 1 (707 runs): zig-out/bin/bench std_static_string_map
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.02ms ±  212us    6.74ms … 8.45ms         89 (13%)        0%
  peak_rss            881KB ±    0       881KB …  881KB          0 ( 0%)        0%
  cpu_cycles         30.3M  ±  684K     30.0M  … 35.0M          76 (11%)        0%
  instructions       95.9M  ± 0.94      95.9M  … 95.9M          10 ( 1%)        0%
  cache_references   5.38K  ± 39.6K     1.52K  …  550K          27 ( 4%)        0%
  cache_misses        475   ± 42.9       410   … 1.02K          25 ( 4%)        0%
  branch_misses       307K  ± 1.16K      305K  …  312K          36 ( 5%)        0%
Benchmark 2 (1896 runs): zig-out/bin/bench this_static_string_map2
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          2.60ms ± 87.4us    2.40ms … 3.47ms         25 ( 1%)        ⚡- 62.9% ±  0.2%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         9.98M  ± 67.7K     9.90M  … 11.6M          28 ( 1%)        ⚡- 67.0% ±  0.1%
  instructions       14.2M  ± 0.90      14.2M  … 14.2M          43 ( 2%)        ⚡- 85.2% ±  0.0%
  cache_references   2.07K  ± 5.63K     1.47K  …  213K          86 ( 5%)        ⚡- 61.4% ± 34.0%
  cache_misses        498   ± 52.4       434   … 1.62K         104 ( 5%)        💩+  4.8% ±  0.9%
  branch_misses       140K  ± 1.58K      137K  …  146K           1 ( 0%)        ⚡- 54.3% ±  0.0%
Benchmark 3 (2503 runs): zig-out/bin/bench squeek502_hand_rolled
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.96ms ± 51.6us    1.73ms … 2.57ms        246 (10%)        ⚡- 72.0% ±  0.1%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         6.86M  ± 47.4K     6.83M  … 8.01M           5 ( 0%)        ⚡- 77.4% ±  0.1%
  instructions       9.11M  ± 1.03      9.11M  … 9.11M          73 ( 3%)        ⚡- 90.5% ±  0.0%
  cache_references   1.54K  ± 2.12K     1.26K  … 58.8K          45 ( 2%)        ⚡- 71.3% ± 29.0%
  cache_misses        399   ± 38.6       335   …  730           88 ( 4%)        ⚡- 15.9% ±  0.7%
  branch_misses      18.8K  ±  358      18.5K  … 20.8K         234 ( 9%)        ⚡- 93.9% ±  0.0%

This also shows how std.StaticStringMap() fares with this adversarial input.

squeek502 · November 16, 2024, 12:50pm

Note that your benchmark is not entirely 1:1, since my hand-rolled version does a case-insensitive lookup.

IntegratedQuantum · November 16, 2024, 1:36pm

Has anyone thought about just using an enum?

const Map = enum(u32) {
    @"Greece" = 30,
    @"Netherlands" = 31,
    @"Belgium" = 32,
    ...
};

fn mapToString(val: u32) []const u8 {
    return @tagName(@as(Map, @enumFromInt(val)));
}

Not sure about the performance of this, but I think this is by far the simplest solution.

squeek502 · November 16, 2024, 2:14pm

This has the heavy caveat that it will panic/cause UB if val doesn’t have an associated enum field (and the @tagName will panic/cause UB if Map is non-exhaustive and there’s no associated enum field with that value).

My assumption would be that a function with a switch would be fairly performant for enum -> T conversion, but haven’t checked to see how it performs vs std.enums.EnumMap.

Travis · November 17, 2024, 1:16am

Ah thats true. I added a specialized, case insensitive map and modified the benchmark alphabet to include upper case letters and doubled that map’s capacity. Now i’m able to slightly edge out the hand-rolled version.

I wonder if the hand-rolled code would benefit from doing the lower casing as i’ve done (by or-ing with 32 as in key[0] | 32). not sure as i didn’t check. just speculating as to why mine might be slightly faster.

$ zig build -Doptimize=ReleaseFast -Dbench-file=src/bench2.zig
$ poop -d 1000 'zig-out/bin/bench std_static_string_map' 'zig-out/bin/bench static_map' 'zig-out/bin/bench squeek502_hand_rolled' 'zig-out/bin/bench static_map_case_insensitive'
# ...
Benchmark 3 (502 runs): zig-out/bin/bench squeek502_hand_rolled
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.96ms ± 63.1us    1.73ms … 2.62ms         76 (15%)        ⚡- 73.0% ±  0.4%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         6.81M  ± 49.7K     6.79M  … 7.88M          20 ( 4%)        ⚡- 78.4% ±  0.3%
  instructions       9.62M  ± 0.90      9.62M  … 9.62M           5 ( 1%)        ⚡- 91.3% ±  0.0%
  cache_references   1.61K  ± 2.57K     1.28K  … 58.9K           8 ( 2%)          - 19.2% ± 22.3%
  cache_misses        413   ± 29.7       360   …  683           22 ( 4%)        ⚡- 23.9% ±  9.4%
  branch_misses       147K  ± 1.59K      146K  …  153K          77 (15%)        ⚡- 38.1% ±  0.1%
Benchmark 4 (584 runs): zig-out/bin/bench static_map_case_insensitive
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.68ms ± 71.7us    1.50ms … 2.16ms          2 ( 0%)        ⚡- 76.9% ±  0.3%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         5.72M  ± 94.1K     5.68M  … 7.83M          62 (11%)        ⚡- 81.8% ±  0.2%
  instructions       8.35M  ± 1.06      8.35M  … 8.35M           4 ( 1%)        ⚡- 92.4% ±  0.0%
  cache_references   2.44K  ± 9.18K     1.58K  …  219K          48 ( 8%)          + 22.5% ± 78.2%
  cache_misses        552   ± 67.4       472   … 1.46K          48 ( 8%)          +  1.8% ±  8.9%
  branch_misses       124K  ± 1.11K      123K  …  130K          58 (10%)        ⚡- 48.0% ±  0.1%

I would have expected to see more peak rss due to increased capacity. I guess that might be shown in the cache misses.

Anyway just wanted to make a fairer benchmark and see if I could get similar results with a custom context and some tuning.

Travis · November 17, 2024, 1:31am

I wonder if the hand-rolled code would benefit from doing the lower casing as i’ve done (by or-ing with 32 as in key[0] | 32).

@squeek502 Not 100% sure of the correctness, but I was able to bring your hand-rolled time down from 1.97ms to 969us with this change:

pub fn alpha2_to_lookup_id(alpha2_code: []const u8) error{InvalidAlpha2Code}!usize {
    if (alpha2_code.len != 2) return error.InvalidAlpha2Code;
    if (alpha2_code[0] < 'A' or alpha2_code[0] > 'z' or
        alpha2_code[1] < 'A' or alpha2_code[1] > 'z') return error.InvalidAlpha2Code;
    const a = (alpha2_code[0] | 32) - 'a';
    const b = (alpha2_code[1] | 32) - 'a';
    return @as(u16, a) * 26 + b;
}

gonzo · November 17, 2024, 9:09am

That range comparison will not return early for some characters outside [a-z][A-Z] – just a caveat, I did not read the details to find out if this might be a problem.