How to create an immutable map at compile time

I would like to create an map of some values. The map needs to be immutable because it is just a mapping of some key to some constants.

I have not been able to find a way to do this.

For example, let’s say I want to have a hashmap of country code to country name, for example

MAP
30 → “Greece”,
31 → “Netherlands”,
32 → “Belgium”,

etc

This map is a static list of the phone code to country name and it does not need to be mutable.

In my search I found a mention of ComptimeStringMap in Type of ComptimeStringMap and also from this reddit post https://www.reddit.com/r/Zig/comments/krnuac/is_it_possible_to_create_hash_maps_at_compile_time/

But it seems this is no longer available in the v0.3.0 standard library.

So question now is, how can one create such a immutable hashmap at comptime in Zig?

It seems that ComptimeStringMap was renamed to StaticStringMap.

5 Likes

If the keys are consecutive numbers, maybe an array would suffice ?

Unfortunately no. Although the actual use case is not like the country code, but just like the country code there are numbers that don’t map to anything.

Plus the numbers, that will serve as keys could be very large.

So I think with these two constraint, an array won’t suffice.

Do you by chance know if there is an alternative to this where the key can be a u8 or []u8 or generic ?

I think it is currently only possible with the StaticHashMap, so no generic types.

There are open issues for making allocators available at compile time.

If this is solved, then you could do something like:

    const map = comptime blk:{
        var gpa = std.heap.GeneralPurposeAllocator(.{}){};
        const allocator = gpa.allocator();

        var result = std.array_hash_map.AutoArrayHashMap(u8, []const u8).init(allocator);
        try result.put(30, "CoolCountry");
        try result.put(31, "SecondCoollCountry");
        // ....
        break :blk result;
    };
1 Like

Maybe this could be an inspiration for you ?

1 Like

Another example, I use a StaticStringMap in zdt, zdt/lib/tzdata.zig at master · FObersteiner/zdt · GitHub

1 Like

A couple more options, second one is mine

Unlike std.StaticStringMap these both support non-string keys.

2 Likes

Another solution is to create lookup tables with comptime. If the keys are unique, you can sort them to use binary search at runtime. This is the most memory compact way I think this could be solved, but it only support key->value lookups, and the keys have to be a of a known fixed size. If the values are also a known fixed size, you can skip the second table entirely.

const std = @import("std");
const assert = std.debug.assert;
const eql = std.ascii.endsWithIgnoreCase;

const data = [_]struct{u8, []const u8}{
    .{2, "hello"},
    .{3, "world"},
    .{9, "greece"},
    .{12, "usa"},
    .{13, "antartica"},
};

const NumType = u8;
const Key = packed struct {
    val: NumType = 0,
    off: usize = 0,
    len: usize = 0,
};

const ALL_STR_SIZE = get_str_size(data);

const both = build_lookups(data);
const keys = both[0];
const strs = both[1];

fn get_str_size(comptime a: [data.len]struct{NumType, []const u8}) usize {
    comptime var c: usize = 0;
    inline for (a) |x| c += x[1].len;
    return c;
}

fn build_lookups(comptime ar: [data.len]struct{NumType, []const u8}) struct{[data.len]Key, [ALL_STR_SIZE]u8} {
    comptime var k: usize = 0;
    comptime var z: usize = 0;
    var a = [_]Key{.{}} ** ar.len;
    var b = [_]u8{0} ** ALL_STR_SIZE;
    inline for (ar, 0..) |x, i| {
        k = x[1].len;
        inline for (0..k) |j| b[z+j] = x[1][j];
        a[i] = .{.val=x[0], .off=z, .len=k};
        z += k;
    }
    return .{a, b};
}

const LNS_JUNC = 64;
fn lookup(val: NumType) ?[]const u8 {
    var bot: usize = 0;
    var top: usize = keys.len - 1;
    var mid: usize = top >> 1;
    while (true) {
        const k = keys[mid];
        if (val == k.val) return strs[k.off..k.off+k.len];
        if (bot >= top) return null;
        if (top - bot <= LNS_JUNC) {
            return for (keys[bot..top+1]) |i| {
                if (i.val == val) break strs[i.off..i.off+i.len];
            } else null;
        }
        if (val > k.val) {
            bot = mid + 1;
        } else {
            top = mid - 1;
        }
        mid = ((top - bot) >> 1) + bot;
    }
}

test "lookups" {
    assert(eql(lookup(2).?, "hello"));
    assert(eql(lookup(3).?, "world"));
    assert(eql(lookup(9).?, "greece"));
    assert(lookup(4) == null);
    assert(lookup(5) == null);
    assert(lookup(6) == null);
    assert(eql(lookup(12).?, "usa"));
    assert(eql(lookup(13).?, "antartica"));
}
2 Likes

Why do you have to find a way to do this? Is it not a valid option to just write whatever code you want?

Use a switch statement. Use a perfect hash function. Some people enjoy writing trie implementations, that’s an option too. You’re a programmer, aren’t you? You can solve this in 100 different ways.

1 Like

A few years ago, I didn’t like/understand this answer when it was used to justify not merging this ComptimeHashMap PR, but have since come around to it.

A catch-all ComptimeHashMap that can be used for anything is convenient, but bespoke solutions tailored to your specific use-case are likely to be better. This is even true for StaticStringMap (previously named ComptimeStringMap), since it only actually has a benefit when the data has keys of varying string lengths (the optimization is just to sort the keys by length and then only compare equal-length strings in get).

As an example, here’s some ISO3166 ‘alpha2 → country’ lookup code I wrote but never ended up using (e.g. "be".belgium). It uses an array of type [26 * 26]Country and then @as(u16, a) * 26 + b to calculate the lookup index for an alpha2 code (where a and b are lowercase_char - 'a'). If I use the same alpha2_codes data with a StaticStringMap like so:

const alpha2_static_string_map = std.StaticStringMapWithEql(Country, std.static_string_map.eqlAsciiIgnoreCase).initComptime(alpha2_codes);

it works fine, but performs much worse, since the StaticStringMap is just doing a linear lookup because all the keys are of length 2:

Benchmark 1: array.exe
  Time (mean ± σ):      29.6 ms ±   0.6 ms    [User: 30.2 ms, System: 4.8 ms]
  Range (min … max):    28.4 ms …  32.2 ms    61 runs

Benchmark 2: map.exe
  Time (mean ± σ):     180.0 ms ±   1.1 ms    [User: 175.8 ms, System: 2.6 ms]
  Range (min … max):   178.9 ms … 183.5 ms    15 runs

Summary
  'array.exe' ran
    6.09 ± 0.13 times faster than 'map.exe'
Benchmark code
const std = @import("std");
const iso3166 = @import("iso3166.zig");

pub fn main() !void {
    const iterations = 1_000_000;
    var rand = std.Random.DefaultPrng.init(0);
    const r = rand.random();

    const max_len = 256;
    var buf: [max_len]u8 = undefined;

    var num_invalid: u64 = 0;
    var i: u64 = 0;
    while (i < iterations) : (i += 1) {
        const len = if (r.float(f32) >= 0.01) 2 else r.uintAtMost(usize, max_len);
        const code = buf[0..len];
        const alpha_only = r.float(f32) < 0.95;
        if (alpha_only) {
            for (code) |*c| {
                c.* = r.uintAtMost(u8, 26);
                c.* += if (r.boolean()) 'a' else 'A';
            }
        } else {
            r.bytes(code);
        }
        _ = iso3166.Country.fromAlpha2StaticStringMap(code) catch {
            num_invalid += 1;
        };
    }

    std.debug.print("num_invalid: {}\n", .{num_invalid});
}

I’m sure there are ways to make this lookup even faster, but I hope it shows what I’m trying to get at.

2 Likes

There are also some fine data structures in the std.enum namespace, which many reading may not be aware of. None of them allocate, making them appropriate for comptime. I’ve found them quite useful a number of times.

2 Likes

I was curious to see how my static-map compared to your hand rolled lookup so I added a benchmark here. Looks like the hand-rolled lookup is about 25% faster in this benchmark which does 100_000 iterations searching for a random 2 letters.

$ zig build -Doptimize=ReleaseFast -Dbench-file=src/bench2.zig
$ poop 'zig-out/bin/bench std_static_string_map' 'zig-out/bin/bench this_static_string_map2' 'zig-out/bin/bench squeek502_hand_rolled'
Benchmark 1 (707 runs): zig-out/bin/bench std_static_string_map
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.02ms ±  212us    6.74ms … 8.45ms         89 (13%)        0%
  peak_rss            881KB ±    0       881KB …  881KB          0 ( 0%)        0%
  cpu_cycles         30.3M  ±  684K     30.0M  … 35.0M          76 (11%)        0%
  instructions       95.9M  ± 0.94      95.9M  … 95.9M          10 ( 1%)        0%
  cache_references   5.38K  ± 39.6K     1.52K  …  550K          27 ( 4%)        0%
  cache_misses        475   ± 42.9       410   … 1.02K          25 ( 4%)        0%
  branch_misses       307K  ± 1.16K      305K  …  312K          36 ( 5%)        0%
Benchmark 2 (1896 runs): zig-out/bin/bench this_static_string_map2
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          2.60ms ± 87.4us    2.40ms … 3.47ms         25 ( 1%)        ⚡- 62.9% ±  0.2%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         9.98M  ± 67.7K     9.90M  … 11.6M          28 ( 1%)        ⚡- 67.0% ±  0.1%
  instructions       14.2M  ± 0.90      14.2M  … 14.2M          43 ( 2%)        ⚡- 85.2% ±  0.0%
  cache_references   2.07K  ± 5.63K     1.47K  …  213K          86 ( 5%)        ⚡- 61.4% ± 34.0%
  cache_misses        498   ± 52.4       434   … 1.62K         104 ( 5%)        💩+  4.8% ±  0.9%
  branch_misses       140K  ± 1.58K      137K  …  146K           1 ( 0%)        ⚡- 54.3% ±  0.0%
Benchmark 3 (2503 runs): zig-out/bin/bench squeek502_hand_rolled
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.96ms ± 51.6us    1.73ms … 2.57ms        246 (10%)        ⚡- 72.0% ±  0.1%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         6.86M  ± 47.4K     6.83M  … 8.01M           5 ( 0%)        ⚡- 77.4% ±  0.1%
  instructions       9.11M  ± 1.03      9.11M  … 9.11M          73 ( 3%)        ⚡- 90.5% ±  0.0%
  cache_references   1.54K  ± 2.12K     1.26K  … 58.8K          45 ( 2%)        ⚡- 71.3% ± 29.0%
  cache_misses        399   ± 38.6       335   …  730           88 ( 4%)        ⚡- 15.9% ±  0.7%
  branch_misses      18.8K  ±  358      18.5K  … 20.8K         234 ( 9%)        ⚡- 93.9% ±  0.0%

This also shows how std.StaticStringMap() fares with this adversarial input.

1 Like

Note that your benchmark is not entirely 1:1, since my hand-rolled version does a case-insensitive lookup.

Has anyone thought about just using an enum?

const Map = enum(u32) {
    @"Greece" = 30,
    @"Netherlands" = 31,
    @"Belgium" = 32,
    ...
};

fn mapToString(val: u32) []const u8 {
    return @tagName(@as(Map, @enumFromInt(val)));
}

Not sure about the performance of this, but I think this is by far the simplest solution.

This has the heavy caveat that it will panic/cause UB if val doesn’t have an associated enum field (and the @tagName will panic/cause UB if Map is non-exhaustive and there’s no associated enum field with that value).

My assumption would be that a function with a switch would be fairly performant for enum -> T conversion, but haven’t checked to see how it performs vs std.enums.EnumMap.

1 Like

Ah thats true. I added a specialized, case insensitive map and modified the benchmark alphabet to include upper case letters and doubled that map’s capacity. Now i’m able to slightly edge out the hand-rolled version.

I wonder if the hand-rolled code would benefit from doing the lower casing as i’ve done (by or-ing with 32 as in key[0] | 32). not sure as i didn’t check. just speculating as to why mine might be slightly faster.

$ zig build -Doptimize=ReleaseFast -Dbench-file=src/bench2.zig
$ poop -d 1000 'zig-out/bin/bench std_static_string_map' 'zig-out/bin/bench static_map' 'zig-out/bin/bench squeek502_hand_rolled' 'zig-out/bin/bench static_map_case_insensitive'
# ...
Benchmark 3 (502 runs): zig-out/bin/bench squeek502_hand_rolled
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.96ms ± 63.1us    1.73ms … 2.62ms         76 (15%)        ⚡- 73.0% ±  0.4%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         6.81M  ± 49.7K     6.79M  … 7.88M          20 ( 4%)        ⚡- 78.4% ±  0.3%
  instructions       9.62M  ± 0.90      9.62M  … 9.62M           5 ( 1%)        ⚡- 91.3% ±  0.0%
  cache_references   1.61K  ± 2.57K     1.28K  … 58.9K           8 ( 2%)          - 19.2% ± 22.3%
  cache_misses        413   ± 29.7       360   …  683           22 ( 4%)        ⚡- 23.9% ±  9.4%
  branch_misses       147K  ± 1.59K      146K  …  153K          77 (15%)        ⚡- 38.1% ±  0.1%
Benchmark 4 (584 runs): zig-out/bin/bench static_map_case_insensitive
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.68ms ± 71.7us    1.50ms … 2.16ms          2 ( 0%)        ⚡- 76.9% ±  0.3%
  peak_rss            909KB ±    0       909KB …  909KB          0 ( 0%)        💩+  3.3% ±  0.0%
  cpu_cycles         5.72M  ± 94.1K     5.68M  … 7.83M          62 (11%)        ⚡- 81.8% ±  0.2%
  instructions       8.35M  ± 1.06      8.35M  … 8.35M           4 ( 1%)        ⚡- 92.4% ±  0.0%
  cache_references   2.44K  ± 9.18K     1.58K  …  219K          48 ( 8%)          + 22.5% ± 78.2%
  cache_misses        552   ± 67.4       472   … 1.46K          48 ( 8%)          +  1.8% ±  8.9%
  branch_misses       124K  ± 1.11K      123K  …  130K          58 (10%)        ⚡- 48.0% ±  0.1%

I would have expected to see more peak rss due to increased capacity. I guess that might be shown in the cache misses.

Anyway just wanted to make a fairer benchmark and see if I could get similar results with a custom context and some tuning.

1 Like

I wonder if the hand-rolled code would benefit from doing the lower casing as i’ve done (by or-ing with 32 as in key[0] | 32).

@squeek502 Not 100% sure of the correctness, but I was able to bring your hand-rolled time down from 1.97ms to 969us with this change:

pub fn alpha2_to_lookup_id(alpha2_code: []const u8) error{InvalidAlpha2Code}!usize {
    if (alpha2_code.len != 2) return error.InvalidAlpha2Code;
    if (alpha2_code[0] < 'A' or alpha2_code[0] > 'z' or
        alpha2_code[1] < 'A' or alpha2_code[1] > 'z') return error.InvalidAlpha2Code;
    const a = (alpha2_code[0] | 32) - 'a';
    const b = (alpha2_code[1] | 32) - 'a';
    return @as(u16, a) * 26 + b;
}
1 Like

That range comparison will not return early for some characters outside [a-z][A-Z] – just a caveat, I did not read the details to find out if this might be a problem.

2 Likes