Replace mem.eql in stringToEnum() fallback case with hash-based check

A direct comparison is a slow process, and one where a hash would definitely speed up stringToEnum in the cases where it falls back on slower approach. This is my proposed implementation. Also please tell me if this would be better suited to an issue/pr, I’m very new to the whole “being helpful” thing.

pub fn stringToEnum(comptime T: type, str: []const u8) ?T {
    // Using StaticStringMap here is more performant, but it will start to take too
    // long to compile if the enum is large enough, due to the current limits of comptime
    // performance when doing things like constructing lookup maps at comptime.
    // TODO The '100' here is arbitrary and should be increased when possible:
    // - https://github.com/ziglang/zig/issues/4055
    // - https://github.com/ziglang/zig/issues/3863
    if (@typeInfo(T).@"enum".fields.len <= 100) {
        const kvs = comptime build_kvs: {
            const EnumKV = struct { []const u8, T };
            var kvs_array: [@typeInfo(T).@"enum".fields.len]EnumKV = undefined;
            for (@typeInfo(T).@"enum".fields, 0..) |enumField, i| {
                kvs_array[i] = .{ enumField.name, @field(T, enumField.name) };
            }
            break :build_kvs kvs_array[0..];
        };
        const map = std.StaticStringMap(T).initComptime(kvs);
        return map.get(str);
    } else {
        const hash = std.hash.Fnv1a_64.init().hash(str);
        // changed enumField to enum_field to comply with naming standards.
        inline for (@typeInfo(T).@"enum".fields) |enum_field| {
            const field_hash = std.hash.Fnv1a_64.init().hash(enum_field.name);
            if (hash == field_hash) {
                return @field(T, enum_field.name);
            }
        }
        return null;
    }
}

Unless you provide benchmarks that prove when and where it is faster, I will just heavily doubt that this is faster.

Why? Because mem.eql can easily early exit in lots of cases, with the hashing you always have to compute the whole hash.


I am not entirely sure, but I think part of the idea is that the optimizer should recognize patterns between the ifs and generate more optimal code based on whatever similarities exist between those branches. (for example group strings of same length, or maybe even with shared prefixes)

I think llvm just gives up trying to find a better solution after some fixed size, but I still think that some of the idea of that fallback is to just feed the code generation backend something fairly straightforward and let it optimize.

2 Likes

It’s definitely an improvement, but you can do even better. Instead of a linear search for which enum matches the hash, you can sort the hashes and do a binary search. I’m doing something similar for enums.fromInt.
Also, this is not sufficient:

if (hash == field_hash) 
  return @field(T, enum_field.name);

You need to do a std.mem.eql with enum_field.name to make sure it’s not a hash collision.

Sorry if I’m wrong, but in the best case scenario, with 101 elements in the enum, and every name starting with a different character, so mem.eql can return false on the first character of every field, wouldn’t the current approach end up doing a comparison for every field, while the hashing approach would do it once for every character in the name? If I didn’t just make a mistake with that, the current approach would only be faster you were looking for the nth field or lower, where n is the number of characters in the name?

Would it make more sense to add a mem.eql, or to increase the hash size to 128 bits so that the chance of a collision would be 2^-64?

In your case, std.mem.eql would do 101x1 character comparisons. Agreed. Note: when checking for equality, you don’t have to go byte by byte. You can compare 64/128/256 bits at a time if you have wide enough registers, and so check multiple characters at once.

A hashing version has to build the hash of the search string, which has linear complexity with the length of the string, and then (assuming a 4-byte hash) 101 x 4-byte comparisons.

It’s possible the hash version can be faster if all the possible values has the same start to all their strings.

Lets say they all match for 7 out of 8 bytes, and only differ on the last byte. It would be 101 x 8 byte comparisons (808 bytes with std.mem.equal) vs a hash generation over 20 bytes + 101 x 4-bytes (404 bytes compared + cost of hash). If the hashes are in a sorted container, then maybe you could do a binary search and make it faster.

No matter how low you go, some values would still be incorrectly converted into an enum. In a general purpose library, the only acceptable chance of that happening is 0.
For specific applications, a small error chance might be acceptable, but they would have to write their own function.

2 Likes

My sense of the thing is that the intended solution is faster comptime. That benefits everything.

How many cases are there of an enum with more than 100 values? How many of them need stringToEnum? Could those cases just use StaticStringMap directly? Personally I keep forgetting about std.meta.stringToEnum and writing a StaticStringMap when I need this.

Are we sure that hashing that many strings at comptime isn’t also slow? It might be right?

You’re replacing one kind of linear probe with another. Improving the performance of that would require binary search, which would mean sorting the hashes, and that’s guaranteed to be slower than constructing a StaticStringMap, because it’s the sort which is slow in that process.

I get that the hash version does slightly less comparison (maybe), but basically you’re optimizing only when there are a lot of prefix matches of the same length. How common is that? Because if there are two enums .foo and .foobar, mem.eql will see that the lengths differ and won’t even follow the pointer. That’s pretty fast.

Basically this is unlikely to improve anything.

I think this is admirable! I hope I’m not discouraging that sentiment when I say that this would most likely not help. Sorry.

3 Likes