Are HashMaps make copy of key and value?

I have ArrayHashMap, that takes MessageKey struct as a key, and Message as value.

const MessageKeyMap = @This();

map: ArrayHashMap(MessageKey, Message, MessageKeyContext, true),
mutex: std.Thread.Mutex = .{},

pub const MessageKey = struct {
    id: u16,
    address: []const u8,
};

pub const MessageKeyContext = struct {
    pub fn hash(ctx: MessageKeyContext, key: MessageKey) u32 {
        _ = ctx;
        var hasher = Wyhash.init(0);
        hasher.update(&.{
            @truncate(key.id >> 8),
            @truncate(key.id & 0xFF),
        });
        hasher.update(key.address);
        return @as(u32, @truncate(hasher.final()));
    }

    pub fn eql(ctx: MessageKeyContext, a: MessageKey, b: MessageKey, b_index: usize) bool {
        _ = ctx;
        _ = b_index;

        if (a.id != b.id) {
            return false;
        }

        if (!std.mem.eql(u8, a.address, b.address)) {
            return false;
        }

        return true;
    }
};

pub fn init(allocator: Allocator) MessageKeyMap {
    return .{
        .mutex = .{},
        .map = ArrayHashMap(
            MessageKey,
            Message,
            MessageKeyContext,
            true,
        ).init(allocator),
    };
}

Put and get are just default put and get, but with Mutex locks.

pub fn put(self: *MessageKeyMap, key: MessageKey, value: Message) !void {
    self.mutex.lock();
    defer self.mutex.unlock();
    try self.map.put(key, value);
}

pub fn get(self: *MessageKeyMap, key: MessageKey) ?Message {
    self.mutex.lock();
    defer self.mutex.unlock();
    return self.map.get(key);
}

I use it in income event handler to store pending for answer messages. While answers are being processed.
When answer from the next stage processed, new similar handler is triggered, and somehow MessageKey.address is changed instantly. I think as generally slices are just pointers with length, it can be possible, that ArrayHashMap takes new value from the same pointer (Message also contains slices, and also changed if I’m correct).

This handler is for request:

pub fn middleware(device: *Device, msg: Message) void {
    const message_key: MessageKey = .{
        .id = msg.id,
        .address = msg.recipient,
    };

    ... some checks and logs ...

    device.message_pending_map.put(
        message_key,
        msg,
    ) catch |err| {
        log.err(
            "{s} failed to store pending message: {any}",
            .{ device.address, err },
        );
        return;
    };

    ...
}

This handler is for answer:

pub fn middleware(device: *Device, msg: Message) void {
    new_address = ... retrival of requested address from answer ...
    if (device.message_pending_map.get(.{
        .id = msg.id,
        .address = new_address,
    })) |pending_msg| {
        ... This part never triggers ...

I think, I need to copy slice with Allocator, and then save it. But is this the case?

1 Like

You’re right. By default, the HashMap doesn’t manage the pointers you give it, it knows nothing about the lifetime of a string you pass in. If you want it to own that memory, you need to make an object that holds the HashMap and a buffer that holds the string data. Sometimes I’ll just make an ArrayList(u8) to hold the string data. In that case, you can’t use slices/pointers because if the ArrayList resizes it might move to a different address, so you typically want to use indices into the buffer. This can be nice though, since you can use u32’s for your index and length, cutting the size of a slice in half. A similar strategy would be using an Arena allocator.

Alternatively, you could allocate the memory wherever and just make sure you go free each individual allocation when you’re done.

Another alternative is to allocate strings inside a SegmentedList. That would give you pointer stability, but you’d have to make sure your individual strings are placed in contiguous memory.

5 Likes

Thank you! Very clear explanation!