Allocator/manager for shared objects?

AndrewCodeDev · January 6, 2024, 11:36pm

I’m reading through some C++ implementations for database objects and there’s a common theme - shared objects/pointers.

It’s a common question we get around pointers and people used to RAII - how to do shared memory where calling deinit doesn’t invalidate other objects.

So this got me to thinking about trying to create an allocator or manager that handles this idea.

I’m only at the conceptual step here, so I’m sure I’m wrong and have a lot to learn, but that’s why I’m here (after all).

Here’s my first approximation for the allocator:

First, we’d have to track the memory that’s been allocated - that could be stored in some data-structure that’s favourable for quick lookup.

For any memory that’s allocated, we save a pointer to it along with a counter. For different pointer types, that info could be saved either as the numeric value of the pointer (like @intFromPtr) and stored as a usize or even as an *anyopaque. That would have to happen with the alloc function.

When the user calls free, the item would be located and the associated counter is decremented. When counter reaches zero, the item can then be actually freed.

For resize, that would require the counter to be equal to one - otherwise, it just returns false. That way, we don’t invalidate other memory currently pointed to that same address.

Cons

The first one being that objects that require reallocation frequently would be more expensive. Same thing goes for resizing - unless you’re positive that you only have one count, it would fail most of the time. If that’s the case, then you probably aren’t actually sharing objects, making the entire mechanism awkward.

The other alternative is to skip the allocator idea entirely and just build a struct that wraps an allocator and provides an even more limited interface. One benefit here is we could omit resize entirely and communicate the intent more clearly.

Any thoughts?

AndrewCodeDev · January 7, 2024, 12:35am

Another hurdle with this idea is that it doesn’t support typical copying. So here’s an example…

Say I have an object with shared state and I want to copy it.

var u = ObjectWithSharedState.init(...); // initial object... counter 1

var v = u; // copies object, but doesn't increment counter, still 1

v.deinit(); // counter goes to 0, frees internal state.

// u is now invalidated

So with the allocator/manager approach, this seems very awkward for Zig. It seems like the object itself would need to implement a copy function itself.

Seems like it’s not very ergonomic unless all of the handling code of shared objects is entirely on the backend.

One alternative is to do everything through pointers - the same problem still applies, but a manager could be more useful here.

So like…

const u = try manager.create(...); // counter is 1

const v = manager.share(u); // counter is 2

manager.destroy(v); // counter is still 1

// u is still valid

Still has the following problem:

const u = try manager.create(...); // counter 1

const v = u; // counter still is 1

manager.destroy(v); // counter is 0

// u is now invalid - same with a normal pointer, really.

However, people are more privy to pointer semantics and usage, so maybe that’s not an issue here?

Cloudef · January 7, 2024, 2:12am

I think there’s no good solution for preventing copy until Proposal: Pinned Structs · Issue #7769 · ziglang/zig · GitHub or something similar is in the language.

The closest I can do right now is

const std = @import("std");

fn Shared(comptime T: type, comptime Src: std.builtin.SourceLocation) type {
    return struct {
        comptime uniq: std.builtin.SourceLocation = Src,
        ref_count: usize = 0,
        value: *T,

        fn init(value: *T) @This() {
            return .{ .value = value };
        }

        fn deinit(self: *@This()) void {
            if (self.ref_count >= 1) {
                self.ref_count -= 1;
                return;
            }
            self.value.deinit();
            self.* = undefined;
        }

        fn ref(self: *@This()) @This() {
            self.ref_count += 1;
            return self.*;
        }
    };
}

pub fn main() !void {
    var a: u32 = 1;
    var b: u32 = 1;

    var shared = Shared(u32, @src()).init(&a);
    const shared2 = Shared(u32, @src()).init(&b);
    const cpy = shared;
    shared = shared2;

    std.debug.print("{}, {}, {}", .{ a, shared, cpy });
}

Which at least prevents another shared object declaration being copied to another one, but still allows the same declarations copied over each other.

AndrewCodeDev · January 7, 2024, 3:37am

@Cloudef Cool trick

I went ahead and started a draft implementation of the manager object I was thinking of. It’s completely non-optimal (like array list and linear search) and currently I’m just locking the mutex without much thought, but I wanted to start with something:

const std = @import("std");
const builtin = @import("builtin");
const Allocator = std.mem.Allocator;
const Mutex = std.Thread.Mutex;
const ArrayList = std.ArrayList;

const debug = (builtin.mode == std.builtin.OptimizeMode.Debug);

fn debugAssert(comptime message: []const u8, check: bool) void {
    if (comptime debug) {
        if (!check) @panic(message);
    }
}

const ManagerConfig = struct {
    thread_safe: bool,  
    // other stuff maybe...
};

const DummyMutex = struct {
    fn lock(_: *DummyMutex) void {}
    fn unlock(_: *DummyMutex) void {}
};

const addr_err_msg = "Address not found in managed counters.";
const refs_err_msg = "References must be greater than zero.";

const SharedCounter = struct {
    address: usize,
    counter: usize,
};

pub fn Manager(comptime config: ManagerConfig) type {
    
    return struct {

        const Self = @This();

        allocator: Allocator,

        mutex: if (config.thread_safe) Mutex else DummyMutex,

        // TODO: replace ArrayList in the future
        shared_counters: ArrayList(SharedCounter),

        pub fn init(allocator: Allocator) Self {
            return .{
                .allocator = allocator,
                .mutex = .{},
                .shared_counters = ArrayList(SharedCounter).init(allocator),
            };
        }

        pub fn deinit(self: *Self) void {
            self.shared_counters.deinit();
        }

        pub fn create(self: *Self, comptime T: type) !*T {

            self.mutex.lock();

            defer self.mutex.unlock();
            
            const ptr = try self.allocator.create(T);

            try self.shared_counters.append(SharedCounter{ 
                .address = @intFromPtr(ptr),
                .counter = 1
            });

            return ptr;
        }

        pub fn destroy(self: *Self, ptr: anytype) void {

            self.mutex.lock();

            defer self.mutex.unlock();

            const index = self.lookupAddress(@intFromPtr(ptr));

            if (index) |i| {
                const sc = &self.shared_counters.items[i];

                debugAssert("Manager.destroy: " ++ refs_err_msg, sc.counter != 0);

                if (sc.counter < 2) {

                    _ = self.shared_counters.swapRemove(i);

                    self.allocator.destroy(ptr);

                } else {
                    sc.counter -= 1;
                }
            }

            debugAssert("Manager.destroy: " ++ addr_err_msg, index != null);
        }


        pub fn share(self: *Self, ptr: anytype) @TypeOf(ptr) {

            self.mutex.lock();

            defer self.mutex.unlock();

            const index = self.lookupAddress(@intFromPtr(ptr));

            if (index) |i| {
                const sc = &self.shared_counters.items[i];

                debugAssert("Manager.share: " ++ refs_err_msg, sc.counter != 0);

                sc.counter += 1;
            }

            debugAssert("Manager.share: " ++ addr_err_msg, index != null);

            return ptr;
        }
        
        // TODO: replace this... linear search
        fn lookupAddress(self: *const Self, address: usize) ?usize {

            const N: usize = self.shared_counters.items.len;

            for (0..N) |i| {
                if (address == self.shared_counters.items[i].address) {
                    return i;
                } 
            }

            return null;
        }        

    };
}

pub fn main() !void {

    var gpa = std.heap.GeneralPurposeAllocator(.{}){ };

    var manager = Manager(.{ .thread_safe = false }).init(gpa.allocator());

    defer {
        manager.deinit();

        if (gpa.deinit() == .leak) @panic("LEAK DETECTED.");
    }

    // example...

    const u = try manager.create(usize);        

    std.debug.print("\ncounter: {}", .{ manager.shared_counters.items[0].counter });

    const v = manager.share(u);

    std.debug.print("\ncounter: {}", .{ manager.shared_counters.items[0].counter });

    manager.destroy(v);

    std.debug.print("\ncounter: {}", .{ manager.shared_counters.items[0].counter });

    manager.destroy(u);
    
    std.debug.print("\nFinal Shared Objects: {}\n", .{ manager.shared_counters.items.len });
}

It follows this example…

const u = try manager.create(...); // counter is 1

const v = manager.share(u); // counter is 2

manager.destroy(v); // counter is still 1

// u is still valid