You Never Give Me Your Money ... sorry Allocator

Well known Zig allocator idiom -

is a design pattern where functions requiring memory take an
std.mem.Allocator instance as an argument, delegating memory allocation decisions to the caller.

I am asking about opposite direction - give allocator

In my current project I’d like that all clients of my code use the same allocator.

As possible solution - client gets allocator from my library and uses it

Are there any pitfalls in this decision?

Define

I assume the reason you want them to use the same allocator is because you do some memory management in your library.

Just document what it does with allocations, allocations are inherently tied to their allocator, this is not unique to your library.

It is a rule everyone should be following, and it’s easy to follow, as it’s uncommon to have more than 2 allocators available to any piece of code.

So, there are no pitfalls, and it is the norm. Unless I am misunderstanding.

To be clear, I am talking about using the same allocator. On the other hand, forcing the use of an allocator provided by your library is limiting, especially for enforcing a practice that is already used.

1 Like

just for clarification - my library gets allocator regular way - client should provide it
Because library initiates threads and long-lived “objects”
I asked client to provide gpa compatible allocator.

This one library will return to the client code

I use the “runtime_safety only fields” pattern extensively in my personal code.

In this pattern, in debug mode or releaseSafe mode, the first client’s allocator is lazily loaded into the global context, and subsequent allocators passed by other clients are checked against the allocator recorded in the global context using assertions. In releaseFast mode, these checks do not exist.

pub const runtime_safety = switch (@import("builtin").mode) {
    .Debug, .ReleaseSafe => true,
    .ReleaseFast, .ReleaseSmall => false,
};
pub const SafetyCheck = struct {
    alloc: ?std.mem.Allocator,
};

pub const MyLib = struct {
    ...,
    safety_check: if (runtime_safety) SafetyCheck else void = if (runtime_safety) .{ .alloc = null } else {},
};
1 Like

Am I missing something or… you are just talking about making custom allocator? Like, you can always make a var lib_gpa_state : mylib.AwesomeAllocator = .init(); and then people can use lib_gpa_state.allocator() whenever std.mem.Allocator is required.

Like… how allocators in std.heap are implemented

pub const Library = struct {
    gpa: Allocator = undefined,

    pub fn Create(gpa: Allocator) !*Library {
        const ret = try gpa.alloc(Library);
        ret.gpa = gpa;
        return ret;
    }

    pub fn giveAllocator(lib: *Library) Allocator {
        return lib.gpa;
    }

    pub fn asyncSend(lib: *Library, buf: []const u8) void {
        ......................................
    }
};

pub const Client = struct {
    lib: *Library = undefined,

    pub fn Create(lib: *Library) !*Client {
        const ret = try lib.giveAllocator();
        ret.lib = lib;
        return ret;
    }

    pub fn process(cln: *Client) void {
        .....................................
        const blob: []const u8 = try cln.lib.giveAllocator().alloc(u8, 1024);
        cln.lib.asyncSend(blob);
        .....................................
    }
};

..............................................
    var lib: *Library = try Library.Create(std.testing.allocator);

    ........................
    .........................

    var client: *Client = try Client.Create(lib);

    .........................
    .........................

    client.process();

   .........................

We are talking about using the same allocator

After finish of send on another thread, lib will use own allocator for destroy

I see, so buf passed to asyncSend must be allocated by the same allocator passed in .Create in your case… I would just let lib.gpa be a field to be accessed and encode that as a requirement

pub const Library = struct {
    gpa: Allocator = undefined,

    pub fn init(gpa: Allocator) Library {
        return .{ .gpa = gpa };
    }

    // buf must be allocated by `lib.gpa`
    pub fn asyncSend(lib: *Library, buf: []const u8) void {
        // ...
    }
};

If this sounds like mistake-prone, I could also have easier-to-use, less-footgun APIs in common usage. Just drop down to more granularities when you know you need it.

    // buf must be allocated by `lib.gpa`
    pub fn asyncSendAssumeAllocated(lib: *Library, buf: []const u8) void {
        // ...
    }

    // duplicate buffer with `lib.gpa`
    pub fn asyncSend(lib: *Library, buf: []const u8) !void {
        return lib.asyncSendAssumeAllocated(try lib.gpa.dupe(u8, buf));
    }

etc etc. On client-side:

var gpa: std.mem.Allocator = std.testing.allocator;
var lib: Library = .init(gpa);
var blob: std.ArrayList(u8) = try .initCapacity(lib.gpa, 1024);
lib.asyncSend(blob.items);

EDIT: I would have make the more footgun API more verbose. edited accordingly

EDIT2: and ideally, you would want to have some std.debug.assert to check the requirements as well. Though I am not sure how to check if that block is allocated by the same allocator

1 Like

We had long allocator related discussion

see Attributes

For my project

mallocReplacement == true

will be good enough

Is using wrong allocator that much of a footgun that requires so much gymnasiums to solve and enforce in code? Surely your software does more things other than allocating memory… right?

For the record, I have never had problems with allocator. If there are performance issues, the answer almost always is not better allocators, but better allocations. Maybe your concurrent or lock-free APIs/algorithms should just take in pre-allocated memories? Maybe batch allocate all the things at a start of a parallel loop? Maybe not call allocator.alloc every loops / on every threads? Maybe you need a memory pool?

etc etc

(Same as some C++/Rust code I have seen in the wild. Maybe your programs have better things to do other than initializing objects, or wrapping things in newtypes and traits…?)

1 Like

It is always a foot gun, but you’re right that it is unnecessary to enforce it as it’s not that big of an issue, and will be caught if using DebugAllocator’s, which you should be in debug modes (unless they are way too slow, then it’s understandable to use something else).

Either way, when memory bugs happen the first thing you check is if you used the wrong allocator, if that’s even possible in your code.

Regardless, the main reason they are storing the allocator is to free the allocated memory from another thread, which is a valid solution.

1 Like

I think you need to identify what is being passed and whether or not it needs to be passed in. In this case, it seems like when you initialize the library you could add a field that is a buffer, or have the client provide the buffer if you want them to control the size.
The instead of passing the buffer in, you could have them use the internal library buffer and pass a slice of that buffer in.
This is from observing the code you provided, but ultimately I don’t think your problem is really that you need the allocator to be passed from the library to the client, but rather that there is certain data that should be owned by the library that you are instead trying to pass around. You may want to rethink your ownership and data lifetimes here.

2 Likes

Since the library only accepts buffers being allocated in certain ways, it should allocate the buffers itself and hands them back to the clients to use. These buffers are library owned. See sample below,

pub const Library = struct {
    gpa: Allocator,

    pub fn getBuf(lib: *const Library, size: usize) ![]const u8 {
        return try lib.gpa.alloc(u8, size);
    }

    /// Client must pass in buf allocated with getAlloc ().
    pub fn asyncSend(lib: *Library, buf: []const u8) void {
        …
        // much later after the async operation completed 
        lib.gpa.free(buf);
    }
};

pub const Client = struct {
    lib: *Library,

    pub fn process(cln: *Client) void {
        const blob: []const u8 = try cln.lib.getBuf(1024);
        … fill blob …
        cln.lib.asyncSend(blob);
    }
};
1 Like

My fault
Library actually is long lived multithreaded code (service)
Main functionality - asynchronous message passing
via TCP sockets and Unix Domain Sockets.

I did not force client to use the same allocator, but help
to use the same allocator if she/he wants

Interface of this service will be done in allocator if style =>
client code saves it within own structs.

So client always has access to allocator used by all participants…

Rough example of main interface Ampe - async message passing engine:

pub const Ampe = struct {
.........................................
/// Gets a message from the internal pool based on the allocation strategy.
pub fn get(ampe: Ampe, strategy: AllocationStrategy) status.AmpeError!?*message.Message {
.........................................
}

/// Returns a message to the internal pool.
pub fn put(ampe: Ampe, msg: *?*message.Message) void {
.........................................
}

/// Creates new Channels.
pub fn create(ampe: Ampe) status.AmpeError!Channels {
.........................................
}

/// Destroys Channels, stops communication, and frees memory.
pub fn destroy(ampe: Ampe, chnls: Channels) status.AmpeError!void {
.........................................
}

/// Gets allocator used by engine for all memory management.
pub fn getAllocator(ampe: Ampe) Allocator {
.........................................
}

Ampe acts as allocator, but instead of dealing with memory, it’s allocates/de-allocates main engine resources:

  • Messages
  • Channels

Rough example of the caller code:

const TofuServer = struct {
        const Self = @This();
        ampe: Ampe = undefined,
        chnls: ?tofu.Channels = undefined,
        cfg: Configurator = undefined,
        helloBh: message.BinaryHeader = undefined,
        connected: bool = undefined,

        pub fn create(engine: Ampe, cfg: *Configurator) status.AmpeError!*Self {
            const allocator = engine.getAllocator();
            const result: *Self = allocator.create(Self) catch {
                return status.AmpeError.AllocationFailed;
            };
            errdefer allocator.destroy(result);

            result.* = try Self.init(engine, cfg);

            try result.createListener();

            return result;
        }
        fn createListener(server: *Self) status.AmpeError!void {
            var welcomeRequest: ?*Message = server.*.ampe.get(tofu.AllocationStrategy.always) catch unreachable;
            defer server.*.ampe.put(&welcomeRequest);

            server.*.cfg.prepareRequest(welcomeRequest.?) catch unreachable;

            var initialBh = server.*.chnls.?.sendToPeer(&welcomeRequest) catch unreachable;

            var welcomeResponse: ?*Message = server.*.chnls.?.waitReceive(tofu.waitReceive_INFINITE_TIMEOUT) catch  {
                return err;
            };
            defer server.ampe.put(&welcomeResponse);

            welcomeResponse.?.bhdr.dumpProto("server recv ");

            if (welcomeResponse.?.bhdr.status == 0) {
                return;
            }
            status.raw_to_error(welcomeResponse.?.bhdr.status) catch  {
                return err;
            };
        }
}

You can see that

  • instead of allocator interface, first argument in create is ampe interface - actually allocator of all resources for participants.
  • as allocator ampe is stored by client and used during client life cycle

Example of ampe creation:

    var eng: *Engine = try Engine.Create(gpa, options);
    defer eng.Destroy();
    const ampe: Ampe = try eng.ampe();

So engine does not return own alocator, it returns allocator provided by caller during creation