I think it might be possible by constructing complicated struct literals at comptime instead, but I haven’t tried because @unionInit
is the easier (and maybe only) way to do it.
If you only want to allow ArenaAllocator
you could accept a pointer to that directly, but there are cases where you want the allocator to behave in some way, that isn’t directly enforced by the std.mem.Allocator
interface, but you still don’t want to restrict it more then necessary so that the user can choose.
So basically I meant, you could communicate to the user in some way, that your data structure is going to leak its allocations. The user then has the choice to pass an ArenaAllocator
, or a big enough FixedBufferAllocator
, or something similar, that can be reset in some way. That way the user is still able to use the datastructure without leaking memory globally until the program exits.
When I see an allocator called arena I just assume that it is going to leak memory,
but maybe we should have some community chosen names somewhere for different kinds of allocators, maybe there is already something somewhere.
I don’t understand how this is a problem, the moment you have the tag you can switch on it and within the different prongs of that switch the tag is compile time known. Take a look at random_node
or default_node
below.
I think you have a misconception thinking it can’t be comptime, but the switch can choose between different code paths at run time, where these pathes can have different compile time known tag values, either by using inline else => |comptime_tag|
or by you knowing that in this branch for example .Foo
the tag is .Foo
.
If you have the tag but not the data for the value you can construct the default value or even just leave it undefined and then set the field values later (undefined_node
), for example when you need the pointer to the node before you are done parsing the full value for it.
All the inline else switches can be put into their own helper functions like defaultNode
if it is something that is needed multiple times. However I don’t think it makes sense to put these switches that are used to construct the right value inside the data structure. Maybe there are cases where it makes sense to put the helper function defaultNode
into the Node namespace, but I still think of it as a helper function that is just located there for some reason, while I think of the other init functions as the “real constructor functions” because they are more basic and defaultNode
just branches from runtime tag to comptime tag.
If I wrote some of my own code and I knew what I needed then I may end up writing functions like defaultNode
directly in Node
, if I only need default nodes. The way I write it below is more for the case, where the data structure may be used in different ways and you want to keep the options open. I think it makes sense to think about the general cases, put the different helper things in separate functions that just boil down to the same basic functions. Writing specific things is good, but only if it doesn’t unnecessarily prescribe/restrict how it can be used on the call/use-site.
Here are a bunch more variations, the program uses random to switch between different tags, run it multiple times:
const std = @import("std");
const Node = union(enum) {
const Tag = std.meta.Tag(Node);
Foo: *struct { shared: u8 = 0, text: []const u8 = "foo" },
Bar: *struct { shared: u8 = 1, num: f64 = 42.45656 },
pub fn Value(comptime tag: Tag) type {
return std.meta.Child(std.meta.TagPayload(Node, tag));
}
pub fn initPtr(comptime tag: Tag, payload_ptr: std.meta.TagPayload(Node, tag)) Node {
return @unionInit(Node, @tagName(tag), payload_ptr);
}
pub fn init(alloc: std.mem.Allocator, comptime tag: Tag, payload_value: Value(tag)) !Node {
const ptr = try alloc.create(@TypeOf(payload_value));
ptr.* = payload_value;
return initPtr(tag, ptr);
}
pub fn initPtrRuntime(tag: Tag, type_erased_payload_ptr: *anyopaque) Node {
return switch (tag) {
inline else => |comptime_tag| initPtr(comptime_tag, @ptrCast(@alignCast(type_erased_payload_ptr))),
};
}
pub fn deinit(self: Node, alloc: std.mem.Allocator) void {
switch (self) {
inline else => |n| alloc.destroy(n),
}
}
pub fn copy(self: Node, alloc: std.mem.Allocator) !Node {
return switch (self) {
inline else => |value_ptr, tag| try init(alloc, tag, value_ptr.*),
};
}
pub fn print(self: *const Node, heading: []const u8) void {
std.debug.print("{s}\n ", .{heading});
switch (self.*) {
.Foo => |f| std.debug.print("Foo shared: {d} text: {s}\n", .{ f.shared, f.text }),
.Bar => |f| std.debug.print("Bar shared: {d} num: {e}\n", .{ f.shared, f.num }),
}
std.debug.print("\n", .{});
}
};
pub fn defaultNode(alloc: std.mem.Allocator, tag: Node.Tag) !Node {
return switch (tag) {
inline else => |comptime_tag| Node.init(alloc, comptime_tag, .{}),
};
}
const RndGen = std.rand.DefaultPrng;
pub fn pickRandomEnum(comptime T: type, random: std.rand.Random) T {
const enum_fields = std.meta.fields(T);
return @enumFromInt(random.uintLessThan(u8, enum_fields.len));
}
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
const seed: u64 = @intCast(std.time.nanoTimestamp());
var rng = RndGen.init(seed);
const random = rng.random();
const tag = pickRandomEnum(Node.Tag, random);
// use a switch at the call site where you know the runtime value of tag
// to switch over the different possibilities within the prongs of the switch
// you know the tag type at comptime because you know that that prong will only
// be selected for that type
// so there is no need for the payload_value to be of an unknown type, we already know its type,
// just by using a switch
const random_node: Node = switch (tag) {
.Foo => try Node.init(allocator, .Foo, .{ .shared = 4, .text = "fooooo" }),
.Bar => try Node.init(allocator, .Bar, .{ .shared = 9, .num = 25.030303 }),
};
defer random_node.deinit(allocator);
random_node.print("random");
// if multiple prongs can be constructed in a similar way we can use an inline else switch
// to generate the different cases
const shared_random_node: Node = switch (pickRandomEnum(Node.Tag, random)) {
inline else => |comptime_tag| try Node.init(allocator, comptime_tag, .{ .shared = @intFromEnum(comptime_tag) }),
};
defer shared_random_node.deinit(allocator);
shared_random_node.print("shared_random");
const default_node: Node = try defaultNode(allocator, pickRandomEnum(Node.Tag, random));
defer default_node.deinit(allocator);
default_node.print("default_node");
const Foo = Node.Value(.Foo);
const foo = try allocator.create(Foo);
foo.* = .{ .shared = 1, .text = "hello" };
var foo_node = Node.initPtr(.Foo, foo);
defer foo_node.deinit(allocator);
foo_node.print("foo_node");
const Bar = Node.Value(.Bar);
const api_internal_value = try allocator.create(Bar);
api_internal_value.* = .{ .num = 11111.22222 };
// lets pretend some api has given us a runtime tag value and an opaque (type erased) pointer
const api_tag: Node.Tag = .Bar;
const api_value: *anyopaque = @ptrCast(api_internal_value);
const api_node: Node = Node.initPtrRuntime(api_tag, api_value);
defer api_node.deinit(allocator);
api_node.print("api_node");
const undefined_node: Node = switch (pickRandomEnum(Node.Tag, random)) {
inline else => |comptime_tag| try Node.init(allocator, comptime_tag, undefined),
};
defer undefined_node.deinit(allocator);
// disabled because this could print a lot of garbage if we pick the Foo node and we are unlucky
// (or segfault if we try reading memory we aren't allowed to access)
// undefined_node.print("undefined_node garbage");
switch (undefined_node) {
inline else => |ptr, comptime_tag| {
ptr.* = .{};
const shared: u8 = @intFromEnum(comptime_tag);
const special = 42;
const special_shared = special + shared;
ptr.shared = special_shared;
},
}
undefined_node.print("undefined_node fixed");
var bar_node = try Node.init(allocator, .Bar, .{ .shared = 1, .num = 4.2 });
defer bar_node.deinit(allocator);
bar_node.print("bar_node");
var copy_node = try bar_node.copy(allocator);
defer copy_node.deinit(allocator);
copy_node.Bar.num = -260.3333;
copy_node.print("copy_node");
}
I can’t think of another variation at the moment, hopefully what you need is one of them, if it isn’t I am really curious about what it could be, maybe you could try adding one that is more like what you are doing.