Making opaque a real type

Currently, opaque is not something we can create an instance of. We can only create pointers to opaque. To hide implementation details we might do something like this:


const Context = struct {
    field1: usize,
    field2: usize,
};
const ContextOpaquePtr = *align(@alignOf(Context)) opaque {};

pub fn startContext() ContextOpaquePtr {
    const allocator = gpa.allocator();
    const ptr = try allocator.create(Context);
    ptr.* = .{ .field1 = 123, .field2 = 456 };
    return @ptrCast(ptr);
}

The shapeless nature of opaque basically forces us to use the heap, increasing code size and slowing things down. Most of the times we donā€™t really that level of secrecy. Itā€™s okay for the caller to know the dimensions and memory alignments of our structures. Opaqueness alone is sufficient.

So hereā€™s my idea: allow the assignment of a backing type to an opaque.
Such an opaque would inherit a size and alignment from the backing type. This info allows the caller to provide memory storage on the stack for what is otherwise unknown structure. The example above can then be implemented like so:

const Context = struct {
    field1: usize,
    field2: usize,
};
const ContextOpaque = opaque(Context) {};

pub fn startContext() ContextOpaque {
    const cxt: Context = .{ .field1 = 123, .field2 = 456 };
    return @bitCast(cxt);
}

In addition, there would be a built-in function @expose(), which would let us selectively exposes decls from the backing type:

const Context = struct {
    field1: usize,
    field2: usize,

    fn getX(self: Context) usize {
        return self.field1;
    }
};
const ContextOpaque = opaque(Context) {
    pub const getX = @expose("getX");
};

The built-in would automatically cast the function so that self is of the opaque type.

3 Likes

Interesting notion. What do you see as the advantage of this approach?

I think of anyopaque as Zigā€™s answer to void * in C: a type-generic way to include a reference to some memory, where a consumer is expected to know what that reference means. opaque itself is a way to provide more type safety than you get with void *, because that lets you have distinct categories of otherwise-unknown value.

This gets used for applications like a classic context object, which is defined as a function pointer of a given signature, and an * anyopaque pointer which is an argument to that function pointer. Then the function casts the pointer to what it knows it to be. That means that the code consuming the context object doesnā€™t have to be comptime-generic, which can be inconvenient or impossible.

What sort of tasks do you see a size-bearing opaque type as achieving? I donā€™t personally think hiding fields from consumers of some struct is worthwhile, and that seems to be the consensus: it wouldnā€™t be difficult to add private fields, but that idea has been rejected when proposed.

So Iā€™m not seeing what a structure-erased blob with a size and alignment would really get us. Could you describe an application to go with the mechanism? You can make a pointer to some stack-allocated struct into an * anyopaque now, as long as youā€™re passing it down-stack so that the memory remains valid:

const StackStruct = struct {
    []const u8,
    []const u8,
    []const u8,
};

fn printStackStruct(erased: *const anyopaque) void {
    const unerased: *const StackStruct = @ptrCast(@alignCast(erased));
    std.debug.print("{s}\n", .{unerased[1]});
}

test "opaque pointer" {
    const on_stack: StackStruct = .{ "fee", "fie", "foe" };
    const eraser: *const anyopaque = @ptrCast(&on_stack);
    printStackStruct(eraser);
}
2 Likes

A good example is this struct in zig-sqlite:

pub const Db = struct {
    const Self = @This();

    db: *c.sqlite3,
    // ...
};

Basically, all it contains is a C pointer. The content of the struct has the same size as a pointer to struct. Right now, I have to allocate 4/8 bytes off the heap in order to return it as an opaque pointer. I would be more efficient if I can just return 4/8 opaque bytes. One less indirection and no need to link in code for an allocator.

Iā€™m working on an example for my project Zigar. Hereā€™s some actual code:

var gpa = std.heap.GeneralPurposeAllocator(.{}){};

const SqliteOpaquePtr = *align(@alignOf(sqlite.Db)) opaque {};

pub fn openDb(path: [:0]const u8) !SqliteOpaquePtr {
    const allocator = gpa.allocator();
    const db_ptr = try allocator.create(sqlite.Db);
    errdefer allocator.destroy(db_ptr);
    db_ptr.* = try sqlite.Db.init(.{
        .mode = .{ .File = path },
        .open_flags = .{},
        .threading_mode = .MultiThread,
    });
    return @ptrCast(db_ptr);
}

pub fn closeDb(db_op: SqliteOpaquePtr) void {
    const db_ptr: *sqlite.Db = @ptrCast(db_op);
    db_ptr.deinit();
    const allocator = gpa.allocator();
    allocator.destroy(db_ptr);
}

While I could simply return sqlite.Db, that would lead to many of the structā€™s functions getting exported to the JavaScript, which are more or less useless because the most critical ones require comptime arguments and I canā€™t marshall calls to them. The struct is only useful when it comes back into Zig. Like this:

pub const Album = struct {
    AlbumId: u32,
    Title: []const u8,
    ArtistId: u32,
    Artist: []const u8,
};

const FindAlbumsIterator = Iterator(Album,
    \\SELECT a.AlbumId, a.Title, b.ArtistId, b.Name AS Artist
    \\FROM albums a
    \\INNER JOIN artists b ON a.ArtistId = b.ArtistId
    \\WHERE a.Title LIKE '%' || ? || '%'
);

pub fn findAlbums(db_op: SqliteOpaquePtr, title: []const u8) !FindAlbumsIterator {
    return try FindAlbumsIterator.init(db_op, .{title});
}

Conceptually, what Iā€™m suggesting is not hard to understand. In the real world after all, opaque objects do have dimensions that we can observe. We are able to allocate space for opaque objects. The opposite is the counter-intuitive notion.

No conceptual difficulty here, certainly. Figuring out what it was meant to be good for is a different matter, thanks for providing an example.

Although Iā€™m not seeing why it would be necessary in this case. The sqlite.Db struct either exists on the heap (in which case, allocation is fine, and a * anyopaque is also fine), or it exists on the stack. In which case, you can make a pointer to the stack copy, and hand that down to the code which uses it.

Not seeing any middle ground. Maybe the sqlite.Db lives in another struct? But thatā€™s the same thing with another level of indirection.

Itā€™s not obvious to me why that would be, but Iā€™ll take your word for it. Is that just how your library works? Because if the goal is ā€œpass a struct but donā€™t provide the decls to Javascriptā€, perhaps what youā€™re using to marshal the struct into Javascript-land could be configured to not automatically provide all decls as functions?

1 Like

But then again, since itā€™s an extern struct, you can @BitCast it into a big integer and pass that around. If you want.

I was also thinking a buffer of u8 that you can cast.

What about something like this:

const std = @import("std");

// faking sqlite
const c = struct {
    const sqlite3 = struct {
        here: u64 = 0,
        are: u64 = 232321,
        some: u64 = 434,
        fields: u64 = 971,
    };
};

const Ptr = *c.sqlite3;

pub const Db = struct {
    const Self = @This();

    db: Ptr,

    fn fromOpaque(o: OpaqueDb) Db {
        return .{ .db = @ptrFromInt(@as(usize, @bitCast(o.db))) };
    }

    // putting both methods here because opaque is supposed to be minimal
    fn toOpaque(self: Db) OpaqueDb {
        return .{ .db = @bitCast(@intFromPtr(self.db)) };
    }
};

pub const OpaqueDb = struct {
    db: [@sizeOf(Ptr)]u8 align(@alignOf(Ptr)), // using field with alignment
};

pub fn main() !void {
    var sqlite3 = c.sqlite3{};
    const sqlite3_ptr = &sqlite3;
    var internal_handle: Db = .{ .db = sqlite3_ptr };
    std.debug.print("internal_handle: {}\n", .{internal_handle});

    const external_handle = internal_handle.toOpaque();
    std.debug.print("external_handle: {}\n", .{external_handle});

    const reinternal_handle = Db.fromOpaque(external_handle);
    std.debug.print("reinternal_handle: {}\n", .{reinternal_handle});

    std.debug.print("\n--------------------------\n", .{});
    if (internal_handle.db == reinternal_handle.db) {
        std.debug.print("seems to work?\n", .{});
    } else {
        std.debug.print("fail\n", .{});
    }

    std.debug.print("OpaqueDb:\n", .{});
    std.debug.print("sizeOf: {d}\n", .{@sizeOf(OpaqueDb)});
    std.debug.print("alignOf: {d}\n", .{@alignOf(OpaqueDb)});
}

Output:

internal_handle: opaquewithfieldalign.Db{ .db = opaquewithfieldalign.c.sqlite3{ .here = 0, .are = 232321, .some = 434, .fields = 971 } }
external_handle: opaquewithfieldalign.OpaqueDb{ .db = { 104, 196, 241, 25, 252, 127, 0, 0 } }
reinternal_handle: opaquewithfieldalign.Db{ .db = opaquewithfieldalign.c.sqlite3{ .here = 0, .are = 232321, .some = 434, .fields = 971 } }

--------------------------
seems to work?
OpaqueDb:
sizeOf: 8
alignOf: 8
3 Likes

The struct is being returned, not being handed down. And of course we canā€™t return a pointer to a stack variable. That was my point about being forced into allocating from the stack.

With my code, what actually happens when a struct is returned by value is that the struct will first land in the stack frame of comptime generated function that I call a ā€œthunkā€. The thunk copies the bytes into an JavaScript ArrayBuffer. The struct will then be floating around in garbage-collected memory until itā€™s passed to the thunk of another function, which does the reverse.

A lot of what I do is very similar to how you would marshal a function call across process boundary or across the network even. Thanks to the magic of the Zig comptime programming, everything can be done seamlessly and automatically.

Imagine, if you will, your computer is running a app that perform RPC to a server sitting in say Szczebrzeszyn, Poland. The first call would establish some sort of a context, stored in a struct that only the server in Szczebrzeszyn can understand. If this struct is returned by pointer, then the server in Szczebrzeszyn must keep a copy of it in memory so that when it receives the address again from your computer, it can find those bytes again. Just sending these opaque bytes down the wire is far simpler. And thereā€™s no drawback unless the struct is larger than a network packet.

Yes, itā€™s design choice made for the purpose of ease of use. If itā€™s public, itā€™s available. Thereā€™s really no other way to implement this. Even if tags get implemented eventually, the issue of third-party code would remain (programmers canā€™t stick tags into other peopleā€™s code). Converting other peopleā€™s structs into blobs of anonymous bytes seems the easiest solution. If I donā€™t fully understand what these things are, then itā€™s entirely reasonable for me to not expose them to programmers using my code.

1 Like

It wouldnā€™t work since itā€™s not an extern struct. I can always make it work by being evil, but we donā€™t want that now do we :wink:

1 Like

Ok, that makes sense. I can think of other ways to handle it but I agree that just turning it into a blob and back would be the way to go.

You can do that with std.mem.toBytes, or, since youā€™re copying it anyway, asBytes. Then on the other end itā€™s just std.mem.bytesToValue, also with a bytesAsValue no-copy variant.

You could even make it type safe:

const SqlBlob = struct {
   blob: [@sizeOf(SqliteDb)]u8,
};

Why make programmers jump through loops? Why make them do evil stuff like transforming a struct into a byte array when we make the language semantically sounder? Thereā€™s no reason why an opaque should behave like anyopaque. ā€œSomethingā€ is not ā€œanythingā€. Anything can have any size. Something has some size.

Right now, pointers to opaques are like C pointers in that they point to memory regions of unknown extent. That creates problems when you try to reduplicate or relocate data structures. You just donā€™t know how many bytes you need to copy.

Iā€™m not sure I understand the point of this discussion.

Usually opaque types are used when a system wants to give you ABI stability, so for example a database might have a plugin system and give you an opaque pointer so that when a new database version comes out, plugins donā€™t need to be recompiled even if some types in the database have changed shape and/or size.

As I understand this is pretty much the only reason why one would use opaque types, so if you start undoing the ABI stability guarantee, you lose the only advantage this system gives you.

3 Likes

I donā€™t think the ABI stability question is all that relevant to Zig. Zig modules generally will make some use of comptime so you have to compile from source.

The main motivation of using opaque types is to prevent consumers of your code from creating dependencies on implementation details. Today you might be using library X. Some time later you might switch to library Y. Thatā€™d break peopleā€™s code if they had been accessing Xā€™s API directly.

What hoops?

pub fn Blob(T: type) type {
    return struct {
        blob: [@sizeOf(T)]const u8,

        const BlobType = @This();

        pub fn init(obj: T) BlobType {
            return BlobType{.blob = std.mem.toBytes(obj)};
        }

        pub fn restore(blob: BlobType) T {
            return std.mem.bytesToValue(T, blob.blob);
        }
    };
}

How is this hoops? Itā€™s the type you want, as a tiny userspace library. This niche use case doesnā€™t justify changes to the type system, because the type system can accommodate the application, as-is.

My definition of evil doesnā€™t include doing something which is provided by the standard library, so that we can have, for instance, and in particular, allocators.

What Iā€™ve sketched out here cannot stop working, because itā€™s fundamental to how Zig operates. Youā€™re not ā€œallowedā€ to interpret the byte sequence, because Zig types donā€™t have a specified memory layout. But they do have a size, which is why @sizeOf always works, and they have a realized memory layout. It might change between compiler releases, itā€™s internal API, and itā€™s a mistake to read a subset of those bytes and expect to find a particular field. But round-tripping like this is a-ok. You are guaranteed by the semantics of the language to get your original instance struct back.

Read the code for one of the allocators. Youā€™ll see what I mean.

Thereā€™s nothing type unsound about this construction, either. Itā€™s a type-specific opaque container, where the data is const, so userspace canā€™t mess it up without trying really hard. To make the idea clear, I wrote restore as a method on the blob, but for your purposes you might want a layer of indirection so that restoration isnā€™t a temptation in user code. Obviously if they can get at std.mem then they can just do it themselves, but look: the proposal is to hand off data in opaque form, which means they have the data, so any sufficiently determined user is going to be able to rehydrate it.

If thatā€™s a non-option, fine, then you pretty much have to give them a pointer and retain the data yourself.

That part isnā€™t even really true! You could literally encrypt the bytes and give them that, and then decrypt it on the far side of the JavaScript boundary. Itā€™s all about how paranoid your code needs to be.

It actually makes perfect sense to me and @chung-leong did a good job of explaining why. A JavaScript runtime is an abstraction boundary, and this lets him marshall Zig-specific data across the abstraction boundary without having to maintain it in the Zig heap. That would mean having to keep track of it, and potentially garbage-collect it, because the data is heading into a dynamic language, so as soon as the last reference to that data goes out of scope, itā€™s gone.

Philosophically, I would not personally erase the identity of a handle like this, but then again, the reasons given are good: the methods pertaining to the struct require significant comptime configuration to be used correctly, so thatā€™s a source of error and frustration in user programs which might try to use those methods. So maybe I would, but primarily, itā€™s not my library, so those arenā€™t my decisions to make.

The ABI stability question is extremely relevant for Zig because itā€™s a mechanism that is both used in the wild (meaning that Zig will need to interoperate with such systems) and also desirable in specific use cases by pure Zig projects as well. Not having to recompile plugins when the host application changes is a truly desirable feature in some situations.

I would be extremely skeptical of a library that tries to use opaque types to enforce a public interface. Semver and properly documenting what is considered public interface is how you prevent consumers from creating dependencies on implementation details (or at least making them aware of the fact that you donā€™t promise to not break their code if they do).

3 Likes

Since thatā€™s your assertion, please give me one example of an existing Zig package, one that you can fetch through the Zig package manager, which behaves in the manner you described, namely that a change in version would not trigger recompilation.

It seems that Iā€™m talking about a mechanism that youā€™re not aware of, let me provide you with some links:

General concept:
https://hackaday.com/2018/07/12/its-all-in-the-libs-building-a-plugin-system-using-dynamic-loading/

Some examples:
https://www.sqlite.org/loadext.html
https://redis.io/docs/latest/develop/reference/modules/
https://big-elephants.com/2015-10/writing-postgres-extensions-part-i/
https://nginx.org/en/docs/dev/development_guide.html#Modules
https://tree-sitter.github.io/tree-sitter/creating-parsers

And hereā€™s one concrete Zig example, a Redis module I wrote in Zig a while ago:
https://github.com/kristoff-it/redis-cuckoofilter

1 Like

As the docs put it:

[opaque] is typically used for type safety when interacting with C code that does not expose struct details.

In C, this is reasonably common, for a few reasons, already adequately covered.

If you want a Zig dynamic library for whatever reason, itā€™s going to be using the C ABI, so the opaque pointer approach is justifiable.

If someone wanted to make module code which blobs up a struct, thereā€™s nothing stopping that. You could really drive the point in by casting the byte array with @bitcast into a big olā€™ unsigned integer, if you wanted to.

I donā€™t think that would be a popular choice, however. Feel free to experiment, and if it catches on, maybe itā€™s worth adding more language support for doing that.

But youā€™ve already identified a situation in your own code where it does make sense: a host/guest program, where you want to be able to pass Zig assets into a hosted runtime, and have guest code reach back into Zig with that data. I could see this being useful in Wasm, or JavaScript like youā€™re doing now. Lua, maybe. Any situation where the nature of the data is such that it has to be handed back to the Zig host for anything useful to happen to it.

What Iā€™m not seeing is what promoting a byte blob to a special sort of opaque type is adding to the picture. Itā€™s been proposed already, and it didnā€™t catch on.

No proposal based on access control has been accepted, and it will probably stay that way. I consider that a good thing, because Zig is a low-level language, structs are literal regions of memory carrying data. Itā€™s possible to obscure what that data means, but not what it is, so sufficiently determined user code can always figure the rest of it out, and all that private fields or sized opaque types (which is just a struct where all the fields are private) can do is make life frustrating and annoying for whomever has to make some internal use of an asset for which some of the type metadata has been excluded.

Itā€™s not even difficult to do this ā€˜by handā€™, it is in fact very easy. Thereā€™s no advantage in making it easier than it already is.