Possible language proposal - Symbols

In a PR comment, it was mentioned that there are some design issues with jsonParse (and friends) that are difficult to solve. Namely, that the presence/absence of an arbitrary method silently change the behavior of an API.

For instance, if I just happened to add a jsonParse(...) decl to a struct, std.json would try to use it whether I intended it to or not. Additionally if I accidentally typo’d jsonPears(...), it would silently fail – the code would still compile, but std.json wouldn’t use my jsonPears(...) function.

I was thinking about how other languages solve this problem. Thinking to high-level languages like Java and JavaScript, they both have great ways of tackling this problem. Java allows classes to implement interfaces (therefore tightly-coupling the class to the gson package, for example), and JavaScript uses a Symbol datatype, which must be used from the variable on which it’s defined (or a copy of it), i.e. to make a value iterable, you’d add obj[Symbol.iterator] = myIterFunction to the object. There is no other way to gain access to Symbol.iterator other than through Symbol.iterator, or some copy of it.

I actually really like JavaScript’s solution. We can mix the solutions that both Java and JavaScript use, allowing us to tightly couple a jsonParse function to the std.json package, without complex language features like nominally-typed interfaces. The way I’d do this in Zig (as mentioned in this comment) is something like the following:

// std/json.zig

pub const parseFn = @symbol();

pub fn innerParse(...) {
    // ...
    if (std.meta.hasSymbolDecl(T, parseFn)) {
        const parseDecl = @field(T, parseFn);
        return try parseDecl(...);
    }
    // ...
}


// main.zig
pub const Foo = struct {
    foo: i32,

    pub fn [std.json.parseFn](...) !Foo {
        return .{ .foo = 5 };
    }
};

This would force the Foo struct to be tightly coupled to std.json, having a few great benefits. No more duck-typing on std.json’s side – now std.json can be confident that Foo is meant to be parsed. Additionally, readers of Foo know that jsonParse isn’t just some random function, they know it’s a special function which is used in some way by std.json. Additionally, since there’s an explicit identifier being exported, there’s a great spot for parseFn documentation to go, rather than people having to somehow guess that the documentation for jsonParse is located at std.json.parse.

And while a language feature, it’s not an overly-difficult one to understand with some basic rules:

  • No two instantiations of @symbol() are == to each other. (i.e., assert(@symbol() != @symbol())
  • A copy of a symbol is == to its original. (i.e., const a = @symbol(); const b = a; assert(a == b);
  • Symbols must be able to be used as decls (likely the most complex part of the feature).
    • Syntax can be anything, but here’s just some brainstorming syntax for now.
    • Declaring a function decl: pub fn [someSymbol](...) ReturnType {}
    • Declaring a const decl: pub const [someSymbol] = value;
    • Getting the value of a decl: I think reusing @field(T, someSymbol) is fine for this

Feedback is welcome! This proposal was originally an offshoot of a stdlib proposal I was making, so it’s still pretty rough and I’m sure there’s some drawbacks (other than “language change”) that I’m not thinking of.

4 Likes

Don’t think this is actually needed anymore, especially if parse contexts are substituted for parse/parseFromValue/stringify. Would be cool!

1 Like

A similar feature is called a ‘name’ in the book Concepts, Techniques, and Models of Computer Programming (pdf page 247 / book 204) while it is used in a similar way in that book, it uses it to implement secure abstract data types (for example a stack where you can’t mess with its implementation), I think something like this could be useful to implement code that has security on a more fine grained level than wasm-sandboxes or os-processes.

That part also talks about having an additional concept that allows for read-only views of data, I think having those two things would be pretty great for API-developers because together they would allow to create APIs which can’t be broken by accessing a field of the implementation that wasn’t meant to be accessible/used by the user.

I know Zig currently isn’t big on disallowing access but I think it could make sense for some things and it seems like a nice way to get these features without huge complexity.

Having read-only views would be a pretty great feature on its own. (Allowing introspection without breaking things accidentially, or having clear read-only access)

And also having names, which can be used as secure keys to gain access to something, would allow people to write whole libraries for code that can securely maintain certain invariants, without having to manage all security solely through careful application design and code review (and always having to review the entire code base, over and over again, or build complex tooling and verification).

An additional benefit of using those features is that you could avoid leaky abstractions, where somebody starts using implementation details, which lead towards the thing not being replaceable with a different implementation anymore.

I don’t think all code would use those features, because sometimes it is better to face all the complexity and don’t hide any implementation details.

However I think it would be a valuable addition towards the kind of code you can write with Zig, because it would allow Zig users to basically construct a safe mode library from within Zig. It could even be useful in implementing the build system and the compiler communication protocol.


The book uses a ‘chunk’ in combination with names to implement this, a chunk only allows access via key, so if used with a name you can only access the data if you have access to the name. So you wouldn’t be able to tell what keys the chunk has or even how many.

1 Like