How to parse/stringify json value with custom tag, like golang

For example, given a stuct in golang:

struct Cat {
  Color string `json:"c"`
  Weight float64 `json:"w"`
  CatType string `json:"ct"`
}
const Cat = struct {
    color: []const u8,
    weight: f64,
    cat_type: []const u8,
    
    pub fn jsonParse(arena: std.mem.Allocator, source: anytype, options: std.json.ParseOptions) !Cat {
        const Inner = struct {c: []const u8, w: f64, ct: []const u8};
        const inner = try std.json.innerParse(Inner, arena, source, options);
        return .{
            .color = inner.c,
            .weight = inner.w
            .cat_type = inner.ct,
        }
    }
};

If a type has a jsonParse function, it will be used to parse the type instead of the default behaviour.

Since it’s just a difference in field names, we can just call the default parsing logic with a type with the correct field names.

You probably want to turn color and cat_type into more useful types like a CatType enum, and a RGB struct

4 Likes

Thanks.

However, this approach is much more cumbersome than in Go or Rust.

Yes, but I will point out this has nothing to do with the language.

The json api just doesn’t support anything in-between a custom parse function and the very basic parse options.

With zig’s comptime reflection, you don’t need some other language mechanism like Go’s tags or rusts attribute macros. You can just: (this is a hypothetical)

const Cat = struct {
    //...
    pub const json = .{
        .color = .{ .rename = "c" },
        .weight = .{ .rename = "w" },
        .cat_type = .{ .rename = "ct" },
    };
};
2 Likes

There’s a problem with this approach.
JSON keys can contain hyphens (-), but Zig variable names cannot.

For example, given a JSON string like:

{"c": "red", "c-t": "some type"}

how should this be handled?

You can have arbitrary zig identifiers with the @"name" syntax.

unfortunately my hypothetical is still hypothetical.

2 Likes

Thanks for this interesting use of comp time magic. That would indeed be an elegant solution. It would probably be even better to not call it json. But rather serialize. That way it would be a contract that can be used for json, yaml and whatever else comes along.

2 Likes

But if you use it for more than one format, you lose the ability to arbitrarily match whatever that output format needs and if two formats need a different name you would need awkward workarounds.

I guess you could create a description that combines both ideas and allows for both:

const Cat = struct {
    //...
    pub const serialize = .{
        .all = .{
            .color = .{ .rename = "c" },
            .weight = .{ .rename = "w" },
        },
        .format = .{
            .json = .{ .cat_type = .{ .rename = "ct" } },
            .something_else = .{ .cat_type = .{ .rename = "CatType" } },
        },
    };
};

But you could play that game all day, trading more general descriptions vs simpler ones that are more straightforward to understand/implement.

I think the potentially more important part would be to also provide a way to associate these kinds of ā€œtagsā€ with the type via external modification, for example the serialize api should have something like
api.registerTypes(&.{ .{type, serialize-description}, ... })

So that a user could provide it for types they can’t directly modify.

4 Likes

I have the same idea, its just more concise for the hypothetical.

My idea was seperate decls for each, and an ā€˜other’, as well as the distinction between serialise, deserialise and both.

I also thought about generating a type for this, for completion and type checking.
But it would limit the ability to have implementation specific options, or smarter use of parsing functions eg .{ .cat_type = .{ .parseFn = catTypeParse } which is a fn([]const u8) CatType instead of a more general function. I think that would just overcomplicate implementations though.

Not to mention different parsers/serialisers have different requirements for custom parsing functions like json has multiple source types.

all to say, I think implementations should just support something like this, without being too restrictive on what exactly it is.

4 Likes

I never really had competing contracts on how to serialize any particular type. I’ve had one api with rules and then the need to store locally in another format where I was the only consumer. You would have to completely decouple the data from the serialization to cover all bases anyway. As you might just as well have json to json conflicts especially with versioning of your data format.

In those cases I suppose serialize could be the default with the possibility of using other structs of a similar style. That way they’re more like versions rather than being forced into a particular format.

The advantages of tags or these meta structs over a method is that they could handle SOA directly.

Yeah, this was what I meant with the serialize being the default ā€œserialize-descriptionā€ but with the ability to split on version or format. I don’t like the split logic into the same serialize struct as it complicates the contract and forces every implementation into being more complex.

The custom parse methods are hard to get fully around, but I think we should make an effort to minimize their need. It’s hard to write a parse method that isn’t locked to a particular format and it’s very hard to write a method that will work both AoS and SoA…

1 Like

Me neither, it was just an example for how you can come up with many different descriptions that have different tradeoffs.

2 Likes