ZON field value extractor (for the lack of better name) - zon_get_fields

A while ago I was looking for a way to use ZON as a configuration fie format, and found none. The parsing itself was there (std.zig.Ast.parse()), but getting the values from the Ast required a considerable amount of work.

Well, since I did that work, why not share it with the world? So, here it is -
https://github.com/Durobot/zon_get_fields.

The thing is not optimal, since each call of a getFieldVal*() function has to split the provided path and walk the AST, but at least it works.

Also, there’s this strange thing with negative field values represented by two tokens in AST, the first being “-”, and the next containing the digits. I hope the way I’m reading them (getting “-”, and then getting the next token, here) is correct in the long run, but honestly, I don’t know, and it doesn’t sit well with me.

Update 2023-12-31:

  1. Switched from several getFieldVal* functions to one public function - getFieldVal;
  2. Added support for proper parsing of character literals (although not Unicode literals at this time);
  3. Updated example code in README.

Happy New Year everyone!

Update 2024-01-25: Added a new function (template), which fills your arbitrary struct with values from the provided AST - pub fn zonToStruct.

Update 2024-01-28: Unicode character literals are now supported.

Update 2024-02-22: pub fn zonToStruct now returns an auto-generated struct with fields that indicate whether their corresponding fields in the target struct were filled or not. Previously there was no way to tell.

Update 2024-03-05: Now with Zig package manager support!

Update 2024-06-20: dynamically allocated slices in the report struct (returned by pub fn zonToStruct) for target struct fields that are slices of non-primitive type elements.

7 Likes

This is a big update, kinda, so I thought I’d give the thread a bump. Hope it’s not inappropriate.

Anyway, I’ve added a completely new way of getting values from the ZON text - you can now throw (a pointer to) your struct and an AST at this new function - pub fn zonToStruct, and thanks to Zig comptime magic (mutually recursive comptime calls… my head nearly exploded), it comes out all filled with values!

Well, at least if the AST contains the values for your struct fields. When there’s no match, the field is not modified, and I couldn’t figure out a way to report this back to the caller, which is a bummer :frowning_face:

2 Likes

You could have a optional comptime out parameter in the zonToStruct function that fills in the missing field information. At some point I think this PR might get merged into std as well WIP: zon by MasonRemaley · Pull Request #17731 · ziglang/zig · GitHub

1 Like

You mean, like an optional second struct, with the same field names, but boolean (true - target struct field modified, false - not modified)?

I guess that is one way of doing it, but I think it would be a pain to implement, and most importantly, to use.

I’m considering providing field mandatory / optional information to zonToStruct, in some form, and making it return error if a mandatory field was not found. In essence, a schema (see JSON schema, XML schema, etc.), in a simplified form.

I was also thinking about returning the number of fields that were not modified as a simpler (temporary?) solution, maybe.

I was more like thinking of out parameter like missing_fields that contains array of enum literals which are the names of the missing fields

You may be interested in seeing how getty does in-band and out-band metadata for your second idea: Customization - Getty

2 Likes

So, once again, a major update - pub fn zonToStruct now returns an auto-generated struct with fields that indicate whether their corresponding fields in the target struct were filled or not. Previously there was no way to tell if the field was assigned to or not.

At first I was thinking about a lightweight (no conditionals, etc.) schema, but then I thought it’s too much hassle for the user, and for me, and that they can simply check the result and decide whether it’s OK or not.

Fields of this “report” struct are enums of ZonFieldResult type - unless they are sub-structs, arrays, arrays of sub-structs, arrays of arrays, that ultimately end up being composed of the same ZonFieldResult enums.

They can indicate either that the field was not_filled or filled, and in the case of arrays - if there was not enough values in ZON / AST (partially_filled) or too many values (overflow_filled).

Slices are either not_filled or filled, simple , since fn zonToStruct has to allocate them dynamically anyway, there’s no such thing as not enough values or too many. A slice represented with one ZonFieldResult field in the report struct.
Sadly, this means that if it’s a slice of structs, for example, you don’t get information on the individual fields of these structs. Oh well, maybe in the future I’ll come up with something to remedy this.

In the meantime, I think I’ll take a break from this little project.

1 Like

just tried zgf using version 0.12.0-dev.3489+c808e546a, and received a slew of errors:

C:\Users\biosb\AppData\Local\zig\p\122077d7101a413d4ea1f02b1155469217d8783feb1ad5df90427aa567bf14163e94\src\zon_get_fields.zig:712:33: error: no field named 'Auto' in enum 'builtin.Type.ContainerLayout'
                        .layout = .Auto,
                        ~~~~~~~~^~~~~~~
C:\tools\zig-dev\lib\std\builtin.zig:336:33: note: enum declared here
    pub const ContainerLayout = enum(u2) {
                                ^~~~
C:\Users\biosb\AppData\Local\zig\p\122077d7101a413d4ea1f02b1155469217d8783feb1ad5df90427aa567bf14163e94\src\zon_get_fields.zig:639:26: note: called from here
    !MakeReportStructType(if (@typeInfo(@TypeOf(tgt_struct_ptr)) != .Pointer or
     ~~~~~~~~~~~~~~~~~~~~^

This is due to Zig starting lowercasing every enum’s field, but in your error the uppercase (starting) enum field is still used. I’ll propably open a pull request to fix that.

1 Like

i was able to make the trivial change in the source, but then i received a slew of “runtime” debug messages that i wasn’t sure what to do with…

i’m currently working with some {json,yaml,properties} parsers, and would really like to move towards zon…

thanks for the effort… what you have here more than meets my needs… :smiley:

i noticed that your example has fixed-sized arrays…

is there a way to essentially have arrays that are sized by the { ... } elements in the .zon file itself…

for example, the .dependencies field in my build.zig.zon can (seemingly) be of arbitrary length…

in my case, i have an array of strings []const []const u8 and would prefer not to set an upper bound on the length…

can this work???

i think i can answer my own question, having looked at various parsers for json, yaml, etc written in zig…

correct me if i’m wrong, but std.json would have a similar limitation…

specifically, if i want to populate some (struct) object with the results of parsing, then any arrays would need to be known size… whether i can parse into a tagged union might also be a stretch…

but all of these parsers (like zon_get_fields) do have lower-level functions for essentially inspecting the parsed object… by dynamically reflect on the field-names present as well as the length of any arrays, the client could either manually populate some data structure and/or simply process these members on the fly…

You could try (but I don’t know whether that is implemented) parsing with passing an allocator. Runtime-known things like []const []const u8 need allocators for expansion.

Hi, sorry for the late reply.
Things were happening in my life, and I got seriously distracted from Zig…

I’m not sure I understand correctly what your needs are, but if you’re talking about arrays in your target struct (the struct that you pass to pub fn zonToStruct() to have it populated), you can declare this field in the target struct as a slice, and the buffer will be allocated dynamically.

Of course, you have to provide an allocator and deallocate the buffer when you’re done using it.

See test "zonToStruct big test" in zon_get_fields.zig (which is a much more comprehensive example than what you can find in the README file), more specifically fields slice_u16, slice_nested_2, and s_slice_of_u8. Notice that I had to provide an allocator ( std.testing.allocator, since this is test code) to zonToStruct():

const res = try zonToStruct(&tgt_struct, ast, std.testing.allocator);

zonToStruct() allocates buffers of necessary capacity for these fields, so I then had to deallocate them (the caller owns this memory and must take care of it):

    defer std.testing.allocator.free(tgt_struct.nested_1.s_slice_of_u8); // Must free `s_slice_of_u8`
    defer std.testing.allocator.free(tgt_struct.slice_u16);              // Must free `slice_u16`
    defer std.testing.allocator.free(tgt_struct.slice_nested_2);         // Must free `slice_nested_2`

Also note that fields of the target struct that are slices of const u8 (strings) are a special case. For them, no allocation will be performed by zonToStruct(), your slice will point to a location inside a const u8 buffer in the provided AST (ast.source).

In this case you don’t need to provide an allocator (pass null) to fn zonToStruct(), unless you have other slice fields in your target struct, of course.

You also must not try deallocating such slices, obviously, since they point to a location in a buffer which belongs to your AST struct.

Hope this helps.

I think this update is quite significant and deserves a short description -

slices of non-primitive (integers, floats, booleans) elements, that is slices of structs, arrays, or slices are now represented differently in the report struct, returned by pub fn zonToStruct.

They are now reflected with slices in the report struct, which are allocated dynamically, and the caller is responsible for deallocation.

The types of elements of these report slices depend on the type of the elements of the target struct slice counterparts.

If your target struct contains a slice of structs, expect a slice of matching structs in the report struct.

If your target struct contains a slice of arrays, you’re getting a slice of arrays in the report struct. Array elements type depend on the target array element type - if it’s a primitive type, you’ll get ZonFieldResult elements. If it’s something more complex, then you’re getting a mirrored image of that.

If If your target struct contains a slice of arrays, you’re getting a slice of… something in the report struct. The type of that something once again depends on the type of elements of these nested slices in the target struct - if these are made of of primitive elements, then “something” is ZonFieldResult enum. If it’s more complex… you get the idea.

See the examples in README.md, in examples folder (same as in README). Also check out unit tests - see test "zonToStruct: Slice of structs", test "zonToStruct: Slice of arrays" and test "zonToStruct: Slice of slices".