Why does @field work for declarations?

Oh, that’s a different story.

So @tagName does not automatically embed expensive string conversion in your program. That only happens if the strings survive until runtime. At comptime, it’s no problem.

2 Likes

Right. I didn’t suppose there’d be any “expense” at runtime, except maybe if I used StaticStringMap; I think what @hachanuy was saying is that the compilation itself would bloat the binary with the inline-for @tagName()s all over. So, bigger binary, maybe “who cares”, or maybe it’s enough to dislike. I haven’t tested. But I’d guess that all of the options on the table aren’t expensive in terms of runtime speed or memory usage (again, StaticStringMap would at least require more stack for the umpteen strings… unless that’s not exactly true since even the init()s of EnumMap and EnumSet create temporary strings for every enum via @tagName).

I’ve still to play with @tholmes’ e2_valid_for_e1(), but it has @tagName (twice) in it, too, and he already identified a shortcoming of that approach, too… I can’t visualize the real gain that his idea has over the EnumMap/EnumSet idea, so would have to just try, and profile, if I felt it was likely to offer real gain. I also haven’t yet tried @Sze’s comptime-function-to-generate-subset-enums with identical backing ints (that’s the key difference); I can see the advantage this offers to “converting” from one enum to another, though I actually don’t foresee my needing/doing that - my vision involves a pretty straight path, where, e.g., database values (people-names, email addresses, the usual stuff) will result in a “build up” of the data structure for the purpose, and it’ll get serialized, sent, and realized. No real “manipulation” in the middle. So one advantage of @Sze’s clever scheme may not really benefit me this time. But I’ve still got to unpack it better, and try it, to see if I see other advantages over the simple EnumMap / EnumSet version.

Nuance may still invite comment, but what I’d be hopeful for is somebody saying, “No, here’s a major problem with your EnumMap / EnumSet idea: …”, if, indeed, I’ve missed something. And, though I think that implementation is superior to my OP poor-man scheme without any std structs, it’d be nice to get a thumbs-up from somebody in the know, too, that the EnumMap / EnumSet is indeed a right-way improvement. Thanks all.

@tholmes, I’d like to understand this more - where do you get this intel? By “turn the EnumMap access into a true comptime-known boolean”, I thought, at first, you meant, by “access”, the call to contains() (meaning, make it a comptime contains()), but I don’t immediately see how that works. I’ll trust, if that’s what you mean, but I’d like to know if I’m interpreting you incorrectly.

2, since that was your “edit”, perhaps it’s intended to negate some of what you said above… but I’d like to know what you mean by “still perform a memory access” - it seems impossible that it’s doing any allocations, so I’m guessing you just mean a stack look that could be avoided if it weren’t for the optional unwrap. I feel like that’s probably pretty marginal cost? But perhaps you’ve got reason to believe it could be more significant? Did you profile it against your table struct type?

Thanks for the help; I’d like to benefit from your discoveries rather than rehashing it all, if possible.

Hi,
I mainly got this info by using https://godbolt.org/, specifically this session.
You can see how I “comptime-ified” the map access, and how if you delete the comptime keyword, it reads from an “example.map” location, and emits a test instruction, whereas if you don’t, it doesn’t and just generates no code for the main function.

You can also see how the Table struct is declared and how e2_valid_for_e1() is also “true comptime”, but I highly recommend you don’t take too much inspiration from it, since your map declaration is much shorter and this session literally proves how its performance is completely equal.

1 Like

Thank you for the lead. I’m trying to get my head wrapped around it. You’re referencing my earlier post, it seems, prior to my EnumMap-EnumSet proposal, but you conclude with

which sounds attractive, if I’m hearing correctly. In other words(?), instead of a .contains() call in, e.g., an EnumSet (to check validity), an attempt to simply make an assignment (that is, in the HTML example there, an attempt to set the action attribute to, say a div, rather than a form), would be set up to simply Error-return right then and there. No .contains() call/check performed at all. Do I have this right? If so, I just have to tie the pieces together. I think you gave me enough hints to do that, but I’ll have to scratch my head a little more to land it.

I think you’re saying: don’t actually assign the subset enums to the Tags-enum value (in which case the enum values would be the same by definition); rather in this comptime function, generate a brand-new enum, but make the names the same, and the enum values the same , explicitly. So, instead of that

const ValidAttributes = struct {
   const accept: []const Tags = &.{ .form, .input };

I’d have

const ValidAttributes = struct {
   const accept: generate({ .form, .input });
//...
comptime fn generate(tags: []const Tags) type {
   const field_names: []const []const u8 = std.meta.fieldNames(tags);
   // ...
   return @Enum(<backing_int>, .exhaustive, field_names, field_values); // ??
}

… ish? Perhaps I’m way off?

1 Like

Ok, wow, this is very enlightening, thank you. I had not yet adventured into godbolt. I see the huge difference between comptime map.get(… and map.get(… (without comptime) (and the lack of any difference by removing the comptime in e2_valid_for_e1 @haField) … the important takeaway is that I should comptime those map.get() calls, if I can, if I go that route. I haven’t yet worked out all where all the get() checking would be - some might only be possible at runtime… so this is valuable insight indeed. Thank you.

This still baffles me, though (the runtime, or non-comptime-explicit variant) – is this what you’d expect? Or is this a bug? And do you still think it’s rooted in the optional-unwrap? Would you guess that if-accessing the optional, rather than ?-unwrapping it, that it would be any different? Well, I tried a little modification in godbolt (pardon the stupid naming):

    if (comptime map.get(.a)) |aa| {
        if (!comptime aa.contains(.y)) return error.Unexpected;
    }

and, indeed, it blows up if I remove the second comptime (inner if). I can’t remove the first comptime independently, of course, as that would make the inner scope not comptime.

And what a crazy blowup indeed! It feels “wrong”… but I don’t understand that giant splat of assembly.

And what a crazy blowup indeed! It feels “wrong”… but I don’t understand that giant splat of assembly.

Yeah, my bad. Most of that assembly is actually related to error handling.
You can see things more clearly if you change main() to no longer be able to return an error, and change the if statements to instead execute unreachable if they fail.

1 Like

Godbolt or it didn’t happen. :slight_smile:

Most of the time, @field(thing, @tagName(tag)) will turn into a reference to something, either the value itself if it’s small enough to inline into an instruction, or a load of its location in static memory.

An inline for matching all of the enums to something, should get optimized down to a jump table. Does it?

If you care, you check. If you don’t check you don’t actually care.

Probably what you want is actually a switch on the enum, with an inline else. Should make minimal to no difference in terms of the compiler output, but it expresses the intention more clearly.

1 Like

Yes, I’m slowly learning that my training wheels are taking me into territory I have to strap up for. This was my first adventure into godbolt land, and my sense that “some smart person” will just know the answer (though possibly true, still, in many cases) has to yield to the sense that I have to adventure a bit deeper and not just depend on others… if “I care” – that does seem to be significant. I could certainly just say, “well, it works… that’s fine for me”, and move on. But if I care enough to press for optimization, I can’t just expect good people like @tholmes to do all the godbolting for me, in the name of discovering something beneficial for all. I’ll try to trust myself to adventure beyond my comfort zone more, and simply hope for wizard-of-oz answers less.

And yet I find all this input so helpful; if I was slightly more self-sufficient, I might not have gleaned the nugget:

… for this I’m grateful, and have plenty of investigation cut out for me to fill those snippets of evenings and lunchbreaks that I have.

1 Like

No bad at all, I’m really grateful for the lead. Someday I’ll be able to quickly conclude that “most of that assembly is…” \a\ rather than \b\ … I’m going to play with it some more, to isolate the error-handling, and see what more I can learn. Thanks again. Techniques like “change the if statements to execute unreachable” make perfect sense, but I don’t know how long it would have taken me to come up with that on my own.

It may be interesting to read justification for this from proposal which introduced it. For example Andrew here agrees @field should be named @member instead (since it was never splited into 2 builtins).

1 Like

What this illustrates to me is that us humans don’t have the kind of issue with irregular syntax which it always looks in the abstract like we might.

There are separate @hasField and @hasDecl because these are used on types exclusively, and if you want an answer to “does this type have a field .foo” the answer “why yes! it does have a declaration .foo!” is wrong, basically always. The distinction matters.

For @field the purpose / effect of the builtin is “do what I would do with dot access to this identifier, but with this comptime-known string instead”. Zig doesn’t have . and :: (which I love that for us, btw, :: is my nemesis), there’s not a lot of point of introducing a distinction at the meta level which we don’t have at the object level.

“Dot access” is, among other things, field access, so @field is an adequate name for the operation. The fact that it isn’t rigorously descriptive is kind of minor, it’s just, well, irregular. Human languages tend to be, so we’re very good at learning “it’s just like that” in practice.

3 Likes

Lots of strong agreement to all this from me (especially re: :: !) Agree that @field is more than adequate. I think one valuable measure is “hindsight”, though, by its nature, that makes it a bit useless, too. It’s nice, months or years down the line, to be able to say something like “see, @field works just fine - nobody is confused about it”, but saddening to have to admit “dang, foo was a really bad name choice, after all; wish we had spent more time considering that”. It’s hard to be perfect, but it seems zig peeps are trying, and allowing time it’s due place, rather than rushing to market with a mere couple of good ideas and rush-jobs thrown in for fun.

With your leads, and some time to try to “learn up” godbolt in a rudimentary way, I affirmed a couple of things for myself and zeroed in on one bit that has become important to me. Many of my .contains() calls can NOT be comptime; you mentioned that your original session “literally proves how its performance is equal” - this is clear with the comptime prefix as you proposed, to make the map.get()...contains() comptime, but I will need runtime eval there. Not to say that apples should be compared to oranges, but I was curious about the impact, in terms of unforeseen loops or … well, there can’t be any unforeseen allocations now, can there? :slight_smile: Anyway, the critical diff when executing that pivotal contains() line without the comptime prefix (the crux of this “validation”), is this asm:

        test    byte ptr [rip + example.map], 2
        je      .LBB3_2
        push    rbp
        mov     rbp, rsp
        movzx   eax, byte ptr [rip + example.map+2]
        mov     byte ptr [rbp - 2], al
        mov     byte ptr [rbp - 1], 1
        pop     rbp

This entire bit of code doesn’t exist in the comptime variant. This is not my wheelhouse, but I think this suggest I can expect something like O(1), regardless of the number of items in the set or the size of the map overall. I’m going to duplicate my effort with the giant enums, which represent the whole HTML (tag attributes) spec, and see what that reveals. I could potentially say, “great, moving on”, or I could decide to profile with timing, perf, valgrind, or something, and adventure even further into scary territory. But my sense is that, with the EnumSet and EnumMap I’ve landed on a solution that is trim, legible, and not mal-performant. Other ideas, like @Sze’s offer some advantages if certain use cases were valuable (at least). My original idea had the embedded inline-for (which could conceptually be replaced with a switch on the enum with an inline else as suggested, but only in core zig code, which I can’t change) - might have actually optimized to a jump-table anyway, as suggested; if I bothered to godbolt that, I’d find out. But, alas, I like the EnumSet/EnumMap better, and it’s likely to win the day after a little more profiling.

1 Like

Conclusion:

@vulpesx ‘s response and a bit of followup later explained this one.

std.enums.EnumFieldStructs, StaticStringMaps, and other ideas were suggested, some concern with @tagName-use was argued through. In the end, after some analysis, I settled on:

   const E1 = enum {
      a, b, c,
   };
   const E2 = enum {
      w, x, y, z,
   };

   const ES = std.enums.EnumSet(E2);
   const map = std.enums.EnumMap(E1, ES).init(.{
      .a = .initMany(&{ .x, .y }),
      .b = .initMany(&{ .w, .z }),
      .c = .initMany(&{ .w, .x, .y }),
   });
   try std.testing.expect( map.get(.a).?.contains(.y));
   try std.testing.expect(!map.get(.b).?.contains(.y));
   try std.testing.expect( map.get(.c).?.contains(.y));

In a “real world” implementation of this involving HTML element tags (.div, .table, etc.) and attributes which are legal for those tags (.label, .name, .id, etc.), and valid composition (which elements are legal within which others), the only additional discovery was the need for @setEvalBranchQuota(2000), as comptime “evaluation exceeded 1000 backwards branches”, halting compilation. Inspection of the results indicate perfect scalability, otherwise, with the huge HTML “dataset” resulting in essentially the same assembly. All contains() are O(1), regardless of the map.get(.a).?.contains(.y) being prefixed with comptime or not (they can’t be comptime in my real-world applications), since the bitmaps are all fully-established at comptime. Special thanks to @tholmes, among others.

1 Like

@setEvalBranchQuota should not take a hardcoded number, but a computed number that accurately represents your code. If the quota is too big it could hide faulty/inefficient loop/recursion logic.

Ah, makes sense, thanks!