What does the value (123) in "Semantic Analysis [123]" mean?

Hello all,

we all should know that, when we run “zig build run”, we see a steps [A/B] zig build-exe NAME TARGET MCPU... Semantic Analysis [???] ... . Recently I asked myself what the value in the Semantic Analysis (??? in my example) means.
I looked in the Zig repository and I found std.Progress, but I found the sema_prog_node (which is, I guess, responsible for Semantic Analysis [???] not very often, so I was not able to get the desired information.
Now, what is this value for? For the count of functions, statements or whatever?

Looking forwards,
Samuel Fiedler

1 Like

It is a counter. In each node when end() is called the parent counter is increased.
There two places that semantic analysis string is used as a node name: in Module.zig for testing and in Compile.zig for normal compilation. Following the code, children nodes are added for subcompilation. I gave up there, but I believe that it creates more grand children for declarations. So my guess is: total number of declarations.

2 Likes

Sounds logical. Perhaps I will take a look at mitchellh’s zig compiler internals and get a bit into the Zig source code to find out more interesting things about the compiler and to learn more things that are used frequently in Zig.

1 Like

Hi! This is basically just a confirmation of what was said above, but yes, the number you see is the number of declarations which have been semantically analyzed. The function ensureDeclAnalyzed in Module.zig creates decl_prog_node to represent the analysis of this declaration, and the end call is what increments the number.

One small complexity: technically, it is the number of Decls analyzed, which is an internal compiler datastructure which is not in one-to-one correspondence with source declarations. In fact, it can be very far off! There’s one Decl for every source declaration – after instantiation, so a declaration in a generic type will have a separate Decl for every instantiation – but there’s also one for every generic function instance, and an extra one for every container type. The reasons are… subtle, and weird, and not worth going into right now :stuck_out_tongue:

5 Likes

Sounds logical for me, thanks!

It would be nice to have a denominator for that value wouldn’t it? The problem is the compiler does not know how many declarations it will need to analyze until it finds them.

If you are recompiling after a successful compilation, the previous count is a good estimate for the denominator. Wouldn’t it be nice if the compiler tracked this value? Well, the bigger plan for that is incremental compilation, in which case it will only analyze declarations that have changed, directly or indirectly, and again it does not know how many until it finds them. However, this system could output some interesting stats which I think would be handy:

  • how many new declarations were introduced due to the edit
  • how many existing declarations were affected by the edit
  • how many declarations were garbage collected due to being orphaned by the edit

At first such information might be esoteric, but I would imagine as one gets comfortable with Zig programming, one would get a feel for those stats and could potentially learn about when an edit made different changes than expected, or had more wide-ranging consequences than expected. It could help programmers get a more intuitive sense for the dependency structure of their application and what components are causing a high amount of compilation cost.

11 Likes

Maybe even (optionally) cache those statistics over multiple runs and then allow to extract the data or log it away somewhere?
Users could then come up with third-party packages to plot it, etc.

Ooh, interesting ideas for extra info. When I get off my current tangents back onto incremental, I might add a debug flag which dumps the following after compilation:

  • How many Decls have been analyzed in total in this compilation
  • How many this update made [un]referenced
  • How many are currently referenced in total

Shouldn’t be too hard (once I actually write the “detect things becoming unreferenced” code).

4 Likes