i’ve currently added some small tweaks to a fork of zls, for the purpose of adding a new (semantic) token type which my IDE (VS Code) will highlight distinctly… [ all of the changes i’ve made are in fact confined to semantic_tokens.zig ]
armed with some new-found knowledge of the zig Ast (plus familiarity with the LSP), everything is fine on this front…
what i want to do, however, is generate HTML (or extended Markdown) that effectively renders an individual .zig file as it would appear vscode… i’ve done this in past and am quite familiar with the detaiils of rendering…
implementing this as a standalone command-line tool, is this as “simple” as tokenizing/parsing a single .zig file and then walking the AST???
presumably i will have no less information that i currently have inside of semantic_tokens.zig within zls… i can also assume that the input .zig file is “error-free” and in fact is already formatted…
and is there enough lexical information retained about the input .zig file to ensure that all tokens are rendered at corresponding line/col locations??? and most important of all, are comments retained somewhere???
any small of examples of AST tree walkers (maybe zig fmt) would be helpful to increase my understanding…
I think it would make sense to investigate what the autodocs do, if you take a look at for example this page: Zig Documentation
And inspect the source listing at the bottom it already has highlighting generated including the comments. I haven’t investigate the details of how the autodocs work but they are written in Zig and reuse the zig parser so the answers should be in the implementation of the autodocs.
i find this function renderTree which seems to be at the heart of zig fmt… it accepts this Fixups parameter that apparently will apply all sorts of transformations to the Ast as it is rendered…
correct me if i’m wrong, but applying certain “fixups” will produce an output file that is no longer a legal zig input??? if so, then this can really simplify an implementation…
more detail… i’m currently using mkdocs material for writing docs – which has support for syntax-highlighted blocks of code in dozens of languages… here’s an example of what i’ve done with Zig•EM…
so yes, i had to write my own python plugin that basically did some regex processing of a “chuck” of source code; fairly straightforward to implement…
the problem, of course, is that i don’t have any semantic information… i can’t distinguish identifiers that are types vs functions, etc… so, with a small example like the one shown above, i have manually “annotated” tokens with a suffix like "#t" or "#f" as a hint to my regex processing…
while this is fine for small snips of code, i need to automate this process when i wish to render dozens and dozens of source files in this manner…
so correct me if i’m wrong, but it appears that transformation basically boils down to:
generate an AST for a given source file
walk the AST, looking for nodes requiring a "#' suffix
create a Fixups value that contains all of these transformations
call renderTree, which will produce the annotated output
pass this annotated output as input to my mkdocs syntax highlighter
You might want to look at neurocyte/zat which is a zig CLI tool that can produce html from zig source (and many other languages). It does not use std.zig.Ast though, it uses tree-sitter.
looking a little closer at the zls implementation, there is ALSO the analysis.zig file which appears to do some critical semantic analysis… it basically “knows” whether an identifier is actually a type (and should therefore be rendered differently from a function name)…
from i can tell, however, analysis.zig, is more or less tied to zls (though i suspect the code has nothing to do with LSP per se)…
so now that i have my Ast (which was trivial for me to generate), are there any “reusable” semantic analysis modules??? something that presumably is creating a symbol table that makes the sort of distinctions i need…
all i can find are functions that take me from an AST to ZIR code – which is much further downstream than i need to go…
any entirely different approach: i know zls “does what i want”, in that it’s able to send a some json that represents a complete parsing of a particular source file; i believe this data not only contains line/col info, but the semantic token types ascribed by zls…
basically, i could write a command-line renderer that starts a zls session, in which i ask the serve to parse some file… i would then have to effectively walk the JSON returned by zls (which is more or less an AST itself) and generate the annotated markdown i can consume within my doc platform…
it might be easier, in fact… there are tests in zls that basically send a "textDocument/semanticTokens/full" request to the server and then comb through the response…