We all should know top-level doc comments (//!
) and doc comments (///
).
Now, in many programming languages, there is some kind of standard pattern for documenting what the parameters mean and what the meaning of the returned value (if the function has one) is.
In Python, we have :param something:
, in JavaScript we have @param something
, what are standard patterns you use in your Zig code?
4 Likes
I don’t think one has solidified. The standard library just speaks to the parameters as needed. (See openFileAbsolute
as an example in std.fs).
In languages like Python and JS, where typing is optional or non-existent, I find those conventions very helpful for describing the type of data expected. Less necessary in statically typed declarations.
2 Likes
The ones that need the most explanation are anytype
parameters. I’ve run into that problem on something I’m working on right now actually.
@Calder-Ty is right that the type system documents a lot on its own.
I tend to find that libraries and projects are what need the most clarification and individual functions are then easier to understand in context. Maybe that’s just me.
3 Likes
Not an answer, but tacking onto the question, I think a useful case for this style of doc would be resolving ambiguity over whether a struct is meant to be initialized directly, or through a new()/init()
function namespaced to the struct. Also resolving ambiguity over which fields of a struct are intended to be accessed directly or not. Private fields aren’t getting added to the language per Andrew:
opened 09:24PM - 06 Oct 21 UTC
closed 07:21PM - 13 Oct 21 UTC
proposal
## Introduction
Currently, function and variable declarations are private by … default and are made externally visible with the `pub` modifier. However, data fields are always public and there is no way to restrict their visibility. Apparently, field access control was never really working in Zig (#569) and was officially removed in #2059. Which is somewhat surprising, since no other modern language I can think of (Java, C++, C#, D, Rust, Go, ...) fails to provide this feature.
The lack of private data makes it impossible to do proper encapsulation / implementation hiding, which is widely considered to be a basic software engineering technique. In particular:
* Types can hide some of their methods and constants, but always expose their internal structure. This leaves them open to accidental (or bad-practical) corruption. The compiler will not complain if the `.capacity` of an `ArrayList` is overwritten.
* Documentation value is lost. Whether or not a field is intended for direct access can only be communicated through code comments.
* Auditing code becomes more difficult, because there are more opportunities to violate constraints and invariants.
* Implementing opaque handles is nearly impossible without additional builtins: #9859, [#1595 (comment)](https://github.com/ziglang/zig/issues/1595#issuecomment-425590705).
## Possible objections
No official reason was given in #2059, but two possible objections to private fields come to mind:
1. Visibility control is more useful for decls than for fields, because invisible decls become completely inaccessible, while fields can always be messed with through pointers.
2. This will lead to over-encapsulation and getter-setter boilerplate ([#2974 (comment)](https://github.com/ziglang/zig/issues/2974#issuecomment-515948745)). Sometimes it's just better to reach in and do things directly.
Concerning 1, I'd say that protection does not have to be perfect to be useful. Preventing accidental and semi-deliberate messing is valuable by itself, not to mention the documentation value.
2 may be a real concern, or maybe not. The same argument can be made about private decls, but a proposal to override the visibility of methods at the call site (#8779) was recently met with a resounding rejection -- even though there are certainly cases where it is both reasonable and safe to call private methods. In addition, Zig is increasingly making legal but potentially buggy patterns into hard errors, somtimes controversially (e.g. unused variables). The lack of basic implementation hiding is not consistent with this safety-first attitude, IMHO.
## Syntax
The simplest option is to adopt the same default as with decls: file-level private by default and externally visible with `pub`. If this clashes with data-oriented style, the opposite default can be chosen, but that wold require the introduction of a `private` keyword. Yet another possibility is to make field visibility struct-level, e.g. with `opaque struct {...}`.
I don't really have a strong opinion here.
## Open questions
* Is there a case for field-level access control in unions?
---
Update 1: As a side benefit, it may be possible to remove the `opaque {}` type from the language, since it is rarely used and can be simulated with `struct { private ptr: usize }`.
Update 2: I wouldn't mind adding an escape hatch like `@privateField(object, "name")` for cases where you need to use a particular library, but find the API too locked-down. Some discussion of this is [here](https://github.com/ziglang/zig/issues/9909#issuecomment-937723462).
And I think some sort of doc string might be a nice alternative approach to his method of indicating intent via struct field names.
3 Likes
Note that you can add doc comments to parameters directly.
Example from the standard library:
/// Windows-only. Same as `symLink` except the pathname parameters
/// are WTF16 LE encoded.
pub fn symLinkW(
self: Dir,
/// WTF-16, does not need to be NT-prefixed. The NT-prefixing
/// of this path is handled by CreateSymbolicLink.
/// Any path separators must be `\`, not `/`.
target_path_w: [:0]const u16,
/// WTF-16, must be NT-prefixed or relative
sym_link_path_w: []const u16,
flags: SymLinkFlags,
) !void {
return windows.CreateSymbolicLink(self.fd, sym_link_path_w, target_path_w, flags.is_directory);
}
How it looks in autodoc:
https://ziglang.org/documentation/master/std/#std.fs.Dir.symLinkW
11 Likes
That looks great on auto-doc, actually. Very nice.
2 Likes
I thought this was broken for a while. Glad to see I was either wrong or it’s fixed!
I believe it got fixed in the autodoc redesign:
ziglang:master
← ziglang:rework-autodoc
opened 08:14AM - 07 Mar 24 UTC
This branch deletes the Autodoc implementation and replaces it with a new one.
…
## High Level Strategy
The old implementation looked like this:
```
5987 src/Autodoc.zig
435 src/autodoc/render_source.zig
10270 lib/docs/commonmark.js
1245 lib/docs/index.html
5242 lib/docs/main.js
2146 lib/docs/ziglexer.js
25325 total
```
After compilation (sizes are for standard library documentation):
```
272K commonmark.js
3.8M data-astNodes.js
360K data-calls.js
767K data-comptimeExprs.js
2.2M data-decls.js
896K data-exprs.js
13K data-files.js
45 data-guideSections.js
129 data-modules.js
15 data-rootMod.js
294 data-typeKinds.js
3.2M data-types.js
38K index.html
158K main.js
36M src/ (470 .zig.html files)
78K ziglexer.js
```
Total output size: 47M (5.7M gzipped)
`src/Autodoc.zig` processed ZIR code, outputting JSON data for a web application to consume. This resulted in a lot of code ineffectively trying to reconstruct the AST from no-longer-available data.
`lib/docs/commonmark.js` was a third-party markdown implementation that supported *too many features*; for example I don't want it to be possible to have HTML tags in doc comments, because that would make source code uglier. Only markdown that looks good both as source and rendered should be allowed.
`lib/docs/ziglexer.js` was an implementation of Zig language tokenization in JavaScript, despite Zig already exposing its own tokenizer in the standard library. When I saw this [added to the zig project](https://github.com/ziglang/zig/pull/16306), [a little part of me died inside](https://github.com/ziglang/zig/issues/16490).
`src/autodoc/render_source.zig` was a tool that converted .zig files to a syntax-highlighted but non-interactive .zig.html files.
The new implementation looks like this:
```
942 lib/docs/main.js
403 lib/docs/index.html
933 lib/docs/wasm/markdown.zig
226 lib/docs/wasm/Decl.zig
1500 lib/docs/wasm/markdown/Parser.zig
254 lib/docs/wasm/markdown/renderer.zig
192 lib/docs/wasm/markdown/Document.zig
941 lib/docs/wasm/main.zig
1038 lib/docs/wasm/Walk.zig
6630 total
```
After compilation (sizes are for standard library documentation):
```
12K index.html
32K main.js
192K main.wasm
12M sources.tar
```
Total output size: 12M (2.3M gzipped)
As you can see, it is both dramatically simpler in terms of implementation as well as build artifacts. Now there are exactly 4 files instead of approximately one gajillion, with a 4x reduction in total file size of the generated web app.
However, not only is it simpler, it's actually more powerful than the old system, because instead of processing ZIR, this system processes the source files directly, meaning it has 100% of the information and never needs to piece anything together backwards.
This strategy uses a WebAssembly module written in Zig. This allows it to reuse components from the compiler, such as the tokenizer, parser, and other utilities for operating on Zig code.
The sources.tar file, after being decompressed by the HTTP layer, is fed directly into the wasm module's memory. The tar file is parsed using std.tar and source files are parsed in place, with some additional computations added to hash tables on the side.
There is room for introducing worker threads to speed up the parsing, although single-threaded it's already so fast that it doesn't really seem necessary.
## Zig Installation
Before this branch, a Zig installation comes with a `docs/std/` directory that contains those 47M of output artifacts mentioned above.
This branch removes those artifacts from Zig installations, instead offering the `zig std` command, which hosts std lib autodocs and spawns a browser window to view them. When this command is activated, `lib/compiler/std-docs.zig` is compiled from source to perform this operation (#19063).
The HTTP server creates the requested files on the fly, including rebuilding main.wasm if any of its source files changed, and constructing sources.tar, meaning that any source changes to the documented files, *or to the autodoc system itself* are immediately reflected when viewing docs. Prefixing the URL with `/debug` results in a debug build of the WebAssembly module.
This means contributors can test changes to Zig standard library documentation, as well as autodocs functionality, by pressing refresh in their browser window.
In total, the Zig installation size is reduced from 317M to 268M (-15%).
## Time to Build the Compiler
Since many lines were deleted from the compiler, we might hope for it to compile faster.
```
Benchmark 1 (3 runs): before/zig build-exe ...
measurement mean ± σ min … max outliers delta
wall_time 86.8s ± 3.28s 84.1s … 90.5s 0 ( 0%) 0%
peak_rss 4.58GB ± 492KB 4.58GB … 4.58GB 0 ( 0%) 0%
cpu_cycles 350G ± 1.99G 348G … 352G 0 ( 0%) 0%
instructions 505G ± 205M 505G … 506G 0 ( 0%) 0%
cache_references 21.4G ± 128M 21.3G … 21.5G 0 ( 0%) 0%
cache_misses 1.76G ± 15.3M 1.75G … 1.78G 0 ( 0%) 0%
branch_misses 2.43G ± 2.19M 2.43G … 2.43G 0 ( 0%) 0%
Benchmark 2 (3 runs): after/zig build-exe ...
measurement mean ± σ min … max outliers delta
wall_time 85.9s ± 3.63s 82.8s … 89.9s 0 ( 0%) - 1.1% ± 9.0%
peak_rss 4.51GB ± 259KB 4.51GB … 4.51GB 0 ( 0%) ⚡- 1.5% ± 0.0%
cpu_cycles 346G ± 2.29G 343G … 347G 0 ( 0%) - 1.2% ± 1.4%
instructions 499G ± 185M 498G … 499G 0 ( 0%) ⚡- 1.3% ± 0.1%
cache_references 21.0G ± 209M 20.8G … 21.2G 0 ( 0%) - 1.9% ± 1.8%
cache_misses 1.73G ± 16.9M 1.71G … 1.75G 0 ( 0%) - 1.9% ± 2.1%
branch_misses 2.41G ± 2.16M 2.41G … 2.41G 0 ( 0%) - 0.7% ± 0.2%
```
Not much difference here.
A ReleaseSmall build of the compiler shrinks from 10M to 9.8M (-1%).
## Time to Build Autodocs
Autodocs generation is now done properly as part of the pipeline of the compiler rather than tacked on at the end. It also no longer has any dependencies on other parts of the pipeline.
This is how long it now takes to generate standard library documentation:
```
Benchmark 1 (3 runs): old/zig test /home/andy/dev/zig/lib/std/std.zig -fno-emit-bin -femit-docs=docs
measurement mean ± σ min … max outliers delta
wall_time 13.3s ± 405ms 12.8s … 13.6s 0 ( 0%) 0%
peak_rss 1.08GB ± 463KB 1.08GB … 1.08GB 0 ( 0%) 0%
cpu_cycles 54.8G ± 878M 54.3G … 55.8G 0 ( 0%) 0%
instructions 106G ± 313K 106G … 106G 0 ( 0%) 0%
cache_references 2.11G ± 35.4M 2.07G … 2.14G 0 ( 0%) 0%
cache_misses 41.3M ± 455K 40.8M … 41.7M 0 ( 0%) 0%
branch_misses 116M ± 67.8K 116M … 116M 0 ( 0%) 0%
Benchmark 2 (197 runs): new/zig build-obj -fno-emit-bin -femit-docs=docs ../lib/std/std.zig
measurement mean ± σ min … max outliers delta
wall_time 24.6ms ± 1.03ms 22.8ms … 28.3ms 4 ( 2%) ⚡- 99.8% ± 0.3%
peak_rss 87.3MB ± 60.6KB 87.2MB … 87.4MB 0 ( 0%) ⚡- 91.9% ± 0.0%
cpu_cycles 38.4M ± 903K 37.4M … 46.1M 13 ( 7%) ⚡- 99.9% ± 0.2%
instructions 39.7M ± 12.4K 39.7M … 39.8M 0 ( 0%) ⚡-100.0% ± 0.0%
cache_references 2.65M ± 89.1K 2.54M … 3.43M 3 ( 2%) ⚡- 99.9% ± 0.2%
cache_misses 197K ± 5.71K 186K … 209K 0 ( 0%) ⚡- 99.5% ± 0.1%
branch_misses 184K ± 1.97K 178K … 190K 6 ( 3%) ⚡- 99.8% ± 0.0%
```
## Regressed Features
* Guides
- I don't want to port the langref to a guide. I think that should remain a separate document.
- I think there is room for guides to be added back to this system - likely they will actually work better since there is now support for parsing and linkifying arbitrary code.
## New Features
### Reliable Linkification
This stems from the fact that with full source files we have all the information, and can write more robust code to look up identifiers from the context they occur in.
### Interactive Source Listings
Press `u` to go to source code for any declaration:
![image](https://github.com/ziglang/zig/assets/106511/d4af2a3a-6efc-4f6f-9cad-91a7a62c909e)
The links take you to the API page for that specific link by changing the location hash.
### Embedded Source Listings
![image](https://github.com/ziglang/zig/assets/106511/5ec77654-46ec-4cf5-aa9d-e5acec19ca3f)
### Search Includes Doc Comments
Pretty straightforward. The current autodoc seems to not support this for some reason.
![image](https://github.com/ziglang/zig/assets/106511/4e8fb955-f50c-4182-9a03-f5f530e4720a)
Planning to also add struct field names, struct field docs, parameter names, and parameter docs to this.
### Error Set View
Merged error sets are detected:
![image](https://github.com/ziglang/zig/assets/106511/e77b57ed-2212-4fe4-b516-210608683d0b)
Errors that come from other declarations are linked:
![image](https://github.com/ziglang/zig/assets/106511/d200483d-0f75-47e3-93f4-741b52047761)
Errors are also shown on function view:
![image](https://github.com/ziglang/zig/assets/106511/f240ed5d-ee3a-4c9f-8923-012353548894)
### Correct Type Detection
![image](https://github.com/ziglang/zig/assets/106511/8358b23c-ada1-4322-8175-590f93a153d7)
Previous implementation guesses wrong on the type of `options` as well as `DynLib`.
### Correct Implementation of Scroll History
See https://github.com/andrewrk/autodoc/commit/6d96a63430b39c8a08158410b25dd0ecafea28db
## Follow-Up Work
I do not consider these to be merge blockers.
* make the panic handler reflect the failure in the user interface
* when navigating back to search results, up+down arrow should keep working
* redundant search results (search "format")
* in query_exec_fallible, sorting should also check the local namespace inside the file
* walk assign_destructure not implemented yet
* escape URLs when rendering html (look for `missing_feature_url_escape`)
* implement renderHome for multiple modules
* struct fields: render each component separate rather than via source rendering
* infer comptime_int constants (example: members of `#std.time`)
* when global const has a type of `type`, categorize it as a type despite its value
* show abbreviated doc comments in types and namespaces listings
* show type function names as e.g. `ArrayList(T)`
* enum fields should not be linkified (example: `std.log.Level`)
* shrink Ast to fit the slices
* linkification of methods (example: `std.array_hash_map.ArrayHashMap.count`)
* navigating to source from a decl should scroll to the decl
* in source view, make `@imports` into links, but keep same syntax highlighting
* include struct field names and doc comments in search query matching
* include function parameter names and doc comments in search query matching
* instead of logging "can't index foo because it has syntax errors" put it in the UI
* in Walk.expr() it is missing support for asm_input/asm_output nodes
* in renderNamespace, handle an aliasing loop
* add a history item when clicking a search result (it already works when keyboard triggered)
* instead of "declaration not found", show the decl that can't be penetrated (example: `#std.os.system.fd_t`)
* when rendering source code, better handle indentation (example: `#std.array_hash_map.ArrayHashMapUnmanaged.count`)
-----
closes #3403
closes #13512
closes #15865
closes #16490
closes #16728
closes #16741
closes #16763
closes #16898
closes #17061