Why are allocator interfaces and implementations split across `std.mem` / `std.heap`, while IO is colocated under `std.Io`?

Does the namespace need to be optimized for search or perusing the library? Just because the autodoc organizes by name by default, does not mean that the way newcomers (or anyone) must be expected to learn the standard library by clicking through the current autodocs like I do now. Here’s an analogy from my experience.

Just because two books don’t have the same title (for the sake of argument) doesn’t mean that the library shelves are sorted by title. The materials are tagged, numbered, weighed, scanned, labeled, covered, etc. And divided by media type, category, author, audience (children, teen, adult, large print), etc. My point is this is an example of a curated experience for getting information into the brains of the patrons who visit in a pleasant, human focused manner.

That kind of service has value. Does anyone have ideas for how to curate a good search or learning experience for zig std? Here’s two off the top of my head:

  1. tags in doc comments which allows people to find by category if they want without increasing the depth of the namespace. Requires lots of tagging which may be as fun as knitting for the right mind.
  2. popularity measure of functions. Grep a codebase to see how often a function is used. Put it at the top in the docs because it is probably important that you know how to use it.

To summarize: good curation, however it is done, would make using the namespace for categorization redundant or unnecessary.

3 Likes

No, they are sorted by author, but scaling out, they are categorized.

A book can have only one “canonical” location, because it’s an object. While “stuff in std” is not subject to that restriction, we should apply it anyway. Therefore the canonical location should be chosen on whatever basis makes it easy to peruse the standard.. what do we call it? Oh right, library.

Better search is well worth pursuing, but it has no bearing on how std should be organized, because it’s still going to have a nested form, and that form should be as useful as possible. Exactly like the physical placement of books in a standard (physical) library!

I firmly disagree with this. Intermediate and expert users have the first level of the std namespace effectively memorized. I do use the search bar, but not nearly so often as I find the top-level name I’m looking for, click on that, and scan. It’s faster, and I don’t get a cache miss if I remembered the exact name incorrectly.

Better search would reduce cache misses, but it’s not going to make search faster than browsing. Browsing I can also do in my head, while coding, and that too becomes more frequent with experience.

I maintain that the best possible arrangement of std, and let’s remember, it must be arranged, shoganai, is the arrangement which makes it easiest to learn to use.

1 Like

Naming is for reading and understanding. That it gains a lot of importance in search and discovery is more of a shortfall in the tooling, and not about the structure of the name spaces.

The “tagging” idea is a nice one, but the reality is that it’s hard enough to find someone to write decent documentation without also requiring them to be a good editor and taxonomist/indexer – which are professional specialties that sadly have become dinosaurs.

2 Likes

The library analogy was meant to convey that a real effort to curate for perusing is both more work than a simple rename, but incredibly valuable and worth striving for. Libraries have the constraint that books are physical items and rearranging them with a snap of a finger to help search isn’t possible. One important difference, is that the canonical location of an book in a library is not how you refer to the book when talking with friends. If every time you mentioned algorithms book you were reading you said 001.6/42 Knuth, you might struggle at the book club. It’s just an analogy.

References to std need to be read by humans regularly. So the need for short but effective names is valuable. All the computer needs is to be able disambiguate between terms. The ability to peruse, search, and understand std is a separate function. It is worth imagining how to implement that function differently.

1 Like

I’m going through this right now, and I have found it a pretty rough landing, to be honest. Which is very understandable for a new language pre-1.0, but I thought my “beginner’s mind“ experience might be insightful/useful.

The first thing is that when I go to the standard library documentation, I land on a list of types and namespaces without explanation. Without some sort of preamble explaining the organizing principles behind the standard library, I have to come up with my own hypotheses by inference. My first, and mostly unconscious, assumption is that these two things (types and namespaces) will be mirrors of each other, especially when I see ArrayHashMap and array_hash_map. It then requires cognitive effort to disprove that hypothesis (taking a look at both and keeping something in my so-called “working” memory :sweat_smile:), and until disproven the hypothesis is actually a drag on my rate of learning.

I’m the type of nerd that likes to read the documentation from start to finish, (and also the kind of nerd that is easily distracted :upside_down_face:). When you are trying to take in an entire language/framework/runtime/etc to build a nebulous mental schema of it’s possibilities, it really helps when the documentation contains a lot of examples AND each big idea/category has its own introductory preamble explaining the concepts in each section (documentation is a tree after all :sweat_smile:). Longer term, I suspect those preambles would also help new contributors land their changes in the right areas and reduce some of the discussion-induced maintenance burden.

So while a description of each big section/namespace helps newcomers build strong branch nodes in their mental schema to which to attach new learnings, having lots of examples in the documentation helps immensely with discoverability.

This was something I actually thought Deno did quite well recently – their documentation is basically structured in three parts:

  • Introduction + Fundamentals, which can be read relatively quickly (maybe 1.5hrs?)
  • Examples, which are immensely helpful for discoverability
  • Reference, which is pretty analogous to the current zig std library documentation. However, I will say that comprehensive explanations are a bit hit and miss throughout their reference documentation, e.g. this one provides a great explanation of why you might use it and the principles behind it, while this one simply describes what the name means but not what it does. I’ll go through a reference document like this and read every instance of the former example – then when I run into a problem I don’t yet know how to solve, I get a spidey tingle that there’s something useful in the documentation.

I found I couldn’t read the standard library documentation front to back like that – I tried, and I will probably try again sometime.

The other thing I personally find really helpful with documentation it is when it’s written in active voice describing how (and why) to use it. For example, this is the current description Io.Queue:

Many producer, many consumer, thread-safe, runtime configurable buffer size. When buffer is empty, consumers suspend and are resumed by producers. When buffer is full, producers suspend and are resumed by consumers.

Whereas, in my opinion, I would find it easier to understand if it were something like

Use an Io.Queue when you need one-or-many threads each sequentially consuming items exactly-once from shared memory, and/or one-or-many threads producing items for consumption. For example, distributing jobs to a worker pool.

Io.Queue suspends your application’s consumer threads when the queue is empty, and automatically resumed when a producer re-populates the queue by adding an item.

Likewise, once the queue is full it suspends producers by blocking/not-resolving requests to add items to the queue. Keep in mind that this blocks producer threads! Blocking producers threads is by design, allowing you to design your application to conserve resources upstream or downstream of the queue.

I made up a lot of stuff because I’ve never actually used Io.Queue before :sweat_smile: but I hope it articulates what I mean. I would still have questions after reading that, tbh. Like, does it guarantee serial execution? I kind of assume that it doesn’t because otherwise there wouldn’t be any reason to have multiple consumers, but maybe that’s configurable? There are use cases I can imagine where I might want a queue to be locked by a consumer until it reaches a particular step in its process, at which point it can free the queue for other consumers to take items out of but it will go about doing some further processing before picking up another job :person_shrugging:

I guess what I’m getting at is that, yes, arranging the code in the standard library such that its not unnecessarily broad (over-categorization), nor unnecessarily narrow (under-categorization) is important, but it’s also subjective and therefore impossible to perfect, while suffering from diminishing returns on investment. Looking at documentation as a ecosystem and using complementary techniques can help mitigate some of the issues that arise from inevitable tradeoffs.

(:grimacing: Sorry for writing a wall of text, I’m really into this sort of thing :joy:)

5 Likes

Welcome to Ziggit @Antman, and welcome to Zig!

This is useful feedback to have, thanks for that.

The solution Zig has developed here, it’s unusual. There are several concepts working together: a “container type”, which makes a namespace, struct is one of those, struct may or may not have fields, a file is a struct, which may or may not have fields, and std is just one of those (no fields on std).

All of this is explained in the core documentation, but so is everything else, and the files are structs part is right at the bottom, the struct portion does not mention it at all.

It’s a great design IMHO, but I myself remember it taking awhile to line up all the facts, and get a unified picture of what was going on.

There’s a concept of four kinds of documentation, which isn’t holy writ or anything but I’ve personally found it helpful as an organizing principle. The documentation you’ll find under the ziglang.org URL is reference material, you’re looking for explanation, and quite right to.

For tutorials, folks seem to get a lot of mileage out of ziglings, there’s the zig.guide for how-to, but less explanatory material than I’d like. There’s Ziggit though. :slight_smile:

Things are still changing fast, especially in terms of the standard library. That tends to mean less thorough documentation, especially in the three quadrants which aren’t reference. It’s a lot to take on: it shouldn’t be the core team’s focus, and for volunteers who aren’t part of core the maintenance burden after finishing something like that is substantial.

There should be more comments in the doc comments which produce the std library reference, I’d say that’s broadly agreed upon. It’s been improving over time. I don’t think the kind of explanatory material you want should be interspersed with the reference. It would be great to have a big expository website which follows the std library reference structure and fills in all those blanks, but well. Someone would have to do it, is the thing.

Like many things, this is a function of available talent and enthusiasm, and maturity over time. Hopefully both of these things will grow with each other.

4 Likes

I hope someone adds a top doc comment at std.zig. Something for the first-timers to read.

//! Containers:
//! * `ArrayList`
//! * ...
//!
//! Input and output:
//! * `Io.File.stdout`
//! * ...
//!
//! Implementations of `mem.Allocator`:
//! * `heap.ArenaAllocator`
//! * `heap.page_allocator`
//! * ...

I’m sure you couldn’t have really known this, but Io.Queue is a funny example to pick on because it’s so new; like, it doesn’t exist in Zig 0.15.2