Move data-structures to a namespace

I feel like keeping the data-structures in the same level as other types makes the standard library documentation a bit cumbersome to read. Has this already been considered before?

2 Likes

The docs literally has separate sections for types, functions, namespaces, and other declarations??

I know. I meant moving the data-structures to a namespace of their own, similar to the allocators in heap, instead of putting them in the top level together with Build, Io, Thread, etc.

It’s not really a docs thing, it’s more like the way the lib itself is organized. Maybe the way I phrased it wasn’t the clearest.

Just so others understand. Is your proposal to move things such as std.dataStructName, to something like std.dataStructures.name?

There has been lots of discussion about this but it’s not always easy to find. Here’s one example, starting with this post in a larger thread:

3 Likes

Something like that yes. My proposal is to move things like std.ArrayList, std.DoublyLinkedList, std.HashMap to a namespace like std.data_structure or std.ds.

Okay, but every type is a data structure. Clearly you have some criteria.

What is it, and why is it better?

Various things already have their own namespace, but common things are re-exported.

For example: std.ArrayList is a small wrapper for std.array_list.Aligned, that uses the default alignment for the item type since that’s very, very common.
There is std.ArrayListAligned which is just an alias for std.array_list.Aligned.
There is also the deprecated std.array_list.Managed which is not re-exported to discourage its use as it will be removed.

The array_list namespace mostly serves to separate all the array List specific tests.

Similar thing with std.hash_map.

1 Like

If we take the reasonable gloss of “data structure” meaning “you can put data in it”, then no, this isn’t true.

std is a type, it is not a data structure. ArrayList is a data structure type. I call them “instantiable types” because it’s as clear as I can say it, but data structure ain’t bad.

I’ve said plenty already about how I’d prefer to see things organized, and “move all the data structures to std.data_structure” isn’t quite it.

But I do agree with OP that having a bunch of data structure types just hanging out directly off of std isn’t ideal, especially for new users.

2 Likes

std is a type because zig uses types as namespaces and doesn’t have a non type alternative.

I think it’s reasonable to ignore such types in this discussion. I should have been clearer and said “instantiable types”.

I am not arguing against reorganising std, I just don’t understand Op’s criteria for what would go in there.

1 Like

Per the Zig guidelines, by which zig std follows:

Everything is a value, all types are data, everything is context, all logic manages state. Nothing is communicated by using a word that applies to all types.

Every declaration is assigned a fully qualified namespace by the compiler, creating a tree structure. Choose names based on the fully-qualified namespace, and avoid redundant name segments.

Consider std.datastructures.ArrayList: ArrayList is already a data structure so the name is redundant. Reduced to std.ArrayList.

I think this rule is so solid that even with the 1.0 std rehauls, I expect those things to remain the same.

3 Likes

Data structures, what we’re both calling “instantiable types”.

I too think the simpler ones, like ArrayList, the ones which don’t do double-duty as namespaces, should not be directly off std. I think that’s what OP is saying, except more specifically to move all of them into a single namespace container, which I don’t think is ideal.

For one thing, std is not the only namespace with data-structure / instantiable types in it. Which is good, I certainly wouldn’t want, idk, std.json.ObjectMap moved to std.data_structure, that doesn’t sound helpful either.

3 Likes

Yeah. I meant specifically the ones directly under std, other data-structures that already reside in a namespace or “fat-type” should stay where they are.

At the same time std.os exists despite being essentially just a category. That’s why I thought having a std.ds might not be unfeasible. std.heap and std.fs seem like they could also fall into this bucket maybe.

std.windows is not a great fully qualified name. Are we talking about doors, or sliding windows algorithms?

std.os.windows sounds like the most correct fully qualified name. And any OS names can be anything, so namespacing them in os is appropriate.

std.heap however is a relic of time and probably be rehaul before 1.0. std.fs I am not sure, probably it will be promoted to std.path. But are we considering PATH a separate thing in std? Then maybe std.fs.path is still needed.

This is why you should always take the name into account, not random “yes this thing is X so lets subclass / inherits X namespace” OOP-ish line of thinking.

This made a lot of sense. But still, having the simple data-structure types together with the “fat-types” (the ones that double as namespaces) seems a bit weird to me.

1 Like

Define “weird”. I would disagree that the way it currently is hurts discoverability.

This issue has been bikeshedded enough, but I feel like a lot of the feedback on this topic comes from some vague feeling of comfort people get from organizing things into categories

IMHO, there is nothing more discoverable than the first page of the docs showing the core data structures, like ArrayList. A data_structure namespace would only serve to obfuscate the arguably most-used types in the standard library.

I think this is more of a documentation discovery problem. Traversing the std as a tree top down is one way to learn the std (and it is currently the only way, encouraged by the official generated std docs itself!). I do agree that it sucks though, whatever documentations do this (Rust is also another)

However, bending the fully-qualified names to make learning the std through this traversal easier is not justified enough IMO. “easier to learn” is a criteria too vague. On the contrary, the rules in the official guidelines are very much definitive and actionable, so we can argue less about names when authoring the std.

Human learning problems should be solved by human writing better docs. The current std autogen docs is nice if one already knows what to look up, but for discovery/onboarding new folks… it is very ass, even as an experienced Zig user.

(On the other hand, zig std is just mutating day by day, I winked once and std.posix is mostly gone, so I imagine this can only be done when we are very close to 1.0 so someone can sit down and write something. Otherwise, there has been two prior instances of two unofficial zig guides being abandoned)

2 Likes

I understand the problem with over-categorization, but the thing is that if it’s not deliberately categorized, such as putting it in a namespace (or some other way), then it’ll be automatically categorized, it’ll fall into the “Types” section along with unrelated objects.

I would say it’s weird in the sense that you have so many of a type of thing, and that this type of thing is the only one that suffers from this, along with the other objects seeming like a natural fit for this automatic/generic/misc category.

This is, admittedly, a very documentation-centric view, and I have not considered how it affects other workflows. Maybe I’m really being too fixated in the “aesthetics” of this and not taking into account the other aspects involved, I don’t know.

I don’t have much to offer since I think you have gotten to the crux of it. I agree that it is probably an aesthetic thing on your end. It is impossible to measure what is the best structure of the docs or namespaces, as at the end of the day it is a matter of taste or opinion.

Practically speaking, I personally quite like the minimalism of the Zig docs. My experience is that it heavily encourages stepping into the code to get your answers, versus using the docs pages as your main source of truth.

I think I prefer this approach, in contrast to something like Rust docs. I found Rust docs to always be very doc-comment heavy; but, when I wanted to actually explore the code to understand certain details, I had to jump through so many different files or pages, and sift through a sea of comments, to get to the ground truth, simply because of the way Rust developers “organize” their code. It may be a symptom of their build parallelism being tied to crates, but I digress.

Personally, I think of the Zig docs more as a fuzzy search tool, than the ultimate source of truth. Reading code is good, so let’s optimize for that.

Quite frankly, with how many types which double as a surrounding namespace there are at this point (like std.Io), I am at this point getting mildly annoyed that the documentation generator separates them from normal “namespaces”. Makes browsing the docs annoying.

I don’t agree with this.

It could be measured: time users with various levels of experience as they use the front page of the documentation, A/B testing various arrangements.

Impractical? Of course. But unlike The Rules, this criterion is objective and manifestly useful.

I’ll take the other side of that bet.

That’s categorizing! The Church Fathers say it’s sin!

Nuh uh! You’re categorizing!

2 Likes