Documentation, Dependencies, Higher abstraction interfaces

removewingman · December 28, 2024, 8:30pm

Hello,
I have been experimenting with zig (really enjoyable) and have some ideas and was wondering what you think of them.

Split Up Build, Fmt, ZigC

The Build System, Zig Compiler and Formatter are different things and can be independent programs. (similar to rustc, cargo and rustfmt)
This makes it simpler to learn, maintain and document. For documentation I would really appreciate man pages.
This could be a long term goal, because now, in early development, the goal is probably to just push out features.

Dependency Management without Versions

This is how a dependency file could look like:

[git]
notstd https://codeberg.org/removewingman/notzigstd.git
zcom https://github.com/ziglibs/zCOM.git
zelda https://github.com/haze/zelda.git

[local]
routez # version 0.6
zig-network # branch stable

This makes it really simple to just use the latest dependency everytime your project is build.
So you always have the latest updates, this is a big benefit in security and why would you not want to use the latest dependency. It is still possible e.g. to always use the stable branch or use versioning, but it is opt-in. This makes writing and maintaining libraries also easier, since there is only one version: main/latest.

Higher Abstraction Level Interfaces

The zig std library is low level, which is completely fine. I am suggesting to provide higher abstraction interfaces, to increase reusability and simplicity. Also from zig zen to:

Communicate intent precisely.
Reduce the amount one must remember.

I started to experiment with this a little bit here.
Some examples include:

file.contentWith1024Buffer(allocator: Allocator, file: File) !ArrayList(u8)
stdio.readStdinWith1024Buffer(allocator: Allocator) !ArrayList(u8)
split.splittedScalar(comptime T: type, allocator: Allocator, toSplit: []const T, delimiter: T) !ArrayList([]const T)
// these could be made more declarative

The idea is to do things once, test them and then reuse them. This makes the programs that use them a lot simpler, readable and safer.
Is this a philosophy/direction where zig wants to go to? Is there already something that provides these levels of abstraction?

Feel free to point out anything. Would love to here what you think.

Since these ideas are kind of abstract, this is one post, if it goes into more detail should probably make separate ones.

squeek502 · December 28, 2024, 8:58pm

Probably the opposite, if anything. See this PR where the plan is for the Unmanaged version of ArrayHashMap to become the default and no ‘managed’ version will be provided (and there are similar plans for ArrayList, etc if I understand correctly).

castholm · December 28, 2024, 9:38pm

Always automatically fetching and using whatever the latest version of a dependency is would be irresponsible and an enormous security vulnerability. All it would take is for one of your dependencies’ repos to get hacked and injected with malicious code for you to be compromised.

Zig already has a pretty well defined idea of how it wants the package manager to work: packages are identified by the hash of their contents, and dependencies are declared by specifying the hash of the package you want to depend on. The URL is only a hint to the package manager for how and from where to fetch the package if not already fetched. When fetching a dependency, the hash of the package contents are verified against the declared hash.

This means that there is a bit more brunt work required by the user to add dependencies and keep them updated, but it comes with the upside that you can always trust that the code from dependencies that get compiled into your program is what you have already vetted to be safe.

removewingman · December 28, 2024, 9:57pm

Unfortunate, thanks

removewingman · December 28, 2024, 10:06pm

I agree. A vulnerability would spread way faster. This is why you could have two branches main and stable or something like this. Or if you always want to have the same state/version just put it into your local repo. I just think this makes upgrading way easier, becaue you never do it. But if zig already has an established way of doing this/way to go, probably going to stick to that. Thanks

IntegratedQuantum · December 28, 2024, 10:10pm

The build system requires the zig compiler though. And also I think it’s good to have them all bundled together in one executable to simplify the installation process.

I don’t think the abstraction you listed here are really that useful to be honest or already have a standard library function that’s close enough, and prefer not to bloat the standard library too much.

What is the point of splitting into a list? I personally have never seen a case where I’d prefer a list over the iterator split currently returns. It would even make my code more complicated because I’d have to free the list.

Instead of your file.contentWith1024Buffer or stdio.readStdinWith1024Buffer, why not just use file.reader().readAllArrayList(&list, 1024)? The later is only slightly more verbose because you need to separately declare the list, but it is more readable (because it isn’t hidden behind a random function) and is likely more efficient as well.

removewingman · December 28, 2024, 10:37pm

That the build system needs the zig compiler can be a dependency. There is not much overhead if you need to install 3 binaries instead of one. You could even configure your package manager to install zigfmt and zigbuild automatically when zig is installed.

The zig std lib should stay as low level as it currently is. I was trying to demonstrate what one might can build on top of to have higher abstraction, the examples could be better, I agree, just tried to get the idea to you.

The lists are because I want to get into a somewhat functional programming style as soon as possible, that is why I create the list in the function and return it. To not create the incentive to have a mutable state all over your application, that is also why I want a list and not an iterator. Just my personal preference and on this low level I do not care about performance.

Thanks

IntegratedQuantum · December 28, 2024, 11:01pm

To install one binary, you need to download it, extract it and add it to the path. New users will need to do all of these 3 steps, and each of them is a possible point of failure. And doing it 3 times increases the chance of failure.
And no putting the responsibility on some third party package manager is not good enough. What about all the windows users without a package manager? What about people who need multiple versions of Zig because some old project only works on that older version?

But isn’t a list the ultimate form of mutable state? You can add, remove and modify all entries, whereas with a split iterator you can only modify the iterator state itself.
I guess maybe if you do not even want to have var in your code at all cost,
then I guess a []const []const u8 would be the better choice here.

removewingman · December 29, 2024, 5:35am

Yes const constu8 would be better, but it is not possible as far as I know/tried because the size of the list is not known at compile time.

IntegratedQuantum · December 29, 2024, 10:37am

[]const []const u8 can have a runtime-known size.

removewingman · December 29, 2024, 4:59pm

Thank you, I did not know that, changed it.

n0s4 · December 30, 2024, 5:12pm

removewingman:

file.contentWith1024Buffer(allocator: Allocator, file: File) !ArrayList(u8)
stdio.readStdinWith1024Buffer(allocator: Allocator) !ArrayList(u8)
split.splittedScalar(comptime T: type, allocator: Allocator, toSplit: []const T, delimiter: T) !ArrayList([]const T)
// these could be made more declarative

All of these already exist in std in a more sensible form:

std.fs.File.readToEndAlloc
std.fs.File.readToEndAlloc (with stdin as the file argument)
std.mem.splitScalar