How can "reusable" and "no dependencies" software coexist?

One of the stated goals of the ZSF is “reusable” software.

We also see “zero dependencies” worn as a badge of honor for many zig projects.

How can these two ideas coexist? Should we not be encouraging depending on each other’s code?

Here is a concrete example: I have an application requiring data exchange in the CBOR format. I could very feasibly hand write the encoding for the ~10 fixed size types I need to encode/decode, or I could depend on r4gus/zbor: CBOR de-/serializer written for Zig - Codeberg.org. I chose to depend on ZBOR. It has been well maintained, and seems to have gotten the API right early on with few breaking changes in the last year for my limited use case. I am happy with this decision.

Another frustration I have is that it feels hard to produce truly reusable software. When I write a library, I believe I need to provide multiple allocation methodologies (gpa, arena, no-allocation, iteration) and multi-OS/multi-arch compatibility. I am even thinking about embedded… (Like I don’t want to accept a file… the user might not have a file system!). I don’t own a mac, I don’t know anything about windows either. This is hard. I think the writer changes alone have made this a lot easier lately, enabling a default abstraction. Hopefully the I/O changes make this easier too.

8 Likes

“zero dependencies” should be seen as a goal, not an actual concrete characteristic of a project. As in, one should strive to have as few dependencies as possible. If your project is a library for parsing text, it is reasonable to literally have no dependencies. However, if your project is a library that provides higher level functions that depend on a certain text format, it is also reasonable to rely on a library that already does the parsing well.

15 Likes

The curse of a rich stdlib :wink:

In a way, in C these decisions are easier. Since the C stdlib sucks, it’s better for reusable libraries to not depend (too much) on it.

For instance IMHO it’s always a good idea to have a low-level functions in a “data processing library” which accepts input data in memory (for instance as a Zig slice). Higher level functions are optional convenience goodies (like reading from a Reader or directly from the filesystem).

Once you have that low-level “read data from memory chunks” function in a library, users of the library can build their own higher level wrapper layers around it, even going as far as calling the processing function for little chunks of data in parallel from multiple threads (assuming the processing function is “pure”).

TL;DR: IMHO, even with a rich stdlib I think it’s a good idea to think of it as “just another dependency”.

2 Likes

There are two completely different perspectives on security dependence in this world.

The first perspective believes that reducing external dependencies in code can lower the attack surface because external code often contains vulnerable features that are not needed. A typical example is log4j; while depending on it, it also introduces code that loads remote resources via the JNDI protocol that are not needed, which may contain vulnerabilities. Centralized vulnerabilities have a greater impact, and there are more attackers scanning and attempting to exploit them.

The second viewpoint holds that reliance on centralization helps reduce the cost of code review. The code in this world is growing exponentially, and if the same functionality is repeatedly implemented, it is difficult to review each instance for potential vulnerabilities. Relying as much as possible on existing solutions rather than reinventing the wheel helps focus review efforts, reduces the amount of code that could be exploited, and enhances security.

A balance can be found between these two viewpoints. Introduce high-quality dependencies that meet your minimal needs, and avoid using dependencies that are overly bulky. Review the related code of your own dependencies to ensure there are no vulnerabilities that you could detect, while also contributing to the code world’s supply chain review.

1 Like

I think this is an excellent and very reasonable question to ask.

When introducing a dependency, the most important question you should be asking yourself is “how vulnerable will I be if the worst thing that I can imagine happens to this dependency?”. What will you do if it turns out that the package has a long-time critical security bug, or its development is taken in a direction that you don’t like, or it stops being maintained altogether?

The fewer dependencies you have, the fewer times you need to ask yourself this question, and the less key a dependency is, the easier it will be for you to answer. The way I would phrase my advice is as an ordered list of recommendations:

  1. No dependencies is better than some dependencies

    If you own and control 100% of the parts that make your software run, you will always be able to fix any problems that arise on your own, without needing to rely on someone else.

  2. No transitive depdendencies is better than some transitive dependencies

    If you need to depend on something, it’s better if that dependency doesn’t have dependencies of its own, since that affects their ability to fix their and your problems (i.e., the recommendations in this list apply recursively).

  3. One dependent is better than many dependents

    Something that is depended upon by only one dependent (meaning: the node in the dependency graph has only one arrow going into it) will be easier to swap out or remove entirely than something that is depended upon by multiple dependents. If you have many dependencies, you at the very least want to avoid an xkcd 2347-like situation.

  4. One ubiquitous dependency that solves a specific problem is better than multiple ubiquitous dependencies that solve the same problem

    If you have a keystone dependency that you simply can’t replace, good interoperability will make the situation more bearable. You don’t want 80% of your stack to require you to use one specific solution to a problem (e.g. a specific logging library or a specific DateTime type) but the remaining 20% to use a different incompatible solution, since that would put more work on your plate to solve the incompatibilities.

The beautiful thing is that the less far down this list you traverse, the more reusable your software is. “Reusable” and “no dependencies” form a very symbiotic relationship because the most reusable software is that which has no dependencies.

(As an aside, this is also why initiatives like Zig’s std.mem.Allocator and std.Io interfaces are so important, since they provide you with a means to continue to use most parts of std even when std’s own default OS-specific implementations don’t work for you or fullfill all your needs.)

Of course, like @hachanuy already stated, zero dependencies should be an aspirational goal, not a be-all and end-all. Having zero dependencies is impossible. You depend on society to continue to function. All living people depend on the Earth to continue to orbit the Sun at just the right safe distance. Even autonomous space probes depend on the laws of the Universe (and possibly intergalactic laws that I’m not privy to :slight_smile:).

6 Likes

Good answers already, but I’ll add my take:

Reusable software isn’t just about shipping libraries in packages that are easily installed. It about:

  • Writing code that has clean abstractions with understandable effects. Somebody reusing it should only need to understand “what” it does, and not “how” it does it. So, no side-effects or side-channels.
  • Code should be composable, so it can used with other code from another source. Might mean it exhibits flexibility in how it can be applied, or it works with accepted standard practises. (Edit: a good example here is being able to inject allocators, writers, concurrency models into library code, so it’s all under the control of the library user rather than the library writer).

So the Zig language should make writing code that exhibits these qualities painless. It still requires a skilled author to do it, but the language shouldn’t get in your way. The standard library should also exhibit these qualities.

The “zero dependencies” aspect is an acknowledgement that the current “npm / cargo / pip install >>>> Downloading 573 packages” way of the world is madness. Just having a mindset where you see dependencies as something to be minimised is a good thing.

7 Likes

Dependencies and reusability are tightly connected. Every dependency you add reduces the ability of others to use your code, so choosing what to depend on should always be a deliberate decision (at least in the context of library code).

Dependencies create real long-term burden for maintainers. As the dependency graph grows, fewer people are willing (or able) to maintain the resulting house of cards and unmaintained code is not what I would call reusable.

A large dependency set also reduces portability in practice: not everyone can afford the time to fetch and compile everything, or the disk space, RAM, and system resources required to build the project.

Excessive dependencies also take flexibility away from downstream users. By pulling in a specific library to solve a problem, you often prevent someone from integrating your code with a different library that solves the same problem in a way better suited to their context.

also, dependencies harm readability and auditability. Locality matters: the larger the dependency surface, the harder the code becomes to review, reason about, and truly understand-because “understanding the project” now means traversing dozens of external repositories.

This is one reason Zig gets it right, the package manager is convenient and reliable, but not so frictionless that it encourages pointless dependencies for trivial functionality, and also Zig std does try to provide the absolute most useful common denominator. If they didn’t pushed for the Reader/Wrtier, Allocator, Io interface, json in the std, cryptographic, and data structures, we would have some essential building block implemented in non composable ways all around us, each depending on a subset of others, and it wouldn’t make it easy to reuse code.

As a rule of thumb, I believe we should aim to depend on libraries that do one thing and do it well, at least from a library standpoint. One of my favorite examples is rxi/log.c - C99 logging library. In the context of C, it’s nearly ideal a single file, simple, readable, auditable, and reliable and it solves a focused problem cleanly ( I can’t stress enough how much I love and have used that library).

This doesn’t mean every dependency must be tiny. But dependencies should be focused and composable. That’s what enables real reuse imho.

At the same time it’s hard to build something meaningful without using someone else’s code. so dependencies aren’t bat at all, but too much or too litle of them is obviously an issue.

4 Likes

I think this talk does a good job tackling the heart of this question:

Warning: it’s loud in the beginning

4 Likes

I would start with a dependency like a cbor parser , especially if it’s not fully clear which parser features you might need.

Once the use case is well defined and you might know you will for example only parse a few message types with a fixed schema, it might be nice to drop the dependency and implement a highly optimized simd parser for these message types only, which could be much faster.

If it’s a fit for the tool to optimize for speed instead of providing easy schema extensibility…

Out of curiosity, who is the intended user for the libraries you write?

I feel like ideally, there’s a little bit of an evolution. First you start with something that solves just your own problem, or maybe a problem your company has. Then you share the code, find that others get a lot of value from it too, so you put more effort into documentation and making it more general purpose. Ideally, this is happening while people with these expanded use cases are contributing code, ideas, fixes, or just testing it in production, and everyone wins.

However, a pattern is see pretty often is people making a library with the goal of starting by trying to solve this problem for everyone and every use case, which feels like putting the cart before the horse.

I like this perspective.

Attempting to write for general use falls all too readily into the YAGNI trap.

It’s also easy to mistake a working implementation for an API. There’s a lot of that from some giant software corps - just expose an implementation and call it an API. It takes a lot more than just a working implementation to make a good library/API (documentation being just one part of that).