Zig API naming: single-letter generics with no explanation?

This is somewhat theoretical, using HashMap as an example, but the underlying concern is about readability when trying to understand less-used parts of std.

Zig std API signature as shown via LSP:

pub fn ArrayHashMapWithAllocator(
    comptime K: type,
    comptime V: type,
    /// A namespace that provides these two functions:
    /// * `pub fn hash(self, K) u32`
    /// * `pub fn eql(self, K, K, usize) bool`
    ///
    /// The final `usize` in the `eql` function represents the index of the key
    /// that's already inside the map.
    comptime Context: type,
    /// When `false`, this data structure is biased towards cheap `eql`
    /// functions and avoids storing each key's hash in the table. Setting
    /// `store_hash` to `true` incurs more memory cost but limits `eql` to
    /// being called only once per insertion/deletion (provided there are no
    /// hash collisions).
    comptime store_hash: bool,
)

I came across ArrayHashMap, and when I read the API signature, the K and V caused me to stutter. I am still new to Zig, and I am not fast at deducing meaning from single letters. I can eventually infer it here because it is a hash map, but I had to stop and think.

The main problem for me was that there is no direct way in the signature to deduce what K and V are. I tried to find this information in the comment below:

    /// A namespace that provides these two functions:
    /// * `pub fn hash(self, K) u32`
    /// * `pub fn eql(self, K, K, usize) bool`
    ///
    /// The final `usize` in the `eql` function represents the index of the key
    /// that's already inside the map.

But this does not actually tell you what K or V represent. It explains usage and constraints, but not the roles of the types themselves.

So in my mind this causes multiple problems:

  1. If I want to search for hash maps to read up on them, I cannot easily search for something like “hash map key” or “hash map value” based on the API. Searching for “K and V in HashMap” is noisy and assumes I already know what those letters mean.

  2. If I go into the source, there is still no comment explaining what K and V are, and I now have to deduce it from even more code that also uses single-letter names. At that point I am effectively reverse-engineering the structure. That can be a good learning exercise in Zig, but it is not a fast way to understand how to use a standard library function and then move on.

  3. This is a hash map, so I can lean on prior knowledge to infer the meaning. My concern is about lesser-known parts of std. If they follow the same naming convention, the initial confusion will be worse, because I do not already know what the structure is supposed to represent.

  4. Point 3 can be somewhat mitigated if std treats K and V as stable conventions, where K always means “key” and V always means “value”. Once learned, that makes signatures readable. But this feels fragile if the same letters are ever reused with different meanings (for example K meaning “keyword” in some other API). In that case the reader may assume the wrong meaning based on habit.

To address this, I see two possible improvements.

One is to keep the short names but add a brief comment in the signature:

pub fn ArrayHashMapWithAllocator(
    comptime K: type, /// Key
    comptime V: type, /// Value
    ...
)

Another is to spell the names out directly:

pub fn ArrayHashMapWithAllocator(
    comptime Key: type, 
    comptime Value: type, 
    ...
)

My main question is whether there is a strong rationale for using single-letter type parameter names in Zig std public APIs, instead of either spelling them out or adding short clarifying comments at the declaration site.

1 Like

It is common between programming languages to use single letter names for type parameters.
Rust also uses K and V for HashMap.
C++ map calls the parameters Key and T (It is a surprise for me that C++ calls the key type Key and not that they call the value type T).

Indeed, a documentation comment is missing for K and V.

6 Likes

While I agree Key and Value might be more helpful and strict adherence to best practices, I think this is just one of those shortcuts that are common enough in the programming world where the “helpful names” rule gets bent, similar to how we all often use i in for loops, or T for a generic type, etc.

There is probably not going to be any great argument why it shouldn’t be Key and Value, but likewise I doubt you will find many who take great issue with it, even though it indeed is not very descriptive.

5 Likes

I agree that single-letter names are common, but I think the comparison matters.

  • T is nearly universal across languages and documentation. Its meaning (“some generic type”) is stable across domains.
  • i is usually a local variable, short-lived, and not part of a public API surface or function signature.
  • K/V sit closer to T in that they are type parameters, but they are far more domain-specific. Outside of associative containers, their meaning is not universally stable.

So the issue for me is not that single letters are used, but where they are used. In this case the function signature is an LSP surface, often the first place where the API is encountered. At that point, guessing the meaning of a single letter adds friction, especially for less common APIs.

That’s why I suggested two possible fixes and am fine with either:

  • spell the names out, or
  • keep the single letters but define them with a short comment at the declaration site.
4 Likes

I see after my first post that my problem statement was not clear, and that the discussion got anchored on the “hash map association”.

In Zig std there are multiple similar occurrences of this pattern: single-letter type parameters with no declaration-site explanation. Here are some signatures as examples.

Problem 1

Here is one occurrence. Nothing in the signature tells what E or V mean. You can often deduce E, but not V, unless you already know the pattern.

std.EnumArray(comptime E: type, comptime V: type) type

/// An array keyed by an enum, backed by a dense array.
/// If the enum is not dense, a mapping will be constructed from
/// enum values to dense indices. This type does no dynamic
/// allocation and can be copied by value.

The core problem here is the same as in the HashMap example: the abbreviations (E, V) are not expanded at first contact. It does not matter whether you discover the API through LSP hover, the generated docs, or by skimming the source. The first surface you see is the signature, and the signature does not define its own abbreviations.

What I observed while reading docs is that you often encounter the single-letter type parameters first, with no explanation, and only later you can infer the meaning from method parameter names or prose. In normal technical writing, abbreviations are defined at first use so readers do not have to stop, infer, and then backtrack. In this context, “first use” is the function/type signature, because that is what LSP and docs present first.

Problem 2

The second problem is inconsistency in naming, for example Treap:

std.Treap(comptime Key: type, comptime compareFn: anytype) type

Here std uses Key in a place where other APIs often use K. That suggests the convention is not uniform across std, which makes inference less reliable.

Problem 3

Method signatures contradicting the entry signature naming layer

std.EnumMap(comptime E: type, comptime V: type)

...

fn put(self: *Self, key: Key, value: Value) void

 (fn (*EnumMap(E,V), E, V) void)

/// Go to EnumMap | E | V
/// Adds the key to the map with the supplied value.
/// If the key is already in the map, overwrites the value.

Here the entry signature uses V, while methods use Value. This does not mean V is “something else”; it means the public surface mixes two naming layers for the same concepts. The reader must maintain a mapping between V ↔ Value (and similarly for keys) while reading. That is workable once you already know the convention, but it increases first-contact parsing cost, especially for less familiar parts of std.

When I looked through std, I could infer an implicit convention that is often followed:

T = type
V = value
N = number / length
K = key
E = enum

But there are deviations (for example std.Treap, and the mixed naming layers shown above), and the single-letter names are usually not defined at the declaration site.

Suggestion

I am not against single-letter type parameters in signatures. They shorten signatures, and if std is consistent they can improve long-term readability once learned, even if that trades some early learning speed for it. My main problems are first-contact clarity and consistency: single-letter names are used without declaration-site meaning, and the same concept is sometimes expressed with both letters and full names across the same API surface.

So the most sane fix that preserves the current shape of std would be:

  1. Add short declaration-site comments for single-letter names (define the abbreviation at first contact).
  2. Treat letters T, V, N, K, E as reserved conventions for their intended meanings, and avoid reusing them with different meanings.
  3. Avoid mixing letters and full names for the same role across public-facing entry signatures and method signatures, unless there is a strong reason; if letters are the dominant convention, then keep letters and use comments to define them.

That is the actual question I am trying to ask: is the current inconsistency intentional, and would std accept a minimal convention such as declaration-site comments to define the meaning of single-letter type parameters at first contact?

4 Likes

The fact that they are sub-type of a map/dictionary type, while perhaps not quite as obvious or universal as an i or T, does provide the context of their meaning, which would need to be accessed to even get the child type.

I am not opposed to their full names being used, I am in agreement with you and support of it. I was offering a descriptor, not a prescription. If implementing my own dictionary type, I would have personally opted for Key and Value, and avoided the shorthand.

As for consistency, as I stated, there is likely not to be any great argument against it beyond “this is how we do it”, but the same exact argument could be used against i, T, and the other aforementioned shorthands. The reality is that there is indeed an arbitrary line that we use for conventions that go against the rule. No matter how many books and blog posts get written about how these go against best-practice to use non-descriptive and/or single-letter names, they continue to be widely used in almost every project in existence. There is a line that seems people stop caring about zealous adherence to orthodoxy, and I think this might be one of this edge-cases that is on the cusp: most seem to view it as too trivial to bother refactoring or making a breaking change over.

it’s kind of the same as basic human language exposition rules: If you’re going to use acronyms or abbreviations, you spell them out at first use. In code, that would be in documentation comments,

But truly, this is just a continuation of bad legacy habits. Perhaps the argument is that they’re no different than the short symbols used for mathematics, but that is something that is arguable one of the problems in Math exposition, to the point that has become a meme joke.

There’s nothing wrong with typing out Key and Value ( or KeyEnum) for the types when you’re implementing containers.

2 Likes