How to handle Environment Variables after Juicy Main

I have a side project I’ve been working for the past year that heavily relies on environment variable. It’s called anyline and it’s intended to be a drop-in replacement for GNU’s readline. You can find the source code here and my Ziggit posts about it here.

Before 0.16.x, I would handle environment variables by calling to std.process.getEnvVarOwned and retrieving the variables where ever needed. You can find an example here.

After 0.16.x, I see a few options. The first naive approach that I could think of is simply require my library’s users to pass each necessary environment variable to each respective function that may require them, i.e.

pub fn readHistory(io: std.Io, alloc: Allocator, file: std.Io.File) ReadHistoryError!void {
    ...
}

Note how the file parameter is now required and the user doesn’t the option to opt out. While this is functional, I don’t want to force that constraint on my (hypothetical) users. Instead, I wish they still have the option to say no.

My second less-naive thought was allow the user to pass an optional environment variable map, i.e.

pub fn readHistory(
    io: std.Io,
    alloc: Allocator,
    maybe_environ_map: ?*std.process.Environ.Map,
) ReadHistoryError!void {
    ...
}

Then users could simply opt out by passing null. However, now the user has lost the ability to override the default with their own path. I could add both parameters, i.e.

pub fn readHistory(
    io: std.Io,
    alloc: Allocator,
    maybe_absolute_path: ?[]const u8,
    maybe_environ_map: ?*std.process.Environ.Map,
) ReadHistoryError!void {
    ...
} 

But now the interfaces just feels clumsy. What happens if a user passes null for the maybe_absolute_path and the environ_map_maybe? Do they simply get a runtime error saying they misused the function? I’m a bit stumped on how to get the best of both worlds.

What are your guys’ thoughts?

Why not a union?

pub const HistorySource = union(enum) {
    path: []const u8,
    environ_map: *std.process.Environ.Map,
};

pub fn readHistory(
    io: std.Io,
    alloc: Allocator,
    source: HistorySource,
) ReadHistoryError!void {
    ...
}
2 Likes

My starting point would be:

pub fn readHistory(
    io: std.Io,
    gpa: Allocator,
    options: Options, 
) ReadHistoryError!void {
    ...
}

const Options = struct {
    absolute_path: []const u8, 

    pub fn from_env(env: *const std.process.Environ.Map) Options
};
  • Start by raising the level of abstraction, giving a name/type to the thing under discussion.
  • Avoid mid layer mistake by making sure user can manually configure everything
  • Provide a shortcut to construct options directly from env

PS: I also made a blog article out of me answering this question: Programming Aphorisms :sweat_smile:

7 Likes

Nice article, I always enjoy reading your posts. Just one note:

Having learned the trick, I remember it, where “remembering” is an act of active recall at the opportune moment. This recall powers “horizontal gene transfer” across domains, stealing shortcuts from Django and midlayer mistake from the kernel. Did you notice that applying “horizontal gene transfer” to the domain of software engineering tacit knowledge is horizontal gene transfer?

Isn’t this more of a “horizontal meme transfer”?

1 Like

My silly question was able to generate a whole blog post? I’ll take it as a compliment :blush:

When I find the time, I’ll take a crack at it and see how it turns out!

Hmm, I hadn’t considered that. I’ll give this a spin as well!

Libraries reading env variables is not nice due to potential issues like this

I think it’s great that Zig now forces environments into the library API surface, allowing me to pass null

1 Like

I finished up trying your idea, it’s not half bad! It works pretty well for both the native Zig code and the C bindings. Here’s a (hopefully perma-) link to the diff. Let me know if you have any feedback!

Bad news… Your option approach didn’t pan out very well. Here’s a perma-link to the diff.

The crux of the problem is memory lifetimes. When I went to implemented fromEnvironMap (in your post you named it from_env), I couldn’t find an easy way to manage the lifetime of the absolute_path. The best way I could find was to require an Allocator from the caller, dynamically allocate memory for the absolute_path, and add a bool flag to indicate whether Option.deinit would need to free the memory or not.

Quick aside

I’m sure the astute readers of the diff will notice I am naively concating the home path with /.history. I’m aware this is error-prone and won’t translate to non-Unix systems. I was in prototyping mode and didn’t want to get hung up on technicalities.

It may also be worth mentioning that I did investigate File.realpath. While I could ask the caller to pass me a buffer and dance around the lifetime problem, the doc comments seem quite adamant that using this function is a bad idea:

/// Obtains the canonicalized absolute path name of `sub_path` relative to this
/// `Dir`. If `sub_path` is absolute, ignores this `Dir` handle and obtains the
/// canonicalized absolute pathname of `sub_path` argument.
///
/// This function has limited platform support, and using it can lead to
/// unnecessary failures and race conditions. It is generally advisable to
/// avoid this function entirely.

Feel free to nitpick my implementation. I’m happy to hear constructive criticism :slightly_smiling_face:

The silver lining of this exercise is it did spark one worthwhile idea. If we move the layer of abstraction down just a little bit, we can actually get away with using a File instead of an absolute_path. This solves not only the memory lifetime problem (because files can be copied by value), but it also simplifies the the overall implementation of the read_history functions. The main tradeoff is my users are now forced to open a File themselves.

Id make it so the anyline actually owns the memory of Options. That means the memory of options needs to be valid until its given to anyline, where anyline then dupes it.

This means you can also give absolute_path from temporary buffers as well.