Error code disambiguation / specializing

Sze · January 27, 2024, 2:19pm

Continuing from:

What is the best way to handle unrecoverable errors like OutOfMemory in an application?

Imagine we have an error like so…
           alloc1
              |
         {   OOM, X, Y Z  }
              |
           alloc2
You now have two allocators using the same error. So which one failed? If you have a long enough call stack, this can become impractical to solve.

I just thought of a way you could handle the case, where you have multiple allocators, want to handle OOM, but in different ways and want to bubble up.

You would use helper/wrapper functions that change error.OutOfMemory to something more specialized like this error.OOMDoSomethingSpecial.

It also may make sense to create some comptime function that turns some allocator into one that is basically like the original allocator just changing the error code, so that you don’t have to write your code adding wrapper calls everywhere and can reuse the flexibility the allocator interface already gives us.

Looking into it, it isn’t possible because the error.OutOfMemory is the only error code that can be returned from an std.mem.Allocator I guess you could have an Allocator(error_code) function that returns a distinct allocator interface, but that seems to lead towards to much code complexity, so probably wrapper/helper functions when it is needed, instead.

Of course all of this only makes sense if your application can use this in some sensible way.

cancername · January 27, 2024, 2:25pm

This makes sense and is easy to do:

fn doSomeStuff(ally: std.mem.Allocator) error{OomDoSomeStuff}![]u8 {
    return ally.alloc(u8, 69) catch error.OomDoSomeStuff;
}

(or by handling it in the parent)

AndrewCodeDev · January 27, 2024, 4:40pm

Additionally, the benefit of this is it allows allocator composition. I’d argue that in the composed case, it shouldn’t really matter which one fails because they’re being treated like the same allocator (even though we are technically crossing the streams here).

One thing that we may want to consider is that the allocator interface is a standard utility that’s meant to work with other standard utilities. It’s one of the floorboards of the standard library. That makes sense for the context of the standard library, but I don’t know if we should restrict our thinking to just that context.

For instance, one of the things I was thinking about recently is using non-virtual-table allocation strategies. That’s aside the point here, but it’s something that requires us to think outside of std.

So here’s where I’m going with this - if an application begins to need custom errors for allocators, then maybe it actually needs custom, non-standard data structures instead?

cancername · January 28, 2024, 12:24am

Great points about allocator composition and allocator usage.

I don’t think indicating the allocation failure of a specific allocator is the goal here. Informing the library user about where the allocation failure occurred is useful, however. To give an example relevant to my proclivities, consider a function fn decodeVideoFrame(ally: Allocator) ![]u8. Depending on the implementation, allocation failure could happen either for the decoding process itself or while allocating the frame. Allocation failure in the former could indicate a bug or memory exhaustion in general, while allocation failure in the latter could indicate that the dimensions would be too large for the system to handle. The library user should be allowed to handle such errors differently.

AndrewCodeDev · January 28, 2024, 12:56am

It’s possible that it’s not related to your specific goal, but it is certainly related to the issue that started this topic: https://ziggit.dev/t/what-is-the-best-way-to-handle-unrecoverable-errors-like-outofmemory-in-an-application/

The issue we were discussing is that errors from two different allocators can coalesce into one OOM signal if they’re on the same code path. That’s a confusing signal because if you’re relying on the try functionality, you lose information along the way. It allows for composition of allocators, but that’s a benefit and a cost.

I completely agree - how, why, and where something fails is important information as well - no argument from me here.

Yes - this is an interesting approach that maintains part of the solution. Allocation for special cases can be wrapped and either handled or signal unique information. The issue that @Sze is running into is the compatibility with standard containers since they often expect the allocator interface which has a dedicated error set. So in that case, it’s not easy to swap out in a standard-compliant way. That’s why I’m saying that it may be beneficial to have containers that signal what you need but I wouldn’t expect the standard library to provide those.

What you’re suggesting here is one of the better approaches that can be used to handle the ambiguous-signal issues. Keeping the information distinct or at least on different paths seems to be the most tenable option… but it’s an interesting problem either way.

AndrewCodeDev · January 28, 2024, 2:00am

I tried playing around with providing a single valued error to a data structure and I think this is the most stable way to do it:

fn isErrorValue(comptime E: type) bool {
    return switch (@typeInfo(E)) {
        .ErrorSet => true, else => false,
    };
}

pub fn MakeStruct(comptime error_value: anytype) type {
    if (comptime !isErrorValue(@TypeOf(error_value))) {
        @compileError("Parameter must have parent type ErrorSet.");
    }
    return struct {
        const ErrorType = @TypeOf(ErrorValue);
        const ErrorValue = error_value;

        pub fn myCreate(_: @This(), comptime T: type, alloc: Allocator) ErrorType!*T {
            return alloc.create(T) catch ErrorValue;
        }
        fn errorValue(_: @This()) ErrorType {
            return error_value;
        }
    };
}

const MyError = error { MyOOM };

pub fn main() !void {

    const foo = MakeStruct(MyError.MyOOM){};

    // compiles fine...
    const x = try foo.myCreate(usize, std.heap.page_allocator);

    // still works fine...
    defer std.heap.page_allocator.destroy(x);

    // we get the right error
    std.debug.print("\n{s}\n", .{ @errorName(foo.errorValue()) });
}

I tried being clever with @errorFromInt but that actually plucks values out of the global error set. The MyError.MyOOM value turned out to be 11, which is definitely unstable unless you’re grabbing that value directly.

Anyhow, to @IntegratedQuantum… you mentioned not wanting to rewrite half the standard library (lol) so I was wondering what it would take to make mirrored structures (just to see what the overhead for that would be)…

I analyzed array_list.zig and I actually think there is a way out here. It’s actually quite easy. Check this out…

Probably a good function to start with:

        pub fn ensureTotalCapacityPrecise(self: *Self, new_capacity: usize) Allocator.Error!void {
            if (@sizeOf(T) == 0) {
                self.capacity = math.maxInt(usize);
                return;
            }

            if (self.capacity >= new_capacity) return;

            // Here we avoid copying allocated but unused bytes by
            // attempting a resize in place, and falling back to allocating
            // a new buffer and doing our own copy. With a realloc() call,
            // the allocator implementation would pointlessly copy our
            // extra capacity.
            const old_memory = self.allocatedSlice();
            if (self.allocator.resize(old_memory, new_capacity)) {
                self.capacity = new_capacity;
            } else {
                const new_memory = try self.allocator.alignedAlloc(T, alignment, new_capacity);
                @memcpy(new_memory[0..self.items.len], self.items);
                self.allocator.free(old_memory);
                self.items.ptr = new_memory.ptr;
                self.capacity = new_memory.len;
            }
        }

First, the return type. But if you gave it a second parameter that it can be customized with at the type definition level like…

const whatever = MyArrayList(T, my_error)...

And then save that error type like in the example I provided above…

const ErrorValue = error_value;
const ErrorType = @TypeOf(error_value);

You could then text replace all instances of Allocator.Error with ErrorType so all the return types are correct.

Okay, now the fun part - handling the allocator stuff. I did a search for try allocator and only found 3 matches in the entire array_list.zig file.

You could go to those places, write a catch return ErrorValue instead of try. Now, they won’t be directly assignable with other ArrayLists, but that may not be a bad thing actually… however since you can get the ArrayList.items field, you can just pass the slice around if you need to view the data - maybe even qualify the child type as const while we’re at it.

I will say there’s one complication with the UnmanagedArrayList… that version uses temporary Managed array lists - so those call sights would need the extra comptime value parameter passed to their definitions. So that said, there’s a little more work to support unmanaged versions of things but if you just want one version, it’s easy.

Honesty, not bad all in all if you’re just trying to replace allocator errors (at least in this one case).