openFile() and file.close() with different? Io objects

I think a basic use of the new Io model would look like this:

const io = std.testing.io;
var file = try std.Io.Dir.cwd().openFile(io, "filename", .{});
defer file.close(io);

What would be the meaning of providing a different io object in file.close()? Presumably there’s a use-case for this which I’m missing, or else I’d imagine close() would be more ergonomic (in which case, the file would retain memory its opening io, to use the same in close(), invisible to the caller - maybe that’s the undesirable, right there, but, presumably there’s something meaningful in exposing that argument).

Do you mean like std.array_list.Managed remembers its allocator? I think the reason is the same they switch default std.ArrayList to unmanaged.

Illegal Behavior?

Explicit is better than implicit. You see Io you know it may return an error.

Thank you. I’m a fan of “explicit is better”, and, for sure, with the default unmanaged approach for ArrayList, it makes sense, as various calls through the lifetime of the ArrayList might take advantage of different allocators being provided (in my understanding). I suppose that, after opening the file (with one Io), you might like to use a different Io for subsequent operations… but then… perhaps most often, the original Io (used for the open) would still(?) make the most sense for the close()…? Perhaps there’s a case for retaining the Io only for the close()? But there were other arguments for preferring the unmanaged scenario for ArrayList… some of them were pretty specific to allocating resources, and may not translate to Io… perhaps there’s a similar analysis, though, and again the advantages outweigh the disadvantages…?

The main disadvantage is that then io has to be stored somewhere. File is a struct that contains just the file description integer. There are two ways how the io parameter is going to be used: either you pass it to all your functions, or you store it in some context struct that you pass around in your application. That means you already have the io you are supposed to close the file with, so it’s a waste of space.

I agree that it’s confusing, because it’s not completely obvious that you can’t really use a another io for file open with one io. But the same confusing already happens with ArrayList, very strange things will happen if you switch allocators.

I guess the ability to accidentally provide a different io at close is an “unlikely footgun”, especially since close() is often called immediately with defer.

You can’t change allocator of an ArrayList mid-flight and similarly you can’t swap the Io implementation you use when reading/writing from/to a file that was open with a given Io impl.

You can have different ArrayLists where each uses a different allocator. So you can have one that is long-lived and uses a gpa, while another is part of an arena.

And similarly you can have different files that use different Io implementations. So far it seems less common to want to do that than using different allocators with different ArrayLists, but it doesn’t seem an unreasonable thing to want to do.

If you cannot switch allocators, why the switch to unmanaged arrayLists? Would it not make sense to keep the allocator used during init within the struct of the arrayList? It would still be possible to have different array lists utilize different allocators. But it would be less confusing and less error prone if the arrayList always “knows” which allocator to use.
I read 0.15.1 Release notes and my conclusion is completely opposite. The managed versions would cause less of a headache.

I disagree, I think unmanaged is good (once you are used to it) and I think this existing topic is a good place to discuss it:

Ah! I’d never tried, but thought I saw somewhere a discussion about this possibility.

I went to bed wondering if there was a use-case where you’d want to open a bunch of files, e.g., blocking, but when it came time to serving data from those files, you might decide to read asynchronously. Perhaps even if one WANTED to do this, it would be forbidden/discouraged and one would learn the lore or abide by documentation to this end.

I think with embedded programming or when building a virtual machine or emulator it could be useful to use one Io implementation for the host system and another one for connected device / the guest system.

As @lalinsky mentioned, storing the allocator in every ArrayList would be a waste of memory. On most general purpose computers, this waste is vanishingly negligible. Zig aims to be able to use high level abstractions on the most resource constrained systems in the exact same way that they are used on systems with abundant resources. Specifying the allocator (and io) explicitly makes data structures that rely on them more portable.

In some backends, it would just plainly doesn’t work. For example, in my implementation of the Io interface in the zio project, I support IOCP and if you open a file for async read/write with IOCP, you can’t use the handle in a synchronous context. So if you opened the file using my runtime, and then tried reading from using the std.Io.Threaded implementation, it would fail. Even worse behavior if you e.g. tried to lock std.Io.Mutex using one io and unlock using another one.

Sorry, I should probably clarify (in response to the comment about using one Io implementation for the host system and another one for connected device), as I think all will agree that using two or more Io schemes within a program/library/etc. can make plenty of sense. I can also think of many times you’d like to use one or the other for a different build target or a different flavor of your lib, e.g.. The focus here would be on using one Io to open a file and potentially another to do I/O operations on that same open file that you opened, while it’s still open. And, originally, my question was just an expression of curiosity about closing that file with a different Io - a use-case I can’t think of. If I understand the summary correctly, and the allusion to managed/unmanaged data structs, then one argument is that maintaining the simplicity of those structs, and not burdening them with having to maintain their own copy (or reference) of the allocator (or Io), is worth the tradeoff. I do predict that this “question” may live on for years, though, especially with newbies who are used to composition conveniences like this, and who are asked to get used to allocators and ios in the first place. One ploy might be to really sell the pattern of making your own struct which implements your needs in various ways, including holding the one allocator (and/or io) that might make sense for your use. Then you don’t allow users of your library to “mess up” because your close() uses the Io you used to open(), (and read/write), and no harm is done.

This might seem a little ridiculous, but I guess Zig std implementations could keep the underlying Io, under the hood, in debug mode only, in order to compare it with the Io provided in subsequent operations; if there was a mismatch, it could be called out, perhaps even sometimes at compile-time. This seems like strange hand-holding, though.

I think eventually there could be static analyzers for Zig that can detect and point out if you have used a different one for closing then was used for opening. Static analyzers might not be able to catch all cases. So the remaining cases would require dynamic techniques, theoretically you could use techniques like DynamoRIO Automated Function Tracing - #4 by Sze to create arbitrary runtime checks.

Zig also could use these or similar techniques, but I am not sure whether it makes sense to have a safety-check for using the wrong allocator / IO, maybe. Zig might not need something like DynamoRIO (although I think it could be useful when working on projects that use a lot of non-zig code), but I think considering something in the direction (basically I think Zig should eventually enable some of the kinds of things you can do with DynamoRIO, though it might choose a different way to implement these capabilities) could potentially lead to a very interactive and nice way to work with code and edit, debug and profile it in development.

Maybe once incremental and the compiler communication protocol is implemented it would allow similar things like DynamoRIO (through instructing the compiler to compile and hot-swap a version of the code that includes some kind of tracing code). I think there is a a bunch of tradeoffs between the kinds of applications you can use it with vs how nicely it is integrated into the build system / language / ecosystem. I think eventually it would be nice if you could use --webui and then toggle on a bunch of optional debugging features for modules where you want it. (If some of those are too expensive to always run in debug mode)

Overall it seems to me that if such checks would be done, they should be implemented through some mostly automatic way, whether that is by compiling code with some kind of tracing enabled or similar techniques.

If you prefer this, it is very easy for you to do this by wrapping ArrayList with a struct containing the allocator. Or, as I do, you can store the allocator at a higher level in the “object hierarchy” to avoid repeating it in many places, and pass it down when needed. In either case, there is no danger of using the wrong allocator.

I guess just creating a wrapper Io implementation (it takes an Io interface during its creation and just forwards but adds bookkeeping) would be enough.

Yes, I think you are right when it comes to half-manual testing, basically the io equivalent of a debug allocator. Still I think it would be good to eventually also add some more infrastructure/capabilities for testing/verifying things without needing manual code changes and having some kind of support for custom tracing could be really useful for project specific checks.

I agree that this seems to be a theme, and one I’m increasingly in favor of, but, to be clear, I think the programmer should tend to make this a part of his/her practice - that is, it’s not the responsibility of the zig std to provide this. Not that that couldn’t be done, but I think often it’s at the library or application level that important choices are to be made about allocators and (now) io and the simplicity or granularity desired. I do also think that a static analyzer could really help where the language doesn’t want to go out of its way to put baby-bumpers on every corner (for the costs they’d incur), but some spot-checking would help many of us avoid oopses.