Std.zip.extract in memory

Hi! :smiley:

I have some zip files saved in memory and I want extract them into a file. The std.zip.extract function does not take a regular std.Io.Reader but rather a std.fs.File.Reader, which makes sense because it needs to seek back and forth to do the unzip but the problem is that I don’t have a std.fs.File.Reader but just a std.Io.Reader.

I couldn’t find any way to create a std.fs.File.Reader backed by a memory buffer instead of an actual file on the filesystem. Did I miss anything?

I would like to avoid dumping the zip to disk only to then read it back again while unzipping and then just delete it.

Well, I would one of two things:

  1. If possible restructure the code so that I have a std.fs.File.Reader. Obviously if your std.Io.Reader isn’t a file in the background, this may not be possible, in which case I would do 2.
  2. Create a temporary anonymous file (I know you can do it on POSIX systems with memfd_create, but I don’t think that Zig has an API for that besides its POSIX bindings) and stream from the std.Io.Reader into the temporary file. Next, get a std.fs.File.Reader (or std.Io.File.Reader in the future) and use that for the Zip API. This would work no matter what the original std.Io.Reader is getting backed by (even if it’s a socket), but of course it would not be as efficient as when you wouldn’t need to do it.

I get the zip from build step that fetches them from the internet and does @embedFile, so I can’t restructure the code.

The code runs on both windows and linux so posix-only solutions are not great.

I was hoping that there was some way to create the file reader from a fixed buffer but it seems like that the stdlib can’t do it right now.

In version 0.14.1, std.zip accepts an instance generated by std.io.seekable_stream.SeekableStream(...). By converting a std.io.fixed_buffer_stream.FixedBufferStream(...) into this instance, you can extract the contents of memory.

The current version of Reader does not fully support this implementation. std.Io.Reader.fixed() does not provide a Reader with seek functionality.
I believe the most reasonable solution is to reimplement a set of seekable fixed buffer readers ourselves and redesign zip.extract based on seek generics, drawing inspiration from the standard library code.

It’s particularly tantalising, because there’s a public std.zip.Iterator struct that allows us to perform extractions on a file-by-file basis.
All it’d take would be adding a method to extract an iterator entry to a std.Io.Writer instead of a std.fs.Dir.
You would think that an iterator would be the ideal structure for extracting the files to, for instance, a hash-map in memory…