Hi, I’ve been wanting to give Zig a try for a while. I need a cli tool to read a range of cells from a .ods spreadsheet and cant find anything already available. I usually make cli tools with Rust but i thought this would be a good little project to get my feet wet with zig. .ods files are just zip archives and I saw they just added zip support to the standard library. The problem Im having is the lack of documentation. I can’t figure out how to read a file from a zip archive using the standard library. This is why I have…
I know its not right. std.zip.Iterator needs comptime SeekableStream: type I’m not sure what that means and i cant figure it out with the documentation. Also i don’t see a entry.name: []u8 but I do see entry.filename_len: u32?
thanks for the replies. I got a little farther off of both your recommendations but I’m still on the struggle bus. I’ll admit I’m not an experience programmer, just a hobby, but i really want to figure this out.
It seems like the std.zip library is incomplete at this point as it lack a lot of basic functionality. I figured out how to find the file i need in the archive but I’m trying to figure out how to actually decompress it into memory so i can parse it. I think I figured it out but i don’t know what kind of writer i need to be able to write to memory instead of a file.writer. I made a page allocated u8, entry.uncompressed_size to store the data but i don’t konw how to get the data from the std.zip.decompress into it, if that makes sense.
One should always be wary of drawing this conclusion when learning a new language. It might be correct, but it blocks the process of trying to figure out how to do what you want to do, and it often isn’t correct.
Readers and Writers are defined in std.io. std.zip takes any kind of Writer, so with a bit more exploring, I’m confident you can figure out the type that you’ll need.
You need two streams. One will read the metadata from the zip the file, the other one you use to do the actual reading.
This is modified from my own codebase, untested:
fn readZipFile(
file: std.fs.File,
) !void {
var seekable = file.seekableStream();
var zipIterator = try std.zip
.Iterator(@TypeOf(seekable))
.init(seekable);
// zipIterator created a copy of seekable.
// We can use seekable for ourselves.
while (true) {
const maybeNext = try zipIterator.next();
if (maybeNext) |entry| {
// Zip allows 0-sized entries, I think they're folders.
if (entry.uncompressed_size == 0) continue;
const totalOffset = entry.file_offset +
entry.filename_len +
@sizeOf(std.zip.LocalFileHeader);
try seekable.seekTo(@intCast(totalOffset));
// file now points to the beggining of the
// compressed data stream.
// File readers are unbuffered. You probably want some buffering.
var baseReader = std.io.bufferedReader(file.reader());
var decompressor = std.compress
.flate
.decompressor(baseReader);
// decompressor is now a reader that spits out decompressed data.
// You don't need to dump it all into memory, you can just read it in pieces,
// but here is how you decompress the entire file.
const buffer = try allocator.alloc(u8, entry.uncompressed_size);
defer allocator.free(buffer);
try decompressor.readNoEof(buffer);
// buffer now holds the entire decompressed data.
} else break; // No more files in the archive.
}
}
The std.zip module seems to be designed to extract a zip file to a directory. Entry.extract accepts a std.fs.Dir and you can’t just pass in a std.io.AnyWriter. As a result trying to get it to write to an in-memory buffer is tricky. When I tried to get a working example from what you’ve provided, I kept coming across an error in the decompress method returning an error.ZipDeflateTruncated.
Instead I tried directly using std.compress.flate.decompressor like in @LucasSantos91’s code. That may work if the underlying files are indeed compressed with DEFLATE. That might help you continue your journey.
Ya, that’s what i meant by it seemed incomplete. I figured there should be an entry.decompress that was geared towards doing what i was attempting. having to calculate offsets seems a little convoluted but i get its a new library, so I’m not complaining. It looks like you can use std.zip.decompress for a single file because std.entry.extract calls it in the way i was trying to call it but it passes in file.writer(). It says it can take any writer but when I look at the documentation for std.io there really isn’t any information or explanation on what the different writers are for or how to use them, and i am not familiar with the concept.
That error seems to mean there is still data in the buffer? maybe my offsets are off? I’m not sure why br.start and br.end are expected to be equal?
.deflate => {
var br = std.io.bufferedReader(reader);
var decompressor = std.compress.flate.decompressor(br.reader());
while (try decompressor.next()) |chunk| {
try writer.writeAll(chunk);
hash.update(chunk);
total_uncompressed += @intCast(chunk.len);
if (total_uncompressed > uncompressed_size)
return error.ZipUncompressSizeTooSmall;
}
if (br.end != br.start)
return error.ZipDeflateTruncated;
Now that I’m looking for it, decompress just calls std.compress.flate.decompressor. This seems to work but now I’m confused on why decomp_data can be const. isn’t it mutated by try decompressor.reader().readNoEof(decomp_data);?