Why does readSliceAll corrupt my data, but allocRemaining works? (new IO interface)

Hi,

I’m trying to get used to the new IO.Reader system in Zig 0.15.
I’m reading the pixel data of a FITS astronomy image (BITPIX = -32, so 32-bit float big-endian).
I have two versions of my file-reading code.


1

This is the simple version that allocates a buffer and reads directly into it:

pub fn readImageData(
    allocator: std.mem.Allocator,
    file: std.fs.File,
    hdu: PrimaryHDU,
) ![]u8 {
    const buf = try allocator.alloc(u8, hdu.data_size);
    errdefer allocator.free(buf);

    try file.seekTo(hdu.data_offset);

    var reader = file.reader(&.{});
    try reader.interface.readSliceAll(buf);

    return buf;
}

This reads the correct amount of bytes, but the float32 values become garbage after decoding:

=== Image Data Analysis ===
Pixel type: f32
Total pixels: 2334784
Min value: 0.000000
Max value: 78012644000000000000000000000000000.000000
Mean value: 34545638973025680000000000000.000000

These numbers are impossible for this type of image, so something is off in the read.


2

This second version works but allocates twice, which should not happen (I assume):

pub fn readImageData(
    allocator: std.mem.Allocator,
    file: std.fs.File,
    hdu: PrimaryHDU,
) ![]u8 {
    var buf = try allocator.alloc(u8, hdu.data_size);
    errdefer allocator.free(buf);

    try file.seekTo(hdu.data_offset);
    var file_reader = file.reader(buf);
    const reader = &file_reader.interface;

    try file_reader.seekTo(hdu.data_offset);
    allocator.free(buf);

    buf = try reader.allocRemaining(allocator, .unlimited);

    return buf;
}

This version is not ideal (allocating twice, not really how IO should be used),
but the pixel values become correct:

=== Image Data Analysis ===
Pixel type: f32
Total pixels: 2334784
Min value: 0.000000
Max value: 64340.543000
Mean value: 893.507832

So my float decoding is fine — the problem is how the bytes are read.

I am still trying to understand the new IO Interface but I struggle a lot.


Does anyone see what I am missing here? or what is the ideal approach in this case?

Thank you very much.

1 Like

Have you tried this?

pub fn readImageData(
    allocator: std.mem.Allocator,
    file: std.fs.File,
    hdu: PrimaryHDU,
) ![]u8 {
    const buf = try allocator.alloc(u8, hdu.data_size);
    errdefer allocator.free(buf);

    var reader = file.reader(&.{});
    try reader.seekTo(hdu.data_offset);

    try reader.interface.readSliceAll(buf);

    return buf;
}

i.e. seek using reader instead of file.

The reader has its own seek offset separate from the handle’s seek offset (which is managed by the OS/host system), and the reader tries to use positional file operations instead of streaming operations when possible, so if the handle’s offset is adjusted before creating the reader, the reader may still try to read from the very beginning of the file.

Alternatively, you can create a streaming reader using file.readerStreaming(), which will respect the file handle’s offset by virtue of only using streaming file operations.

2 Likes

Thanks that worked. I don’t know how I haven’t tried this. Always something to learn!

Best regards

1 Like

You initialize reader with buf then you free(buf) then you use reader. Use after free?

I dont know. I tried around until I ended up with this reader setup because I did not managed to get it working. but the solution from @castholm works perfectly.

1 Like

By default, file reader/writers track their own position and use positional reads/writes.

An alternate solution would be to use readerStreaming which uses the global file position, which is what you were seeking before.

thanks. What would I do if I didn’t know the size of the hdu.data_size?

you would need some way to either get the size, or know when you reach the end of whatever hdu data is.

Something I should have mentioned is your code can just take a *Io.Reader and reader.discard(.limited(hdu.data_size)
this lets you read from more than just files, which is wonderful for testing.

2 Likes