Strange behavior (possible UB?) in specific situations

In a repo of mine (GitHub - mgord9518/squashfuse-zig: Idomatic Zig bindings for squashfuse), I recently have vendored liblzma into the build.zig to allow easier cross-compilation, but this caused my tests to fail with an apparent libxz error of 8 (LZMA_DATA_ERROR) on linux-musl targets and in only one specific scenario that I’ve been able to find.

Basically, if I open up a SquashFS image with zlib or zstd compression, close it, then open one with xz compression, it will either fail to open or fail to read a file in the image because liblzma believes the data is corrupt. I’ve narrowed this further to notice that if I don’t deinit the SquashFs image, it will work as intended (my deinit function currently just wraps the internal squashfuse destroy function). If I open the xz compressed image first, everything also seems to work.

I believe this might be relating to malloc due to the fact that the internal squashfuse functions that manifest this behavior appear to just wrap free. Even more bizzare, linux-gnu targets appear to work as intended (I’ve tested on both x86_64 and aarch64).

I’ve been on this shit for days and can’t seem to make any further progress. If anyone would like to help investigate, my test file is under squashfuse-zig/lib/test.zig and the repo my bindings are for is GitHub - vasi/squashfuse: FUSE filesystem to mount squashfs archives. The destroy function is located at squashfuse/fs.c. In order to replicate this behavior, it must be built with zig build test -Duse-zig-xz=false

This is me quickly skimming through code and noticing some things with zero context as for what Squashfuse is or does. I have no idea if this has any relation to the problem you’re facing.

Your SquashFs.deinit function is implemented like this:

pub fn deinit(sqfs: *SquashFs) void {
    c.sqfs_table_destroy(&sqfs.internal.id_table);
    c.sqfs_table_destroy(&sqfs.internal.frag_table);

    if (c.sqfs_export_ok(&sqfs.internal) == c.SQFS_OK) {
        c.sqfs_table_destroy(&sqfs.internal.export_table);
    }

    c.sqfs_cache_destroy(&sqfs.internal.md_cache);
    c.sqfs_cache_destroy(&sqfs.internal.data_cache);
    c.sqfs_cache_destroy(&sqfs.internal.frag_cache);
    c.sqfs_cache_destroy(&sqfs.internal.blockidx);

    sqfs.file.close();
}

The original sqfs_destroy function is implemented like this:

void sqfs_destroy(sqfs *fs) {
	sqfs_table_destroy(&fs->id_table);
	sqfs_table_destroy(&fs->frag_table);
	if (sqfs_export_ok(fs))
		sqfs_table_destroy(&fs->export_table);
	sqfs_cache_destroy(&fs->md_cache);
	sqfs_cache_destroy(&fs->data_cache);
	sqfs_cache_destroy(&fs->frag_cache);
	sqfs_cache_destroy(&fs->blockidx);
}

These two implementations are not equivalent. Do you see what’s different?

Hint: SQFS_OK is defined as part of this enumeration:

typedef enum {
	SQFS_OK,
	SQFS_ERR,
	SQFS_BADFORMAT,		/* unsupported file format */
	SQFS_BADVERSION,	/* unsupported squashfs version */
	SQFS_BADCOMP,		/* unsupported compression method */
	SQFS_UNSUP			/* unsupported feature */
} sqfs_err;

(I also don’t see why your Zig code doesn’t just call sqfs_destroy directly instead of attempting to reimplement the same sequence of steps, but maybe there’s some detail I’m missing.)

Also, while unlikely to be related to your test failures, another issue are these pointer casts:

c.sqfs_version(
    &sqfs.internal,
    @ptrCast(&sqfs.version.major),
    @ptrCast(&sqfs.version.minor),
);

std.SemanticVersion fields are usize, while sqfs_version writes its values to pointers of c_int, which makes these casts non-portable.

1 Like

Really good catches, I should add that in testing I actually did notice my accidental inversion checking sqfs_export_ok and fixed it (but haven’t yet pushed it) because I was at work using my phone. I made the following correction:

if (c.sqfs_export_ok(&sqfs.internal) != 0) {
...
}

After testing the current deinit function vs just calling sqfs_image_destroy in the deinit function (with and without closing sqfs.file, the behavior remains. As far as why, I’m slowly trying to port the functionality itself to Zig making it an actual implementation instead of just bindings.

The version setting is suspicious, I’ll definitely fix that and see if it helps.

For context, Squashfuse is a library to read SquashFS images, get file metadata, extract, etc.

1 Like

Currently, if anyone wants to help yoy, they would have to open the github, understand which objects and which functions are needes to perform the steps that you describe, and then examine the implementation of said functions. It would be easir to help you if you provided a minimal reproducible example.
You description sounds like a use-after-free. Just a guess, but it sounds that when closing the image the first time, it releases something that would be needed by the following open calls. It could be releasing a file handle, or a buffer, for example.

1 Like

In a few days I’ll see if I can replicate it on a smaller scale, but being that the behavior somehow relates to squashfuse I’m not really sure if there’s much I can do besides making some stripped down version of the bindings

I’ll definitely look into UAF and see if I can narrow it down, but shouldn’t that cause a crash?

Not always, sadly. A lot of times it fails silently.