In C world, using negative value as unsigned integer is normal?

I’m using the miniz C library in my Zig projec and I encountered a strange thing. Here are some code pieces in the C lib:

typedef uint32_t mz_uint;
    enum
    {
        MZ_NO_COMPRESSION = 0,
        MZ_BEST_SPEED = 1,
        MZ_BEST_COMPRESSION = 9,
        MZ_UBER_COMPRESSION = 10,
        MZ_DEFAULT_LEVEL = 6,
        MZ_DEFAULT_COMPRESSION = -1
    };
MINIZ_EXPORT mz_bool mz_zip_writer_add_mem(mz_zip_archive *pZip, const char *pArchive_name, const void *pBuf, size_t buf_size, mz_uint level_and_flags);
b = mz_zip_writer_add_mem(&zip_archive, "test.txt", "foo", 3, MZ_DEFAULT_COMPRESSION);

You can see that MZ_DEFAULT_COMPRESSION is used as mz_uint.

This might be okay in C code, but when the lib is depended by a Zig project, Zig compiler refuses to compile it, because -1 can’t be used as u32.

This is not a development blocker. I can use MZ_BEST_COMPRESSION now. But this is interesting.

Negative values for unsigned integers are treated by C compilers as equivalent to the maximum possible value for that integer (plus one, so in this case 2^32), minus whatever number you typed in.
Since C compilers also have no safety checks for integer overflow, this means that adding a negative unsigned number (such as -5) to a positive unsigned number (such as 10) gives a result that makes sense (such as 5).
You can imitate this behaviour by using ~@as(u32, 0) (which is equivalent to -1 as a uint32 in C) and wrapping addition/subtraction operators.

2 Likes

Your link is broken FYI.

The problem is that the enum is anonymous, so it’s essentially the same as #define MZ_DEFAULT_COMPRESSION -1. In C, the size and signedness of enums is implementation-defined (except that the type must be able to represent all enum values). Had the enum been defined like typedef enum mz_compression_levels { ... } mz_compression_levels and the function taken mz_compression_levels level_and_flags Zig would have no problem using the translated code.

C and Zig have fundamentally different ideas about implicit conversions so there’s no way around this without changing the original C header. You will have to use @bitCast(MZ_DEFAULT_COMPRESSION) in Zig when using that enum constant where an unsigned integer is expected.

2 Likes

It’s usually a shortcut for ‘set all bits to 1’ without having to worry about the actual width of the integer type, and the underlying type of enum in C is poorly defined anyway.

Some compilers (like GCC or Clang) have a -Wsign-conversion to warn on implicit signed/unsigned conversion and at least in my own C projects I have that warning enabled (unfortunately it’s not in any of the common warning sets like -Wall -Wextra).

2 Likes

I would use std.math.maxInt(u32) instead of -1 when defining this type of enums. It produces same bitpattern for unsigned integers.

1 Like

I wanted to say that the closest equivalent in Zig would be:

const x: u32 = 0 -% 1;

But that doesn’t actually compile because Zig seems to turn this into const x: u32 = -1;:

error: type 'u32' cannot represent integer value '-1'

Zig’s expression resolution really works in mysterious ways…

Both 0 and 1 are comptime_int, so the result of 0 -% 1 is a comptime_int. It doesn’t underflow, so it gets evaluated to -1. And only then is it evaluated attempted to make it fit in a u32.

2 Likes

I was going to counter with:

    const x: u32 = ~0;

but lol, that doesn’t compile either:

main.zig:4:20: error: unable to perform binary not operation on type 'comptime_int'
    const x: u32 = ~0;

(Not that people like Python here much I guess, but you can do this in Python even though Python’s ints are bigints too.)

1 Like

Someone attempted to make it compile just recently. But it turns out to be a bad idea.

1 Like

const x = ~@as(u32, 0);

Has worked for me.
I think the issue is that 0 is assumed as comptime_int by default and zig can’t do a bitwise not on a comptime_int because it doesn’t have a specific number of bits.

4 Likes

I believe the following is more readable:

const x = std.math.maxInt(u32);
2 Likes

I don’t know, I think it depends. If x is a bitmask I would argue that expressing it with a binary operation communicates intent more clearly.
In the general case I think it’s pretty subjective. Personally, I prefer writing it down on my own instead of calling a function when they take the same amount of space because I can verify the code is correct by looking at it instead of going to the function definition.

1 Like