Bits monstrosity

ericlang · November 6, 2024, 10:34pm

const mask: u8 = @intCast(@as(u8, 1) << @intCast(bit));

Do we really have to write stuff this way? instead of:

const mask: u8 = 1 << bit;

dimdin · November 6, 2024, 10:38pm

This works, and prints 16:

const std = @import("std");

pub fn main() void {
    const bit: u8 = 4;
    const mask: u8 = 1 << bit;
    std.debug.print("{d}\n", .{mask});
}

What is the type of bit in your case?

ericlang · November 7, 2024, 8:17am

It was more of a general question ragarding numbercrunching.
Also when using for() loops casting is needed,.

const width: u32;
for(0..width) |i|
{
    // and now the `i` has magically become a usize
}

I guess it is party inexperience. But still it sometimes makes code less readable.

LucasSantos91 · November 7, 2024, 10:59am

Related.
I am curious as to what made Andrew reject this.

ericlang · November 7, 2024, 11:49am

Well I guess because there are other issues. First the bugs than maybe some language additions.
Looping should definitely be possible with all int types, also negative ones.
Maybe even with inclusive numbers (using ...).

Sze · November 7, 2024, 1:34pm

This is a duplicate of the issue @LucasSantos91 mentioned (and it is also closed):

github.com/ziglang/zig

For loop range is unusable with types other than usize

opened 07:59AM - 01 Sep 23 UTC

closed 06:59AM - 16 Aug 24 UTC

TeamPuzel

### Zig Version 0.11.0 ### Steps to Reproduce and Observed Behavior I h…ave this function: ```zig pub fn sprite(data: *const Sprite, x: u8, y: u8) void { std.debug.assert(x <= 127); std.debug.assert(y <= 127); for (data, 0..8) |column, sy| { for (column, 0..8) |color, sx| { if (x + sx > 127 or y + sy > 127) break; const sxc: u8 = @intCast(sx); const syc: u8 = @intCast(sy); if (color != transparency) pixel(sxc + x, syc + y, color); } } } ``` where Sprite is `[8][8]Color` (Ignore that the function has bugs, that's not really the point 🙂) Doesn't matter if I use `0..` or `0..8` it still doesn't compile, as the range is seemingly inferred to be usize. The only way to make it work is to use the two casts as shown in the function above to make the compiler happy. This is annoying because this range is compile time known, so it should be clear it can fit in a `u8`. ### Expected Behavior I expected to be able to use `u8` in loops.

However Andrew mentions in the linked comment above:

there is a chance you may get your wish with #3806.

Which is this issue:

github.com/ziglang/zig

allow integer types to be any range

opened 10:31PM - 29 Nov 19 UTC

andrewrk

breaking proposal

Zig already has ranged integer types, however the range is required to be a sign…ed or unsigned power of 2. This proposal is for generalizing them further, to allow any arbitrary range. ```zig comptime { assert(i32 == @Int(-1 << 31, 1 << 31)); assert(u32 == @Int(0, 1 << 32)); assert(u0 == @Int(0, 1); assert(noreturn == @int(0, 0)); } ``` ---- Let's consider some reasons to do this: One common practice for C developers is to use `-1` or `MAX_UINT32` (and related) constants as an *in-bound* indicator of metadata. For example, the stage1 compiler uses a `size_t` field to indicate the ABI size of a type, but the value `SIZE_MAX` is used to indicate that the size is not yet computed. In Zig we want people to use [Optionals](https://ziglang.org/documentation/master/#Optionals) for this, but there's a catch: the in-bound special value uses less memory for the type. In Zig on 64-bit targets, `@sizeOf(usize) == 8` and `@sizeOf(?usize) == 16`. That's a huge cost to pay, for something that could take up 0 bits of information if you are willing to give up a single value inside the range of a `usize`. With ranged integers, this could be made type-safe: ```zig const AbiSize = @Int(0, (1 << usize.bit_count) - 1); const MyType = struct { abi_size: ?AbiSize, }; var my_type: MyType = undefined; test "catching a bug" { var other_thing: usize = 1234; my_type.abi_size = other_thing; // error: expected @Int(0, 18446744073709551615), found usize } ``` Now, not only do we have the Optionals feature of zig protecting against accidentally using a very large integer when it is supposed to indicate `null`, but we also have the compile error helping out with range checks. One can choose to deal with the larger ranged value by handling the possibility, and returning an error, or with `@intCast`, which inserts a handy safety check. How about if there are 2 special values rather than 1? ```zig const N = union(enum) { special1, special2, normal: @Int(0, (1 << u32) - 2), }; ``` Here, size of N would be 4 bytes. ---- Let's consider another example, with enums. Enums allow defining a set of possible values for a type: ```zig const E = enum { one, two, three, }; ``` There are 3 possible values of this type, so Zig chooses to use `u2` for the tag type. It will require 1 byte to represent it, wasting 6 bits. If you wrap it in an optional, that will be 16 bits to represent something that, according to information theory, requires only 2 bits. And Zig's hands are tied; because currently each field requires ABI alignment, each byte is necessary. If #3802 is accepted and implemented, and the `is_null` bit of optionals becomes `align(0)`, then `?E` can remain 1 byte, and `?E` in a struct with `align(0)` will take up 3 bits. However, consider if the enum was allowed to choose a ranged integer type. It would choose `@Int(0, 3)`. Wrapped in an optional, it actually could choose to use the integer value 3 as the `is_null` bit. Then `?E` in a struct will take up 2 bits. Again, assuming #3802 is implemented, Zig would even be able to "flatten" several enums into the same integer: ```zig const Mode = enum { // 2-bit tag type Debug, ReleaseSafe, ReleaseFast, ReleaseSmall }; const Endian = enum { // 1-bit tag type big, little, }; pub const AtomicOrder = enum { // 3-bit tag type Unordered, Monotonic, Acquire, Release, AcqRel, SeqCst, }; pub const AtomicRmwOp = enum { // 4-bit tag type Xchg, Add, Sub, And, Nand, Or, Xor, Max, Min, }; const MyFancyType = struct { mode: Mode align(0), endian: Endian align(0), atomic_order: AtomicOrder align(0), op: AtomicRmwOp align(0), }; ``` If you add up all the bits of the tag type, it comes out to 10, meaning that the size of MyFancyType would have to be 2 bytes. However, with ranged integers as tag types, zig would be able to flatten out all the enum tag values into one byte. In fact there are only 21 total tag types here, leaving room for 235 more total tags before MyFancyType would have to gain another byte of size. ---- This proposal would solve #747. Peer type resolution of comptime ints would produce a ranged integer: ```zig export fn foo(b: bool) void { // master branch: error: cannot store runtime value in type 'comptime_int' const x = if (b) -10 else 100; // proposal: @typeOf(x) == @Int(-10, 101) } ``` ---- With optional pointers, Zig has an optimization to use the zero address as the null value. The `allowzero` property can be used to indicate that the address 0 is valid. This is effectively treating the address as a ranged integer type! This optimization for optional pointers could now be described in userland types: ```zig const PointerAddress = @Int(1, 1 << usize.bit_count); const Pointer = ?PointerAddress; comptime { assert(@sizeOf(PointerAddress) == @sizeOf(usize)); assert(@sizeOf(Pointer) == @sizeOf(usize)); } ``` One possible extension to this proposal would be to allow pointer types to override the address integer type. Rather than `allowzero` which is single purpose, they could do something like this: ```zig comptime { assert(*addrtype(usize) i32 == *allowzero i32); assert(*addrtype(@Int(1, 1 << usize.bit_count)) == *i32); } ``` This would also introduce type-safety to using more than just 0x0 as a special pointer address, which is perfectly acceptable on most hosted operating systems, and also typically set up in freestanding environments as well. Typically, the entire first page of memory is unmapped, and often the virtual address space is limited to 48 bits making `@Int(os.page_size, 1 << 48)` a good default address type for pointers on many targets! Combining this with the fact that pointers also have alignment bits to play with, this would give Zig's type system the ability to pack a lot of data into pointers which are annotated with `align(0)`. ---- What about two's complement wrapping math operations? Two's complement only works on powers-of-two integer types. Wrapping math operations would not be allowed on non-power-of-two integer types. Compile error.

where @mnemnion linked to that comment.

So if we get integer types which can be any range, we might get for loops with arbitrary types too.

I think that would be pretty cool, but those integer types of any range might be a lot of work to implement, so I think we need to have some patience for that.

ericlang · November 7, 2024, 1:56pm

Yes I saw some of the links. Except the last one.
I think you are right it will be quite some work to get these things right.

(edit) like these things, which in C# creates a neverending loop as I noted by personal accident.

byte b = 10;
while (b >= 0)
{
        b--;
}

mnemnion · November 8, 2024, 5:54pm

In short: yes. Zig peer type coercion has a very simple rule for numeric types:

Integers coerce to integer types which can represent every value of the old type, and likewise Floats coerce to float types which can represent every value of the old type.

Shifting has an additional rule: the shiftand must be of a width log₂ of the shifted type, so to shift a u64 the shiftand must be a u6 or smaller. A consequence of this rule is that a comptime_int can’t just be cast to the result type.

These rules can be annoying, but they solve more problems than they cause. I often add this function to code:

pub fn cast(N: type, val: anytype) N {
     return @intCast(val);
}

So I can write cast(u8, something) instead of @as(u8, @intCast(something)), this makes the resulting code easier to read and understand.