*T and [*]T values can't be coerced into each other, but they can be operated with the subtract operator. Intended?

The rule that *T values can’t be coerced into [*]T can be passed by the fact.

pub fn main() void {
    const a: [3]u8 = .{1, 2, 3};
    var ps = &a[0];
    var pm: [*]const u8 = &a;
    
    ps = @ptrCast(pm);
    pm = @ptrCast(ps); // can be bypassed
    
    const d = ps - pm;
    
    pm += ps - pm; // bypassed
    
    _ = &ps;
    _ = ±
    _ = &d;
}

What is the reason why they can not be coerced into each other?

The above code actually has a bug: (ps - pm) might result a bad usize value if ps < pm. (It is not a bug, see below.)

Now, ps - pm doesn’t panic if ps < pm at run time. But should it panic instead?

Or should ps - pm return a signed integer?

My completely non-official take:

  • A pointer to single (PTS) is useful because it’s a pointer with an inferred length of 1, giving a bound which is the size of struct.
  • A pointer to many (PTM) is just a pointer and has no length. That’s why we prefer slices that have a pointer and a length (giving a bound of num_items * size_of_item).

Coercing a PTS to a PTM would lose information because you’d drop the length, so that’s invalid without a cast. Coercing a PTM to a PTS wouldn’t lose any information, but you’d only be able to access the first item through it. Not sure if that’s invalid or not. I’d expect to need a cast.

When you perform arithmetic on pointers, you’re calculating offsets. d in your code is of type usize, it’s just an integer. So the line pm = pm + (ps - pm) is just adding an integer to a PTM, which is perfectly legal.

Address spaces are defined to wrap around. i.e. 0xffff_ffff_ffff_ffff is next to 0x0000_0000_0000_0000. Therefore, signed or unsigned, the maths is the same. if ps < pm it calculates the very big offset to the next repetition of pm.

0                                      0                                      0
|------pm---------ps-------------------|------pm---------ps-------------------|
                    <----------- d ---------->
1 Like

True, thanks. I just verified that it is not a bug.

True. However, I have the inverse expectation: PTM should be able to be coerced into PTS, as the bypass indicates. I’m not sure if this is true: the operands of the subtract operator should be of the same type, or they can be peer type resolved to the same type. If it is true and PTM can’t be coerced into PTS, then the fact that ps - pm is legal loses rationale.

how is that more valid than saying usize should coerce to [*]T because you can do ptr arithmetic with usize?

because that is literally what your bypass is;
ps - pm produces a usize, which you are then adding to pm.

ps - pm is certainly useful, if ps is a pointer to an element within pm, then ps - pm will give you its index.

pm + n is useful, it is literally just indexing, whether zig should have multiple ways to index is a different discussion.

This whole thing is just the result of [*]T not having bounds, which lets you access arbitrary memory.
This is not at all an argument for allowing *T -> [*]T coercion.

2 Likes

Think of it this way. A PTM is a pointer to a completely unbounded array of items. A PTS is a pointer to a single item. They point at two different types of things and so are not compatible. Hence no coercing, just like a pointer to u8 and a pointer to f32 won’t coerce.

If you want to point to a single item in the array you take the address of that single item, just as you did with var ps = &pm[0]. That’s the right syntax for that. No coercion necessary. Not var ps : *u8 = pm which would require coercion of an array ptr to an item ptr..

Zig is not C. C pointers fill multiple roles which Zig separates out into separate types. PTS for references. Slices for bounded arrays. PTM for unbounded arrays. C pointers for C compatibility.

Zig is also not Rust. It doesn’t ban you from doing pointer arithmetic. What it does instead is say “Here are these bounded pointer types. They’re safer because they have lengths. Use them instead, but if you need them pointers-to-many are available. We trust you to be an adult when you use them.”.

3 Likes

I never expect *T -> [*]T coercion, I expected [*]T -> *T coercion.
(Edit: sorry, after looking back my above comments, I indeed made some confusions, Actually, I expected coercion for both directions.)

Thanks for your explanation, which let me realize that the subtract operator doesn’t require its operands to be the same type. (I should actually realize it already before).

My current understanding is that ps - pm is treated differently from numeric subtraction operations. There is no peer type resolution for this subtraction expression.

After thinking for awhile, I think your viewpoint

Coercing a PTS to a PTM would lose information because you’d drop the length, so that’s invalid without a cast.

doesn’t stand, because in the coercion *[N]T -> [*]T, the length info is also dropped. There should be another reason to disallow *T -> [*]T.

BTW, I found another way to by-pass the restriction for *T -> [*]T:

pub fn main() void {
    const a: [3]u8 = .{1, 2, 3};
    const ps = &a[0];
    var pm: [*]const u8 = undefined;

    // pm = ps; // error
    const p1a: *const [1]u8 = ps;
    pm = p1a;
}

if a function takes ptr: [*]T it probably intend to dereference ptr[1].*
So automatically converting a *T to [*]T seems dangerous. I prefer having an explicit conversion

2 Likes

I agree with the viewpoint, though I don’t think it is strong. Because either implicit coercion or explicit cast generally needs an explicit conversion expression. For example, even if such coercion is allowed, an extra is still needed to use *T as [*]T.

var pm: [*]T = psArg;
// or
@as([*]T, psArg)

The restriction does prevent careless unintended *T -> [*]T assignments.

I have marked @vulpesx’s comment as the answer, which explains why subtractions with a *T operand and a [*]T operand is intended.

this is not true, the coercion requirement (in the sense you are speaking) is for there to be an expected type.

You are confusing yourself by thinking of the clearly visible n: T = x; and @as(T, x) as somehow different from foo.bar = x; or foo(x) which also have destination types.

The loss of information is not as hard of a line as you interpret it.

Besides [*]T on its own is not that useful unless you have some way to determine its length, otherwise you just can’t write useful logic. If you don’t realise that beforehand, you will realise that writing the code, remember zig is not aimed at people new to these lowlevel things.

Not to mention how easy zig makes it to not assume things like sentinel values or lengths.

What I am saying is that in practice the coercion from array pointer to pointer-to-many doesn’t exist in a vacume, the programmer is likely or will likely provide the information somewhere else.

On the other hand *T -> [*]T is just far less needed, it is more likely it is a mistake than intended, so zig forces you to do an explicit conversion which is still pretty easy to do.


this is my logic, i am not speaking andrews reasoning.
I use likely/unlikely based on my own experience and reasoning, it would be interesting for this to be investigated more concretely.

1 Like

I suddenly realized that pointers in Zig have a addr_space property. So the by-pass way 1 shown in the first comment will not always work.

And the second by-pass way will be removed. (Though, there is still the third way: pm = ps[0..1].ptr)

Okay, not much confusion now. Thank all for the explanations.