Sugar proposal: @_ as unused fields name generator

Foreword: after couple of years trying zig in embedded/baremetal development I have convinced myself, that it is relatively safe to switch to it completely (at least for certain types of projects).

During my adoption of zig, I tried several approaches to handle io registers for MCU/MPU and even bus connected devices and in the end packed structs have won (obviously).

The thing, which tired me a lot was rendering/rerendering names for unused fields. May be my approach seems odd, but I prefer describe only those io registers and fields of a such, those really necessary to commit changes. Describing the whole bunch of regs/fields for even simple STM’s MCU would take too long, so this seems rewarding way of development: add definitions for only regs/fields we use in current time.

So, it would be beneficial to have similar engine, which just does the same thing which it does for ignored variable _ but for unused fields, those forced to be declared to fill gaps in between used registers/fields.

Instead of writing and maintaining (renaming) unused fields of

const BareMetalReg = packed struct(u32) {
   usefulField: u5 = undefined,
   _: u3 = 0, // three ignored bitfields
   anotherUsefulField: enum(u1) {....} = undefined,
   __: u3 = 0, // two ignored fields
   mostUsefulField: u1 = undefined,
   ___: u19 = 0, // more ignored fields so far...
};

we can write:

const BareMetalReg = packed struct(u32) {
   usefulField: u5 = undefined,
   _unused: u3 = 0,
   anotherUsefulField: enum(u1) {....} = undefined,
   _unused: u3 = 0,
   mostUsefulField: u1 = undefined,
   _unused: u19 = 0,
};

And then, when we need to embed new useful field, we don’t have to rename others by any means. And GitNazis can not punish us any more on reviews for this!!!

In my implementation I decided to utilize unused @_ builtin as a placeholder symbol. It mimics unused variable in func. scope (_) and does not break its current usage in struct’s scope. Moreover, it attracts attention and denotes, that compiler will do something unusual under the hood in this place. It is also pretty fast to write (sorry, I know that zig must be easy readable, not writable :slight_smile: ). There was an option to stick to something like @”_”, but try to type it yourself and make no mistake.

Here is my draft implementation for @_, sorry, I haven’t moved my copy of zig’s src to codeberg. It works for me so far, of course, there are bugs! Anyway, give it a try!

4 Likes

Related proposal: Change meaning of underscore from actual identifier name to ignoring the value being named ¡ Issue #4164 ¡ ziglang/zig ¡ GitHub

Also unrelated to your main issue, avoid setting packed struct members to undefined as that doesn’t have well defined behavior. Related: Proposal: make `packed` `undefined` semantics consistent with those of integers · Issue #25279 · ziglang/zig · GitHub

2 Likes

I believe this proposal is based on an incorrect use of default fields. Even if undefined had well-defined behavior in packed structs, you are still violating the struct’s data invariants by default when initializing fields to undefined like this.

My personal solution would probably be to rewrite your struct like this, naming the unused/reserved fields after the bit ranges they occupy, removing the default values, and adding an init function to construct the struct:

const BareMetalReg = packed struct(u32) {
   useful_field: u5,
   unused_5_7: u3,
   another_useful_field: UsefulEnum,
   unused_9_11: u3,
   most_useful_field: u1,
   unused_13_31: u19,

   const UsefulEnum = enum(u1) { ... };

   fn init(
      useful_field: u5,
      another_useful_field: UsefulEnum,
      most_useful_field: u1,
   ) BareMetalReg {
      return .{
         .useful_field = useful_field,
         .unused_5_7 = 0,
         .another_useful_field = another_useful_field,
         .unused_9_11 = 0,
         .most_useful_field = most_useful_field,
         .unused_13_31 = 0,
      };
   }
};

Granted, it is a bit more to write initially, but favors correctness and solves your renaming problem :slightly_smiling_face:

3 Likes

The proposal is mainly about new operator (@_) rather than fields initialization, which my commit has nothing to do about. Indeed, I (may be erroneously) used undefined initializer to derive some field’s type in comptime @TypeOf((BareMetalReg{}).useful_field) when undefined is nothing to care about (I believe). For example, to set func’s parameter, from which I assumed to modify the register.

Actually, this struct was not inteded for direct instantiation, as soon as in bare metal programming you rather access fields of such structs from pointers and pointers from casts of some hardcoded base address + offset. Sorry that I have triggered crowd in wrong direction…

3 Likes

You can use @FieldType for this, without needing to resort to “hacks” with default values :slightly_smiling_face:

@FieldType(BareMetalReg, "useful_field")

Ah, got it. Still, I would personally rather just name the fields the way I proposed in my previous reply.

1 Like

I used to as well… Until I found myself loosing concentration when I need to rename again and again on every commit. No, enough handwork. Who we are?! We a programmers to make machines do work for us!

Clever! I missed it. Although, when using this route, you just loosing helpers from LSP, which enreaches you coding experience with autocompletion and automated renamings…

1 Like

This strikes me as a heavyweight anti-feature. We get a whole new syntactic ‘thing’, @_, which is exceptional in a bunch of ways: it’s an @ identifier which doesn’t use @"the usual @ syntax", and it’s a field name which doesn’t have to be unique to the type. It wouldn’t be useful anywhere else either.

That’s a lot of weirdness for the purpose: it’s a big change with a small result. I will grant that the way you combine @ and _ is broadly consistent with how Zig already uses them.

What makes it an anti-feature is this: I’ve named this kind of field _pad, _unused1 (etc), and _reserved, and those mean different things! So adding a hieroglyph encourages code to be as gnomic as possible about why a field is not intended for use, which is bad, because it means that it’s better style to never have the @_ syntax object in your code.

I recognize that this kind of thing comes up in embedded more often than ‘normal’ (hosted) code, I see that as more reason to document the reason why a field exists, even in the case where the code doesn’t make use of that field. “We need this for alignment”, “This has a purpose but our code doesn’t use it”, and “We’re holding on to some bits / bytes here, because we might want them someday”, are all distinct cases, I think calling them all @_ is the wrong thing.

Minor quibble, but I consider unused fields a good candidate for default values: 0 for packed and undefined for not-packed are my usual choices, although I suppose anything will do.

A default value should be used only when that value is always valid for any initial instance of the structure. For unused fields, that’s trivially the case. So setting them once and then ignoring them strikes me as better:

const BareMetalReg = packed struct(u32) {
   useful_field: u5,
   _unused_5_7: u3 = 0,
   another_useful_field: UsefulEnum,
   _unused_9_11: u3 = 0,
   most_useful_field: u1,
   _unused_13_31: u19 = 0,

   const UsefulEnum = enum(u1) { ... };

   pub fn init(
      useful_field: u5,
      another_useful_field: UsefulEnum,
      most_useful_field: u1,
   ) BareMetalReg {
      return .{
         .useful_field = useful_field,
         .another_useful_field = another_useful_field,
         .most_useful_field = most_useful_field,
      };
   }
};

Personal taste, but I find this substantially clearer, and nicer to look at.

5 Likes

Thank you very much for reasoning and arguing! This is what I really interested in!
I’m not sure, how many projects you did in embedded/baremetal environment. If you have (or feel to) to spend hours in writing comments for every bit field and recognize costs of maintenance of tons of dead lines of code, this means your projects far far more mission critical than I has been happy to serve in. Yes, proposed feature costs you nothing, because there are a lot of things you don’t use in the language / std lib - one more not used feature does not break anything. The way I think, is a little bit different: when I start project, I prefer to write less (zig is hard to write) and have things running fastest. So, I prefer avoid describing the whole HAL and have a room to effortlessly convert my POC into production (sorry, if I have just broken someone’s world of perfection). In a struggle to have this pipeline, I came to this style of coding/documenting:

/// RM0440 Rev 8: p. 1799
const StatusRegister = packed struct {
    ReceiveBufferStatus: enum(u1) { Empty = 0, NotEmpty = 1 } = undefined,
    TransmitBufferStatus: enum(u1) { Empty = 1, NotEmpty = 0 } = undefined,
    @_: u5 = undefined,
    BusyFlag: enum(u1) { Busy = 1, Idle = 0 } = undefined,
    @_: u1 = undefined,
    FifoReceptionLevel: enum(u2) { Empty = 0, QuaterFull = 1, HalfFull = 0b10, Full = 0b11 } = undefined,
    FifoTransmissionLevel: enum(u2) { Empty = 0, QuaterFull = 1, HalfFull = 0b10, Full = 0b11 } = undefined,
    @_: u3 = undefined,

    pub const offset = 0x8;
};

So, the first comment just sends you to the correct page of that FM, where this shitty register is described. That’s it. Tell me one reason to write comments of unused stuff in this methodology. To start working with some device/periphery I sometimes need to touch 50 fields in 30 registers. Every register is a bunch of weird bit fields, I have no idea will I use or not at all. So describing all those these 400+ fields, or maintaining comments for temporarily unused fields I need to spend x10 time to make things happen. No. This is bad idea. I disagree.

I appreciate your argument for breaking @ usage and this is expected. Let me share my bottom → top thinking when choosing a placeholder seq.

  1. it must be the shortest possible. One symbol?… which one?
  2. it must not break anyone’s codebase… special (usually avoided, but meaningful) sequence?
  3. @unused or @”_”?

In the end, I just dropped reinventing bicycle and fall down to @_. Is this perfect? No. Is this “acceptable?” For whom how.

As for the naming fields differentially _pad, _unused1 (etc), and _reserved, I see no difference, because you don’t use them all. And you don’t use them in just the same manner, irrespectively of the name! :slight_smile: So why drain energy naming these things? You just can not remember ALL nuances about all registers in all your platforms and what is what and how you have to use it or avoid of using it. You just read the proper page of that FM, which sometimes 10k+ pages (ARM reference). So just don’t try to move that FM in your code. I used to and it was terrible idea.

Again, I’m happy with your arguments, they helped me to narrow the topic (I hope)

1 Like

how about this?

/// RM0440 Rev 8: p. 1799
const StatusRegister = packed struct {
    ReceiveBufferStatus: enum(u1) { Empty = 0, NotEmpty = 1 } = undefined,
    TransmitBufferStatus: enum(u1) { Empty = 1, NotEmpty = 0 } = undefined,
    @pad(u5, undefined),
    BusyFlag: enum(u1) { Busy = 1, Idle = 0 } = undefined,
    @pad(u1, undefined), 
    FifoReceptionLevel: enum(u2) { Empty = 0, QuaterFull = 1, HalfFull = 0b10, Full = 0b11 } = undefined,
    FifoTransmissionLevel: enum(u2) { Empty = 0, QuaterFull = 1, HalfFull = 0b10, Full = 0b11 } = undefined,
    @pad(u3, undefined),

    pub const offset = 0x8;
};

One of Zig’s values is “favor reading code over writing code”, one of the aphorisms in the Zen. The case you’re making is focused on writing code, not reading it.

I think it would be hard to make a case for this on the basis of reading code.

You can get pretty far like this:

_1 : u3,
_2: u5,

Same number of characters for the first nine (10 if you want to start with 0. Me, I type pretty fast, so _un1 is not going to slow me down at all. Wouldn’t break the byte budget, and probably it needs no explanation at all. _u1 is too close to a type name IMHO.

I do get that even numbering them causes a bit of friction, you need to track which number comes next, and (speaking personally) I would be unhappy if I rearranged things so that they ended up out-of-order, and would probably waste time putting them back in order. But how often is that going to happen?

That’s if you are actually in a situation where you don’t want to waste cycles on naming unused fields: great, don’t. Any terse and simple schema which starts with _ is broadly equivalent here, after that it’s mechanical.

I’m looking at it from the other direction: the language should not explicitly encourage this behavior. Having a special glyph says “this is what you should do, unused fields should be @_”, I think there are many cases where this should not be the default.

It’s defensible when, as you say, you’re racing through hundreds of structs translating some PDF so you can start doing the important stuff. But I think a few bytes of extra friction there is justified by not encouraging that kind of thing explicitly.

As you say:

Mostly, in your use case, these fields do have canonical names, and in a perfect world with infinite time and a vast stable of code minions, they would all have that name, prepended with _ to indicate that the code isn’t using them. You must admit that, coming back to a project like this years later, or onboarding someone, it would be a lot nicer. Say the task is “do $thing with uart_fmm_8”, it’s better if it’s called _uart_fmm_8 already, not _un25 or whatever.

In the real world, time is money, and neither come easy. But I don’t think the language should explicitly support shortcuts, in a way which encourages coders to use them even when they can afford not to.

1 Like

I’m not sure where underscore _ came from (Go? something elder?), but I imagine Andrew Kelley in the beginning of the road stopped writing @ignored() = ... and screwed that principle for himself just because naming unused things is a dumb idea. (Of course there wasn’t such a thing, like @ignored() :slight_smile: )

_ is more legible than @ignore().

It’s a very old convention. So far as I know, Go was the first language to give it special semantics (write-only), rather than just treating it as a convenient way to spell “this is discarded / unused deliberately”.

In Go, _ has the same meaning for a field: it can’t be referenced and the struct can have as many blank fields as one would like. In Zig it’s just a field name, so you can have only one, and it can be referenced:

const UnderStruct = struct {
    _: usize,
};

test UnderStruct {
    const u_s: UnderStruct = .{ ._ = 42 };
    try expectEqual(42, u_s._);
}

I can’t point you at an issue on the subject, but it’s a safe bet that the alternative was considered and rejected.

If I had to pick between @_, and just making _ work in structs the way it works in Go, I’d pick the latter. Given that Zig doesn’t have the latter, it’s unlikely to acquire the former.

Zig also decided against the Plan 9-style struct embedding which Go uses, and the reason was legibility. It’s very convenient to write, but it’s not as convenient to read, fields start mysteriously appearing which are not explicitly defined on the struct.

Zig also decided against private fields, and _ or @_, whatever you want to cal lit, creates an ‘unmentionable’ field.

In any case, C is fairly popular in embedded, despite lacking this feature as well. So probably not an insurmountable barrier to adoption.

3 Likes

sure. I’m not trying to convince, that private field is an option. Although this type of field is better called inaccessable-by-any-means. The facts are:

  1. unused fields are inevitable in development
  2. there are kinda mental obstacles to help write less dumb code (and maintain it as well), not technical.
  3. zig indeed is about to be compact and encourages to write less useless code (and spawn less bugs) providing huge bunch of useful features.

when I decided about this sugar, my main concern was about R3 (repeated routine rewrites), those consequently spawn new bugs.

So, basically, this sugar is a good candidate to be landed, maybe not with @_ as a driver.

I gave an alternative above with @pad(<type>, <initializer>). This may work, but it worse.

Just wanted to share my approach in case any ideas help. I have almost 300 registers implemented so far, and eventually settled on the approach below for a number of reasons:

  1. Field names match the SoC specification. (simple copy paste, and easy to search for later)
  2. SoC ‘unused’ bits get mapped to '_'bitPos - which is quick to type, and useful for re-checking definitions against Spec.
  3. Default values match the Spec.

I name all fields, firstly for ease of future use, but also because it is faster to copy n paste complete definition at the time I have the thousands page doc open.

I also use a few enums, but only where I will read the code a lot.

    var reg = pio.PD_CFG2_REG.peek();
    reg.P23 = pio.PIN_CFG.OUT;
    pio.PD_CFG2_REG.poke(reg);

The nicest feature I found with this approach is being able to use reflection on the packed struct to discard the fields that have a name starting with ‘_’, so then I have nice readable serial console output for a register contents; REG_NAME{ FIELD_NAME=value, ... which also shows enum names or true/false where applicable.

I love the current design of packed struct, and hope it never changes. I especially dont want to see unused fields become an error, which I think was on the table for a while.

//8.5.6.4.
pub const RSB_STAT_ADDR = Reg32(RSB_STAT, RSB_BASE_ADDR + 0x2c);
pub const RSB_STAT = packed struct(u32) {
    TRANS_OVER: bool = false,
    TRANS_ERR: bool = false,
    LOAD_BSY: bool = false,
    _3: u5 = 0,
    TRANS_ERR_DATA: u4 = 0,
    _12: u4 = 0,
    TRANS_ERR_ACK: bool = false,
    _17: u15 = 0,
};

I changed a few things over time like replacing the original register name name_REG with name_ADDR so that I have a nice Type name too.
I also recently started adding page numbers to the section comment, because pdf search is a bit slow on my setup.

5 Likes

This seems more than adequate, and future-you will thank you for the bit of extra attention up front.

This supports my conjecture that @_ would discourage using better names, and should therefore not exist.

That’s another issue with unmentionable fields: either StructField would have to be modified to identify them, or the invariant that all StructField names are unique would be broken, and neither of those is very nice.

Surmountable, of course, most things are. But a strike against. Interestingly Fn handles this by giving a Param no name, which won’t work for structs due to @field and a bunch of other reasons.

Random observation: you could even check the number at comptime! Parse the 12 in _12, and confirm it matches the result of @bitOffsetOf the field.

2 Likes
    @compileLog(@typeInfo(StatusRegister).@"struct".fields[2]);
...

Compile Log Output:
@as(builtin.Type.StructField, .{ .name = "__unused_field_278_1"[0..20], .type = u5, .default_value_ptr = @as(*const anyopaque, @ptrCast(&@as(u5, undefined))), .is_comptime = false, .alignment = 0 })

Nothing is broken, btw, so no worries.

I like your approach and usage of _bitNum!

Couple unrelated notes:

I used same approach of naming regs/fields trying to keep them 1:1 with documentation until I got several ICs to work with simultaneously. Engineering teams of IC’s vendors quickly teached me, that it is not a good idea any more because different naming conventions made such mnemonics complete cryptography. So I switched to PascalCase, by the way trying to keep Capital letters corresponding with vendor’s abbrevs.

I’m happy that we independently came to the same comments for regs. I prefer to refer doc ID with rev and page because 1) it makes an ultimate address; 2) allows to search with less key strokes (page number G in zathura).

The last comment about searching by reg/field name of chapter name… This works well with relatively small documents, or when you can spend time for driving through Contents of the doc. I was forced to stop that practice and switch to page numbers after first attempt to work with ARM DDI 0487… 14k pages is really tough weight for almost all pdf viewers.

1 Like

One more thing, that I have not checked in advance, but thanks to your notes, it appears, that these fields are not unmentionable! :slight_smile:

const field_name = try std.fmt.allocPrint(gpa, "{s}_{d}_{d}", .{ field_prefix, @intFromEnum(container_node), field_index });

If you remove printing container node, you can easily refer the field as .__unused_field_1. Moreover, any name can be rendered, referring field size or bit number (or byte/word for extern structs) and any checks fulfilled.

Thanks. Yeah the Arm docs are monsters! I’m currently working with the Arm A.R.M. (11k pages) and there I have a slightly different approach for processor registers that uses a bit of Global Assembly:

//p.5857 System Control Register
pub const SCTLR_EL1 = packed struct(u64) {
    M: u1 = 0,
    A: u1 = 0,
    C: u1 = 0,
    SA: u1 = 0,
    SA0: u1 = 0,
    CP15BEN: u1 = 0,
 ...
    pub fn peek() @This() {
        return @bitCast(gasm.peekSCTLR_EL1());
    }
    pub fn poke(v: @This()) void {
        gasm.pokeSCTLR_EL1(@bitCast(v));
    }
};

I must admit that I was very tempted to rename these! but they are so universally recognized/referenced I realized I’d be making more work for future me.

I use this approach mainly because; I already have quite some Global Assembly, my old brain cant parse inline assembly, and I like to edit my GA with syntax highlighting.

I’m sad that there is talk about removing Global Assembly, and even packed struct naming is currently in flux!

What I havent got my head around yet is whether I need some calling convention fluff (naked? c?) for calling down to GA, or from GA into my Zig.

Anyway thanks for the reassurance regarding large .pdf docs. I’m new to linux (Asahi) and wasnt sure if I was holding it wrong.

Out of interest how do you handle undocumented registers? I’m not happy with my approach:

    Reg32Raw(0x1ca1018).poke(0x1e);
    Reg32Raw(0x1ca101c).poke(0x0);
    Reg32Raw(0x1ca1020).poke(0x303);

Which is just a wrapper around a *volatile u32 but not sure what else I could do. It niggles my OCD to have unknowns, but not as much as the errors/typos in this SoC document :slight_smile:

1 Like