0xFF from intFromBool impossible?

Once again I had 2 bools with non-binary values today. Others working fine.

So I have this at top level of main zig file:

var impossibleCodeEntered: bool = false;

Later on some code is triggered as though the value is true.

When I do an intFromBool, it returns 0xFF (pretty sure my bare-metal uart logging is telling the truth here).

const madness: u8 = @intFromBool(impossibleCodeEntered);
gasm.logU8(madness);

However this still evaluates true:

if (madness == 1)

All of this works as expected if I add an assignment at the top of main().

impossibleCodeEntered = false;

Is it possibly 0xFF is some kind of internal marker for uninitialized memory that I should not be seeing? I am developing a bare-metal OS in Zig, so there is a chance the problem is between the chair and keyboard :slight_smile: but it just seems such a basic thing to go wrong, that I have no idea what to think.

This isnt the first time I’m seeing strange aarch64 memory values: (ignore the FP stuff, that is fixed)

I am guessing that @sizeOf(bool) is 4 (32bit) and a ā€œbinary notā€ instruction is used to toggle the boolean value (i.e. zeros become ones for all 32bit and vice versa).
Examining the generated assembly code can confirm my hypothesis.

1 Like

From the machine code perspective, a true/false decision is a test for zero. So true can be any non zero value.

Depending on optimization level the compiler might even invert the logic so you have an ā€˜is_valid’ Boolean in a register and the optimizer might decide to keep a not_is_valid value, as long as the code behaves the same.

The expression result might even live only in a single bit flag register on some architectures.

The if statement in C is defined accordingly, taking the false branch for zero and the true branch for every other value.

If you just cast such a value, only 0 for false is guaranteed, otherwise the compiler would need to add a load for a true literal.

So even if true is defined as 1 that’s not what’s necessarily used by the machine code to signify a truth value.

If you want that use an if(bool) 1 else 0;

2 Likes

>Examining the generated assembly
Thanks. My arm cheat-sheet says MVN, but objdump -d hasnt revealed any, and my attempts to add some extra instructions to help seem to be optimized away.

Thanks. The problem here is that the bool var starts life as false, but is later being evaluated as true, unless I set to false again at the top of main. ie I am not setting it to true. It just seems to be defaulted to a non zero value.

But what you say makes a lot of sense, however I would still expect intFromBool to honor it’s contract in a binary way ie. returning 1 or 0.

If you’re on bare metal, make sure you zero your .bss section. IIRC the zig compiler places zero-init values in .bss and does not emit code to zero that section.

3 Likes

Ah ha! moment, that would explain a lot, and I am actually amazed anything works :slight_smile:

I’m new to linker files. I have a .bss section, but not sure it is getting used. Since I use objcpy with .bin to generate a binary file, which is just being loaded and executed at the fixed address I provide.

From what I have just been reading, I would need an elf loader to have things setup correctly. Since I dont have that, is there a way I can solve this from within my linker file? This is all I have so far:

.bss : {
    \*(.bss)
}

Thanks, hopefully this will explain most of the other memory oddities! although I suspect the misaligned packed struct read wouldnt be affected by this, but would certainly explain the bad var byte value and default enum values :slight_smile:

In the linker script, you can create some symbols that hold the start and end of bss, and probably more sections.

Then you have some startup code that uses those symbols to initialise sections properly.

1 Like

It looks like:

  .bss (NOLOAD) :
  {
    _bss_start = .;
    *(.bss*)
    _bss_end = .;
  } > ram
  _bss_size = SIZEOF(.bss);
extern var _bss_start: u32;
extern var _bss_size: u32;

export fn _start() noreturn {
    // must set the stack first

    // copy .data from flash

    // clear .bss (@memset or ...)
    const bss_size = @ptrToInt(&_bss_size);
    const bss = @ptrCast([*]u8, &_bss_start);
    for (bss[0..bss_size]) |*b| {
        b.* = 0;
    }
    main();
}

NOTE: The value goes into .data because it is initialized.
If the program is not loaded in ram you must copy the data from the flash readonly region into ram (These regions are defined in the linker script).

2 Likes

Thanks! Of course these things are a bit non-deterministic and thus hard to test, but I took out my temporary bool hacks, and my Zig OS has been booting smoothly all day :slight_smile:

I boot from SD card, but I think uboot loads my OS direct into RAM (I see a boot.scr), so would I be right to assume I only need the ā€˜>ram’ option only once I am brave / stupid :wink: enough to install my OS to eMMC? I’ve seen some example linker files with memory layouts described, which you mentioned but I havent investigated further. I do have a base address defined (. = 0x40080000;) and a stack much further in the RAM (2GB total). That seems to work, since I am not doing any allocations, and have a shallow max stack depth.

Can I ask the difference between *(.bss) and *(.bss*) please? I assume this means Zig exports it var symbols prefixed with .bss to ensure they land in the .bss section? Or is Zig also somehow generating its own section names?

PS what did you mean by the value goes to .data? you mean any calculated values/symbols in the linker file end up in .data?

>ram instructs the output of the section to be placed in the MEMORY region named ram. See the ld manual for the concepts and the link script commands reference.

*(.bss) copies the input section .bss to output.
*(.bss*) copies any input section named .bss* to output (where * can match any string including the empty one).

Whatever input section is mentioned inside *() is copied to the output section.
e.g. *(.data .rodata) copies both .data and .rodata, *(.data_*) copies any region with a name that starts with .data_.

It is placed in the .data section because there is an initial value (false):
var impossibleCodeEntered: bool = false;

Zig uses the same conventions with C. You can control the output section using the linksection zig keyword and the memory region using the addrspace zig keyword. Both keywords can be used in fn and var declarations.

The most common sections are:

  • .text for code
  • .rodata for read only data (like the contents of string literals)
  • .data for global or static variables with an initial value
  • .bss for global or static variables without an initial value (initialized to 0).
2 Likes

Thank-you for the extra details, and the documentation link! I haven’t done much C since KnR 2nd ed. days, and dont recall needing a linker before, so I appreciate the guidance very much.

I put the .bss memory clear into my OS boot global assembly (core0), so that I know for sure it is done as early as possible, and before I call into Zig, especially because I hope soon to be booting part of my OS in parallel on a 2nd core, which I realise now will be another Zig entry point from the same linker output. That’s gonna be fun I’m sure :slight_smile:

1 Like