Help with inline assembly x86_64

I am fairly new to this llvm/gcc style inline assembly. I am mainly focused on x86_64 for now…

so the thing I am trying to do is jmp to a global variable’s value.

var jmp_back: usize = 0;
export fn DoSomething() callconv(.naked) noreturn {
    @setRuntimeSafety(false);

    asm volatile (
        \\ jmp *%[jmp_back]
        :
        : [jmp_back] "m" (jmp_back),
        : "memory", "cc"
    );
}

which generates kind of not expected assembly, it moves the value of the variable to rsp + 08, then stores the pointer to rsp + 08 to rsp, then tries to jmp [rsp], so it would be jumping to the address of rsp + 08 Compiler Explorer

and I don’t want to use “r” input constraint i would like to avoid using any general purpose registers.

No, it jump to the address stored in rsp+08.

1 Like

I don’t get it lea rax, [rsp + 8] supposed to store the address of rsp+8 to rax right? then this rax gets stored to [rsp] so when jmp [rsp] happens its going to read the address of rsp+8, at least that’s what’s my debugger is showing what’s happening

For those who didn’t get the problem:

Start
rax: undefined 
[rsp]: undefined
[rsp + 8]: undefined

mov     rax, qword ptr [example.do_something_trampoline]
rax: value of do_something_trampoline
[rsp]: undefined
[rsp + 8]: undefined

mov     qword ptr [rsp + 8], rax
rax: value of do_something_trampoline
[rsp]: undefined
[rsp + 8]: value of do_something_trampoline

lea     rax, [rsp + 8]
rax: address of rsp + 8
[rsp]: undefined
[rsp + 8]: value of do_something_trampoline

mov     qword ptr [rsp], rax
rax: address of rsp + 8
[rsp]: address of rsp + 8
[rsp + 8]: value of do_something_trampoline

jmp     qword ptr [rsp]
This will jump to address of rsp + 8,
but we should be jumping to do_something_trampoline
1 Like

No, [] means indirect. It stores in rax the value that is loaded from rsp+8 address.
In zig notation it is: rax = (rsp + 8).*

EDIT: I was wrong, load effective address (lea) loads the address and not the value.

then the whole thing doesn’t make sense, it’s loading the value to rax from the variable address, then storing it to rsp + 8 then again reloading it to rax, lea rax,[rsp+8]

then storing rax to rsp then jmp rsp…

some c volatile keyword shenanigans?

EDIT: I hope that both examples are correct now

var do_something_trampoline: usize = 0;

export fn DoSomething() callconv(.naked) noreturn {
    @setRuntimeSafety(false);
    asm volatile (
        \\mov do_something_trampoline,%rax
        \\jmp *%rax
        ::: "rax"
    );
}

The above code produces:

DoSomething:
        mov     rax, qword ptr [do_something_trampoline]
        jmp     rax

Another possibility is:

var do_something_trampoline: usize = 0;

export fn DoSomething() callconv(.naked) noreturn {
    @setRuntimeSafety(false);
    asm volatile (
        \\lea (do_something_trampoline),%rax
        \\jmp *(%rax)
        ::: "rax"
    );
}

producing:

DoSomething:
        lea     rax, [do_something_trampoline]
        jmp     qword ptr [rax]
1 Like

Probably completely off topic, sorry sorry.
In the far past I did quite some assembly in Delphi.
When I see the asm syntax we need in Zig I am kinda confused.
Why these superweird stuff and not just some asm block with real statements?

\\jmp *(%rax)
::: "rax"
 // omg...

Assembly is a block box for the compiler, so it would have to disable all optimizations when it finds one. Constraints are a way to recover some optimizations. You tell the compiler what you’re reading and writing, and it ensures that the assembly block is properly ordered according to the statements around it. Without that, the compiler could reorder the statements in such a way that you could be reading a value that wasn’t computed yet, or your write could be overwritten.
Constraints are not optional, reading or writing values without setting the proper constraints is undefined behavior.

I was wondering why not some way to do a direct jmp [do_something_trampoline]

why these type of indirection is needed like loading to rsp… rax

I think x86_64 supports jmp [rip + offset] something like that?

Edit: actually works Compiler Explorer

You still need to set the proper constraints:

var do_something_trampoline: usize = 0;

export fn DoSomething() callconv(.naked) noreturn {
    @setRuntimeSafety(false);
    asm volatile (
        \\jmp *do_something_trampoline
        :: [dummy] "rmx" (do_something_trampoline): "memory"
    );
}

With optimizations enabled, this generates the desired assembly: godbolt.

1 Like

can you explain a little bit more about “rmx”?

Interestingly

test.zig:

const std = @import("std");

var do_something_trampoline: usize = 0;

export fn DoSomething() callconv(.naked) noreturn {
    @setRuntimeSafety(false);
    asm volatile (
        \\jmp *do_something_trampoline
        :
        : [dummy] "rmx" (do_something_trampoline),
        : "memory"
    );
}

pub fn main() void {}

zig build-exe test.zig

gives error: lld-link: undefined symbol: do_something_trampoline

zig version: 0.15.0-dev.631+9a3540d61

needs export var do_something_trampoline: usize = 0; for it to compile kinda weird? is it fixed in some newer version? or whatever https://godbolt.org/ is using?

1 Like

That’s intended. My bad, I forgot about export.
Godbolt only runs the compilation phase, it doesn’t do linking, which is way it didn’t error.
You can avoid the export by doing this:

    asm volatile (
-        \\jmp *do_something_trampoline
+        \\jmp *dummy
        :: [dummy] "rmx" (do_something_trampoline): "memory"

The body of an inline assembly is a separate compilation unit, so it can only access things that are exported, or that explicitly provided to it in the constraints.
"rmx" is explained here.
Concatenating constraints by juxtaposition like this means an alternative. r is register, m is memory, and x is supposed to be “anything”. The compiler can choose to pass the value of do_something_trampoline to the body of the inline assembly in any of these places. You’d imagine that x would make everything else redundant, but, for some reason, it doesn’t. :man_shrugging:
I tried each individual constraint and neither one generated the desired assembly, only when I provided all the alternatives.

1 Like

is it kind of like a workaround zig’s unstable inline assembly or that’s how this “rmx” suppose to work?

btw

this is also giving me undefined symbol error

I don’t know. Inline assembly is kind of an afterthought in LLVM, and is poorly documented.

Ops, I always forget the syntax of inline assembly. The correct way is

\\jmp *%[dummy]
1 Like

The debug version moves the value to rax, then to [rsp] then jmp [rsp] :") not trying to be annoying but it’s not consistent which is kinda weird…

1 Like