]> git.proxmox.com Git - rustc.git/blame - src/doc/rust-by-example/src/unsafe/asm.md
New upstream version 1.70.0+dfsg1
[rustc.git] / src / doc / rust-by-example / src / unsafe / asm.md
CommitLineData
a2a8927a
XL
1# Inline assembly
2
3Rust provides support for inline assembly via the `asm!` macro.
4It can be used to embed handwritten assembly in the assembly output generated by the compiler.
5Generally this should not be necessary, but might be where the required performance or timing
6cannot be otherwise achieved. Accessing low level hardware primitives, e.g. in kernel code, may also demand this functionality.
7
8> **Note**: the examples here are given in x86/x86-64 assembly, but other architectures are also supported.
9
10Inline assembly is currently supported on the following architectures:
11- x86 and x86-64
12- ARM
13- AArch64
14- RISC-V
15
16## Basic usage
17
18Let us start with the simplest possible example:
19
20```rust
353b0b11 21# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
22use std::arch::asm;
23
24unsafe {
25 asm!("nop");
26}
353b0b11 27# }
a2a8927a
XL
28```
29
30This will insert a NOP (no operation) instruction into the assembly generated by the compiler.
31Note that all `asm!` invocations have to be inside an `unsafe` block, as they could insert
32arbitrary instructions and break various invariants. The instructions to be inserted are listed
33in the first argument of the `asm!` macro as a string literal.
34
35## Inputs and outputs
36
37Now inserting an instruction that does nothing is rather boring. Let us do something that
38actually acts on data:
39
40```rust
353b0b11 41# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
42use std::arch::asm;
43
44let x: u64;
45unsafe {
46 asm!("mov {}, 5", out(reg) x);
47}
48assert_eq!(x, 5);
353b0b11 49# }
a2a8927a
XL
50```
51
52This will write the value `5` into the `u64` variable `x`.
53You can see that the string literal we use to specify instructions is actually a template string.
54It is governed by the same rules as Rust [format strings][format-syntax].
55The arguments that are inserted into the template however look a bit different than you may
56be familiar with. First we need to specify if the variable is an input or an output of the
57inline assembly. In this case it is an output. We declared this by writing `out`.
58We also need to specify in what kind of register the assembly expects the variable.
59In this case we put it in an arbitrary general purpose register by specifying `reg`.
60The compiler will choose an appropriate register to insert into
61the template and will read the variable from there after the inline assembly finishes executing.
62
5e7ed085
FG
63[format-syntax]: https://doc.rust-lang.org/std/fmt/#syntax
64
a2a8927a
XL
65Let us see another example that also uses an input:
66
67```rust
353b0b11 68# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
69use std::arch::asm;
70
71let i: u64 = 3;
72let o: u64;
73unsafe {
74 asm!(
75 "mov {0}, {1}",
76 "add {0}, 5",
77 out(reg) o,
78 in(reg) i,
79 );
80}
81assert_eq!(o, 8);
353b0b11 82# }
a2a8927a
XL
83```
84
85This will add `5` to the input in variable `i` and write the result to variable `o`.
86The particular way this assembly does this is first copying the value from `i` to the output,
87and then adding `5` to it.
88
89The example shows a few things:
90
91First, we can see that `asm!` allows multiple template string arguments; each
92one is treated as a separate line of assembly code, as if they were all joined
93together with newlines between them. This makes it easy to format assembly
94code.
95
96Second, we can see that inputs are declared by writing `in` instead of `out`.
97
98Third, we can see that we can specify an argument number, or name as in any format string.
99For inline assembly templates this is particularly useful as arguments are often used more than once.
100For more complex inline assembly using this facility is generally recommended, as it improves
101readability, and allows reordering instructions without changing the argument order.
102
103We can further refine the above example to avoid the `mov` instruction:
104
105```rust
353b0b11 106# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
107use std::arch::asm;
108
109let mut x: u64 = 3;
110unsafe {
111 asm!("add {0}, 5", inout(reg) x);
112}
113assert_eq!(x, 8);
353b0b11 114# }
a2a8927a
XL
115```
116
117We can see that `inout` is used to specify an argument that is both input and output.
118This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.
119
120It is also possible to specify different variables for the input and output parts of an `inout` operand:
121
122```rust
353b0b11 123# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
124use std::arch::asm;
125
126let x: u64 = 3;
127let y: u64;
128unsafe {
129 asm!("add {0}, 5", inout(reg) x => y);
130}
131assert_eq!(y, 8);
353b0b11 132# }
a2a8927a
XL
133```
134
135## Late output operands
136
137The Rust compiler is conservative with its allocation of operands. It is assumed that an `out`
138can be written at any time, and can therefore not share its location with any other argument.
139However, to guarantee optimal performance it is important to use as few registers as possible,
140so they won't have to be saved and reloaded around the inline assembly block.
141To achieve this Rust provides a `lateout` specifier. This can be used on any output that is
142written only after all inputs have been consumed.
143There is also a `inlateout` variant of this specifier.
144
923072b8 145Here is an example where `inlateout` *cannot* be used in `release` mode or other optimized cases:
a2a8927a
XL
146
147```rust
353b0b11 148# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
149use std::arch::asm;
150
151let mut a: u64 = 4;
152let b: u64 = 4;
153let c: u64 = 4;
154unsafe {
155 asm!(
156 "add {0}, {1}",
157 "add {0}, {2}",
158 inout(reg) a,
159 in(reg) b,
160 in(reg) c,
161 );
162}
163assert_eq!(a, 12);
353b0b11 164# }
a2a8927a 165```
923072b8 166The above could work well in unoptimized cases (`Debug` mode), but if you want optimized performance (`release` mode or other optimized cases), it could not work.
a2a8927a 167
923072b8 168That is because in optimized cases, the compiler is free to allocate the same register for inputs `b` and `c` since it knows they have the same value. However it must allocate a separate register for `a` since it uses `inout` and not `inlateout`. If `inlateout` was used, then `a` and `c` could be allocated to the same register, in which case the first instruction to overwrite the value of `c` and cause the assembly code to produce the wrong result.
a2a8927a
XL
169
170However the following example can use `inlateout` since the output is only modified after all input registers have been read:
171
172```rust
353b0b11 173# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
174use std::arch::asm;
175
176let mut a: u64 = 4;
177let b: u64 = 4;
178unsafe {
179 asm!("add {0}, {1}", inlateout(reg) a, in(reg) b);
180}
181assert_eq!(a, 8);
353b0b11 182# }
a2a8927a
XL
183```
184
185As you can see, this assembly fragment will still work correctly if `a` and `b` are assigned to the same register.
186
187## Explicit register operands
188
189Some instructions require that the operands be in a specific register.
190Therefore, Rust inline assembly provides some more specific constraint specifiers.
191While `reg` is generally available on any architecture, explicit registers are highly architecture specific. E.g. for x86 the general purpose registers `eax`, `ebx`, `ecx`, `edx`, `ebp`, `esi`, and `edi` among others can be addressed by their name.
192
193```rust,no_run
353b0b11 194# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
195use std::arch::asm;
196
197let cmd = 0xd1;
198unsafe {
199 asm!("out 0x64, eax", in("eax") cmd);
200}
353b0b11 201# }
a2a8927a
XL
202```
203
204In this example we call the `out` instruction to output the content of the `cmd` variable to port `0x64`. Since the `out` instruction only accepts `eax` (and its sub registers) as operand we had to use the `eax` constraint specifier.
205
206> **Note**: unlike other operand types, explicit register operands cannot be used in the template string: you can't use `{}` and should write the register name directly instead. Also, they must appear at the end of the operand list after all other operand types.
207
208Consider this example which uses the x86 `mul` instruction:
209
210```rust
353b0b11 211# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
212use std::arch::asm;
213
214fn mul(a: u64, b: u64) -> u128 {
215 let lo: u64;
216 let hi: u64;
217
218 unsafe {
219 asm!(
220 // The x86 mul instruction takes rax as an implicit input and writes
221 // the 128-bit result of the multiplication to rax:rdx.
222 "mul {}",
223 in(reg) a,
224 inlateout("rax") b => lo,
225 lateout("rdx") hi
226 );
227 }
228
229 ((hi as u128) << 64) + lo as u128
230}
353b0b11 231# }
a2a8927a
XL
232```
233
234This uses the `mul` instruction to multiply two 64-bit inputs with a 128-bit result.
235The only explicit operand is a register, that we fill from the variable `a`.
236The second operand is implicit, and must be the `rax` register, which we fill from the variable `b`.
237The lower 64 bits of the result are stored in `rax` from which we fill the variable `lo`.
238The higher 64 bits are stored in `rdx` from which we fill the variable `hi`.
239
240## Clobbered registers
241
242In many cases inline assembly will modify state that is not needed as an output.
243Usually this is either because we have to use a scratch register in the assembly or because instructions modify state that we don't need to further examine.
244This state is generally referred to as being "clobbered".
245We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block.
246
247```rust
2b03887a 248use std::arch::asm;
a2a8927a 249
353b0b11 250# #[cfg(target_arch = "x86_64")]
5e7ed085
FG
251fn main() {
252 // three entries of four bytes each
253 let mut name_buf = [0_u8; 12];
254 // String is stored as ascii in ebx, edx, ecx in order
923072b8
FG
255 // Because ebx is reserved, the asm needs to preserve the value of it.
256 // So we push and pop it around the main asm.
5e7ed085
FG
257 // (in 64 bit mode for 64 bit processors, 32 bit processors would use ebx)
258
259 unsafe {
260 asm!(
261 "push rbx",
262 "cpuid",
923072b8
FG
263 "mov [rdi], ebx",
264 "mov [rdi + 4], edx",
265 "mov [rdi + 8], ecx",
5e7ed085
FG
266 "pop rbx",
267 // We use a pointer to an array for storing the values to simplify
268 // the Rust code at the cost of a couple more asm instructions
269 // This is more explicit with how the asm works however, as opposed
270 // to explicit register outputs such as `out("ecx") val`
271 // The *pointer itself* is only an input even though it's written behind
923072b8 272 in("rdi") name_buf.as_mut_ptr(),
5e7ed085
FG
273 // select cpuid 0, also specify eax as clobbered
274 inout("eax") 0 => _,
275 // cpuid clobbers these registers too
276 out("ecx") _,
277 out("edx") _,
278 );
279 }
a2a8927a 280
5e7ed085
FG
281 let name = core::str::from_utf8(&name_buf).unwrap();
282 println!("CPU Manufacturer ID: {}", name);
283}
353b0b11
FG
284
285# #[cfg(not(target_arch = "x86_64"))]
286# fn main() {}
a2a8927a
XL
287```
288
289In the example above we use the `cpuid` instruction to read the CPU manufacturer ID.
5e7ed085 290This instruction writes to `eax` with the maximum supported `cpuid` argument and `ebx`, `edx`, and `ecx` with the CPU manufacturer ID as ASCII bytes in that order.
a2a8927a
XL
291
292Even though `eax` is never read we still need to tell the compiler that the register has been modified so that the compiler can save any values that were in these registers before the asm. This is done by declaring it as an output but with `_` instead of a variable name, which indicates that the output value is to be discarded.
293
353b0b11 294This code also works around the limitation that `ebx` is a reserved register by LLVM. That means that LLVM assumes that it has full control over the register and it must be restored to its original state before exiting the asm block, so it cannot be used as an input or output **except** if the compiler uses it to fulfill a general register class (e.g. `in(reg)`). This makes `reg` operands dangerous when using reserved registers as we could unknowingly corrupt our input or output because they share the same register.
a2a8927a 295
353b0b11 296To work around this we use `rdi` to store the pointer to the output array, save `ebx` via `push`, read from `ebx` inside the asm block into the array and then restore `ebx` to its original state via `pop`. The `push` and `pop` use the full 64-bit `rbx` version of the register to ensure that the entire register is saved. On 32 bit targets the code would instead use `ebx` in the `push`/`pop`.
923072b8
FG
297
298This can also be used with a general register class to obtain a scratch register for use inside the asm code:
a2a8927a
XL
299
300```rust
353b0b11 301# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
302use std::arch::asm;
303
304// Multiply x by 6 using shifts and adds
305let mut x: u64 = 4;
306unsafe {
307 asm!(
308 "mov {tmp}, {x}",
309 "shl {tmp}, 1",
310 "shl {x}, 2",
311 "add {x}, {tmp}",
312 x = inout(reg) x,
313 tmp = out(reg) _,
314 );
315}
316assert_eq!(x, 4 * 6);
353b0b11 317# }
a2a8927a
XL
318```
319
320## Symbol operands and ABI clobbers
321
322By default, `asm!` assumes that any register not specified as an output will have its contents preserved by the assembly code. The [`clobber_abi`] argument to `asm!` tells the compiler to automatically insert the necessary clobber operands according to the given calling convention ABI: any register which is not fully preserved in that ABI will be treated as clobbered. Multiple `clobber_abi` arguments may be provided and all clobbers from all specified ABIs will be inserted.
323
324[`clobber_abi`]: ../../reference/inline-assembly.html#abi-clobbers
325
326```rust
353b0b11 327# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
328use std::arch::asm;
329
330extern "C" fn foo(arg: i32) -> i32 {
331 println!("arg = {}", arg);
332 arg * 2
333}
334
335fn call_foo(arg: i32) -> i32 {
336 unsafe {
337 let result;
338 asm!(
064997fb 339 "call {}",
a2a8927a
XL
340 // Function pointer to call
341 in(reg) foo,
342 // 1st argument in rdi
343 in("rdi") arg,
344 // Return value in rax
345 out("rax") result,
346 // Mark all registers which are not preserved by the "C" calling
347 // convention as clobbered.
348 clobber_abi("C"),
349 );
350 result
351 }
352}
353b0b11 353# }
a2a8927a
XL
354```
355
356## Register template modifiers
357
358In some cases, fine control is needed over the way a register name is formatted when inserted into the template string. This is needed when an architecture's assembly language has several names for the same register, each typically being a "view" over a subset of the register (e.g. the low 32 bits of a 64-bit register).
359
360By default the compiler will always choose the name that refers to the full register size (e.g. `rax` on x86-64, `eax` on x86, etc).
361
2b03887a 362This default can be overridden by using modifiers on the template string operands, just like you would with format strings:
a2a8927a
XL
363
364```rust
353b0b11 365# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
366use std::arch::asm;
367
368let mut x: u16 = 0xab;
369
370unsafe {
371 asm!("mov {0:h}, {0:l}", inout(reg_abcd) x);
372}
373
374assert_eq!(x, 0xabab);
353b0b11 375# }
a2a8927a
XL
376```
377
378In this example, we use the `reg_abcd` register class to restrict the register allocator to the 4 legacy x86 registers (`ax`, `bx`, `cx`, `dx`) of which the first two bytes can be addressed independently.
379
380Let us assume that the register allocator has chosen to allocate `x` in the `ax` register.
381The `h` modifier will emit the register name for the high byte of that register and the `l` modifier will emit the register name for the low byte. The asm code will therefore be expanded as `mov ah, al` which copies the low byte of the value into the high byte.
382
064997fb 383If you use a smaller data type (e.g. `u16`) with an operand and forget to use template modifiers, the compiler will emit a warning and suggest the correct modifier to use.
a2a8927a
XL
384
385## Memory address operands
386
387Sometimes assembly instructions require operands passed via memory addresses/memory locations.
388You have to manually use the memory address syntax specified by the target architecture.
389For example, on x86/x86_64 using Intel assembly syntax, you should wrap inputs/outputs in `[]` to indicate they are memory operands:
390
391```rust
353b0b11 392# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
393use std::arch::asm;
394
395fn load_fpu_control_word(control: u16) {
396 unsafe {
397 asm!("fldcw [{}]", in(reg) &control, options(nostack));
398 }
399}
353b0b11 400# }
a2a8927a
XL
401```
402
403## Labels
404
405Any reuse of a named label, local or otherwise, can result in an assembler or linker error or may cause other strange behavior. Reuse of a named label can happen in a variety of ways including:
406
407- explicitly: using a label more than once in one `asm!` block, or multiple times across blocks.
408- implicitly via inlining: the compiler is allowed to instantiate multiple copies of an `asm!` block, for example when the function containing it is inlined in multiple places.
409- implicitly via LTO: LTO can cause code from *other crates* to be placed in the same codegen unit, and so could bring in arbitrary labels.
410
411As a consequence, you should only use GNU assembler **numeric** [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.
412
413Moreover, on x86 when using the default Intel syntax, due to [an LLVM bug], you shouldn't use labels exclusively made of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. Using `options(att_syntax)` will avoid any ambiguity, but that affects the syntax of the _entire_ `asm!` block. (See [Options](#options), below, for more on `options`.)
414
415```rust
353b0b11 416# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
417use std::arch::asm;
418
419let mut a = 0;
420unsafe {
421 asm!(
422 "mov {0}, 10",
423 "2:",
424 "sub {0}, 1",
425 "cmp {0}, 3",
426 "jle 2f",
427 "jmp 2b",
428 "2:",
429 "add {0}, 2",
430 out(reg) a
431 );
432}
433assert_eq!(a, 5);
353b0b11 434# }
a2a8927a
XL
435```
436
437This will decrement the `{0}` register value from 10 to 3, then add 2 and store it in `a`.
438
439This example shows a few things:
440
441- First, that the same number can be used as a label multiple times in the same inline block.
442- Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes “b” (“backward”) or ”f” (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction.
443
a2a8927a
XL
444[local labels]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Local-Labels
445[an LLVM bug]: https://bugs.llvm.org/show_bug.cgi?id=36144
446
447## Options
448
449By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However, in many cases it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better.
450
451Let's take our previous example of an `add` instruction:
452
453```rust
353b0b11 454# #[cfg(target_arch = "x86_64")] {
a2a8927a
XL
455use std::arch::asm;
456
457let mut a: u64 = 4;
458let b: u64 = 4;
459unsafe {
460 asm!(
461 "add {0}, {1}",
462 inlateout(reg) a, in(reg) b,
463 options(pure, nomem, nostack),
464 );
465}
466assert_eq!(a, 8);
353b0b11 467# }
a2a8927a
XL
468```
469
470Options can be provided as an optional final argument to the `asm!` macro. We specified three options here:
471- `pure` means that the asm code has no observable side effects and that its output depends only on its inputs. This allows the compiler optimizer to call the inline asm fewer times or even eliminate it entirely.
472- `nomem` means that the asm code does not read or write to memory. By default the compiler will assume that inline assembly can read or write any memory address that is accessible to it (e.g. through a pointer passed as an operand, or a global).
473- `nostack` means that the asm code does not push any data onto the stack. This allows the compiler to use optimizations such as the stack red zone on x86-64 to avoid stack pointer adjustments.
474
475These allow the compiler to better optimize code using `asm!`, for example by eliminating pure `asm!` blocks whose outputs are not needed.
476
477See the [reference](../../reference/inline-assembly.html) for the full list of available options and their effects.