]> git.proxmox.com Git - rustc.git/blob - src/doc/rust-by-example/src/unsafe/asm.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / rust-by-example / src / unsafe / asm.md
1 # Inline assembly
2
3 Rust provides support for inline assembly via the `asm!` macro.
4 It can be used to embed handwritten assembly in the assembly output generated by the compiler.
5 Generally this should not be necessary, but might be where the required performance or timing
6 cannot be otherwise achieved. Accessing low level hardware primitives, e.g. in kernel code, may also demand this functionality.
7
8 > **Note**: the examples here are given in x86/x86-64 assembly, but other architectures are also supported.
9
10 Inline assembly is currently supported on the following architectures:
11 - x86 and x86-64
12 - ARM
13 - AArch64
14 - RISC-V
15
16 ## Basic usage
17
18 Let us start with the simplest possible example:
19
20 ```rust
21 use std::arch::asm;
22
23 unsafe {
24 asm!("nop");
25 }
26 ```
27
28 This will insert a NOP (no operation) instruction into the assembly generated by the compiler.
29 Note that all `asm!` invocations have to be inside an `unsafe` block, as they could insert
30 arbitrary instructions and break various invariants. The instructions to be inserted are listed
31 in the first argument of the `asm!` macro as a string literal.
32
33 ## Inputs and outputs
34
35 Now inserting an instruction that does nothing is rather boring. Let us do something that
36 actually acts on data:
37
38 ```rust
39 use std::arch::asm;
40
41 let x: u64;
42 unsafe {
43 asm!("mov {}, 5", out(reg) x);
44 }
45 assert_eq!(x, 5);
46 ```
47
48 This will write the value `5` into the `u64` variable `x`.
49 You can see that the string literal we use to specify instructions is actually a template string.
50 It is governed by the same rules as Rust [format strings][format-syntax].
51 The arguments that are inserted into the template however look a bit different than you may
52 be familiar with. First we need to specify if the variable is an input or an output of the
53 inline assembly. In this case it is an output. We declared this by writing `out`.
54 We also need to specify in what kind of register the assembly expects the variable.
55 In this case we put it in an arbitrary general purpose register by specifying `reg`.
56 The compiler will choose an appropriate register to insert into
57 the template and will read the variable from there after the inline assembly finishes executing.
58
59 [format-syntax]: https://doc.rust-lang.org/std/fmt/#syntax
60
61 Let us see another example that also uses an input:
62
63 ```rust
64 use std::arch::asm;
65
66 let i: u64 = 3;
67 let o: u64;
68 unsafe {
69 asm!(
70 "mov {0}, {1}",
71 "add {0}, 5",
72 out(reg) o,
73 in(reg) i,
74 );
75 }
76 assert_eq!(o, 8);
77 ```
78
79 This will add `5` to the input in variable `i` and write the result to variable `o`.
80 The particular way this assembly does this is first copying the value from `i` to the output,
81 and then adding `5` to it.
82
83 The example shows a few things:
84
85 First, we can see that `asm!` allows multiple template string arguments; each
86 one is treated as a separate line of assembly code, as if they were all joined
87 together with newlines between them. This makes it easy to format assembly
88 code.
89
90 Second, we can see that inputs are declared by writing `in` instead of `out`.
91
92 Third, we can see that we can specify an argument number, or name as in any format string.
93 For inline assembly templates this is particularly useful as arguments are often used more than once.
94 For more complex inline assembly using this facility is generally recommended, as it improves
95 readability, and allows reordering instructions without changing the argument order.
96
97 We can further refine the above example to avoid the `mov` instruction:
98
99 ```rust
100 use std::arch::asm;
101
102 let mut x: u64 = 3;
103 unsafe {
104 asm!("add {0}, 5", inout(reg) x);
105 }
106 assert_eq!(x, 8);
107 ```
108
109 We can see that `inout` is used to specify an argument that is both input and output.
110 This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register.
111
112 It is also possible to specify different variables for the input and output parts of an `inout` operand:
113
114 ```rust
115 use std::arch::asm;
116
117 let x: u64 = 3;
118 let y: u64;
119 unsafe {
120 asm!("add {0}, 5", inout(reg) x => y);
121 }
122 assert_eq!(y, 8);
123 ```
124
125 ## Late output operands
126
127 The Rust compiler is conservative with its allocation of operands. It is assumed that an `out`
128 can be written at any time, and can therefore not share its location with any other argument.
129 However, to guarantee optimal performance it is important to use as few registers as possible,
130 so they won't have to be saved and reloaded around the inline assembly block.
131 To achieve this Rust provides a `lateout` specifier. This can be used on any output that is
132 written only after all inputs have been consumed.
133 There is also a `inlateout` variant of this specifier.
134
135 Here is an example where `inlateout` *cannot* be used in `release` mode or other optimized cases:
136
137 ```rust
138 use std::arch::asm;
139
140 let mut a: u64 = 4;
141 let b: u64 = 4;
142 let c: u64 = 4;
143 unsafe {
144 asm!(
145 "add {0}, {1}",
146 "add {0}, {2}",
147 inout(reg) a,
148 in(reg) b,
149 in(reg) c,
150 );
151 }
152 assert_eq!(a, 12);
153 ```
154 The above could work well in unoptimized cases (`Debug` mode), but if you want optimized performance (`release` mode or other optimized cases), it could not work.
155
156 That is because in optimized cases, the compiler is free to allocate the same register for inputs `b` and `c` since it knows they have the same value. However it must allocate a separate register for `a` since it uses `inout` and not `inlateout`. If `inlateout` was used, then `a` and `c` could be allocated to the same register, in which case the first instruction to overwrite the value of `c` and cause the assembly code to produce the wrong result.
157
158 However the following example can use `inlateout` since the output is only modified after all input registers have been read:
159
160 ```rust
161 use std::arch::asm;
162
163 let mut a: u64 = 4;
164 let b: u64 = 4;
165 unsafe {
166 asm!("add {0}, {1}", inlateout(reg) a, in(reg) b);
167 }
168 assert_eq!(a, 8);
169 ```
170
171 As you can see, this assembly fragment will still work correctly if `a` and `b` are assigned to the same register.
172
173 ## Explicit register operands
174
175 Some instructions require that the operands be in a specific register.
176 Therefore, Rust inline assembly provides some more specific constraint specifiers.
177 While `reg` is generally available on any architecture, explicit registers are highly architecture specific. E.g. for x86 the general purpose registers `eax`, `ebx`, `ecx`, `edx`, `ebp`, `esi`, and `edi` among others can be addressed by their name.
178
179 ```rust,no_run
180 use std::arch::asm;
181
182 let cmd = 0xd1;
183 unsafe {
184 asm!("out 0x64, eax", in("eax") cmd);
185 }
186 ```
187
188 In this example we call the `out` instruction to output the content of the `cmd` variable to port `0x64`. Since the `out` instruction only accepts `eax` (and its sub registers) as operand we had to use the `eax` constraint specifier.
189
190 > **Note**: unlike other operand types, explicit register operands cannot be used in the template string: you can't use `{}` and should write the register name directly instead. Also, they must appear at the end of the operand list after all other operand types.
191
192 Consider this example which uses the x86 `mul` instruction:
193
194 ```rust
195 use std::arch::asm;
196
197 fn mul(a: u64, b: u64) -> u128 {
198 let lo: u64;
199 let hi: u64;
200
201 unsafe {
202 asm!(
203 // The x86 mul instruction takes rax as an implicit input and writes
204 // the 128-bit result of the multiplication to rax:rdx.
205 "mul {}",
206 in(reg) a,
207 inlateout("rax") b => lo,
208 lateout("rdx") hi
209 );
210 }
211
212 ((hi as u128) << 64) + lo as u128
213 }
214 ```
215
216 This uses the `mul` instruction to multiply two 64-bit inputs with a 128-bit result.
217 The only explicit operand is a register, that we fill from the variable `a`.
218 The second operand is implicit, and must be the `rax` register, which we fill from the variable `b`.
219 The lower 64 bits of the result are stored in `rax` from which we fill the variable `lo`.
220 The higher 64 bits are stored in `rdx` from which we fill the variable `hi`.
221
222 ## Clobbered registers
223
224 In many cases inline assembly will modify state that is not needed as an output.
225 Usually this is either because we have to use a scratch register in the assembly or because instructions modify state that we don't need to further examine.
226 This state is generally referred to as being "clobbered".
227 We need to tell the compiler about this since it may need to save and restore this state around the inline assembly block.
228
229 ```rust
230 use core::arch::asm;
231
232 fn main() {
233 // three entries of four bytes each
234 let mut name_buf = [0_u8; 12];
235 // String is stored as ascii in ebx, edx, ecx in order
236 // Because ebx is reserved, the asm needs to preserve the value of it.
237 // So we push and pop it around the main asm.
238 // (in 64 bit mode for 64 bit processors, 32 bit processors would use ebx)
239
240 unsafe {
241 asm!(
242 "push rbx",
243 "cpuid",
244 "mov [rdi], ebx",
245 "mov [rdi + 4], edx",
246 "mov [rdi + 8], ecx",
247 "pop rbx",
248 // We use a pointer to an array for storing the values to simplify
249 // the Rust code at the cost of a couple more asm instructions
250 // This is more explicit with how the asm works however, as opposed
251 // to explicit register outputs such as `out("ecx") val`
252 // The *pointer itself* is only an input even though it's written behind
253 in("rdi") name_buf.as_mut_ptr(),
254 // select cpuid 0, also specify eax as clobbered
255 inout("eax") 0 => _,
256 // cpuid clobbers these registers too
257 out("ecx") _,
258 out("edx") _,
259 );
260 }
261
262 let name = core::str::from_utf8(&name_buf).unwrap();
263 println!("CPU Manufacturer ID: {}", name);
264 }
265 ```
266
267 In the example above we use the `cpuid` instruction to read the CPU manufacturer ID.
268 This instruction writes to `eax` with the maximum supported `cpuid` argument and `ebx`, `edx`, and `ecx` with the CPU manufacturer ID as ASCII bytes in that order.
269
270 Even though `eax` is never read we still need to tell the compiler that the register has been modified so that the compiler can save any values that were in these registers before the asm. This is done by declaring it as an output but with `_` instead of a variable name, which indicates that the output value is to be discarded.
271
272 This code also works around the limitation that `ebx` is a reserved register by LLVM. That means that LLVM assumes that it has full control over the register and it must be restored to its original state before exiting the asm block, so it cannot be used as an input or output **except** if the compiler uses it to fulfill a general register class (e.g. `in(reg)`). This makes `reg` operands dangerous when using reserved registers as we could unknowingly corrupt out input or output because they share the same register.
273
274 To work around this we use `rdi` to store the pointer to the output array, save `ebx` via `push`, read from `ebx` inside the asm block into the array and then restoring `ebx` to its original state via `pop`. The `push` and `pop` use the full 64-bit `rbx` version of the register to ensure that the entire register is saved. On 32 bit targets the code would instead use `ebx` in the `push`/`pop`.
275
276 This can also be used with a general register class to obtain a scratch register for use inside the asm code:
277
278 ```rust
279 use std::arch::asm;
280
281 // Multiply x by 6 using shifts and adds
282 let mut x: u64 = 4;
283 unsafe {
284 asm!(
285 "mov {tmp}, {x}",
286 "shl {tmp}, 1",
287 "shl {x}, 2",
288 "add {x}, {tmp}",
289 x = inout(reg) x,
290 tmp = out(reg) _,
291 );
292 }
293 assert_eq!(x, 4 * 6);
294 ```
295
296 ## Symbol operands and ABI clobbers
297
298 By default, `asm!` assumes that any register not specified as an output will have its contents preserved by the assembly code. The [`clobber_abi`] argument to `asm!` tells the compiler to automatically insert the necessary clobber operands according to the given calling convention ABI: any register which is not fully preserved in that ABI will be treated as clobbered. Multiple `clobber_abi` arguments may be provided and all clobbers from all specified ABIs will be inserted.
299
300 [`clobber_abi`]: ../../reference/inline-assembly.html#abi-clobbers
301
302 ```rust
303 use std::arch::asm;
304
305 extern "C" fn foo(arg: i32) -> i32 {
306 println!("arg = {}", arg);
307 arg * 2
308 }
309
310 fn call_foo(arg: i32) -> i32 {
311 unsafe {
312 let result;
313 asm!(
314 "call *{}",
315 // Function pointer to call
316 in(reg) foo,
317 // 1st argument in rdi
318 in("rdi") arg,
319 // Return value in rax
320 out("rax") result,
321 // Mark all registers which are not preserved by the "C" calling
322 // convention as clobbered.
323 clobber_abi("C"),
324 );
325 result
326 }
327 }
328 ```
329
330 ## Register template modifiers
331
332 In some cases, fine control is needed over the way a register name is formatted when inserted into the template string. This is needed when an architecture's assembly language has several names for the same register, each typically being a "view" over a subset of the register (e.g. the low 32 bits of a 64-bit register).
333
334 By default the compiler will always choose the name that refers to the full register size (e.g. `rax` on x86-64, `eax` on x86, etc).
335
336 This default can be overriden by using modifiers on the template string operands, just like you would with format strings:
337
338 ```rust
339 use std::arch::asm;
340
341 let mut x: u16 = 0xab;
342
343 unsafe {
344 asm!("mov {0:h}, {0:l}", inout(reg_abcd) x);
345 }
346
347 assert_eq!(x, 0xabab);
348 ```
349
350 In this example, we use the `reg_abcd` register class to restrict the register allocator to the 4 legacy x86 registers (`ax`, `bx`, `cx`, `dx`) of which the first two bytes can be addressed independently.
351
352 Let us assume that the register allocator has chosen to allocate `x` in the `ax` register.
353 The `h` modifier will emit the register name for the high byte of that register and the `l` modifier will emit the register name for the low byte. The asm code will therefore be expanded as `mov ah, al` which copies the low byte of the value into the high byte.
354
355 If you use a smaller data type (e.g. `u16`) with an operand and forget the use template modifiers, the compiler will emit a warning and suggest the correct modifier to use.
356
357 ## Memory address operands
358
359 Sometimes assembly instructions require operands passed via memory addresses/memory locations.
360 You have to manually use the memory address syntax specified by the target architecture.
361 For example, on x86/x86_64 using Intel assembly syntax, you should wrap inputs/outputs in `[]` to indicate they are memory operands:
362
363 ```rust
364 use std::arch::asm;
365
366 fn load_fpu_control_word(control: u16) {
367 unsafe {
368 asm!("fldcw [{}]", in(reg) &control, options(nostack));
369 }
370 }
371 ```
372
373 ## Labels
374
375 Any reuse of a named label, local or otherwise, can result in an assembler or linker error or may cause other strange behavior. Reuse of a named label can happen in a variety of ways including:
376
377 - explicitly: using a label more than once in one `asm!` block, or multiple times across blocks.
378 - implicitly via inlining: the compiler is allowed to instantiate multiple copies of an `asm!` block, for example when the function containing it is inlined in multiple places.
379 - implicitly via LTO: LTO can cause code from *other crates* to be placed in the same codegen unit, and so could bring in arbitrary labels.
380
381 As a consequence, you should only use GNU assembler **numeric** [local labels] inside inline assembly code. Defining symbols in assembly code may lead to assembler and/or linker errors due to duplicate symbol definitions.
382
383 Moreover, on x86 when using the default Intel syntax, due to [an LLVM bug], you shouldn't use labels exclusively made of `0` and `1` digits, e.g. `0`, `11` or `101010`, as they may end up being interpreted as binary values. Using `options(att_syntax)` will avoid any ambiguity, but that affects the syntax of the _entire_ `asm!` block. (See [Options](#options), below, for more on `options`.)
384
385 ```rust
386 use std::arch::asm;
387
388 let mut a = 0;
389 unsafe {
390 asm!(
391 "mov {0}, 10",
392 "2:",
393 "sub {0}, 1",
394 "cmp {0}, 3",
395 "jle 2f",
396 "jmp 2b",
397 "2:",
398 "add {0}, 2",
399 out(reg) a
400 );
401 }
402 assert_eq!(a, 5);
403 ```
404
405 This will decrement the `{0}` register value from 10 to 3, then add 2 and store it in `a`.
406
407 This example shows a few things:
408
409 - First, that the same number can be used as a label multiple times in the same inline block.
410 - Second, that when a numeric label is used as a reference (as an instruction operand, for example), the suffixes “b” (“backward”) or ”f” (“forward”) should be added to the numeric label. It will then refer to the nearest label defined by this number in this direction.
411
412 [local labels]: https://sourceware.org/binutils/docs/as/Symbol-Names.html#Local-Labels
413 [an LLVM bug]: https://bugs.llvm.org/show_bug.cgi?id=36144
414
415 ## Options
416
417 By default, an inline assembly block is treated the same way as an external FFI function call with a custom calling convention: it may read/write memory, have observable side effects, etc. However, in many cases it is desirable to give the compiler more information about what the assembly code is actually doing so that it can optimize better.
418
419 Let's take our previous example of an `add` instruction:
420
421 ```rust
422 use std::arch::asm;
423
424 let mut a: u64 = 4;
425 let b: u64 = 4;
426 unsafe {
427 asm!(
428 "add {0}, {1}",
429 inlateout(reg) a, in(reg) b,
430 options(pure, nomem, nostack),
431 );
432 }
433 assert_eq!(a, 8);
434 ```
435
436 Options can be provided as an optional final argument to the `asm!` macro. We specified three options here:
437 - `pure` means that the asm code has no observable side effects and that its output depends only on its inputs. This allows the compiler optimizer to call the inline asm fewer times or even eliminate it entirely.
438 - `nomem` means that the asm code does not read or write to memory. By default the compiler will assume that inline assembly can read or write any memory address that is accessible to it (e.g. through a pointer passed as an operand, or a global).
439 - `nostack` means that the asm code does not push any data onto the stack. This allows the compiler to use optimizations such as the stack red zone on x86-64 to avoid stack pointer adjustments.
440
441 These allow the compiler to better optimize code using `asm!`, for example by eliminating pure `asm!` blocks whose outputs are not needed.
442
443 See the [reference](../../reference/inline-assembly.html) for the full list of available options and their effects.