]>
Commit | Line | Data |
---|---|---|
532ac7d7 XL |
1 | # Code generation attributes |
2 | ||
3 | The following [attributes] are used for controlling code generation. | |
4 | ||
5 | ## Optimization hints | |
6 | ||
7 | The `cold` and `inline` [attributes] give suggestions to generate code in a | |
8 | way that may be faster than what it would do without the hint. The attributes | |
9 | are only hints, and may be ignored. | |
10 | ||
11 | Both attributes can be used on [functions]. When applied to a function in a | |
12 | [trait], they apply only to that function when used as a default function for | |
13 | a trait implementation and not to all trait implementations. The attributes | |
14 | have no effect on a trait function without a body. | |
15 | ||
16 | ### The `inline` attribute | |
17 | ||
18 | The *`inline` [attribute]* suggests that a copy of the attributed function | |
19 | should be placed in the caller, rather than generating code to call the | |
20 | function where it is defined. | |
21 | ||
22 | > ***Note***: The `rustc` compiler automatically inlines functions based on | |
23 | > internal heuristics. Incorrectly inlining functions can make the program | |
24 | > slower, so this attribute should be used with care. | |
25 | ||
26 | There are three ways to use the inline attribute: | |
27 | ||
ba9703b0 XL |
28 | * `#[inline]` *suggests* performing an inline expansion. |
29 | * `#[inline(always)]` *suggests* that an inline expansion should always be | |
532ac7d7 | 30 | performed. |
ba9703b0 | 31 | * `#[inline(never)]` *suggests* that an inline expansion should never be |
532ac7d7 XL |
32 | performed. |
33 | ||
ba9703b0 XL |
34 | > ***Note***: `#[inline]` in every form is a hint, with no *requirements* |
35 | > on the language to place a copy of the attributed function in the caller. | |
36 | ||
532ac7d7 XL |
37 | ### The `cold` attribute |
38 | ||
39 | The *`cold` [attribute]* suggests that the attributed function is unlikely to | |
40 | be called. | |
41 | ||
42 | ## The `no_builtins` attribute | |
43 | ||
44 | The *`no_builtins` [attribute]* may be applied at the crate level to disable | |
45 | optimizing certain code patterns to invocations of library functions that are | |
46 | assumed to exist. | |
47 | ||
48 | ## The `target_feature` attribute | |
49 | ||
136023e0 | 50 | The *`target_feature` [attribute]* may be applied to a function to |
532ac7d7 XL |
51 | enable code generation of that function for specific platform architecture |
52 | features. It uses the [_MetaListNameValueStr_] syntax with a single key of | |
53 | `enable` whose value is a string of comma-separated feature names to enable. | |
54 | ||
60c5eb7d XL |
55 | ```rust |
56 | # #[cfg(target_feature = "avx2")] | |
532ac7d7 XL |
57 | #[target_feature(enable = "avx2")] |
58 | unsafe fn foo_avx2() {} | |
59 | ``` | |
60 | ||
61 | Each [target architecture] has a set of features that may be enabled. It is an | |
62 | error to specify a feature for a target architecture that the crate is not | |
63 | being compiled for. | |
64 | ||
65 | It is [undefined behavior] to call a function that is compiled with a feature | |
66 | that is not supported on the current platform the code is running on. | |
67 | ||
68 | Functions marked with `target_feature` are not inlined into a context that | |
69 | does not support the given features. The `#[inline(always)]` attribute may not | |
70 | be used with a `target_feature` attribute. | |
71 | ||
72 | ### Available features | |
73 | ||
74 | The following is a list of the available feature names. | |
75 | ||
76 | #### `x86` or `x86_64` | |
77 | ||
136023e0 XL |
78 | This platform requires that `#[target_feature]` is only applied to [`unsafe` |
79 | functions][unsafe function]. | |
80 | ||
532ac7d7 XL |
81 | Feature | Implicitly Enables | Description |
82 | ------------|--------------------|------------------- | |
83 | `aes` | `sse2` | [AES] — Advanced Encryption Standard | |
84 | `avx` | `sse4.2` | [AVX] — Advanced Vector Extensions | |
85 | `avx2` | `avx` | [AVX2] — Advanced Vector Extensions 2 | |
86 | `bmi1` | | [BMI1] — Bit Manipulation Instruction Sets | |
87 | `bmi2` | | [BMI2] — Bit Manipulation Instruction Sets 2 | |
88 | `fma` | `avx` | [FMA3] — Three-operand fused multiply-add | |
89 | `fxsr` | | [`fxsave`] and [`fxrstor`] — Save and restore x87 FPU, MMX Technology, and SSE State | |
90 | `lzcnt` | | [`lzcnt`] — Leading zeros count | |
91 | `pclmulqdq` | `sse2` | [`pclmulqdq`] — Packed carry-less multiplication quadword | |
92 | `popcnt` | | [`popcnt`] — Count of bits set to 1 | |
93 | `rdrand` | | [`rdrand`] — Read random number | |
94 | `rdseed` | | [`rdseed`] — Read random seed | |
95 | `sha` | `sse2` | [SHA] — Secure Hash Algorithm | |
96 | `sse` | | [SSE] — Streaming <abbr title="Single Instruction Multiple Data">SIMD</abbr> Extensions | |
97 | `sse2` | `sse` | [SSE2] — Streaming SIMD Extensions 2 | |
98 | `sse3` | `sse2` | [SSE3] — Streaming SIMD Extensions 3 | |
29967ef6 | 99 | `sse4.1` | `ssse3` | [SSE4.1] — Streaming SIMD Extensions 4.1 |
532ac7d7 XL |
100 | `sse4.2` | `sse4.1` | [SSE4.2] — Streaming SIMD Extensions 4.2 |
101 | `ssse3` | `sse3` | [SSSE3] — Supplemental Streaming SIMD Extensions 3 | |
102 | `xsave` | | [`xsave`] — Save processor extended states | |
103 | `xsavec` | | [`xsavec`] — Save processor extended states with compaction | |
104 | `xsaveopt` | | [`xsaveopt`] — Save processor extended states optimized | |
105 | `xsaves` | | [`xsaves`] — Save processor extended states supervisor | |
106 | ||
107 | <!-- Keep links near each table to make it easier to move and update. --> | |
108 | ||
109 | [AES]: https://en.wikipedia.org/wiki/AES_instruction_set | |
110 | [AVX]: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions | |
111 | [AVX2]: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#AVX2 | |
112 | [BMI1]: https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets | |
113 | [BMI2]: https://en.wikipedia.org/wiki/Bit_Manipulation_Instruction_Sets#BMI2 | |
114 | [FMA3]: https://en.wikipedia.org/wiki/FMA_instruction_set | |
115 | [`fxsave`]: https://www.felixcloutier.com/x86/fxsave | |
116 | [`fxrstor`]: https://www.felixcloutier.com/x86/fxrstor | |
117 | [`lzcnt`]: https://www.felixcloutier.com/x86/lzcnt | |
118 | [`pclmulqdq`]: https://www.felixcloutier.com/x86/pclmulqdq | |
119 | [`popcnt`]: https://www.felixcloutier.com/x86/popcnt | |
120 | [`rdrand`]: https://en.wikipedia.org/wiki/RdRand | |
121 | [`rdseed`]: https://en.wikipedia.org/wiki/RdRand | |
122 | [SHA]: https://en.wikipedia.org/wiki/Intel_SHA_extensions | |
123 | [SSE]: https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions | |
124 | [SSE2]: https://en.wikipedia.org/wiki/SSE2 | |
125 | [SSE3]: https://en.wikipedia.org/wiki/SSE3 | |
126 | [SSE4.1]: https://en.wikipedia.org/wiki/SSE4#SSE4.1 | |
127 | [SSE4.2]: https://en.wikipedia.org/wiki/SSE4#SSE4.2 | |
128 | [SSSE3]: https://en.wikipedia.org/wiki/SSSE3 | |
129 | [`xsave`]: https://www.felixcloutier.com/x86/xsave | |
130 | [`xsavec`]: https://www.felixcloutier.com/x86/xsavec | |
131 | [`xsaveopt`]: https://www.felixcloutier.com/x86/xsaveopt | |
132 | [`xsaves`]: https://www.felixcloutier.com/x86/xsaves | |
133 | ||
136023e0 XL |
134 | #### `wasm32` or `wasm64` |
135 | ||
136 | This platform allows `#[target_feature]` to be applied to both safe and | |
137 | [`unsafe` functions][unsafe function]. | |
138 | ||
139 | Feature | Description | |
140 | ------------|------------------- | |
141 | `simd128` | [WebAssembly simd proposal][simd128] | |
142 | ||
143 | [simd128]: https://github.com/webassembly/simd | |
144 | ||
532ac7d7 XL |
145 | ### Additional information |
146 | ||
147 | See the [`target_feature` conditional compilation option] for selectively | |
148 | enabling or disabling compilation of code based on compile-time settings. Note | |
149 | that this option is not affected by the `target_feature` attribute, and is | |
150 | only driven by the features enabled for the entire crate. | |
151 | ||
152 | See the [`is_x86_feature_detected`] macro in the standard library for runtime | |
153 | feature detection on the x86 platforms. | |
154 | ||
155 | > Note: `rustc` has a default set of features enabled for each target and CPU. | |
156 | > The CPU may be chosen with the [`-C target-cpu`] flag. Individual features | |
157 | > may be enabled or disabled for an entire crate with the | |
158 | > [`-C target-feature`] flag. | |
159 | ||
3dfed10e XL |
160 | ## The `track_caller` attribute |
161 | ||
162 | The `track_caller` attribute may be applied to any function with [`"Rust"` ABI][rust-abi] | |
163 | with the exception of the entry point `fn main`. When applied to functions and methods in | |
164 | trait declarations, the attribute applies to all implementations. If the trait provides a | |
165 | default implementation with the attribute, then the attribute also applies to override implementations. | |
166 | ||
167 | When applied to a function in an `extern` block the attribute must also be applied to any linked | |
168 | implementations, otherwise undefined behavior results. When applied to a function which is made | |
169 | available to an `extern` block, the declaration in the `extern` block must also have the attribute, | |
170 | otherwise undefined behavior results. | |
171 | ||
172 | ### Behavior | |
173 | ||
29967ef6 XL |
174 | Applying the attribute to a function `f` allows code within `f` to get a hint of the [`Location`] of |
175 | the "topmost" tracked call that led to `f`'s invocation. At the point of observation, an | |
176 | implementation behaves as if it walks up the stack from `f`'s frame to find the nearest frame of an | |
3dfed10e XL |
177 | *unattributed* function `outer`, and it returns the [`Location`] of the tracked call in `outer`. |
178 | ||
179 | ```rust | |
180 | #[track_caller] | |
181 | fn f() { | |
182 | println!("{}", std::panic::Location::caller()); | |
183 | } | |
184 | ``` | |
185 | ||
186 | > Note: `core` provides [`core::panic::Location::caller`] for observing caller locations. It wraps | |
187 | > the [`core::intrinsics::caller_location`] intrinsic implemented by `rustc`. | |
188 | ||
189 | > Note: because the resulting `Location` is a hint, an implementation may halt its walk up the stack | |
190 | > early. See [Limitations](#limitations) for important caveats. | |
191 | ||
192 | #### Examples | |
193 | ||
194 | When `f` is called directly by `calls_f`, code in `f` observes its callsite within `calls_f`: | |
195 | ||
196 | ```rust | |
197 | # #[track_caller] | |
198 | # fn f() { | |
199 | # println!("{}", std::panic::Location::caller()); | |
200 | # } | |
201 | fn calls_f() { | |
202 | f(); // <-- f() prints this location | |
203 | } | |
204 | ``` | |
205 | ||
206 | When `f` is called by another attributed function `g` which is in turn called by `calls_g`, code in | |
29967ef6 | 207 | both `f` and `g` observes `g`'s callsite within `calls_g`: |
3dfed10e XL |
208 | |
209 | ```rust | |
210 | # #[track_caller] | |
211 | # fn f() { | |
212 | # println!("{}", std::panic::Location::caller()); | |
213 | # } | |
214 | #[track_caller] | |
215 | fn g() { | |
216 | println!("{}", std::panic::Location::caller()); | |
217 | f(); | |
218 | } | |
219 | ||
220 | fn calls_g() { | |
221 | g(); // <-- g() prints this location twice, once itself and once from f() | |
222 | } | |
223 | ``` | |
224 | ||
225 | When `g` is called by another attributed function `h` which is in turn called by `calls_h`, all code | |
226 | in `f`, `g`, and `h` observes `h`'s callsite within `calls_h`: | |
227 | ||
228 | ```rust | |
229 | # #[track_caller] | |
230 | # fn f() { | |
231 | # println!("{}", std::panic::Location::caller()); | |
232 | # } | |
233 | # #[track_caller] | |
234 | # fn g() { | |
235 | # println!("{}", std::panic::Location::caller()); | |
236 | # f(); | |
237 | # } | |
238 | #[track_caller] | |
239 | fn h() { | |
240 | println!("{}", std::panic::Location::caller()); | |
241 | g(); | |
242 | } | |
243 | ||
244 | fn calls_h() { | |
245 | h(); // <-- prints this location three times, once itself, once from g(), once from f() | |
246 | } | |
247 | ``` | |
248 | ||
249 | And so on. | |
250 | ||
251 | ### Limitations | |
252 | ||
253 | This information is a hint and implementations are not required to preserve it. | |
254 | ||
255 | In particular, coercing a function with `#[track_caller]` to a function pointer creates a shim which | |
256 | appears to observers to have been called at the attributed function's definition site, losing actual | |
257 | caller information across virtual calls. A common example of this coercion is the creation of a | |
258 | trait object whose methods are attributed. | |
259 | ||
260 | > Note: The aforementioned shim for function pointers is necessary because `rustc` implements | |
261 | > `track_caller` in a codegen context by appending an implicit parameter to the function ABI, but | |
262 | > this would be unsound for an indirect call because the parameter is not a part of the function's | |
263 | > type and a given function pointer type may or may not refer to a function with the attribute. The | |
264 | > creation of a shim hides the implicit parameter from callers of the function pointer, preserving | |
265 | > soundness. | |
266 | ||
416331ca XL |
267 | [_MetaListNameValueStr_]: ../attributes.md#meta-item-attribute-syntax |
268 | [`-C target-cpu`]: ../../rustc/codegen-options/index.html#target-cpu | |
269 | [`-C target-feature`]: ../../rustc/codegen-options/index.html#target-feature | |
270 | [`is_x86_feature_detected`]: ../../std/macro.is_x86_feature_detected.html | |
271 | [`target_feature` conditional compilation option]: ../conditional-compilation.md#target_feature | |
272 | [attribute]: ../attributes.md | |
273 | [attributes]: ../attributes.md | |
274 | [functions]: ../items/functions.md | |
275 | [target architecture]: ../conditional-compilation.md#target_arch | |
276 | [trait]: ../items/traits.md | |
277 | [undefined behavior]: ../behavior-considered-undefined.md | |
278 | [unsafe function]: ../unsafe-functions.md | |
3dfed10e XL |
279 | [rust-abi]: ../items/external-blocks.md#abi |
280 | [`core::intrinsics::caller_location`]: ../../core/intrinsics/fn.caller_location.html | |
281 | [`core::panic::Location::caller`]: ../../core/panic/struct.Location.html#method.caller | |
282 | [`Location`]: ../../core/panic/struct.Location.html |