]>
Commit | Line | Data |
---|---|---|
1a4d82fc JJ |
1 | % The Rust Reference |
2 | ||
3 | # Introduction | |
4 | ||
5 | This document is the primary reference for the Rust programming language. It | |
6 | provides three kinds of material: | |
7 | ||
85aaf69f | 8 | - Chapters that informally describe each language construct and their use. |
1a4d82fc JJ |
9 | - Chapters that informally describe the memory model, concurrency model, |
10 | runtime services, linkage model and debugging facilities. | |
11 | - Appendix chapters providing rationale and references to languages that | |
12 | influenced the design. | |
13 | ||
14 | This document does not serve as an introduction to the language. Background | |
15 | familiarity with the language is assumed. A separate [book] is available to | |
16 | help acquire such background familiarity. | |
17 | ||
18 | This document also does not serve as a reference to the [standard] library | |
19 | included in the language distribution. Those libraries are documented | |
20 | separately by extracting documentation attributes from their source code. Many | |
21 | of the features that one might expect to be language features are library | |
22 | features in Rust, so what you're looking for may be there, not here. | |
23 | ||
85aaf69f SL |
24 | You may also be interested in the [grammar]. |
25 | ||
1a4d82fc JJ |
26 | [book]: book/index.html |
27 | [standard]: std/index.html | |
85aaf69f | 28 | [grammar]: grammar.html |
1a4d82fc JJ |
29 | |
30 | # Notation | |
31 | ||
1a4d82fc JJ |
32 | ## Unicode productions |
33 | ||
bd371182 AL |
34 | A few productions in Rust's grammar permit Unicode code points outside the |
35 | ASCII range. We define these productions in terms of character properties | |
36 | specified in the Unicode standard, rather than in terms of ASCII-range code | |
37 | points. The grammar has a [Special Unicode Productions][unicodeproductions] | |
38 | section that lists these productions. | |
39 | ||
40 | [unicodeproductions]: grammar.html#special-unicode-productions | |
1a4d82fc JJ |
41 | |
42 | ## String table productions | |
43 | ||
44 | Some rules in the grammar — notably [unary | |
45 | operators](#unary-operator-expressions), [binary | |
bd371182 | 46 | operators](#binary-operator-expressions), and [keywords][keywords] — are |
1a4d82fc JJ |
47 | given in a simplified form: as a listing of a table of unquoted, printable |
48 | whitespace-separated strings. These cases form a subset of the rules regarding | |
49 | the [token](#tokens) rule, and are assumed to be the result of a | |
50 | lexical-analysis phase feeding the parser, driven by a DFA, operating over the | |
51 | disjunction of all such string table entries. | |
52 | ||
bd371182 AL |
53 | [keywords]: grammar.html#keywords |
54 | ||
1a4d82fc JJ |
55 | When such a string enclosed in double-quotes (`"`) occurs inside the grammar, |
56 | it is an implicit reference to a single member of such a string table | |
57 | production. See [tokens](#tokens) for more information. | |
58 | ||
59 | # Lexical structure | |
60 | ||
61 | ## Input format | |
62 | ||
bd371182 | 63 | Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8. |
1a4d82fc | 64 | Most Rust grammar rules are defined in terms of printable ASCII-range |
bd371182 AL |
65 | code points, but a small number are defined in terms of Unicode properties or |
66 | explicit code point lists. [^inputformat] | |
1a4d82fc JJ |
67 | |
68 | [^inputformat]: Substitute definitions for the special Unicode productions are | |
69 | provided to the grammar verifier, restricted to ASCII range, when verifying the | |
70 | grammar in this document. | |
71 | ||
bd371182 | 72 | ## Identifiers |
1a4d82fc | 73 | |
bd371182 | 74 | An identifier is any nonempty Unicode[^non_ascii_idents] string of the following form: |
1a4d82fc | 75 | |
bd371182 AL |
76 | [^non_ascii_idents]: Non-ASCII characters in identifiers are currently feature |
77 | gated. This is expected to improve soon. | |
1a4d82fc | 78 | |
b039eaaf SL |
79 | Either |
80 | ||
81 | * The first character has property `XID_start` | |
82 | * The remaining characters have property `XID_continue` | |
83 | ||
84 | Or | |
85 | ||
86 | * The first character is `_` | |
87 | * The identifier is more than one character, `_` alone is not an identifier | |
88 | * The remaining characters have property `XID_continue` | |
1a4d82fc | 89 | |
bd371182 | 90 | that does _not_ occur in the set of [keywords][keywords]. |
1a4d82fc JJ |
91 | |
92 | > **Note**: `XID_start` and `XID_continue` as character properties cover the | |
93 | > character ranges used to form the more familiar C and Java language-family | |
94 | > identifiers. | |
95 | ||
1a4d82fc JJ |
96 | ## Comments |
97 | ||
bd371182 AL |
98 | Comments in Rust code follow the general C++ style of line (`//`) and |
99 | block (`/* ... */`) comment forms. Nested block comments are supported. | |
1a4d82fc JJ |
100 | |
101 | Line comments beginning with exactly _three_ slashes (`///`), and block | |
92a42be0 | 102 | comments (`/** ... */`), are interpreted as a special syntax for `doc` |
1a4d82fc | 103 | [attributes](#attributes). That is, they are equivalent to writing |
bd371182 AL |
104 | `#[doc="..."]` around the body of the comment, i.e., `/// Foo` turns into |
105 | `#[doc="Foo"]`. | |
1a4d82fc | 106 | |
7453a54e | 107 | Line comments beginning with `//!` and block comments `/*! ... */` are |
bd371182 AL |
108 | doc comments that apply to the parent of the comment, rather than the item |
109 | that follows. That is, they are equivalent to writing `#![doc="..."]` around | |
110 | the body of the comment. `//!` comments are usually used to document | |
111 | modules that occupy a source file. | |
1a4d82fc JJ |
112 | |
113 | Non-doc comments are interpreted as a form of whitespace. | |
114 | ||
115 | ## Whitespace | |
116 | ||
bd371182 | 117 | Whitespace is any non-empty string containing only the following characters: |
1a4d82fc | 118 | |
bd371182 AL |
119 | - `U+0020` (space, `' '`) |
120 | - `U+0009` (tab, `'\t'`) | |
121 | - `U+000A` (LF, `'\n'`) | |
122 | - `U+000D` (CR, `'\r'`) | |
1a4d82fc JJ |
123 | |
124 | Rust is a "free-form" language, meaning that all forms of whitespace serve only | |
125 | to separate _tokens_ in the grammar, and have no semantic significance. | |
126 | ||
127 | A Rust program has identical meaning if each whitespace element is replaced | |
128 | with any other legal whitespace element, such as a single space character. | |
129 | ||
130 | ## Tokens | |
131 | ||
1a4d82fc JJ |
132 | Tokens are primitive productions in the grammar defined by regular |
133 | (non-recursive) languages. "Simple" tokens are given in [string table | |
134 | production](#string-table-productions) form, and occur in the rest of the | |
135 | grammar as double-quoted strings. Other tokens have exact rules given. | |
136 | ||
1a4d82fc JJ |
137 | ### Literals |
138 | ||
139 | A literal is an expression consisting of a single token, rather than a sequence | |
140 | of tokens, that immediately and directly denotes the value it evaluates to, | |
141 | rather than referring to it by name or some other evaluation rule. A literal is | |
142 | a form of constant expression, so is evaluated (primarily) at compile time. | |
143 | ||
1a4d82fc JJ |
144 | #### Examples |
145 | ||
146 | ##### Characters and strings | |
147 | ||
c34b1796 AL |
148 | | | Example | `#` sets | Characters | Escapes | |
149 | |----------------------------------------------|-----------------|------------|-------------|---------------------| | |
92a42be0 SL |
150 | | [Character](#character-literals) | `'H'` | `N/A` | All Unicode | [Quote](#quote-escapes) & [Byte](#byte-escapes) & [Unicode](#unicode-escapes) | |
151 | | [String](#string-literals) | `"hello"` | `N/A` | All Unicode | [Quote](#quote-escapes) & [Byte](#byte-escapes) & [Unicode](#unicode-escapes) | | |
c34b1796 | 152 | | [Raw](#raw-string-literals) | `r#"hello"#` | `0...` | All Unicode | `N/A` | |
92a42be0 SL |
153 | | [Byte](#byte-literals) | `b'H'` | `N/A` | All ASCII | [Quote](#quote-escapes) & [Byte](#byte-escapes) | |
154 | | [Byte string](#byte-string-literals) | `b"hello"` | `N/A` | All ASCII | [Quote](#quote-escapes) & [Byte](#byte-escapes) | | |
c34b1796 | 155 | | [Raw byte string](#raw-byte-string-literals) | `br#"hello"#` | `0...` | All ASCII | `N/A` | |
1a4d82fc JJ |
156 | |
157 | ##### Byte escapes | |
158 | ||
159 | | | Name | | |
160 | |---|------| | |
161 | | `\x7F` | 8-bit character code (exactly 2 digits) | | |
162 | | `\n` | Newline | | |
163 | | `\r` | Carriage return | | |
164 | | `\t` | Tab | | |
165 | | `\\` | Backslash | | |
92a42be0 | 166 | | `\0` | Null | |
1a4d82fc JJ |
167 | |
168 | ##### Unicode escapes | |
169 | | | Name | | |
170 | |---|------| | |
85aaf69f | 171 | | `\u{7FFF}` | 24-bit Unicode character code (up to 6 digits) | |
1a4d82fc | 172 | |
92a42be0 SL |
173 | ##### Quote escapes |
174 | | | Name | | |
175 | |---|------| | |
176 | | `\'` | Single quote | | |
177 | | `\"` | Double quote | | |
178 | ||
1a4d82fc JJ |
179 | ##### Numbers |
180 | ||
181 | | [Number literals](#number-literals)`*` | Example | Exponentiation | Suffixes | | |
182 | |----------------------------------------|---------|----------------|----------| | |
85aaf69f SL |
183 | | Decimal integer | `98_222` | `N/A` | Integer suffixes | |
184 | | Hex integer | `0xff` | `N/A` | Integer suffixes | | |
185 | | Octal integer | `0o77` | `N/A` | Integer suffixes | | |
186 | | Binary integer | `0b1111_0000` | `N/A` | Integer suffixes | | |
187 | | Floating-point | `123.0E+77` | `Optional` | Floating-point suffixes | | |
1a4d82fc JJ |
188 | |
189 | `*` All number literals allow `_` as a visual separator: `1_234.0E+18f64` | |
190 | ||
191 | ##### Suffixes | |
192 | | Integer | Floating-point | | |
193 | |---------|----------------| | |
bd371182 | 194 | | `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `isize`, `usize` | `f32`, `f64` | |
1a4d82fc JJ |
195 | |
196 | #### Character and string literals | |
197 | ||
1a4d82fc JJ |
198 | ##### Character literals |
199 | ||
200 | A _character literal_ is a single Unicode character enclosed within two | |
201 | `U+0027` (single-quote) characters, with the exception of `U+0027` itself, | |
c34b1796 | 202 | which must be _escaped_ by a preceding `U+005C` character (`\`). |
1a4d82fc JJ |
203 | |
204 | ##### String literals | |
205 | ||
206 | A _string literal_ is a sequence of any Unicode characters enclosed within two | |
207 | `U+0022` (double-quote) characters, with the exception of `U+0022` itself, | |
bd371182 | 208 | which must be _escaped_ by a preceding `U+005C` character (`\`). |
1a4d82fc | 209 | |
bd371182 | 210 | Line-break characters are allowed in string literals. Normally they represent |
9cc50fc6 SL |
211 | themselves (i.e. no translation), but as a special exception, when an unescaped |
212 | `U+005C` character (`\`) occurs immediately before the newline (`U+000A`), the | |
213 | `U+005C` character, the newline, and all whitespace at the beginning of the | |
214 | next line are ignored. Thus `a` and `b` are equal: | |
c34b1796 AL |
215 | |
216 | ```rust | |
217 | let a = "foobar"; | |
218 | let b = "foo\ | |
219 | bar"; | |
220 | ||
221 | assert_eq!(a,b); | |
222 | ``` | |
223 | ||
1a4d82fc JJ |
224 | ##### Character escapes |
225 | ||
226 | Some additional _escapes_ are available in either character or non-raw string | |
227 | literals. An escape starts with a `U+005C` (`\`) and continues with one of the | |
228 | following forms: | |
229 | ||
bd371182 AL |
230 | * An _8-bit code point escape_ starts with `U+0078` (`x`) and is |
231 | followed by exactly two _hex digits_. It denotes the Unicode code point | |
1a4d82fc | 232 | equal to the provided hex value. |
bd371182 | 233 | * A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed |
85aaf69f | 234 | by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D` |
bd371182 | 235 | (`}`). It denotes the Unicode code point equal to the provided hex value. |
1a4d82fc | 236 | * A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072` |
bd371182 | 237 | (`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF), |
1a4d82fc | 238 | `U+000D` (CR) or `U+0009` (HT) respectively. |
7453a54e SL |
239 | * The _null escape_ is the character `U+0030` (`0`) and denotes the Unicode |
240 | value `U+0000` (NUL). | |
1a4d82fc JJ |
241 | * The _backslash escape_ is the character `U+005C` (`\`) which must be |
242 | escaped in order to denote *itself*. | |
243 | ||
244 | ##### Raw string literals | |
245 | ||
246 | Raw string literals do not process any escapes. They start with the character | |
247 | `U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a | |
bd371182 AL |
248 | `U+0022` (double-quote) character. The _raw string body_ can contain any sequence |
249 | of Unicode characters and is terminated only by another `U+0022` (double-quote) | |
250 | character, followed by the same number of `U+0023` (`#`) characters that preceded | |
251 | the opening `U+0022` (double-quote) character. | |
1a4d82fc JJ |
252 | |
253 | All Unicode characters contained in the raw string body represent themselves, | |
254 | the characters `U+0022` (double-quote) (except when followed by at least as | |
255 | many `U+0023` (`#`) characters as were used to start the raw string literal) or | |
256 | `U+005C` (`\`) do not have any special meaning. | |
257 | ||
258 | Examples for string literals: | |
259 | ||
260 | ``` | |
261 | "foo"; r"foo"; // foo | |
262 | "\"foo\""; r#""foo""#; // "foo" | |
263 | ||
264 | "foo #\"# bar"; | |
265 | r##"foo #"# bar"##; // foo #"# bar | |
266 | ||
267 | "\x52"; "R"; r"R"; // R | |
268 | "\\x52"; r"\x52"; // \x52 | |
269 | ``` | |
270 | ||
271 | #### Byte and byte string literals | |
272 | ||
1a4d82fc JJ |
273 | ##### Byte literals |
274 | ||
275 | A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F` | |
bd371182 AL |
276 | range) or a single _escape_ preceded by the characters `U+0062` (`b`) and |
277 | `U+0027` (single-quote), and followed by the character `U+0027`. If the character | |
278 | `U+0027` is present within the literal, it must be _escaped_ by a preceding | |
279 | `U+005C` (`\`) character. It is equivalent to a `u8` unsigned 8-bit integer | |
280 | _number literal_. | |
1a4d82fc JJ |
281 | |
282 | ##### Byte string literals | |
283 | ||
85aaf69f SL |
284 | A non-raw _byte string literal_ is a sequence of ASCII characters and _escapes_, |
285 | preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and | |
286 | followed by the character `U+0022`. If the character `U+0022` is present within | |
287 | the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character. | |
288 | Alternatively, a byte string literal can be a _raw byte string literal_, defined | |
bd371182 | 289 | below. A byte string literal of length `n` is equivalent to a `&'static [u8; n]` borrowed fixed-sized array |
85aaf69f | 290 | of unsigned 8-bit integers. |
1a4d82fc JJ |
291 | |
292 | Some additional _escapes_ are available in either byte or non-raw byte string | |
293 | literals. An escape starts with a `U+005C` (`\`) and continues with one of the | |
294 | following forms: | |
295 | ||
bd371182 | 296 | * A _byte escape_ escape starts with `U+0078` (`x`) and is |
1a4d82fc JJ |
297 | followed by exactly two _hex digits_. It denotes the byte |
298 | equal to the provided hex value. | |
299 | * A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072` | |
300 | (`r`), or `U+0074` (`t`), denoting the bytes values `0x0A` (ASCII LF), | |
301 | `0x0D` (ASCII CR) or `0x09` (ASCII HT) respectively. | |
7453a54e SL |
302 | * The _null escape_ is the character `U+0030` (`0`) and denotes the byte |
303 | value `0x00` (ASCII NUL). | |
1a4d82fc JJ |
304 | * The _backslash escape_ is the character `U+005C` (`\`) which must be |
305 | escaped in order to denote its ASCII encoding `0x5C`. | |
306 | ||
307 | ##### Raw byte string literals | |
308 | ||
309 | Raw byte string literals do not process any escapes. They start with the | |
310 | character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more | |
311 | of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The | |
bd371182 AL |
312 | _raw string body_ can contain any sequence of ASCII characters and is terminated |
313 | only by another `U+0022` (double-quote) character, followed by the same number of | |
314 | `U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote) | |
315 | character. A raw byte string literal can not contain any non-ASCII byte. | |
1a4d82fc JJ |
316 | |
317 | All characters contained in the raw string body represent their ASCII encoding, | |
318 | the characters `U+0022` (double-quote) (except when followed by at least as | |
319 | many `U+0023` (`#`) characters as were used to start the raw string literal) or | |
320 | `U+005C` (`\`) do not have any special meaning. | |
321 | ||
322 | Examples for byte string literals: | |
323 | ||
324 | ``` | |
325 | b"foo"; br"foo"; // foo | |
326 | b"\"foo\""; br#""foo""#; // "foo" | |
327 | ||
328 | b"foo #\"# bar"; | |
329 | br##"foo #"# bar"##; // foo #"# bar | |
330 | ||
331 | b"\x52"; b"R"; br"R"; // R | |
332 | b"\\x52"; br"\x52"; // \x52 | |
333 | ``` | |
334 | ||
335 | #### Number literals | |
336 | ||
1a4d82fc JJ |
337 | A _number literal_ is either an _integer literal_ or a _floating-point |
338 | literal_. The grammar for recognizing the two kinds of literals is mixed. | |
339 | ||
340 | ##### Integer literals | |
341 | ||
342 | An _integer literal_ has one of four forms: | |
343 | ||
344 | * A _decimal literal_ starts with a *decimal digit* and continues with any | |
345 | mixture of *decimal digits* and _underscores_. | |
346 | * A _hex literal_ starts with the character sequence `U+0030` `U+0078` | |
347 | (`0x`) and continues as any mixture of hex digits and underscores. | |
348 | * An _octal literal_ starts with the character sequence `U+0030` `U+006F` | |
349 | (`0o`) and continues as any mixture of octal digits and underscores. | |
350 | * A _binary literal_ starts with the character sequence `U+0030` `U+0062` | |
351 | (`0b`) and continues as any mixture of binary digits and underscores. | |
352 | ||
353 | Like any literal, an integer literal may be followed (immediately, | |
354 | without any spaces) by an _integer suffix_, which forcibly sets the | |
85aaf69f SL |
355 | type of the literal. The integer suffix must be the name of one of the |
356 | integral types: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, | |
357 | `isize`, or `usize`. | |
1a4d82fc | 358 | |
c1a9b12d SL |
359 | The type of an _unsuffixed_ integer literal is determined by type inference: |
360 | ||
361 | * If an integer type can be _uniquely_ determined from the surrounding | |
362 | program context, the unsuffixed integer literal has that type. | |
363 | ||
364 | * If the program context under-constrains the type, it defaults to the | |
365 | signed 32-bit integer `i32`. | |
366 | ||
367 | * If the program context over-constrains the type, it is considered a | |
368 | static type error. | |
1a4d82fc JJ |
369 | |
370 | Examples of integer literals of various forms: | |
371 | ||
372 | ``` | |
85aaf69f SL |
373 | 123i32; // type i32 |
374 | 123u32; // type u32 | |
375 | 123_u32; // type u32 | |
1a4d82fc JJ |
376 | 0xff_u8; // type u8 |
377 | 0o70_i16; // type i16 | |
378 | 0b1111_1111_1001_0000_i32; // type i32 | |
85aaf69f | 379 | 0usize; // type usize |
1a4d82fc JJ |
380 | ``` |
381 | ||
54a0048b SL |
382 | Note that the Rust syntax considers `-1i8` as an application of the [unary minus |
383 | operator](#unary-operator-expressions) to an integer literal `1i8`, rather than | |
384 | a single integer literal. | |
385 | ||
1a4d82fc JJ |
386 | ##### Floating-point literals |
387 | ||
388 | A _floating-point literal_ has one of two forms: | |
389 | ||
390 | * A _decimal literal_ followed by a period character `U+002E` (`.`). This is | |
391 | optionally followed by another decimal literal, with an optional _exponent_. | |
392 | * A single _decimal literal_ followed by an _exponent_. | |
393 | ||
bd371182 AL |
394 | Like integer literals, a floating-point literal may be followed by a |
395 | suffix, so long as the pre-suffix part does not end with `U+002E` (`.`). | |
396 | The suffix forcibly sets the type of the literal. There are two valid | |
1a4d82fc JJ |
397 | _floating-point suffixes_, `f32` and `f64` (the 32-bit and 64-bit floating point |
398 | types), which explicitly determine the type of the literal. | |
399 | ||
c1a9b12d SL |
400 | The type of an _unsuffixed_ floating-point literal is determined by |
401 | type inference: | |
402 | ||
403 | * If a floating-point type can be _uniquely_ determined from the | |
404 | surrounding program context, the unsuffixed floating-point literal | |
405 | has that type. | |
406 | ||
407 | * If the program context under-constrains the type, it defaults to `f64`. | |
408 | ||
409 | * If the program context over-constrains the type, it is considered a | |
410 | static type error. | |
bd371182 | 411 | |
1a4d82fc JJ |
412 | Examples of floating-point literals of various forms: |
413 | ||
414 | ``` | |
415 | 123.0f64; // type f64 | |
416 | 0.1f64; // type f64 | |
417 | 0.1f32; // type f32 | |
418 | 12E+99_f64; // type f64 | |
419 | let x: f64 = 2.; // type f64 | |
420 | ``` | |
421 | ||
422 | This last example is different because it is not possible to use the suffix | |
423 | syntax with a floating point literal ending in a period. `2.f64` would attempt | |
424 | to call a method named `f64` on `2`. | |
425 | ||
c34b1796 AL |
426 | The representation semantics of floating-point numbers are described in |
427 | ["Machine Types"](#machine-types). | |
428 | ||
1a4d82fc JJ |
429 | #### Boolean literals |
430 | ||
431 | The two values of the boolean type are written `true` and `false`. | |
432 | ||
433 | ### Symbols | |
434 | ||
b039eaaf SL |
435 | Symbols are a general class of printable [tokens](#tokens) that play structural |
436 | roles in a variety of grammar productions. They are a | |
437 | set of remaining miscellaneous printable tokens that do not | |
1a4d82fc | 438 | otherwise appear as [unary operators](#unary-operator-expressions), [binary |
bd371182 | 439 | operators](#binary-operator-expressions), or [keywords][keywords]. |
b039eaaf SL |
440 | They are catalogued in [the Symbols section][symbols] of the Grammar document. |
441 | ||
442 | [symbols]: grammar.html#symbols | |
1a4d82fc JJ |
443 | |
444 | ||
445 | ## Paths | |
446 | ||
1a4d82fc JJ |
447 | A _path_ is a sequence of one or more path components _logically_ separated by |
448 | a namespace qualifier (`::`). If a path consists of only one component, it may | |
bd371182 | 449 | refer to either an [item](#items) or a [variable](#variables) in a local control |
1a4d82fc JJ |
450 | scope. If a path has multiple components, it refers to an item. |
451 | ||
452 | Every item has a _canonical path_ within its crate, but the path naming an item | |
453 | is only meaningful within a given crate. There is no global namespace across | |
454 | crates; an item's canonical path merely identifies it within the crate. | |
455 | ||
456 | Two examples of simple paths consisting of only identifier components: | |
457 | ||
458 | ```{.ignore} | |
459 | x; | |
460 | x::y::z; | |
461 | ``` | |
462 | ||
bd371182 AL |
463 | Path components are usually [identifiers](#identifiers), but they may |
464 | also include angle-bracket-enclosed lists of type arguments. In | |
465 | [expression](#expressions) context, the type argument list is given | |
466 | after a `::` namespace qualifier in order to disambiguate it from a | |
467 | relational expression involving the less-than symbol (`<`). In type | |
468 | expression context, the final namespace qualifier is omitted. | |
1a4d82fc JJ |
469 | |
470 | Two examples of paths with type arguments: | |
471 | ||
472 | ``` | |
85aaf69f | 473 | # struct HashMap<K, V>(K,V); |
1a4d82fc JJ |
474 | # fn f() { |
475 | # fn id<T>(t: T) -> T { t } | |
85aaf69f SL |
476 | type T = HashMap<i32,String>; // Type arguments used in a type expression |
477 | let x = id::<i32>(10); // Type arguments used in a call expression | |
1a4d82fc JJ |
478 | # } |
479 | ``` | |
480 | ||
481 | Paths can be denoted with various leading qualifiers to change the meaning of | |
482 | how it is resolved: | |
483 | ||
484 | * Paths starting with `::` are considered to be global paths where the | |
485 | components of the path start being resolved from the crate root. Each | |
486 | identifier in the path must resolve to an item. | |
487 | ||
488 | ```rust | |
489 | mod a { | |
490 | pub fn foo() {} | |
491 | } | |
492 | mod b { | |
493 | pub fn foo() { | |
494 | ::a::foo(); // call a's foo function | |
495 | } | |
496 | } | |
497 | # fn main() {} | |
498 | ``` | |
499 | ||
500 | * Paths starting with the keyword `super` begin resolution relative to the | |
85aaf69f | 501 | parent module. Each further identifier must resolve to an item. |
1a4d82fc JJ |
502 | |
503 | ```rust | |
504 | mod a { | |
505 | pub fn foo() {} | |
506 | } | |
507 | mod b { | |
508 | pub fn foo() { | |
509 | super::a::foo(); // call a's foo function | |
510 | } | |
511 | } | |
512 | # fn main() {} | |
513 | ``` | |
514 | ||
515 | * Paths starting with the keyword `self` begin resolution relative to the | |
516 | current module. Each further identifier must resolve to an item. | |
517 | ||
518 | ```rust | |
519 | fn foo() {} | |
520 | fn bar() { | |
521 | self::foo(); | |
522 | } | |
523 | # fn main() {} | |
524 | ``` | |
525 | ||
92a42be0 SL |
526 | Additionally keyword `super` may be repeated several times after the first |
527 | `super` or `self` to refer to ancestor modules. | |
528 | ||
529 | ```rust | |
530 | mod a { | |
531 | fn foo() {} | |
532 | ||
533 | mod b { | |
534 | mod c { | |
535 | fn foo() { | |
536 | super::super::foo(); // call a's foo function | |
537 | self::super::super::foo(); // call a's foo function | |
538 | } | |
539 | } | |
540 | } | |
541 | } | |
542 | # fn main() {} | |
543 | ``` | |
544 | ||
1a4d82fc JJ |
545 | # Syntax extensions |
546 | ||
547 | A number of minor features of Rust are not central enough to have their own | |
548 | syntax, and yet are not implementable as functions. Instead, they are given | |
c34b1796 | 549 | names, and invoked through a consistent syntax: `some_extension!(...)`. |
1a4d82fc JJ |
550 | |
551 | Users of `rustc` can define new syntax extensions in two ways: | |
552 | ||
bd371182 AL |
553 | * [Compiler plugins][plugin] can include arbitrary Rust code that |
554 | manipulates syntax trees at compile time. Note that the interface | |
555 | for compiler plugins is considered highly unstable. | |
1a4d82fc JJ |
556 | |
557 | * [Macros](book/macros.html) define new syntax in a higher-level, | |
558 | declarative way. | |
559 | ||
560 | ## Macros | |
561 | ||
1a4d82fc JJ |
562 | `macro_rules` allows users to define syntax extension in a declarative way. We |
563 | call such extensions "macros by example" or simply "macros" — to be distinguished | |
564 | from the "procedural macros" defined in [compiler plugins][plugin]. | |
565 | ||
566 | Currently, macros can expand to expressions, statements, items, or patterns. | |
567 | ||
568 | (A `sep_token` is any token other than `*` and `+`. A `non_special_token` is | |
569 | any token other than a delimiter or `$`.) | |
570 | ||
571 | The macro expander looks up macro invocations by name, and tries each macro | |
572 | rule in turn. It transcribes the first successful match. Matching and | |
573 | transcription are closely related to each other, and we will describe them | |
574 | together. | |
575 | ||
576 | ### Macro By Example | |
577 | ||
578 | The macro expander matches and transcribes every token that does not begin with | |
579 | a `$` literally, including delimiters. For parsing reasons, delimiters must be | |
580 | balanced, but they are otherwise not special. | |
581 | ||
582 | In the matcher, `$` _name_ `:` _designator_ matches the nonterminal in the Rust | |
92a42be0 SL |
583 | syntax named by _designator_. Valid designators are: |
584 | ||
585 | * `item`: an [item](#items) | |
586 | * `block`: a [block](#block-expressions) | |
587 | * `stmt`: a [statement](#statements) | |
588 | * `pat`: a [pattern](#match-expressions) | |
589 | * `expr`: an [expression](#expressions) | |
590 | * `ty`: a [type](#types) | |
591 | * `ident`: an [identifier](#identifiers) | |
592 | * `path`: a [path](#paths) | |
593 | * `tt`: either side of the `=>` in macro rules | |
594 | * `meta`: the contents of an [attribute](#attributes) | |
595 | ||
596 | In the transcriber, the | |
e9174d1e SL |
597 | designator is already known, and so only the name of a matched nonterminal comes |
598 | after the dollar sign. | |
1a4d82fc JJ |
599 | |
600 | In both the matcher and transcriber, the Kleene star-like operator indicates | |
bd371182 | 601 | repetition. The Kleene star operator consists of `$` and parentheses, optionally |
1a4d82fc | 602 | followed by a separator token, followed by `*` or `+`. `*` means zero or more |
bd371182 | 603 | repetitions, `+` means at least one repetition. The parentheses are not matched or |
1a4d82fc JJ |
604 | transcribed. On the matcher side, a name is bound to _all_ of the names it |
605 | matches, in a structure that mimics the structure of the repetition encountered | |
606 | on a successful match. The job of the transcriber is to sort that structure | |
607 | out. | |
608 | ||
609 | The rules for transcription of these repetitions are called "Macro By Example". | |
610 | Essentially, one "layer" of repetition is discharged at a time, and all of them | |
611 | must be discharged by the time a name is transcribed. Therefore, `( $( $i:ident | |
612 | ),* ) => ( $i )` is an invalid macro, but `( $( $i:ident ),* ) => ( $( $i:ident | |
613 | ),* )` is acceptable (if trivial). | |
614 | ||
615 | When Macro By Example encounters a repetition, it examines all of the `$` | |
616 | _name_ s that occur in its body. At the "current layer", they all must repeat | |
617 | the same number of times, so ` ( $( $i:ident ),* ; $( $j:ident ),* ) => ( $( | |
618 | ($i,$j) ),* )` is valid if given the argument `(a,b,c ; d,e,f)`, but not | |
619 | `(a,b,c ; d,e)`. The repetition walks through the choices at that layer in | |
bd371182 | 620 | lockstep, so the former input transcribes to `(a,d), (b,e), (c,f)`. |
1a4d82fc JJ |
621 | |
622 | Nested repetitions are allowed. | |
623 | ||
624 | ### Parsing limitations | |
625 | ||
626 | The parser used by the macro system is reasonably powerful, but the parsing of | |
627 | Rust syntax is restricted in two ways: | |
628 | ||
bd371182 AL |
629 | 1. Macro definitions are required to include suitable separators after parsing |
630 | expressions and other bits of the Rust grammar. This implies that | |
631 | a macro definition like `$i:expr [ , ]` is not legal, because `[` could be part | |
632 | of an expression. A macro definition like `$i:expr,` or `$i:expr;` would be legal, | |
633 | however, because `,` and `;` are legal separators. See [RFC 550] for more information. | |
1a4d82fc JJ |
634 | 2. The parser must have eliminated all ambiguity by the time it reaches a `$` |
635 | _name_ `:` _designator_. This requirement most often affects name-designator | |
636 | pairs when they occur at the beginning of, or immediately after, a `$(...)*`; | |
637 | requiring a distinctive token in front can solve the problem. | |
638 | ||
bd371182 AL |
639 | [RFC 550]: https://github.com/rust-lang/rfcs/blob/master/text/0550-macro-future-proofing.md |
640 | ||
1a4d82fc JJ |
641 | # Crates and source files |
642 | ||
bd371182 | 643 | Although Rust, like any other language, can be implemented by an interpreter as |
b039eaaf SL |
644 | well as a compiler, the only existing implementation is a compiler, |
645 | and the language has | |
bd371182 AL |
646 | always been designed to be compiled. For these reasons, this section assumes a |
647 | compiler. | |
648 | ||
649 | Rust's semantics obey a *phase distinction* between compile-time and | |
c1a9b12d SL |
650 | run-time.[^phase-distinction] Semantic rules that have a *static |
651 | interpretation* govern the success or failure of compilation, while | |
652 | semantic rules | |
bd371182 AL |
653 | that have a *dynamic interpretation* govern the behavior of the program at |
654 | run-time. | |
655 | ||
656 | [^phase-distinction]: This distinction would also exist in an interpreter. | |
657 | Static checks like syntactic analysis, type checking, and lints should | |
658 | happen before the program is executed regardless of when it is executed. | |
1a4d82fc JJ |
659 | |
660 | The compilation model centers on artifacts called _crates_. Each compilation | |
661 | processes a single crate in source form, and if successful, produces a single | |
bd371182 AL |
662 | crate in binary form: either an executable or some sort of |
663 | library.[^cratesourcefile] | |
1a4d82fc JJ |
664 | |
665 | [^cratesourcefile]: A crate is somewhat analogous to an *assembly* in the | |
666 | ECMA-335 CLI model, a *library* in the SML/NJ Compilation Manager, a *unit* | |
667 | in the Owens and Flatt module system, or a *configuration* in Mesa. | |
668 | ||
669 | A _crate_ is a unit of compilation and linking, as well as versioning, | |
670 | distribution and runtime loading. A crate contains a _tree_ of nested | |
671 | [module](#modules) scopes. The top level of this tree is a module that is | |
672 | anonymous (from the point of view of paths within the module) and any item | |
673 | within a crate has a canonical [module path](#paths) denoting its location | |
674 | within the crate's module tree. | |
675 | ||
676 | The Rust compiler is always invoked with a single source file as input, and | |
677 | always produces a single output crate. The processing of that source file may | |
678 | result in other source files being loaded as modules. Source files have the | |
679 | extension `.rs`. | |
680 | ||
681 | A Rust source file describes a module, the name and location of which — | |
682 | in the module tree of the current crate — are defined from outside the | |
683 | source file: either by an explicit `mod_item` in a referencing source file, or | |
bd371182 AL |
684 | by the name of the crate itself. Every source file is a module, but not every |
685 | module needs its own source file: [module definitions](#modules) can be nested | |
686 | within one file. | |
1a4d82fc JJ |
687 | |
688 | Each source file contains a sequence of zero or more `item` definitions, and | |
bd371182 AL |
689 | may optionally begin with any number of [attributes](#items-and-attributes) |
690 | that apply to the containing module, most of which influence the behavior of | |
691 | the compiler. The anonymous crate module can have additional attributes that | |
692 | apply to the crate as a whole. | |
1a4d82fc | 693 | |
c34b1796 | 694 | ```no_run |
bd371182 | 695 | // Specify the crate name. |
1a4d82fc JJ |
696 | #![crate_name = "projx"] |
697 | ||
bd371182 | 698 | // Specify the type of output artifact. |
1a4d82fc JJ |
699 | #![crate_type = "lib"] |
700 | ||
bd371182 AL |
701 | // Turn on a warning. |
702 | // This can be done in any module, not just the anonymous crate module. | |
1a4d82fc JJ |
703 | #![warn(non_camel_case_types)] |
704 | ``` | |
705 | ||
706 | A crate that contains a `main` function can be compiled to an executable. If a | |
92a42be0 SL |
707 | `main` function is present, its return type must be `()` |
708 | ("[unit](#tuple-types)") and it must take no arguments. | |
1a4d82fc JJ |
709 | |
710 | # Items and attributes | |
711 | ||
712 | Crates contain [items](#items), each of which may have some number of | |
713 | [attributes](#attributes) attached to it. | |
714 | ||
715 | ## Items | |
716 | ||
85aaf69f SL |
717 | An _item_ is a component of a crate. Items are organized within a crate by a |
718 | nested set of [modules](#modules). Every crate has a single "outermost" | |
1a4d82fc JJ |
719 | anonymous module; all further items within the crate have [paths](#paths) |
720 | within the module tree of the crate. | |
721 | ||
722 | Items are entirely determined at compile-time, generally remain fixed during | |
723 | execution, and may reside in read-only memory. | |
724 | ||
725 | There are several kinds of item: | |
726 | ||
85aaf69f SL |
727 | * [`extern crate` declarations](#extern-crate-declarations) |
728 | * [`use` declarations](#use-declarations) | |
1a4d82fc JJ |
729 | * [modules](#modules) |
730 | * [functions](#functions) | |
bd371182 | 731 | * [type definitions](grammar.html#type-definitions) |
b039eaaf | 732 | * [structs](#structs) |
1a4d82fc | 733 | * [enumerations](#enumerations) |
bd371182 | 734 | * [constant items](#constant-items) |
1a4d82fc JJ |
735 | * [static items](#static-items) |
736 | * [traits](#traits) | |
737 | * [implementations](#implementations) | |
738 | ||
739 | Some items form an implicit scope for the declaration of sub-items. In other | |
740 | words, within a function or module, declarations of items can (in many cases) | |
741 | be mixed with the statements, control blocks, and similar artifacts that | |
742 | otherwise compose the item body. The meaning of these scoped items is the same | |
743 | as if the item was declared outside the scope — it is still a static item | |
744 | — except that the item's *path name* within the module namespace is | |
745 | qualified by the name of the enclosing item, or is private to the enclosing | |
746 | item (in the case of functions). The grammar specifies the exact locations in | |
747 | which sub-item declarations may appear. | |
748 | ||
749 | ### Type Parameters | |
750 | ||
bd371182 AL |
751 | All items except modules, constants and statics may be *parameterized* by type. |
752 | Type parameters are given as a comma-separated list of identifiers enclosed in | |
753 | angle brackets (`<...>`), after the name of the item and before its definition. | |
754 | The type parameters of an item are considered "part of the name", not part of | |
755 | the type of the item. A referencing [path](#paths) must (in principle) provide | |
756 | type arguments as a list of comma-separated types enclosed within angle | |
757 | brackets, in order to refer to the type-parameterized item. In practice, the | |
758 | type-inference system can usually infer such argument types from context. There | |
759 | are no general type-parametric types, only type-parametric items. That is, Rust | |
760 | has no notion of type abstraction: there are no higher-ranked (or "forall") types | |
761 | abstracted over other types, though higher-ranked types do exist for lifetimes. | |
1a4d82fc JJ |
762 | |
763 | ### Modules | |
764 | ||
85aaf69f | 765 | A module is a container for zero or more [items](#items). |
1a4d82fc JJ |
766 | |
767 | A _module item_ is a module, surrounded in braces, named, and prefixed with the | |
768 | keyword `mod`. A module item introduces a new, named module into the tree of | |
769 | modules making up a crate. Modules can nest arbitrarily. | |
770 | ||
771 | An example of a module: | |
772 | ||
773 | ``` | |
774 | mod math { | |
775 | type Complex = (f64, f64); | |
776 | fn sin(f: f64) -> f64 { | |
777 | /* ... */ | |
778 | # panic!(); | |
779 | } | |
780 | fn cos(f: f64) -> f64 { | |
781 | /* ... */ | |
782 | # panic!(); | |
783 | } | |
784 | fn tan(f: f64) -> f64 { | |
785 | /* ... */ | |
786 | # panic!(); | |
787 | } | |
788 | } | |
789 | ``` | |
790 | ||
791 | Modules and types share the same namespace. Declaring a named type with | |
792 | the same name as a module in scope is forbidden: that is, a type definition, | |
793 | trait, struct, enumeration, or type parameter can't shadow the name of a module | |
794 | in scope, or vice versa. | |
795 | ||
796 | A module without a body is loaded from an external file, by default with the | |
797 | same name as the module, plus the `.rs` extension. When a nested submodule is | |
798 | loaded from an external file, it is loaded from a subdirectory path that | |
799 | mirrors the module hierarchy. | |
800 | ||
801 | ```{.ignore} | |
802 | // Load the `vec` module from `vec.rs` | |
803 | mod vec; | |
804 | ||
805 | mod thread { | |
806 | // Load the `local_data` module from `thread/local_data.rs` | |
bd371182 | 807 | // or `thread/local_data/mod.rs`. |
1a4d82fc JJ |
808 | mod local_data; |
809 | } | |
810 | ``` | |
811 | ||
812 | The directories and files used for loading external file modules can be | |
813 | influenced with the `path` attribute. | |
814 | ||
815 | ```{.ignore} | |
816 | #[path = "thread_files"] | |
817 | mod thread { | |
818 | // Load the `local_data` module from `thread_files/tls.rs` | |
819 | #[path = "tls.rs"] | |
820 | mod local_data; | |
821 | } | |
822 | ``` | |
823 | ||
bd371182 | 824 | #### Extern crate declarations |
1a4d82fc JJ |
825 | |
826 | An _`extern crate` declaration_ specifies a dependency on an external crate. | |
827 | The external crate is then bound into the declaring scope as the `ident` | |
828 | provided in the `extern_crate_decl`. | |
829 | ||
830 | The external crate is resolved to a specific `soname` at compile time, and a | |
831 | runtime linkage requirement to that `soname` is passed to the linker for | |
832 | loading at runtime. The `soname` is resolved at compile time by scanning the | |
bd371182 AL |
833 | compiler's library path and matching the optional `crateid` provided against |
834 | the `crateid` attributes that were declared on the external crate when it was | |
835 | compiled. If no `crateid` is provided, a default `name` attribute is assumed, | |
836 | equal to the `ident` given in the `extern_crate_decl`. | |
1a4d82fc JJ |
837 | |
838 | Three examples of `extern crate` declarations: | |
839 | ||
840 | ```{.ignore} | |
841 | extern crate pcre; | |
842 | ||
843 | extern crate std; // equivalent to: extern crate std as std; | |
844 | ||
c34b1796 | 845 | extern crate std as ruststd; // linking to 'std' under another name |
1a4d82fc JJ |
846 | ``` |
847 | ||
bd371182 | 848 | #### Use declarations |
1a4d82fc JJ |
849 | |
850 | A _use declaration_ creates one or more local name bindings synonymous with | |
851 | some other [path](#paths). Usually a `use` declaration is used to shorten the | |
7453a54e SL |
852 | path required to refer to a module item. These declarations may appear in |
853 | [modules](#modules) and [blocks](grammar.html#block-expressions), usually at the top. | |
1a4d82fc JJ |
854 | |
855 | > **Note**: Unlike in many languages, | |
856 | > `use` declarations in Rust do *not* declare linkage dependency with external crates. | |
857 | > Rather, [`extern crate` declarations](#extern-crate-declarations) declare linkage dependencies. | |
858 | ||
859 | Use declarations support a number of convenient shortcuts: | |
860 | ||
85aaf69f | 861 | * Rebinding the target name as a new local name, using the syntax `use p::q::r as x;` |
1a4d82fc JJ |
862 | * Simultaneously binding a list of paths differing only in their final element, |
863 | using the glob-like brace syntax `use a::b::{c,d,e,f};` | |
864 | * Binding all paths matching a given prefix, using the asterisk wildcard syntax | |
865 | `use a::b::*;` | |
866 | * Simultaneously binding a list of paths differing only in their final element | |
85aaf69f SL |
867 | and their immediate parent module, using the `self` keyword, such as |
868 | `use a::b::{self, c, d};` | |
1a4d82fc JJ |
869 | |
870 | An example of `use` declarations: | |
871 | ||
bd371182 | 872 | ```rust |
1a4d82fc | 873 | use std::option::Option::{Some, None}; |
85aaf69f | 874 | use std::collections::hash_map::{self, HashMap}; |
1a4d82fc JJ |
875 | |
876 | fn foo<T>(_: T){} | |
85aaf69f | 877 | fn bar(map1: HashMap<String, usize>, map2: hash_map::HashMap<String, usize>){} |
1a4d82fc JJ |
878 | |
879 | fn main() { | |
1a4d82fc JJ |
880 | // Equivalent to 'foo(vec![std::option::Option::Some(1.0f64), |
881 | // std::option::Option::None]);' | |
882 | foo(vec![Some(1.0f64), None]); | |
883 | ||
884 | // Both `hash_map` and `HashMap` are in scope. | |
885 | let map1 = HashMap::new(); | |
886 | let map2 = hash_map::HashMap::new(); | |
887 | bar(map1, map2); | |
888 | } | |
889 | ``` | |
890 | ||
891 | Like items, `use` declarations are private to the containing module, by | |
892 | default. Also like items, a `use` declaration can be public, if qualified by | |
893 | the `pub` keyword. Such a `use` declaration serves to _re-export_ a name. A | |
894 | public `use` declaration can therefore _redirect_ some public name to a | |
895 | different target definition: even a definition with a private canonical path, | |
896 | inside a different module. If a sequence of such redirections form a cycle or | |
897 | cannot be resolved unambiguously, they represent a compile-time error. | |
898 | ||
899 | An example of re-exporting: | |
900 | ||
901 | ``` | |
902 | # fn main() { } | |
903 | mod quux { | |
904 | pub use quux::foo::{bar, baz}; | |
905 | ||
906 | pub mod foo { | |
907 | pub fn bar() { } | |
908 | pub fn baz() { } | |
909 | } | |
910 | } | |
911 | ``` | |
912 | ||
913 | In this example, the module `quux` re-exports two public names defined in | |
914 | `foo`. | |
915 | ||
916 | Also note that the paths contained in `use` items are relative to the crate | |
917 | root. So, in the previous example, the `use` refers to `quux::foo::{bar, | |
918 | baz}`, and not simply to `foo::{bar, baz}`. This also means that top-level | |
919 | module declarations should be at the crate root if direct usage of the declared | |
920 | modules within `use` items is desired. It is also possible to use `self` and | |
921 | `super` at the beginning of a `use` item to refer to the current and direct | |
922 | parent modules respectively. All rules regarding accessing declared modules in | |
bd371182 | 923 | `use` declarations apply to both module declarations and `extern crate` |
1a4d82fc JJ |
924 | declarations. |
925 | ||
926 | An example of what will and will not work for `use` items: | |
927 | ||
928 | ``` | |
929 | # #![allow(unused_imports)] | |
1a4d82fc JJ |
930 | use foo::baz::foobaz; // good: foo is at the root of the crate |
931 | ||
932 | mod foo { | |
1a4d82fc | 933 | |
bd371182 AL |
934 | mod example { |
935 | pub mod iter {} | |
936 | } | |
937 | ||
938 | use foo::example::iter; // good: foo is at crate root | |
b039eaaf | 939 | // use example::iter; // bad: example is not at the crate root |
1a4d82fc JJ |
940 | use self::baz::foobaz; // good: self refers to module 'foo' |
941 | use foo::bar::foobar; // good: foo is at crate root | |
942 | ||
943 | pub mod bar { | |
944 | pub fn foobar() { } | |
945 | } | |
946 | ||
947 | pub mod baz { | |
948 | use super::bar::foobar; // good: super refers to module 'foo' | |
949 | pub fn foobaz() { } | |
950 | } | |
951 | } | |
952 | ||
953 | fn main() {} | |
954 | ``` | |
955 | ||
956 | ### Functions | |
957 | ||
b039eaaf SL |
958 | A _function item_ defines a sequence of [statements](#statements) and a |
959 | final [expression](#expressions), along with a name and a set of | |
960 | parameters. Other than a name, all these are optional. | |
961 | Functions are declared with the keyword `fn`. Functions may declare a | |
bd371182 AL |
962 | set of *input* [*variables*](#variables) as parameters, through which the caller |
963 | passes arguments into the function, and the *output* [*type*](#types) | |
964 | of the value the function will return to its caller on completion. | |
1a4d82fc | 965 | |
85aaf69f | 966 | A function may also be copied into a first-class *value*, in which case the |
1a4d82fc JJ |
967 | value has the corresponding [*function type*](#function-types), and can be used |
968 | otherwise exactly as a function item (with a minor additional cost of calling | |
969 | the function indirectly). | |
970 | ||
971 | Every control path in a function logically ends with a `return` expression or a | |
972 | diverging expression. If the outermost block of a function has a | |
973 | value-producing expression in its final-expression position, that expression is | |
974 | interpreted as an implicit `return` expression applied to the final-expression. | |
975 | ||
976 | An example of a function: | |
977 | ||
978 | ``` | |
85aaf69f | 979 | fn add(x: i32, y: i32) -> i32 { |
b039eaaf | 980 | x + y |
1a4d82fc JJ |
981 | } |
982 | ``` | |
983 | ||
984 | As with `let` bindings, function arguments are irrefutable patterns, so any | |
985 | pattern that is valid in a let binding is also valid as an argument. | |
986 | ||
987 | ``` | |
85aaf69f | 988 | fn first((value, _): (i32, i32)) -> i32 { value } |
1a4d82fc JJ |
989 | ``` |
990 | ||
991 | ||
992 | #### Generic functions | |
993 | ||
994 | A _generic function_ allows one or more _parameterized types_ to appear in its | |
7453a54e SL |
995 | signature. Each type parameter must be explicitly declared in an |
996 | angle-bracket-enclosed and comma-separated list, following the function name. | |
1a4d82fc | 997 | |
62682a34 SL |
998 | ```rust,ignore |
999 | // foo is generic over A and B | |
1000 | ||
1001 | fn foo<A, B>(x: A, y: B) { | |
1a4d82fc JJ |
1002 | ``` |
1003 | ||
1004 | Inside the function signature and body, the name of the type parameter can be | |
bd371182 AL |
1005 | used as a type name. [Trait](#traits) bounds can be specified for type parameters |
1006 | to allow methods with that trait to be called on values of that type. This is | |
62682a34 SL |
1007 | specified using the `where` syntax: |
1008 | ||
1009 | ```rust,ignore | |
1010 | fn foo<T>(x: T) where T: Debug { | |
1011 | ``` | |
1a4d82fc JJ |
1012 | |
1013 | When a generic function is referenced, its type is instantiated based on the | |
62682a34 SL |
1014 | context of the reference. For example, calling the `foo` function here: |
1015 | ||
1016 | ``` | |
1017 | use std::fmt::Debug; | |
1018 | ||
1019 | fn foo<T>(x: &[T]) where T: Debug { | |
1020 | // details elided | |
1021 | # () | |
1022 | } | |
1023 | ||
1024 | foo(&[1, 2]); | |
1025 | ``` | |
1026 | ||
1027 | will instantiate type parameter `T` with `i32`. | |
1a4d82fc JJ |
1028 | |
1029 | The type parameters can also be explicitly supplied in a trailing | |
1030 | [path](#paths) component after the function name. This might be necessary if | |
1031 | there is not sufficient context to determine the type parameters. For example, | |
1032 | `mem::size_of::<u32>() == 4`. | |
1033 | ||
1a4d82fc JJ |
1034 | #### Diverging functions |
1035 | ||
1036 | A special kind of function can be declared with a `!` character where the | |
bd371182 | 1037 | output type would normally be. For example: |
1a4d82fc JJ |
1038 | |
1039 | ``` | |
1040 | fn my_err(s: &str) -> ! { | |
1041 | println!("{}", s); | |
1042 | panic!(); | |
1043 | } | |
1044 | ``` | |
1045 | ||
1046 | We call such functions "diverging" because they never return a value to the | |
1047 | caller. Every control path in a diverging function must end with a `panic!()` or | |
1048 | a call to another diverging function on every control path. The `!` annotation | |
85aaf69f | 1049 | does *not* denote a type. |
1a4d82fc JJ |
1050 | |
1051 | It might be necessary to declare a diverging function because as mentioned | |
1052 | previously, the typechecker checks that every control path in a function ends | |
1053 | with a [`return`](#return-expressions) or diverging expression. So, if `my_err` | |
1054 | were declared without the `!` annotation, the following code would not | |
1055 | typecheck: | |
1056 | ||
1057 | ``` | |
1058 | # fn my_err(s: &str) -> ! { panic!() } | |
1059 | ||
85aaf69f | 1060 | fn f(i: i32) -> i32 { |
e9174d1e SL |
1061 | if i == 42 { |
1062 | return 42; | |
1063 | } | |
1064 | else { | |
1065 | my_err("Bad number!"); | |
1066 | } | |
1a4d82fc JJ |
1067 | } |
1068 | ``` | |
1069 | ||
1070 | This will not compile without the `!` annotation on `my_err`, since the `else` | |
85aaf69f | 1071 | branch of the conditional in `f` does not return an `i32`, as required by the |
1a4d82fc JJ |
1072 | signature of `f`. Adding the `!` annotation to `my_err` informs the |
1073 | typechecker that, should control ever enter `my_err`, no further type judgments | |
1074 | about `f` need to hold, since control will never resume in any context that | |
1075 | relies on those judgments. Thus the return type on `f` only needs to reflect | |
1076 | the `if` branch of the conditional. | |
1077 | ||
1078 | #### Extern functions | |
1079 | ||
1080 | Extern functions are part of Rust's foreign function interface, providing the | |
1081 | opposite functionality to [external blocks](#external-blocks). Whereas | |
1082 | external blocks allow Rust code to call foreign code, extern functions with | |
1083 | bodies defined in Rust code _can be called by foreign code_. They are defined | |
1084 | in the same way as any other Rust function, except that they have the `extern` | |
1085 | modifier. | |
1086 | ||
1087 | ``` | |
1088 | // Declares an extern fn, the ABI defaults to "C" | |
85aaf69f | 1089 | extern fn new_i32() -> i32 { 0 } |
1a4d82fc JJ |
1090 | |
1091 | // Declares an extern fn with "stdcall" ABI | |
85aaf69f | 1092 | extern "stdcall" fn new_i32_stdcall() -> i32 { 0 } |
1a4d82fc JJ |
1093 | ``` |
1094 | ||
62682a34 | 1095 | Unlike normal functions, extern fns have type `extern "ABI" fn()`. This is the |
1a4d82fc JJ |
1096 | same type as the functions declared in an extern block. |
1097 | ||
1098 | ``` | |
85aaf69f SL |
1099 | # extern fn new_i32() -> i32 { 0 } |
1100 | let fptr: extern "C" fn() -> i32 = new_i32; | |
1a4d82fc JJ |
1101 | ``` |
1102 | ||
1103 | Extern functions may be called directly from Rust code as Rust uses large, | |
1104 | contiguous stack segments like C. | |
1105 | ||
1106 | ### Type aliases | |
1107 | ||
1108 | A _type alias_ defines a new name for an existing [type](#types). Type | |
1109 | aliases are declared with the keyword `type`. Every value has a single, | |
bd371182 AL |
1110 | specific type, but may implement several different traits, or be compatible with |
1111 | several different type constraints. | |
1a4d82fc | 1112 | |
bd371182 AL |
1113 | For example, the following defines the type `Point` as a synonym for the type |
1114 | `(u8, u8)`, the type of pairs of unsigned 8 bit integers: | |
1a4d82fc JJ |
1115 | |
1116 | ``` | |
1117 | type Point = (u8, u8); | |
1118 | let p: Point = (41, 68); | |
1119 | ``` | |
1120 | ||
54a0048b SL |
1121 | Currently a type alias to an enum type cannot be used to qualify the |
1122 | constructors: | |
1123 | ||
1124 | ``` | |
1125 | enum E { A } | |
1126 | type F = E; | |
1127 | let _: F = E::A; // OK | |
1128 | // let _: F = F::A; // Doesn't work | |
1129 | ``` | |
1130 | ||
b039eaaf | 1131 | ### Structs |
1a4d82fc | 1132 | |
b039eaaf | 1133 | A _struct_ is a nominal [struct type](#struct-types) defined with the |
1a4d82fc JJ |
1134 | keyword `struct`. |
1135 | ||
1136 | An example of a `struct` item and its use: | |
1137 | ||
1138 | ``` | |
85aaf69f | 1139 | struct Point {x: i32, y: i32} |
1a4d82fc | 1140 | let p = Point {x: 10, y: 11}; |
85aaf69f | 1141 | let px: i32 = p.x; |
1a4d82fc JJ |
1142 | ``` |
1143 | ||
b039eaaf | 1144 | A _tuple struct_ is a nominal [tuple type](#tuple-types), also defined with |
1a4d82fc JJ |
1145 | the keyword `struct`. For example: |
1146 | ||
1147 | ``` | |
85aaf69f | 1148 | struct Point(i32, i32); |
1a4d82fc | 1149 | let p = Point(10, 11); |
85aaf69f | 1150 | let px: i32 = match p { Point(x, _) => x }; |
1a4d82fc JJ |
1151 | ``` |
1152 | ||
b039eaaf SL |
1153 | A _unit-like struct_ is a struct without any fields, defined by leaving off |
1154 | the list of fields entirely. Such a struct implicitly defines a constant of | |
1155 | its type with the same name. For example: | |
1a4d82fc JJ |
1156 | |
1157 | ``` | |
1158 | struct Cookie; | |
b039eaaf SL |
1159 | let c = [Cookie, Cookie {}, Cookie, Cookie {}]; |
1160 | ``` | |
1161 | ||
1162 | is equivalent to | |
1163 | ||
1164 | ``` | |
b039eaaf SL |
1165 | struct Cookie {} |
1166 | const Cookie: Cookie = Cookie {}; | |
1167 | let c = [Cookie, Cookie {}, Cookie, Cookie {}]; | |
1a4d82fc JJ |
1168 | ``` |
1169 | ||
b039eaaf | 1170 | The precise memory layout of a struct is not specified. One can specify a |
1a4d82fc JJ |
1171 | particular layout using the [`repr` attribute](#ffi-attributes). |
1172 | ||
1173 | ### Enumerations | |
1174 | ||
1175 | An _enumeration_ is a simultaneous definition of a nominal [enumerated | |
1176 | type](#enumerated-types) as well as a set of *constructors*, that can be used | |
1177 | to create or pattern-match values of the corresponding enumerated type. | |
1178 | ||
1179 | Enumerations are declared with the keyword `enum`. | |
1180 | ||
1181 | An example of an `enum` item and its use: | |
1182 | ||
1183 | ``` | |
1184 | enum Animal { | |
c1a9b12d SL |
1185 | Dog, |
1186 | Cat, | |
1a4d82fc JJ |
1187 | } |
1188 | ||
1189 | let mut a: Animal = Animal::Dog; | |
1190 | a = Animal::Cat; | |
1191 | ``` | |
1192 | ||
1193 | Enumeration constructors can have either named or unnamed fields: | |
1194 | ||
bd371182 | 1195 | ```rust |
1a4d82fc JJ |
1196 | enum Animal { |
1197 | Dog (String, f64), | |
7453a54e | 1198 | Cat { name: String, weight: f64 }, |
1a4d82fc JJ |
1199 | } |
1200 | ||
1201 | let mut a: Animal = Animal::Dog("Cocoa".to_string(), 37.2); | |
1202 | a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 }; | |
1a4d82fc JJ |
1203 | ``` |
1204 | ||
1205 | In this example, `Cat` is a _struct-like enum variant_, | |
1206 | whereas `Dog` is simply called an enum variant. | |
1207 | ||
54a0048b SL |
1208 | Each enum value has a _discriminant_ which is an integer associated to it. You |
1209 | can specify it explicitly: | |
85aaf69f SL |
1210 | |
1211 | ``` | |
1212 | enum Foo { | |
1213 | Bar = 123, | |
1214 | } | |
1215 | ``` | |
1216 | ||
54a0048b SL |
1217 | The right hand side of the specification is interpreted as an `isize` value, |
1218 | but the compiler is allowed to use a smaller type in the actual memory layout. | |
1219 | The [`repr` attribute](#ffi-attributes) can be added in order to change | |
1220 | the type of the right hand side and specify the memory layout. | |
1221 | ||
1222 | If a discriminant isn't specified, they start at zero, and add one for each | |
85aaf69f SL |
1223 | variant, in order. |
1224 | ||
54a0048b | 1225 | You can cast an enum to get its discriminant: |
85aaf69f SL |
1226 | |
1227 | ``` | |
1228 | # enum Foo { Bar = 123 } | |
1229 | let x = Foo::Bar as u32; // x is now 123u32 | |
1230 | ``` | |
1231 | ||
1232 | This only works as long as none of the variants have data attached. If | |
1233 | it were `Bar(i32)`, this is disallowed. | |
1234 | ||
1a4d82fc JJ |
1235 | ### Constant items |
1236 | ||
1a4d82fc JJ |
1237 | A *constant item* is a named _constant value_ which is not associated with a |
1238 | specific memory location in the program. Constants are essentially inlined | |
1239 | wherever they are used, meaning that they are copied directly into the relevant | |
1240 | context when used. References to the same constant are not necessarily | |
1241 | guaranteed to refer to the same memory address. | |
1242 | ||
1243 | Constant values must not have destructors, and otherwise permit most forms of | |
1244 | data. Constants may refer to the address of other constants, in which case the | |
1245 | address will have the `static` lifetime. The compiler is, however, still at | |
1246 | liberty to translate the constant many times, so the address referred to may not | |
1247 | be stable. | |
1248 | ||
1249 | Constants must be explicitly typed. The type may be `bool`, `char`, a number, or | |
1250 | a type derived from those primitive types. The derived types are references with | |
1251 | the `static` lifetime, fixed-size arrays, tuples, enum variants, and structs. | |
1252 | ||
1253 | ``` | |
85aaf69f SL |
1254 | const BIT1: u32 = 1 << 0; |
1255 | const BIT2: u32 = 1 << 1; | |
1a4d82fc | 1256 | |
85aaf69f | 1257 | const BITS: [u32; 2] = [BIT1, BIT2]; |
1a4d82fc JJ |
1258 | const STRING: &'static str = "bitstring"; |
1259 | ||
1260 | struct BitsNStrings<'a> { | |
85aaf69f | 1261 | mybits: [u32; 2], |
7453a54e | 1262 | mystring: &'a str, |
1a4d82fc JJ |
1263 | } |
1264 | ||
1265 | const BITS_N_STRINGS: BitsNStrings<'static> = BitsNStrings { | |
1266 | mybits: BITS, | |
7453a54e | 1267 | mystring: STRING, |
1a4d82fc JJ |
1268 | }; |
1269 | ``` | |
1270 | ||
1271 | ### Static items | |
1272 | ||
1a4d82fc JJ |
1273 | A *static item* is similar to a *constant*, except that it represents a precise |
1274 | memory location in the program. A static is never "inlined" at the usage site, | |
1275 | and all references to it refer to the same memory location. Static items have | |
1276 | the `static` lifetime, which outlives all other lifetimes in a Rust program. | |
1277 | Static items may be placed in read-only memory if they do not contain any | |
1278 | interior mutability. | |
1279 | ||
1280 | Statics may contain interior mutability through the `UnsafeCell` language item. | |
1281 | All access to a static is safe, but there are a number of restrictions on | |
1282 | statics: | |
1283 | ||
1284 | * Statics may not contain any destructors. | |
c1a9b12d | 1285 | * The types of static values must ascribe to `Sync` to allow thread-safe access. |
1a4d82fc JJ |
1286 | * Statics may not refer to other statics by value, only by reference. |
1287 | * Constants cannot refer to statics. | |
1288 | ||
1289 | Constants should in general be preferred over statics, unless large amounts of | |
1290 | data are being stored, or single-address and mutability properties are required. | |
1291 | ||
1a4d82fc JJ |
1292 | #### Mutable statics |
1293 | ||
1294 | If a static item is declared with the `mut` keyword, then it is allowed to | |
1295 | be modified by the program. One of Rust's goals is to make concurrency bugs | |
1296 | hard to run into, and this is obviously a very large source of race conditions | |
1297 | or other bugs. For this reason, an `unsafe` block is required when either | |
1298 | reading or writing a mutable static variable. Care should be taken to ensure | |
1299 | that modifications to a mutable static are safe with respect to other threads | |
1300 | running in the same process. | |
1301 | ||
1302 | Mutable statics are still very useful, however. They can be used with C | |
1303 | libraries and can also be bound from C libraries (in an `extern` block). | |
1304 | ||
1305 | ``` | |
85aaf69f | 1306 | # fn atomic_add(_: &mut u32, _: u32) -> u32 { 2 } |
1a4d82fc | 1307 | |
85aaf69f | 1308 | static mut LEVELS: u32 = 0; |
1a4d82fc JJ |
1309 | |
1310 | // This violates the idea of no shared state, and this doesn't internally | |
1311 | // protect against races, so this function is `unsafe` | |
85aaf69f | 1312 | unsafe fn bump_levels_unsafe1() -> u32 { |
1a4d82fc JJ |
1313 | let ret = LEVELS; |
1314 | LEVELS += 1; | |
1315 | return ret; | |
1316 | } | |
1317 | ||
1318 | // Assuming that we have an atomic_add function which returns the old value, | |
1319 | // this function is "safe" but the meaning of the return value may not be what | |
1320 | // callers expect, so it's still marked as `unsafe` | |
85aaf69f | 1321 | unsafe fn bump_levels_unsafe2() -> u32 { |
1a4d82fc JJ |
1322 | return atomic_add(&mut LEVELS, 1); |
1323 | } | |
1324 | ``` | |
1325 | ||
1326 | Mutable statics have the same restrictions as normal statics, except that the | |
1327 | type of the value is not required to ascribe to `Sync`. | |
1328 | ||
1329 | ### Traits | |
1330 | ||
bd371182 AL |
1331 | A _trait_ describes an abstract interface that types can |
1332 | implement. This interface consists of associated items, which come in | |
1333 | three varieties: | |
1a4d82fc | 1334 | |
bd371182 AL |
1335 | - functions |
1336 | - constants | |
1337 | - types | |
1338 | ||
1339 | Associated functions whose first parameter is named `self` are called | |
1340 | methods and may be invoked using `.` notation (e.g., `x.foo()`). | |
1341 | ||
1342 | All traits define an implicit type parameter `Self` that refers to | |
1343 | "the type that is implementing this interface". Traits may also | |
1344 | contain additional type parameters. These type parameters (including | |
1345 | `Self`) may be constrained by other traits and so forth as usual. | |
1346 | ||
1347 | Trait bounds on `Self` are considered "supertraits". These are | |
1348 | required to be acyclic. Supertraits are somewhat different from other | |
1349 | constraints in that they affect what methods are available in the | |
1350 | vtable when the trait is used as a [trait object](#trait-objects). | |
1a4d82fc JJ |
1351 | |
1352 | Traits are implemented for specific types through separate | |
1353 | [implementations](#implementations). | |
1354 | ||
d9579d0f AL |
1355 | Consider the following trait: |
1356 | ||
1a4d82fc | 1357 | ``` |
85aaf69f SL |
1358 | # type Surface = i32; |
1359 | # type BoundingBox = i32; | |
1a4d82fc JJ |
1360 | trait Shape { |
1361 | fn draw(&self, Surface); | |
1362 | fn bounding_box(&self) -> BoundingBox; | |
1363 | } | |
1364 | ``` | |
1365 | ||
1366 | This defines a trait with two methods. All values that have | |
1367 | [implementations](#implementations) of this trait in scope can have their | |
1368 | `draw` and `bounding_box` methods called, using `value.bounding_box()` | |
1369 | [syntax](#method-call-expressions). | |
1370 | ||
d9579d0f AL |
1371 | Traits can include default implementations of methods, as in: |
1372 | ||
1373 | ``` | |
1374 | trait Foo { | |
1375 | fn bar(&self); | |
d9579d0f AL |
1376 | fn baz(&self) { println!("We called baz."); } |
1377 | } | |
1378 | ``` | |
1379 | ||
1380 | Here the `baz` method has a default implementation, so types that implement | |
1381 | `Foo` need only implement `bar`. It is also possible for implementing types | |
1382 | to override a method that has a default implementation. | |
1383 | ||
1a4d82fc JJ |
1384 | Type parameters can be specified for a trait to make it generic. These appear |
1385 | after the trait name, using the same syntax used in [generic | |
1386 | functions](#generic-functions). | |
1387 | ||
1388 | ``` | |
1389 | trait Seq<T> { | |
e9174d1e SL |
1390 | fn len(&self) -> u32; |
1391 | fn elt_at(&self, n: u32) -> T; | |
1392 | fn iter<F>(&self, F) where F: Fn(T); | |
1a4d82fc JJ |
1393 | } |
1394 | ``` | |
1395 | ||
d9579d0f AL |
1396 | It is also possible to define associated types for a trait. Consider the |
1397 | following example of a `Container` trait. Notice how the type is available | |
1398 | for use in the method signatures: | |
1399 | ||
1400 | ``` | |
1401 | trait Container { | |
1402 | type E; | |
1403 | fn empty() -> Self; | |
1404 | fn insert(&mut self, Self::E); | |
1405 | } | |
1406 | ``` | |
1407 | ||
1408 | In order for a type to implement this trait, it must not only provide | |
1409 | implementations for every method, but it must specify the type `E`. Here's | |
1410 | an implementation of `Container` for the standard library type `Vec`: | |
1411 | ||
1412 | ``` | |
1413 | # trait Container { | |
1414 | # type E; | |
1415 | # fn empty() -> Self; | |
1416 | # fn insert(&mut self, Self::E); | |
1417 | # } | |
1418 | impl<T> Container for Vec<T> { | |
1419 | type E = T; | |
1420 | fn empty() -> Vec<T> { Vec::new() } | |
1421 | fn insert(&mut self, x: T) { self.push(x); } | |
1422 | } | |
1423 | ``` | |
1424 | ||
1a4d82fc | 1425 | Generic functions may use traits as _bounds_ on their type parameters. This |
62682a34 SL |
1426 | will have two effects: |
1427 | ||
1428 | - Only types that have the trait may instantiate the parameter. | |
1429 | - Within the generic function, the methods of the trait can be | |
1430 | called on values that have the parameter's type. | |
1431 | ||
1432 | For example: | |
1a4d82fc JJ |
1433 | |
1434 | ``` | |
85aaf69f | 1435 | # type Surface = i32; |
1a4d82fc JJ |
1436 | # trait Shape { fn draw(&self, Surface); } |
1437 | fn draw_twice<T: Shape>(surface: Surface, sh: T) { | |
1438 | sh.draw(surface); | |
1439 | sh.draw(surface); | |
1440 | } | |
1441 | ``` | |
1442 | ||
e9174d1e | 1443 | Traits also define a [trait object](#trait-objects) with the same |
bd371182 AL |
1444 | name as the trait. Values of this type are created by coercing from a |
1445 | pointer of some specific type to a pointer of trait type. For example, | |
1446 | `&T` could be coerced to `&Shape` if `T: Shape` holds (and similarly | |
1447 | for `Box<T>`). This coercion can either be implicit or | |
1448 | [explicit](#type-cast-expressions). Here is an example of an explicit | |
1449 | coercion: | |
1a4d82fc JJ |
1450 | |
1451 | ``` | |
bd371182 AL |
1452 | trait Shape { } |
1453 | impl Shape for i32 { } | |
1454 | let mycircle = 0i32; | |
1a4d82fc JJ |
1455 | let myshape: Box<Shape> = Box::new(mycircle) as Box<Shape>; |
1456 | ``` | |
1457 | ||
1458 | The resulting value is a box containing the value that was cast, along with | |
1459 | information that identifies the methods of the implementation that was used. | |
1460 | Values with a trait type can have [methods called](#method-call-expressions) on | |
1461 | them, for any method in the trait, and can be used to instantiate type | |
1462 | parameters that are bounded by the trait. | |
1463 | ||
1464 | Trait methods may be static, which means that they lack a `self` argument. | |
1465 | This means that they can only be called with function call syntax (`f(x)`) and | |
1466 | not method call syntax (`obj.f()`). The way to refer to the name of a static | |
1467 | method is to qualify it with the trait name, treating the trait name like a | |
1468 | module. For example: | |
1469 | ||
1470 | ``` | |
1471 | trait Num { | |
85aaf69f | 1472 | fn from_i32(n: i32) -> Self; |
1a4d82fc JJ |
1473 | } |
1474 | impl Num for f64 { | |
85aaf69f | 1475 | fn from_i32(n: i32) -> f64 { n as f64 } |
1a4d82fc | 1476 | } |
85aaf69f | 1477 | let x: f64 = Num::from_i32(42); |
1a4d82fc JJ |
1478 | ``` |
1479 | ||
e9174d1e | 1480 | Traits may inherit from other traits. Consider the following example: |
1a4d82fc JJ |
1481 | |
1482 | ``` | |
85aaf69f SL |
1483 | trait Shape { fn area(&self) -> f64; } |
1484 | trait Circle : Shape { fn radius(&self) -> f64; } | |
1a4d82fc JJ |
1485 | ``` |
1486 | ||
e9174d1e | 1487 | The syntax `Circle : Shape` means that types that implement `Circle` must also |
1a4d82fc JJ |
1488 | have an implementation for `Shape`. Multiple supertraits are separated by `+`, |
1489 | `trait Circle : Shape + PartialEq { }`. In an implementation of `Circle` for a | |
1490 | given type `T`, methods can refer to `Shape` methods, since the typechecker | |
1491 | checks that any type with an implementation of `Circle` also has an | |
e9174d1e SL |
1492 | implementation of `Shape`: |
1493 | ||
1494 | ```rust | |
1495 | struct Foo; | |
1496 | ||
1497 | trait Shape { fn area(&self) -> f64; } | |
1498 | trait Circle : Shape { fn radius(&self) -> f64; } | |
b039eaaf SL |
1499 | impl Shape for Foo { |
1500 | fn area(&self) -> f64 { | |
1501 | 0.0 | |
1502 | } | |
1503 | } | |
e9174d1e SL |
1504 | impl Circle for Foo { |
1505 | fn radius(&self) -> f64 { | |
1506 | println!("calling area: {}", self.area()); | |
1507 | ||
1508 | 0.0 | |
1509 | } | |
1510 | } | |
1511 | ||
1512 | let c = Foo; | |
1513 | c.radius(); | |
1514 | ``` | |
1a4d82fc JJ |
1515 | |
1516 | In type-parameterized functions, methods of the supertrait may be called on | |
1517 | values of subtrait-bound type parameters. Referring to the previous example of | |
1518 | `trait Circle : Shape`: | |
1519 | ||
1520 | ``` | |
1521 | # trait Shape { fn area(&self) -> f64; } | |
1522 | # trait Circle : Shape { fn radius(&self) -> f64; } | |
1523 | fn radius_times_area<T: Circle>(c: T) -> f64 { | |
1524 | // `c` is both a Circle and a Shape | |
1525 | c.radius() * c.area() | |
1526 | } | |
1527 | ``` | |
1528 | ||
1529 | Likewise, supertrait methods may also be called on trait objects. | |
1530 | ||
1531 | ```{.ignore} | |
1a4d82fc JJ |
1532 | # trait Shape { fn area(&self) -> f64; } |
1533 | # trait Circle : Shape { fn radius(&self) -> f64; } | |
85aaf69f SL |
1534 | # impl Shape for i32 { fn area(&self) -> f64 { 0.0 } } |
1535 | # impl Circle for i32 { fn radius(&self) -> f64 { 0.0 } } | |
1536 | # let mycircle = 0i32; | |
1a4d82fc JJ |
1537 | let mycircle = Box::new(mycircle) as Box<Circle>; |
1538 | let nonsense = mycircle.radius() * mycircle.area(); | |
1539 | ``` | |
1540 | ||
1541 | ### Implementations | |
1542 | ||
1543 | An _implementation_ is an item that implements a [trait](#traits) for a | |
1544 | specific type. | |
1545 | ||
1546 | Implementations are defined with the keyword `impl`. | |
1547 | ||
1548 | ``` | |
c34b1796 | 1549 | # #[derive(Copy, Clone)] |
1a4d82fc | 1550 | # struct Point {x: f64, y: f64}; |
85aaf69f | 1551 | # type Surface = i32; |
1a4d82fc JJ |
1552 | # struct BoundingBox {x: f64, y: f64, width: f64, height: f64}; |
1553 | # trait Shape { fn draw(&self, Surface); fn bounding_box(&self) -> BoundingBox; } | |
1554 | # fn do_draw_circle(s: Surface, c: Circle) { } | |
1555 | struct Circle { | |
1556 | radius: f64, | |
1557 | center: Point, | |
1558 | } | |
1559 | ||
1560 | impl Copy for Circle {} | |
1561 | ||
c34b1796 AL |
1562 | impl Clone for Circle { |
1563 | fn clone(&self) -> Circle { *self } | |
1564 | } | |
1565 | ||
1a4d82fc JJ |
1566 | impl Shape for Circle { |
1567 | fn draw(&self, s: Surface) { do_draw_circle(s, *self); } | |
1568 | fn bounding_box(&self) -> BoundingBox { | |
1569 | let r = self.radius; | |
e9174d1e SL |
1570 | BoundingBox { |
1571 | x: self.center.x - r, | |
1572 | y: self.center.y - r, | |
1573 | width: 2.0 * r, | |
1574 | height: 2.0 * r, | |
1575 | } | |
1a4d82fc JJ |
1576 | } |
1577 | } | |
1578 | ``` | |
1579 | ||
1580 | It is possible to define an implementation without referring to a trait. The | |
b039eaaf SL |
1581 | methods in such an implementation can only be used as direct calls on the values |
1582 | of the type that the implementation targets. In such an implementation, the | |
1583 | trait type and `for` after `impl` are omitted. Such implementations are limited | |
1584 | to nominal types (enums, structs, trait objects), and the implementation must | |
1585 | appear in the same crate as the `self` type: | |
1a4d82fc JJ |
1586 | |
1587 | ``` | |
85aaf69f | 1588 | struct Point {x: i32, y: i32} |
1a4d82fc JJ |
1589 | |
1590 | impl Point { | |
1591 | fn log(&self) { | |
1592 | println!("Point is at ({}, {})", self.x, self.y); | |
1593 | } | |
1594 | } | |
1595 | ||
1596 | let my_point = Point {x: 10, y:11}; | |
1597 | my_point.log(); | |
1598 | ``` | |
1599 | ||
1600 | When a trait _is_ specified in an `impl`, all methods declared as part of the | |
1601 | trait must be implemented, with matching types and type parameter counts. | |
1602 | ||
1603 | An implementation can take type parameters, which can be different from the | |
1604 | type parameters taken by the trait it implements. Implementation parameters | |
1605 | are written after the `impl` keyword. | |
1606 | ||
1607 | ``` | |
85aaf69f | 1608 | # trait Seq<T> { fn dummy(&self, _: T) { } } |
1a4d82fc | 1609 | impl<T> Seq<T> for Vec<T> { |
e9174d1e | 1610 | /* ... */ |
1a4d82fc JJ |
1611 | } |
1612 | impl Seq<bool> for u32 { | |
e9174d1e | 1613 | /* Treat the integer as a sequence of bits */ |
1a4d82fc JJ |
1614 | } |
1615 | ``` | |
1616 | ||
1617 | ### External blocks | |
1618 | ||
1a4d82fc JJ |
1619 | External blocks form the basis for Rust's foreign function interface. |
1620 | Declarations in an external block describe symbols in external, non-Rust | |
1621 | libraries. | |
1622 | ||
1623 | Functions within external blocks are declared in the same way as other Rust | |
1624 | functions, with the exception that they may not have a body and are instead | |
1625 | terminated by a semicolon. | |
1626 | ||
1a4d82fc JJ |
1627 | Functions within external blocks may be called by Rust code, just like |
1628 | functions defined in Rust. The Rust compiler automatically translates between | |
1629 | the Rust ABI and the foreign ABI. | |
1630 | ||
1631 | A number of [attributes](#attributes) control the behavior of external blocks. | |
1632 | ||
1633 | By default external blocks assume that the library they are calling uses the | |
1634 | standard C "cdecl" ABI. Other ABIs may be specified using an `abi` string, as | |
1635 | shown here: | |
1636 | ||
bd371182 | 1637 | ```ignore |
1a4d82fc JJ |
1638 | // Interface to the Windows API |
1639 | extern "stdcall" { } | |
1640 | ``` | |
1641 | ||
1642 | The `link` attribute allows the name of the library to be specified. When | |
1643 | specified the compiler will attempt to link against the native library of the | |
1644 | specified name. | |
1645 | ||
1646 | ```{.ignore} | |
1647 | #[link(name = "crypto")] | |
1648 | extern { } | |
1649 | ``` | |
1650 | ||
1651 | The type of a function declared in an extern block is `extern "abi" fn(A1, ..., | |
1652 | An) -> R`, where `A1...An` are the declared types of its arguments and `R` is | |
1653 | the declared return type. | |
1654 | ||
c1a9b12d SL |
1655 | It is valid to add the `link` attribute on an empty extern block. You can use |
1656 | this to satisfy the linking requirements of extern blocks elsewhere in your code | |
1657 | (including upstream crates) instead of adding the attribute to each extern block. | |
1658 | ||
1a4d82fc JJ |
1659 | ## Visibility and Privacy |
1660 | ||
1661 | These two terms are often used interchangeably, and what they are attempting to | |
1662 | convey is the answer to the question "Can this item be used at this location?" | |
1663 | ||
1664 | Rust's name resolution operates on a global hierarchy of namespaces. Each level | |
1665 | in the hierarchy can be thought of as some item. The items are one of those | |
1666 | mentioned above, but also include external crates. Declaring or defining a new | |
1667 | module can be thought of as inserting a new tree into the hierarchy at the | |
1668 | location of the definition. | |
1669 | ||
1670 | To control whether interfaces can be used across modules, Rust checks each use | |
1671 | of an item to see whether it should be allowed or not. This is where privacy | |
1672 | warnings are generated, or otherwise "you used a private item of another module | |
1673 | and weren't allowed to." | |
1674 | ||
1675 | By default, everything in Rust is *private*, with one exception. Enum variants | |
bd371182 | 1676 | in a `pub` enum are also public by default. When an item is declared as `pub`, |
1a4d82fc JJ |
1677 | it can be thought of as being accessible to the outside world. For example: |
1678 | ||
1679 | ``` | |
1a4d82fc JJ |
1680 | # fn main() {} |
1681 | // Declare a private struct | |
1682 | struct Foo; | |
1683 | ||
1684 | // Declare a public struct with a private field | |
1685 | pub struct Bar { | |
7453a54e | 1686 | field: i32, |
1a4d82fc JJ |
1687 | } |
1688 | ||
1689 | // Declare a public enum with two public variants | |
1690 | pub enum State { | |
1691 | PubliclyAccessibleState, | |
1692 | PubliclyAccessibleState2, | |
1693 | } | |
1694 | ``` | |
1695 | ||
1696 | With the notion of an item being either public or private, Rust allows item | |
1697 | accesses in two cases: | |
1698 | ||
1699 | 1. If an item is public, then it can be used externally through any of its | |
1700 | public ancestors. | |
1701 | 2. If an item is private, it may be accessed by the current module and its | |
1702 | descendants. | |
1703 | ||
1704 | These two cases are surprisingly powerful for creating module hierarchies | |
1705 | exposing public APIs while hiding internal implementation details. To help | |
85aaf69f | 1706 | explain, here's a few use cases and what they would entail: |
1a4d82fc JJ |
1707 | |
1708 | * A library developer needs to expose functionality to crates which link | |
1709 | against their library. As a consequence of the first case, this means that | |
1710 | anything which is usable externally must be `pub` from the root down to the | |
1711 | destination item. Any private item in the chain will disallow external | |
1712 | accesses. | |
1713 | ||
1714 | * A crate needs a global available "helper module" to itself, but it doesn't | |
1715 | want to expose the helper module as a public API. To accomplish this, the | |
1716 | root of the crate's hierarchy would have a private module which then | |
c1a9b12d | 1717 | internally has a "public API". Because the entire crate is a descendant of |
1a4d82fc JJ |
1718 | the root, then the entire local crate can access this private module through |
1719 | the second case. | |
1720 | ||
1721 | * When writing unit tests for a module, it's often a common idiom to have an | |
1722 | immediate child of the module to-be-tested named `mod test`. This module | |
1723 | could access any items of the parent module through the second case, meaning | |
1724 | that internal implementation details could also be seamlessly tested from the | |
1725 | child module. | |
1726 | ||
1727 | In the second case, it mentions that a private item "can be accessed" by the | |
1728 | current module and its descendants, but the exact meaning of accessing an item | |
1729 | depends on what the item is. Accessing a module, for example, would mean | |
1730 | looking inside of it (to import more items). On the other hand, accessing a | |
1731 | function would mean that it is invoked. Additionally, path expressions and | |
1732 | import statements are considered to access an item in the sense that the | |
1733 | import/expression is only valid if the destination is in the current visibility | |
1734 | scope. | |
1735 | ||
1736 | Here's an example of a program which exemplifies the three cases outlined | |
85aaf69f | 1737 | above: |
1a4d82fc JJ |
1738 | |
1739 | ``` | |
1740 | // This module is private, meaning that no external crate can access this | |
1741 | // module. Because it is private at the root of this current crate, however, any | |
1742 | // module in the crate may access any publicly visible item in this module. | |
1743 | mod crate_helper_module { | |
1744 | ||
1745 | // This function can be used by anything in the current crate | |
1746 | pub fn crate_helper() {} | |
1747 | ||
1748 | // This function *cannot* be used by anything else in the crate. It is not | |
1749 | // publicly visible outside of the `crate_helper_module`, so only this | |
1750 | // current module and its descendants may access it. | |
1751 | fn implementation_detail() {} | |
1752 | } | |
1753 | ||
1754 | // This function is "public to the root" meaning that it's available to external | |
1755 | // crates linking against this one. | |
1756 | pub fn public_api() {} | |
1757 | ||
1758 | // Similarly to 'public_api', this module is public so external crates may look | |
1759 | // inside of it. | |
1760 | pub mod submodule { | |
1761 | use crate_helper_module; | |
1762 | ||
1763 | pub fn my_method() { | |
1764 | // Any item in the local crate may invoke the helper module's public | |
1765 | // interface through a combination of the two rules above. | |
1766 | crate_helper_module::crate_helper(); | |
1767 | } | |
1768 | ||
1769 | // This function is hidden to any module which is not a descendant of | |
1770 | // `submodule` | |
1771 | fn my_implementation() {} | |
1772 | ||
1773 | #[cfg(test)] | |
1774 | mod test { | |
1775 | ||
1776 | #[test] | |
1777 | fn test_my_implementation() { | |
1778 | // Because this module is a descendant of `submodule`, it's allowed | |
1779 | // to access private items inside of `submodule` without a privacy | |
1780 | // violation. | |
1781 | super::my_implementation(); | |
1782 | } | |
1783 | } | |
1784 | } | |
1785 | ||
1786 | # fn main() {} | |
1787 | ``` | |
1788 | ||
7453a54e | 1789 | For a Rust program to pass the privacy checking pass, all paths must be valid |
1a4d82fc JJ |
1790 | accesses given the two rules above. This includes all use statements, |
1791 | expressions, types, etc. | |
1792 | ||
1793 | ### Re-exporting and Visibility | |
1794 | ||
1795 | Rust allows publicly re-exporting items through a `pub use` directive. Because | |
1796 | this is a public directive, this allows the item to be used in the current | |
1797 | module through the rules above. It essentially allows public access into the | |
1798 | re-exported item. For example, this program is valid: | |
1799 | ||
1800 | ``` | |
c34b1796 | 1801 | pub use self::implementation::api; |
1a4d82fc JJ |
1802 | |
1803 | mod implementation { | |
c34b1796 AL |
1804 | pub mod api { |
1805 | pub fn f() {} | |
1806 | } | |
1a4d82fc JJ |
1807 | } |
1808 | ||
1809 | # fn main() {} | |
1810 | ``` | |
1811 | ||
c34b1796 | 1812 | This means that any external crate referencing `implementation::api::f` would |
1a4d82fc JJ |
1813 | receive a privacy violation, while the path `api::f` would be allowed. |
1814 | ||
1815 | When re-exporting a private item, it can be thought of as allowing the "privacy | |
1816 | chain" being short-circuited through the reexport instead of passing through | |
1817 | the namespace hierarchy as it normally would. | |
1818 | ||
1819 | ## Attributes | |
1820 | ||
1a4d82fc JJ |
1821 | Any item declaration may have an _attribute_ applied to it. Attributes in Rust |
1822 | are modeled on Attributes in ECMA-335, with the syntax coming from ECMA-334 | |
1823 | (C#). An attribute is a general, free-form metadatum that is interpreted | |
1824 | according to name, convention, and language and compiler version. Attributes | |
1825 | may appear as any of: | |
1826 | ||
1827 | * A single identifier, the attribute name | |
1828 | * An identifier followed by the equals sign '=' and a literal, providing a | |
1829 | key/value pair | |
1830 | * An identifier followed by a parenthesized list of sub-attribute arguments | |
1831 | ||
1832 | Attributes with a bang ("!") after the hash ("#") apply to the item that the | |
1833 | attribute is declared within. Attributes that do not have a bang after the hash | |
1834 | apply to the item that follows the attribute. | |
1835 | ||
1836 | An example of attributes: | |
1837 | ||
1838 | ```{.rust} | |
1839 | // General metadata applied to the enclosing module or crate. | |
1840 | #![crate_type = "lib"] | |
1841 | ||
1842 | // A function marked as a unit test | |
1843 | #[test] | |
1844 | fn test_foo() { | |
e9174d1e | 1845 | /* ... */ |
1a4d82fc JJ |
1846 | } |
1847 | ||
1848 | // A conditionally-compiled module | |
1849 | #[cfg(target_os="linux")] | |
1850 | mod bar { | |
e9174d1e | 1851 | /* ... */ |
1a4d82fc JJ |
1852 | } |
1853 | ||
1854 | // A lint attribute used to suppress a warning/error | |
1855 | #[allow(non_camel_case_types)] | |
1856 | type int8_t = i8; | |
1857 | ``` | |
1858 | ||
1859 | > **Note:** At some point in the future, the compiler will distinguish between | |
1860 | > language-reserved and user-available attributes. Until then, there is | |
1861 | > effectively no difference between an attribute handled by a loadable syntax | |
1862 | > extension and the compiler. | |
1863 | ||
1864 | ### Crate-only attributes | |
1865 | ||
bd371182 | 1866 | - `crate_name` - specify the crate's crate name. |
1a4d82fc JJ |
1867 | - `crate_type` - see [linkage](#linkage). |
1868 | - `feature` - see [compiler features](#compiler-features). | |
1869 | - `no_builtins` - disable optimizing certain code patterns to invocations of | |
1870 | library functions that are assumed to exist | |
1871 | - `no_main` - disable emitting the `main` symbol. Useful when some other | |
1872 | object being linked to defines `main`. | |
1873 | - `no_start` - disable linking to the `native` crate, which specifies the | |
1874 | "start" language item. | |
1875 | - `no_std` - disable linking to the `std` crate. | |
e9174d1e | 1876 | - `plugin` - load a list of named crates as compiler plugins, e.g. |
85aaf69f SL |
1877 | `#![plugin(foo, bar)]`. Optional arguments for each plugin, |
1878 | i.e. `#![plugin(foo(... args ...))]`, are provided to the plugin's | |
1879 | registrar function. The `plugin` feature gate is required to use | |
1880 | this attribute. | |
e9174d1e SL |
1881 | - `recursion_limit` - Sets the maximum depth for potentially |
1882 | infinitely-recursive compile-time operations like | |
1883 | auto-dereference or macro expansion. The default is | |
1884 | `#![recursion_limit="64"]`. | |
1a4d82fc JJ |
1885 | |
1886 | ### Module-only attributes | |
1887 | ||
1888 | - `no_implicit_prelude` - disable injecting `use std::prelude::*` in this | |
1889 | module. | |
1890 | - `path` - specifies the file to load the module from. `#[path="foo.rs"] mod | |
1891 | bar;` is equivalent to `mod bar { /* contents of foo.rs */ }`. The path is | |
1892 | taken relative to the directory that the current module is in. | |
1893 | ||
1894 | ### Function-only attributes | |
1895 | ||
1896 | - `main` - indicates that this function should be passed to the entry point, | |
1897 | rather than the function in the crate root named `main`. | |
1898 | - `plugin_registrar` - mark this function as the registration point for | |
1899 | [compiler plugins][plugin], such as loadable syntax extensions. | |
1900 | - `start` - indicates that this function should be used as the entry point, | |
1901 | overriding the "start" language item. See the "start" [language | |
1902 | item](#language-items) for more details. | |
1903 | - `test` - indicates that this function is a test function, to only be compiled | |
1904 | in case of `--test`. | |
c34b1796 | 1905 | - `should_panic` - indicates that this test function should panic, inverting the success condition. |
85aaf69f SL |
1906 | - `cold` - The function is unlikely to be executed, so optimize it (and calls |
1907 | to it) differently. | |
54a0048b SL |
1908 | - `naked` - The function utilizes a custom ABI or custom inline ASM that requires |
1909 | epilogue and prologue to be skipped. | |
1a4d82fc JJ |
1910 | |
1911 | ### Static-only attributes | |
1912 | ||
1913 | - `thread_local` - on a `static mut`, this signals that the value of this | |
1914 | static may change depending on the current thread. The exact consequences of | |
1915 | this are implementation-defined. | |
1916 | ||
1917 | ### FFI attributes | |
1918 | ||
1919 | On an `extern` block, the following attributes are interpreted: | |
1920 | ||
1921 | - `link_args` - specify arguments to the linker, rather than just the library | |
1922 | name and type. This is feature gated and the exact behavior is | |
1923 | implementation-defined (due to variety of linker invocation syntax). | |
1924 | - `link` - indicate that a native library should be linked to for the | |
e9174d1e SL |
1925 | declarations in this block to be linked correctly. `link` supports an optional |
1926 | `kind` key with three possible values: `dylib`, `static`, and `framework`. See | |
1927 | [external blocks](#external-blocks) for more about external blocks. Two | |
1a4d82fc JJ |
1928 | examples: `#[link(name = "readline")]` and |
1929 | `#[link(name = "CoreFoundation", kind = "framework")]`. | |
e9174d1e SL |
1930 | - `linked_from` - indicates what native library this block of FFI items is |
1931 | coming from. This attribute is of the form `#[linked_from = "foo"]` where | |
1932 | `foo` is the name of a library in either `#[link]` or a `-l` flag. This | |
1933 | attribute is currently required to export symbols from a Rust dynamic library | |
1934 | on Windows, and it is feature gated behind the `linked_from` feature. | |
1a4d82fc JJ |
1935 | |
1936 | On declarations inside an `extern` block, the following attributes are | |
1937 | interpreted: | |
1938 | ||
1939 | - `link_name` - the name of the symbol that this function or static should be | |
1940 | imported as. | |
1941 | - `linkage` - on a static, this specifies the [linkage | |
1942 | type](http://llvm.org/docs/LangRef.html#linkage-types). | |
1943 | ||
1944 | On `enum`s: | |
1945 | ||
1946 | - `repr` - on C-like enums, this sets the underlying type used for | |
1947 | representation. Takes one argument, which is the primitive | |
1948 | type this enum should be represented for, or `C`, which specifies that it | |
1949 | should be the default `enum` size of the C ABI for that platform. Note that | |
1950 | enum representation in C is undefined, and this may be incorrect when the C | |
1951 | code is compiled with certain flags. | |
1952 | ||
1953 | On `struct`s: | |
1954 | ||
1955 | - `repr` - specifies the representation to use for this struct. Takes a list | |
1956 | of options. The currently accepted ones are `C` and `packed`, which may be | |
1957 | combined. `C` will use a C ABI compatible struct layout, and `packed` will | |
1958 | remove any padding between fields (note that this is very fragile and may | |
1959 | break platforms which require aligned access). | |
1960 | ||
85aaf69f | 1961 | ### Macro-related attributes |
1a4d82fc JJ |
1962 | |
1963 | - `macro_use` on a `mod` — macros defined in this module will be visible in the | |
1964 | module's parent, after this module has been included. | |
1965 | ||
1966 | - `macro_use` on an `extern crate` — load macros from this crate. An optional | |
1967 | list of names `#[macro_use(foo, bar)]` restricts the import to just those | |
1968 | macros named. The `extern crate` must appear at the crate root, not inside | |
1969 | `mod`, which ensures proper function of the [`$crate` macro | |
b039eaaf | 1970 | variable](book/macros.html#the-variable-crate). |
1a4d82fc JJ |
1971 | |
1972 | - `macro_reexport` on an `extern crate` — re-export the named macros. | |
1973 | ||
1974 | - `macro_export` - export a macro for cross-crate usage. | |
1975 | ||
85aaf69f SL |
1976 | - `no_link` on an `extern crate` — even if we load this crate for macros, don't |
1977 | link it into the output. | |
1a4d82fc JJ |
1978 | |
1979 | See the [macros section of the | |
b039eaaf | 1980 | book](book/macros.html#scoping-and-macro-importexport) for more information on |
1a4d82fc JJ |
1981 | macro scope. |
1982 | ||
1983 | ||
1984 | ### Miscellaneous attributes | |
1985 | ||
1986 | - `export_name` - on statics and functions, this determines the name of the | |
1987 | exported symbol. | |
1988 | - `link_section` - on statics and functions, this specifies the section of the | |
1989 | object file that this item's contents will be placed into. | |
1990 | - `no_mangle` - on any item, do not apply the standard name mangling. Set the | |
1991 | symbol for this item to its identifier. | |
1a4d82fc JJ |
1992 | - `simd` - on certain tuple structs, derive the arithmetic operators, which |
1993 | lower to the target's SIMD instructions, if any; the `simd` feature gate | |
1994 | is necessary to use this attribute. | |
b039eaaf SL |
1995 | - `unsafe_destructor_blind_to_params` - on `Drop::drop` method, asserts that the |
1996 | destructor code (and all potential specializations of that code) will | |
1997 | never attempt to read from nor write to any references with lifetimes | |
1998 | that come in via generic parameters. This is a constraint we cannot | |
1999 | currently express via the type system, and therefore we rely on the | |
2000 | programmer to assert that it holds. Adding this to a Drop impl causes | |
2001 | the associated destructor to be considered "uninteresting" by the | |
2002 | Drop-Check rule, and thus it can help sidestep data ordering | |
2003 | constraints that would otherwise be introduced by the Drop-Check | |
2004 | rule. Such sidestepping of the constraints, if done incorrectly, can | |
2005 | lead to undefined behavior (in the form of reading or writing to data | |
2006 | outside of its dynamic extent), and thus this attribute has the word | |
2007 | "unsafe" in its name. To use this, the | |
2008 | `unsafe_destructor_blind_to_params` feature gate must be enabled. | |
1a4d82fc JJ |
2009 | - `unsafe_no_drop_flag` - on structs, remove the flag that prevents |
2010 | destructors from being run twice. Destructors might be run multiple times on | |
bd371182 AL |
2011 | the same object with this attribute. To use this, the `unsafe_no_drop_flag` feature |
2012 | gate must be enabled. | |
1a4d82fc | 2013 | - `doc` - Doc comments such as `/// foo` are equivalent to `#[doc = "foo"]`. |
85aaf69f SL |
2014 | - `rustc_on_unimplemented` - Write a custom note to be shown along with the error |
2015 | when the trait is found to be unimplemented on a type. | |
2016 | You may use format arguments like `{T}`, `{A}` to correspond to the | |
2017 | types at the point of use corresponding to the type parameters of the | |
2018 | trait of the same name. `{Self}` will be replaced with the type that is supposed | |
2019 | to implement the trait but doesn't. To use this, the `on_unimplemented` feature gate | |
2020 | must be enabled. | |
1a4d82fc JJ |
2021 | |
2022 | ### Conditional compilation | |
2023 | ||
2024 | Sometimes one wants to have different compiler outputs from the same code, | |
2025 | depending on build target, such as targeted operating system, or to enable | |
2026 | release builds. | |
2027 | ||
2028 | There are two kinds of configuration options, one that is either defined or not | |
2029 | (`#[cfg(foo)]`), and the other that contains a string that can be checked | |
bd371182 AL |
2030 | against (`#[cfg(bar = "baz")]`). Currently, only compiler-defined configuration |
2031 | options can have the latter form. | |
1a4d82fc JJ |
2032 | |
2033 | ``` | |
2034 | // The function is only included in the build when compiling for OSX | |
2035 | #[cfg(target_os = "macos")] | |
2036 | fn macos_only() { | |
2037 | // ... | |
2038 | } | |
2039 | ||
2040 | // This function is only included when either foo or bar is defined | |
2041 | #[cfg(any(foo, bar))] | |
2042 | fn needs_foo_or_bar() { | |
2043 | // ... | |
2044 | } | |
2045 | ||
2046 | // This function is only included when compiling for a unixish OS with a 32-bit | |
2047 | // architecture | |
c34b1796 | 2048 | #[cfg(all(unix, target_pointer_width = "32"))] |
1a4d82fc JJ |
2049 | fn on_32bit_unix() { |
2050 | // ... | |
2051 | } | |
2052 | ||
2053 | // This function is only included when foo is not defined | |
2054 | #[cfg(not(foo))] | |
2055 | fn needs_not_foo() { | |
2056 | // ... | |
2057 | } | |
2058 | ``` | |
2059 | ||
2060 | This illustrates some conditional compilation can be achieved using the | |
2061 | `#[cfg(...)]` attribute. `any`, `all` and `not` can be used to assemble | |
2062 | arbitrarily complex configurations through nesting. | |
2063 | ||
2064 | The following configurations must be defined by the implementation: | |
2065 | ||
e9174d1e | 2066 | * `debug_assertions` - Enabled by default when compiling without optimizations. |
62682a34 SL |
2067 | This can be used to enable extra debugging code in development but not in |
2068 | production. For example, it controls the behavior of the standard library's | |
2069 | `debug_assert!` macro. | |
e9174d1e | 2070 | * `target_arch = "..."` - Target CPU architecture, such as `"x86"`, `"x86_64"` |
7453a54e | 2071 | `"mips"`, `"powerpc"`, `"powerpc64"`, `"arm"`, or `"aarch64"`. |
e9174d1e | 2072 | * `target_endian = "..."` - Endianness of the target CPU, either `"little"` or |
1a4d82fc | 2073 | `"big"`. |
e9174d1e SL |
2074 | * `target_env = ".."` - An option provided by the compiler by default |
2075 | describing the runtime environment of the target platform. Some examples of | |
2076 | this are `musl` for builds targeting the MUSL libc implementation, `msvc` for | |
2077 | Windows builds targeting MSVC, and `gnu` frequently the rest of the time. This | |
2078 | option may also be blank on some platforms. | |
2079 | * `target_family = "..."` - Operating system family of the target, e. g. | |
1a4d82fc JJ |
2080 | `"unix"` or `"windows"`. The value of this configuration option is defined |
2081 | as a configuration itself, like `unix` or `windows`. | |
e9174d1e | 2082 | * `target_os = "..."` - Operating system of the target, examples include |
bd371182 | 2083 | `"windows"`, `"macos"`, `"ios"`, `"linux"`, `"android"`, `"freebsd"`, `"dragonfly"`, |
c1a9b12d | 2084 | `"bitrig"` , `"openbsd"` or `"netbsd"`. |
e9174d1e | 2085 | * `target_pointer_width = "..."` - Target pointer width in bits. This is set |
c34b1796 AL |
2086 | to `"32"` for targets with 32-bit pointers, and likewise set to `"64"` for |
2087 | 64-bit pointers. | |
b039eaaf SL |
2088 | * `target_vendor = "..."` - Vendor of the target, for example `apple`, `pc`, or |
2089 | simply `"unknown"`. | |
e9174d1e SL |
2090 | * `test` - Enabled when compiling the test harness (using the `--test` flag). |
2091 | * `unix` - See `target_family`. | |
2092 | * `windows` - See `target_family`. | |
1a4d82fc | 2093 | |
bd371182 AL |
2094 | You can also set another attribute based on a `cfg` variable with `cfg_attr`: |
2095 | ||
2096 | ```rust,ignore | |
2097 | #[cfg_attr(a, b)] | |
2098 | ``` | |
2099 | ||
2100 | Will be the same as `#[b]` if `a` is set by `cfg`, and nothing otherwise. | |
2101 | ||
1a4d82fc JJ |
2102 | ### Lint check attributes |
2103 | ||
2104 | A lint check names a potentially undesirable coding pattern, such as | |
2105 | unreachable code or omitted documentation, for the static entity to which the | |
2106 | attribute applies. | |
2107 | ||
2108 | For any lint check `C`: | |
2109 | ||
2110 | * `allow(C)` overrides the check for `C` so that violations will go | |
2111 | unreported, | |
2112 | * `deny(C)` signals an error after encountering a violation of `C`, | |
2113 | * `forbid(C)` is the same as `deny(C)`, but also forbids changing the lint | |
2114 | level afterwards, | |
2115 | * `warn(C)` warns about violations of `C` but continues compilation. | |
2116 | ||
2117 | The lint checks supported by the compiler can be found via `rustc -W help`, | |
2118 | along with their default settings. [Compiler | |
bd371182 | 2119 | plugins](book/compiler-plugins.html#lint-plugins) can provide additional lint checks. |
1a4d82fc JJ |
2120 | |
2121 | ```{.ignore} | |
7453a54e | 2122 | pub mod m1 { |
1a4d82fc JJ |
2123 | // Missing documentation is ignored here |
2124 | #[allow(missing_docs)] | |
85aaf69f | 2125 | pub fn undocumented_one() -> i32 { 1 } |
1a4d82fc JJ |
2126 | |
2127 | // Missing documentation signals a warning here | |
2128 | #[warn(missing_docs)] | |
85aaf69f | 2129 | pub fn undocumented_too() -> i32 { 2 } |
1a4d82fc JJ |
2130 | |
2131 | // Missing documentation signals an error here | |
2132 | #[deny(missing_docs)] | |
85aaf69f | 2133 | pub fn undocumented_end() -> i32 { 3 } |
1a4d82fc JJ |
2134 | } |
2135 | ``` | |
2136 | ||
2137 | This example shows how one can use `allow` and `warn` to toggle a particular | |
85aaf69f | 2138 | check on and off: |
1a4d82fc JJ |
2139 | |
2140 | ```{.ignore} | |
2141 | #[warn(missing_docs)] | |
7453a54e | 2142 | pub mod m2{ |
1a4d82fc | 2143 | #[allow(missing_docs)] |
7453a54e | 2144 | pub mod nested { |
1a4d82fc | 2145 | // Missing documentation is ignored here |
85aaf69f | 2146 | pub fn undocumented_one() -> i32 { 1 } |
1a4d82fc JJ |
2147 | |
2148 | // Missing documentation signals a warning here, | |
2149 | // despite the allow above. | |
2150 | #[warn(missing_docs)] | |
85aaf69f | 2151 | pub fn undocumented_two() -> i32 { 2 } |
1a4d82fc JJ |
2152 | } |
2153 | ||
2154 | // Missing documentation signals a warning here | |
85aaf69f | 2155 | pub fn undocumented_too() -> i32 { 3 } |
1a4d82fc JJ |
2156 | } |
2157 | ``` | |
2158 | ||
2159 | This example shows how one can use `forbid` to disallow uses of `allow` for | |
85aaf69f | 2160 | that lint check: |
1a4d82fc JJ |
2161 | |
2162 | ```{.ignore} | |
2163 | #[forbid(missing_docs)] | |
7453a54e | 2164 | pub mod m3 { |
1a4d82fc JJ |
2165 | // Attempting to toggle warning signals an error here |
2166 | #[allow(missing_docs)] | |
2167 | /// Returns 2. | |
85aaf69f | 2168 | pub fn undocumented_too() -> i32 { 2 } |
1a4d82fc JJ |
2169 | } |
2170 | ``` | |
2171 | ||
2172 | ### Language items | |
2173 | ||
2174 | Some primitive Rust operations are defined in Rust code, rather than being | |
2175 | implemented directly in C or assembly language. The definitions of these | |
2176 | operations have to be easy for the compiler to find. The `lang` attribute | |
2177 | makes it possible to declare these operations. For example, the `str` module | |
2178 | in the Rust standard library defines the string equality function: | |
2179 | ||
2180 | ```{.ignore} | |
bd371182 | 2181 | #[lang = "str_eq"] |
1a4d82fc JJ |
2182 | pub fn eq_slice(a: &str, b: &str) -> bool { |
2183 | // details elided | |
2184 | } | |
2185 | ``` | |
2186 | ||
2187 | The name `str_eq` has a special meaning to the Rust compiler, and the presence | |
2188 | of this definition means that it will use this definition when generating calls | |
2189 | to the string equality function. | |
2190 | ||
bd371182 AL |
2191 | The set of language items is currently considered unstable. A complete |
2192 | list of the built-in language items will be added in the future. | |
1a4d82fc JJ |
2193 | |
2194 | ### Inline attributes | |
2195 | ||
bd371182 AL |
2196 | The inline attribute suggests that the compiler should place a copy of |
2197 | the function or static in the caller, rather than generating code to | |
2198 | call the function or access the static where it is defined. | |
1a4d82fc JJ |
2199 | |
2200 | The compiler automatically inlines functions based on internal heuristics. | |
bd371182 | 2201 | Incorrectly inlining functions can actually make the program slower, so it |
1a4d82fc JJ |
2202 | should be used with care. |
2203 | ||
bd371182 AL |
2204 | `#[inline]` and `#[inline(always)]` always cause the function to be serialized |
2205 | into the crate metadata to allow cross-crate inlining. | |
1a4d82fc JJ |
2206 | |
2207 | There are three different types of inline attributes: | |
2208 | ||
2209 | * `#[inline]` hints the compiler to perform an inline expansion. | |
2210 | * `#[inline(always)]` asks the compiler to always perform an inline expansion. | |
2211 | * `#[inline(never)]` asks the compiler to never perform an inline expansion. | |
2212 | ||
85aaf69f | 2213 | ### `derive` |
1a4d82fc | 2214 | |
85aaf69f | 2215 | The `derive` attribute allows certain traits to be automatically implemented |
1a4d82fc JJ |
2216 | for data structures. For example, the following will create an `impl` for the |
2217 | `PartialEq` and `Clone` traits for `Foo`, the type parameter `T` will be given | |
2218 | the `PartialEq` or `Clone` constraints for the appropriate `impl`: | |
2219 | ||
2220 | ``` | |
85aaf69f | 2221 | #[derive(PartialEq, Clone)] |
1a4d82fc | 2222 | struct Foo<T> { |
85aaf69f | 2223 | a: i32, |
1a4d82fc JJ |
2224 | b: T |
2225 | } | |
2226 | ``` | |
2227 | ||
2228 | The generated `impl` for `PartialEq` is equivalent to | |
2229 | ||
2230 | ``` | |
85aaf69f | 2231 | # struct Foo<T> { a: i32, b: T } |
1a4d82fc JJ |
2232 | impl<T: PartialEq> PartialEq for Foo<T> { |
2233 | fn eq(&self, other: &Foo<T>) -> bool { | |
2234 | self.a == other.a && self.b == other.b | |
2235 | } | |
2236 | ||
2237 | fn ne(&self, other: &Foo<T>) -> bool { | |
2238 | self.a != other.a || self.b != other.b | |
2239 | } | |
2240 | } | |
2241 | ``` | |
2242 | ||
1a4d82fc JJ |
2243 | ### Compiler Features |
2244 | ||
2245 | Certain aspects of Rust may be implemented in the compiler, but they're not | |
2246 | necessarily ready for every-day use. These features are often of "prototype | |
2247 | quality" or "almost production ready", but may not be stable enough to be | |
2248 | considered a full-fledged language feature. | |
2249 | ||
2250 | For this reason, Rust recognizes a special crate-level attribute of the form: | |
2251 | ||
2252 | ```{.ignore} | |
2253 | #![feature(feature1, feature2, feature3)] | |
2254 | ``` | |
2255 | ||
2256 | This directive informs the compiler that the feature list: `feature1`, | |
2257 | `feature2`, and `feature3` should all be enabled. This is only recognized at a | |
2258 | crate-level, not at a module-level. Without this directive, all features are | |
2259 | considered off, and using the features will result in a compiler error. | |
2260 | ||
2261 | The currently implemented features of the reference compiler are: | |
2262 | ||
c34b1796 | 2263 | * `advanced_slice_patterns` - See the [match expressions](#match-expressions) |
85aaf69f | 2264 | section for discussion; the exact semantics of |
c34b1796 | 2265 | slice patterns are subject to change, so some types |
b039eaaf | 2266 | are still unstable. |
c34b1796 AL |
2267 | |
2268 | * `slice_patterns` - OK, actually, slice patterns are just scary and | |
2269 | completely unstable. | |
85aaf69f | 2270 | |
1a4d82fc JJ |
2271 | * `asm` - The `asm!` macro provides a means for inline assembly. This is often |
2272 | useful, but the exact syntax for this feature along with its | |
2273 | semantics are likely to change, so this macro usage must be opted | |
2274 | into. | |
2275 | ||
bd371182 AL |
2276 | * `associated_consts` - Allows constants to be defined in `impl` and `trait` |
2277 | blocks, so that they can be associated with a type or | |
2278 | trait in a similar manner to methods and associated | |
2279 | types. | |
85aaf69f SL |
2280 | |
2281 | * `box_patterns` - Allows `box` patterns, the exact semantics of which | |
2282 | is subject to change. | |
2283 | ||
2284 | * `box_syntax` - Allows use of `box` expressions, the exact semantics of which | |
2285 | is subject to change. | |
2286 | ||
b039eaaf SL |
2287 | * `cfg_target_vendor` - Allows conditional compilation using the `target_vendor` |
2288 | matcher which is subject to change. | |
2289 | ||
1a4d82fc JJ |
2290 | * `concat_idents` - Allows use of the `concat_idents` macro, which is in many |
2291 | ways insufficient for concatenating identifiers, and may be | |
2292 | removed entirely for something more wholesome. | |
2293 | ||
85aaf69f | 2294 | * `custom_attribute` - Allows the usage of attributes unknown to the compiler |
bd371182 | 2295 | so that new attributes can be added in a backwards compatible |
85aaf69f | 2296 | manner (RFC 572). |
1a4d82fc | 2297 | |
c34b1796 AL |
2298 | * `custom_derive` - Allows the use of `#[derive(Foo,Bar)]` as sugar for |
2299 | `#[derive_Foo] #[derive_Bar]`, which can be user-defined syntax | |
2300 | extensions. | |
2301 | ||
54a0048b SL |
2302 | * `inclusive_range_syntax` - Allows use of the `a...b` and `...b` syntax for inclusive ranges. |
2303 | ||
2304 | * `inclusive_range` - Allows use of the types that represent desugared inclusive ranges. | |
2305 | ||
1a4d82fc JJ |
2306 | * `intrinsics` - Allows use of the "rust-intrinsics" ABI. Compiler intrinsics |
2307 | are inherently unstable and no promise about them is made. | |
2308 | ||
2309 | * `lang_items` - Allows use of the `#[lang]` attribute. Like `intrinsics`, | |
2310 | lang items are inherently unstable and no promise about them | |
2311 | is made. | |
2312 | ||
2313 | * `link_args` - This attribute is used to specify custom flags to the linker, | |
2314 | but usage is strongly discouraged. The compiler's usage of the | |
2315 | system linker is not guaranteed to continue in the future, and | |
2316 | if the system linker is not used then specifying custom flags | |
2317 | doesn't have much meaning. | |
2318 | ||
2319 | * `link_llvm_intrinsics` – Allows linking to LLVM intrinsics via | |
2320 | `#[link_name="llvm.*"]`. | |
2321 | ||
2322 | * `linkage` - Allows use of the `linkage` attribute, which is not portable. | |
2323 | ||
2324 | * `log_syntax` - Allows use of the `log_syntax` macro attribute, which is a | |
2325 | nasty hack that will certainly be removed. | |
2326 | ||
85aaf69f | 2327 | * `main` - Allows use of the `#[main]` attribute, which changes the entry point |
bd371182 | 2328 | into a Rust program. This capability is subject to change. |
85aaf69f SL |
2329 | |
2330 | * `macro_reexport` - Allows macros to be re-exported from one crate after being imported | |
2331 | from another. This feature was originally designed with the sole | |
2332 | use case of the Rust standard library in mind, and is subject to | |
2333 | change. | |
2334 | ||
1a4d82fc JJ |
2335 | * `non_ascii_idents` - The compiler supports the use of non-ascii identifiers, |
2336 | but the implementation is a little rough around the | |
2337 | edges, so this can be seen as an experimental feature | |
2338 | for now until the specification of identifiers is fully | |
2339 | fleshed out. | |
2340 | ||
85aaf69f SL |
2341 | * `no_std` - Allows the `#![no_std]` crate attribute, which disables the implicit |
2342 | `extern crate std`. This typically requires use of the unstable APIs | |
2343 | behind the libstd "facade", such as libcore and libcollections. It | |
2344 | may also cause problems when using syntax extensions, including | |
2345 | `#[derive]`. | |
2346 | ||
2347 | * `on_unimplemented` - Allows the `#[rustc_on_unimplemented]` attribute, which allows | |
2348 | trait definitions to add specialized notes to error messages | |
2349 | when an implementation was expected but not found. | |
2350 | ||
2351 | * `optin_builtin_traits` - Allows the definition of default and negative trait | |
2352 | implementations. Experimental. | |
1a4d82fc JJ |
2353 | |
2354 | * `plugin` - Usage of [compiler plugins][plugin] for custom lints or syntax extensions. | |
2355 | These depend on compiler internals and are subject to change. | |
2356 | ||
2357 | * `plugin_registrar` - Indicates that a crate provides [compiler plugins][plugin]. | |
2358 | ||
2359 | * `quote` - Allows use of the `quote_*!` family of macros, which are | |
2360 | implemented very poorly and will likely change significantly | |
2361 | with a proper implementation. | |
2362 | ||
85aaf69f SL |
2363 | * `rustc_attrs` - Gates internal `#[rustc_*]` attributes which may be |
2364 | for internal use only or have meaning added to them in the future. | |
2365 | ||
1a4d82fc JJ |
2366 | * `rustc_diagnostic_macros`- A mysterious feature, used in the implementation |
2367 | of rustc, not meant for mortals. | |
2368 | ||
2369 | * `simd` - Allows use of the `#[simd]` attribute, which is overly simple and | |
2370 | not the SIMD interface we want to expose in the long term. | |
2371 | ||
85aaf69f SL |
2372 | * `simd_ffi` - Allows use of SIMD vectors in signatures for foreign functions. |
2373 | The SIMD interface is subject to change. | |
2374 | ||
85aaf69f | 2375 | * `start` - Allows use of the `#[start]` attribute, which changes the entry point |
bd371182 | 2376 | into a Rust program. This capability, especially the signature for the |
85aaf69f SL |
2377 | annotated function, is subject to change. |
2378 | ||
1a4d82fc JJ |
2379 | * `thread_local` - The usage of the `#[thread_local]` attribute is experimental |
2380 | and should be seen as unstable. This attribute is used to | |
2381 | declare a `static` as being unique per-thread leveraging | |
2382 | LLVM's implementation which works in concert with the kernel | |
2383 | loader and dynamic linker. This is not necessarily available | |
85aaf69f | 2384 | on all platforms, and usage of it is discouraged. |
1a4d82fc JJ |
2385 | |
2386 | * `trace_macros` - Allows use of the `trace_macros` macro, which is a nasty | |
2387 | hack that will certainly be removed. | |
2388 | ||
2389 | * `unboxed_closures` - Rust's new closure design, which is currently a work in | |
2390 | progress feature with many known bugs. | |
2391 | ||
85aaf69f SL |
2392 | * `unsafe_no_drop_flag` - Allows use of the `#[unsafe_no_drop_flag]` attribute, |
2393 | which removes hidden flag added to a type that | |
2394 | implements the `Drop` trait. The design for the | |
2395 | `Drop` flag is subject to change, and this feature | |
2396 | may be removed in the future. | |
2397 | ||
2398 | * `unmarked_api` - Allows use of items within a `#![staged_api]` crate | |
2399 | which have not been marked with a stability marker. | |
2400 | Such items should not be allowed by the compiler to exist, | |
2401 | so if you need this there probably is a compiler bug. | |
2402 | ||
c34b1796 AL |
2403 | * `allow_internal_unstable` - Allows `macro_rules!` macros to be tagged with the |
2404 | `#[allow_internal_unstable]` attribute, designed | |
2405 | to allow `std` macros to call | |
2406 | `#[unstable]`/feature-gated functionality | |
2407 | internally without imposing on callers | |
2408 | (i.e. making them behave like function calls in | |
2409 | terms of encapsulation). | |
c1a9b12d SL |
2410 | * - `default_type_parameter_fallback` - Allows type parameter defaults to |
2411 | influence type inference. | |
c34b1796 | 2412 | |
92a42be0 SL |
2413 | * - `stmt_expr_attributes` - Allows attributes on expressions and |
2414 | non-item statements. | |
2415 | ||
9cc50fc6 SL |
2416 | * - `deprecated` - Allows using the `#[deprecated]` attribute. |
2417 | ||
2418 | * - `type_ascription` - Allows type ascription expressions `expr: Type`. | |
2419 | ||
2420 | * - `abi_vectorcall` - Allows the usage of the vectorcall calling convention | |
2421 | (e.g. `extern "vectorcall" func fn_();`) | |
2422 | ||
1a4d82fc | 2423 | If a feature is promoted to a language feature, then all existing programs will |
bd371182 | 2424 | start to receive compilation warnings about `#![feature]` directives which enabled |
1a4d82fc JJ |
2425 | the new feature (because the directive is no longer necessary). However, if a |
2426 | feature is decided to be removed from the language, errors will be issued (if | |
2427 | there isn't a parser error first). The directive in this case is no longer | |
2428 | necessary, and it's likely that existing code will break if the feature isn't | |
2429 | removed. | |
2430 | ||
2431 | If an unknown feature is found in a directive, it results in a compiler error. | |
2432 | An unknown feature is one which has never been recognized by the compiler. | |
2433 | ||
2434 | # Statements and expressions | |
2435 | ||
2436 | Rust is _primarily_ an expression language. This means that most forms of | |
2437 | value-producing or effect-causing evaluation are directed by the uniform syntax | |
2438 | category of _expressions_. Each kind of expression can typically _nest_ within | |
2439 | each other kind of expression, and rules for evaluation of expressions involve | |
2440 | specifying both the value produced by the expression and the order in which its | |
2441 | sub-expressions are themselves evaluated. | |
2442 | ||
2443 | In contrast, statements in Rust serve _mostly_ to contain and explicitly | |
2444 | sequence expression evaluation. | |
2445 | ||
2446 | ## Statements | |
2447 | ||
2448 | A _statement_ is a component of a block, which is in turn a component of an | |
2449 | outer [expression](#expressions) or [function](#functions). | |
2450 | ||
2451 | Rust has two kinds of statement: [declaration | |
2452 | statements](#declaration-statements) and [expression | |
2453 | statements](#expression-statements). | |
2454 | ||
2455 | ### Declaration statements | |
2456 | ||
2457 | A _declaration statement_ is one that introduces one or more *names* into the | |
bd371182 | 2458 | enclosing statement block. The declared names may denote new variables or new |
1a4d82fc JJ |
2459 | items. |
2460 | ||
2461 | #### Item declarations | |
2462 | ||
2463 | An _item declaration statement_ has a syntactic form identical to an | |
2464 | [item](#items) declaration within a module. Declaring an item — a | |
b039eaaf | 2465 | function, enumeration, struct, type, static, trait, implementation or module |
1a4d82fc JJ |
2466 | — locally within a statement block is simply a way of restricting its |
2467 | scope to a narrow region containing all of its uses; it is otherwise identical | |
2468 | in meaning to declaring the item outside the statement block. | |
2469 | ||
2470 | > **Note**: there is no implicit capture of the function's dynamic environment when | |
2471 | > declaring a function-local item. | |
2472 | ||
92a42be0 | 2473 | #### `let` statements |
1a4d82fc | 2474 | |
92a42be0 | 2475 | A _`let` statement_ introduces a new set of variables, given by a pattern. The |
1a4d82fc JJ |
2476 | pattern may be followed by a type annotation, and/or an initializer expression. |
2477 | When no type annotation is given, the compiler will infer the type, or signal | |
2478 | an error if insufficient type information is available for definite inference. | |
bd371182 | 2479 | Any variables introduced by a variable declaration are visible from the point of |
1a4d82fc JJ |
2480 | declaration until the end of the enclosing block scope. |
2481 | ||
2482 | ### Expression statements | |
2483 | ||
2484 | An _expression statement_ is one that evaluates an [expression](#expressions) | |
2485 | and ignores its result. The type of an expression statement `e;` is always | |
2486 | `()`, regardless of the type of `e`. As a rule, an expression statement's | |
2487 | purpose is to trigger the effects of evaluating its expression. | |
2488 | ||
2489 | ## Expressions | |
2490 | ||
2491 | An expression may have two roles: it always produces a *value*, and it may have | |
2492 | *effects* (otherwise known as "side effects"). An expression *evaluates to* a | |
2493 | value, and has effects during *evaluation*. Many expressions contain | |
2494 | sub-expressions (operands). The meaning of each kind of expression dictates | |
2495 | several things: | |
2496 | ||
2497 | * Whether or not to evaluate the sub-expressions when evaluating the expression | |
2498 | * The order in which to evaluate the sub-expressions | |
2499 | * How to combine the sub-expressions' values to obtain the value of the expression | |
2500 | ||
2501 | In this way, the structure of expressions dictates the structure of execution. | |
2502 | Blocks are just another kind of expression, so blocks, statements, expressions, | |
2503 | and blocks again can recursively nest inside each other to an arbitrary depth. | |
2504 | ||
2505 | #### Lvalues, rvalues and temporaries | |
2506 | ||
2507 | Expressions are divided into two main categories: _lvalues_ and _rvalues_. | |
2508 | Likewise within each expression, sub-expressions may occur in _lvalue context_ | |
2509 | or _rvalue context_. The evaluation of an expression depends both on its own | |
2510 | category and the context it occurs within. | |
2511 | ||
2512 | An lvalue is an expression that represents a memory location. These expressions | |
2513 | are [paths](#path-expressions) (which refer to local variables, function and | |
2514 | method arguments, or static variables), dereferences (`*expr`), [indexing | |
2515 | expressions](#index-expressions) (`expr[expr]`), and [field | |
2516 | references](#field-expressions) (`expr.f`). All other expressions are rvalues. | |
2517 | ||
2518 | The left operand of an [assignment](#assignment-expressions) or | |
bd371182 AL |
2519 | [compound-assignment](#compound-assignment-expressions) expression is |
2520 | an lvalue context, as is the single operand of a unary | |
2521 | [borrow](#unary-operator-expressions). The discriminant or subject of | |
2522 | a [match expression](#match-expressions) may be an lvalue context, if | |
2523 | ref bindings are made, but is otherwise an rvalue context. All other | |
2524 | expression contexts are rvalue contexts. | |
1a4d82fc JJ |
2525 | |
2526 | When an lvalue is evaluated in an _lvalue context_, it denotes a memory | |
2527 | location; when evaluated in an _rvalue context_, it denotes the value held _in_ | |
2528 | that memory location. | |
2529 | ||
bd371182 AL |
2530 | ##### Temporary lifetimes |
2531 | ||
2532 | When an rvalue is used in an lvalue context, a temporary un-named | |
2533 | lvalue is created and used instead. The lifetime of temporary values | |
2534 | is typically the innermost enclosing statement; the tail expression of | |
2535 | a block is considered part of the statement that encloses the block. | |
2536 | ||
2537 | When a temporary rvalue is being created that is assigned into a `let` | |
2538 | declaration, however, the temporary is created with the lifetime of | |
2539 | the enclosing block instead, as using the enclosing statement (the | |
2540 | `let` declaration) would be a guaranteed error (since a pointer to the | |
2541 | temporary would be stored into a variable, but the temporary would be | |
2542 | freed before the variable could be used). The compiler uses simple | |
2543 | syntactic rules to decide which values are being assigned into a `let` | |
2544 | binding, and therefore deserve a longer temporary lifetime. | |
2545 | ||
2546 | Here are some examples: | |
2547 | ||
2548 | - `let x = foo(&temp())`. The expression `temp()` is an rvalue. As it | |
2549 | is being borrowed, a temporary is created which will be freed after | |
2550 | the innermost enclosing statement (the `let` declaration, in this case). | |
2551 | - `let x = temp().foo()`. This is the same as the previous example, | |
2552 | except that the value of `temp()` is being borrowed via autoref on a | |
2553 | method-call. Here we are assuming that `foo()` is an `&self` method | |
2554 | defined in some trait, say `Foo`. In other words, the expression | |
2555 | `temp().foo()` is equivalent to `Foo::foo(&temp())`. | |
2556 | - `let x = &temp()`. Here, the same temporary is being assigned into | |
2557 | `x`, rather than being passed as a parameter, and hence the | |
2558 | temporary's lifetime is considered to be the enclosing block. | |
2559 | - `let x = SomeStruct { foo: &temp() }`. As in the previous case, the | |
2560 | temporary is assigned into a struct which is then assigned into a | |
2561 | binding, and hence it is given the lifetime of the enclosing block. | |
2562 | - `let x = [ &temp() ]`. As in the previous case, the | |
2563 | temporary is assigned into an array which is then assigned into a | |
2564 | binding, and hence it is given the lifetime of the enclosing block. | |
2565 | - `let ref x = temp()`. In this case, the temporary is created using a ref binding, | |
2566 | but the result is the same: the lifetime is extended to the enclosing block. | |
1a4d82fc JJ |
2567 | |
2568 | #### Moved and copied types | |
2569 | ||
bd371182 | 2570 | When a [local variable](#variables) is used as an |
b039eaaf | 2571 | [rvalue](#lvalues-rvalues-and-temporaries), the variable will be copied |
c1a9b12d | 2572 | if its type implements `Copy`. All others are moved. |
1a4d82fc JJ |
2573 | |
2574 | ### Literal expressions | |
2575 | ||
2576 | A _literal expression_ consists of one of the [literal](#literals) forms | |
2577 | described earlier. It directly describes a number, character, string, boolean | |
2578 | value, or the unit value. | |
2579 | ||
2580 | ```{.literals} | |
2581 | (); // unit type | |
2582 | "hello"; // string type | |
2583 | '5'; // character type | |
2584 | 5; // integer type | |
2585 | ``` | |
2586 | ||
2587 | ### Path expressions | |
2588 | ||
2589 | A [path](#paths) used as an expression context denotes either a local variable | |
b039eaaf | 2590 | or an item. Path expressions are [lvalues](#lvalues-rvalues-and-temporaries). |
1a4d82fc JJ |
2591 | |
2592 | ### Tuple expressions | |
2593 | ||
2594 | Tuples are written by enclosing zero or more comma-separated expressions in | |
2595 | parentheses. They are used to create [tuple-typed](#tuple-types) values. | |
2596 | ||
2597 | ```{.tuple} | |
1a4d82fc | 2598 | (0.0, 4.5); |
bd371182 | 2599 | ("a", 4usize, true); |
1a4d82fc JJ |
2600 | ``` |
2601 | ||
bd371182 AL |
2602 | You can disambiguate a single-element tuple from a value in parentheses with a |
2603 | comma: | |
1a4d82fc | 2604 | |
bd371182 AL |
2605 | ``` |
2606 | (0,); // single-element tuple | |
2607 | (0); // zero in parentheses | |
2608 | ``` | |
1a4d82fc | 2609 | |
b039eaaf | 2610 | ### Struct expressions |
1a4d82fc | 2611 | |
b039eaaf SL |
2612 | There are several forms of struct expressions. A _struct expression_ |
2613 | consists of the [path](#paths) of a [struct item](#structs), followed by | |
1a4d82fc | 2614 | a brace-enclosed list of one or more comma-separated name-value pairs, |
b039eaaf | 2615 | providing the field values of a new instance of the struct. A field name |
1a4d82fc | 2616 | can be any identifier, and is separated from its value expression by a colon. |
b039eaaf SL |
2617 | The location denoted by a struct field is mutable if and only if the |
2618 | enclosing struct is mutable. | |
1a4d82fc | 2619 | |
b039eaaf SL |
2620 | A _tuple struct expression_ consists of the [path](#paths) of a [struct |
2621 | item](#structs), followed by a parenthesized list of one or more | |
2622 | comma-separated expressions (in other words, the path of a struct item | |
2623 | followed by a tuple expression). The struct item must be a tuple struct | |
1a4d82fc JJ |
2624 | item. |
2625 | ||
b039eaaf SL |
2626 | A _unit-like struct expression_ consists only of the [path](#paths) of a |
2627 | [struct item](#structs). | |
1a4d82fc | 2628 | |
b039eaaf | 2629 | The following are examples of struct expressions: |
1a4d82fc JJ |
2630 | |
2631 | ``` | |
2632 | # struct Point { x: f64, y: f64 } | |
2633 | # struct TuplePoint(f64, f64); | |
c34b1796 | 2634 | # mod game { pub struct User<'a> { pub name: &'a str, pub age: u32, pub score: usize } } |
1a4d82fc JJ |
2635 | # struct Cookie; fn some_fn<T>(t: T) {} |
2636 | Point {x: 10.0, y: 20.0}; | |
2637 | TuplePoint(10.0, 20.0); | |
2638 | let u = game::User {name: "Joe", age: 35, score: 100_000}; | |
2639 | some_fn::<Cookie>(Cookie); | |
2640 | ``` | |
2641 | ||
b039eaaf SL |
2642 | A struct expression forms a new value of the named struct type. Note |
2643 | that for a given *unit-like* struct type, this will always be the same | |
1a4d82fc JJ |
2644 | value. |
2645 | ||
b039eaaf | 2646 | A struct expression can terminate with the syntax `..` followed by an |
1a4d82fc | 2647 | expression to denote a functional update. The expression following `..` (the |
b039eaaf SL |
2648 | base) must have the same struct type as the new struct type being formed. |
2649 | The entire expression denotes the result of constructing a new struct (with | |
1a4d82fc JJ |
2650 | the same type as the base expression) with the given values for the fields that |
2651 | were explicitly specified and the values in the base expression for all other | |
2652 | fields. | |
2653 | ||
2654 | ``` | |
85aaf69f | 2655 | # struct Point3d { x: i32, y: i32, z: i32 } |
1a4d82fc JJ |
2656 | let base = Point3d {x: 1, y: 2, z: 3}; |
2657 | Point3d {y: 0, z: 10, .. base}; | |
2658 | ``` | |
2659 | ||
2660 | ### Block expressions | |
2661 | ||
1a4d82fc | 2662 | A _block expression_ is similar to a module in terms of the declarations that |
85aaf69f | 2663 | are possible. Each block conceptually introduces a new namespace scope. Use |
1a4d82fc JJ |
2664 | items can bring new names into scopes and declared items are in scope for only |
2665 | the block itself. | |
2666 | ||
2667 | A block will execute each statement sequentially, and then execute the | |
85aaf69f SL |
2668 | expression (if given). If the block ends in a statement, its value is `()`: |
2669 | ||
2670 | ``` | |
2671 | let x: () = { println!("Hello."); }; | |
2672 | ``` | |
2673 | ||
2674 | If it ends in an expression, its value and type are that of the expression: | |
2675 | ||
2676 | ``` | |
2677 | let x: i32 = { println!("Hello."); 5 }; | |
2678 | ||
2679 | assert_eq!(5, x); | |
2680 | ``` | |
1a4d82fc JJ |
2681 | |
2682 | ### Method-call expressions | |
2683 | ||
1a4d82fc JJ |
2684 | A _method call_ consists of an expression followed by a single dot, an |
2685 | identifier, and a parenthesized expression-list. Method calls are resolved to | |
2686 | methods on specific traits, either statically dispatching to a method if the | |
2687 | exact `self`-type of the left-hand-side is known, or dynamically dispatching if | |
bd371182 | 2688 | the left-hand-side expression is an indirect [trait object](#trait-objects). |
1a4d82fc JJ |
2689 | |
2690 | ### Field expressions | |
2691 | ||
1a4d82fc JJ |
2692 | A _field expression_ consists of an expression followed by a single dot and an |
2693 | identifier, when not immediately followed by a parenthesized expression-list | |
2694 | (the latter is a [method call expression](#method-call-expressions)). A field | |
b039eaaf | 2695 | expression denotes a field of a [struct](#struct-types). |
1a4d82fc JJ |
2696 | |
2697 | ```{.ignore .field} | |
2698 | mystruct.myfield; | |
2699 | foo().x; | |
2700 | (Struct {a: 10, b: 20}).a; | |
2701 | ``` | |
2702 | ||
b039eaaf | 2703 | A field access is an [lvalue](#lvalues-rvalues-and-temporaries) referring to |
1a4d82fc JJ |
2704 | the value of that field. When the type providing the field inherits mutability, |
2705 | it can be [assigned](#assignment-expressions) to. | |
2706 | ||
bd371182 AL |
2707 | Also, if the type of the expression to the left of the dot is a |
2708 | pointer, it is automatically dereferenced as many times as necessary | |
2709 | to make the field access possible. In cases of ambiguity, we prefer | |
2710 | fewer autoderefs to more. | |
1a4d82fc JJ |
2711 | |
2712 | ### Array expressions | |
2713 | ||
b039eaaf | 2714 | An [array](#array-and-slice-types) _expression_ is written by enclosing zero |
1a4d82fc JJ |
2715 | or more comma-separated expressions of uniform type in square brackets. |
2716 | ||
85aaf69f | 2717 | In the `[expr ';' expr]` form, the expression after the `';'` must be a |
1a4d82fc JJ |
2718 | constant expression that can be evaluated at compile time, such as a |
2719 | [literal](#literals) or a [static item](#static-items). | |
2720 | ||
2721 | ``` | |
85aaf69f | 2722 | [1, 2, 3, 4]; |
1a4d82fc | 2723 | ["a", "b", "c", "d"]; |
85aaf69f | 2724 | [0; 128]; // array with 128 zeros |
1a4d82fc JJ |
2725 | [0u8, 0u8, 0u8, 0u8]; |
2726 | ``` | |
2727 | ||
2728 | ### Index expressions | |
2729 | ||
b039eaaf | 2730 | [Array](#array-and-slice-types)-typed expressions can be indexed by |
1a4d82fc | 2731 | writing a square-bracket-enclosed expression (the index) after them. When the |
b039eaaf | 2732 | array is mutable, the resulting [lvalue](#lvalues-rvalues-and-temporaries) can |
1a4d82fc JJ |
2733 | be assigned to. |
2734 | ||
2735 | Indices are zero-based, and may be of any integral type. Vector access is | |
bd371182 AL |
2736 | bounds-checked at compile-time for constant arrays being accessed with a constant index value. |
2737 | Otherwise a check will be performed at run-time that will put the thread in a _panicked state_ if it fails. | |
1a4d82fc JJ |
2738 | |
2739 | ```{should-fail} | |
2740 | ([1, 2, 3, 4])[0]; | |
bd371182 AL |
2741 | |
2742 | let x = (["a", "b"])[10]; // compiler error: const index-expr is out of bounds | |
2743 | ||
2744 | let n = 10; | |
2745 | let y = (["a", "b"])[n]; // panics | |
2746 | ||
2747 | let arr = ["a", "b"]; | |
2748 | arr[10]; // panics | |
2749 | ``` | |
2750 | ||
2751 | Also, if the type of the expression to the left of the brackets is a | |
2752 | pointer, it is automatically dereferenced as many times as necessary | |
2753 | to make the indexing possible. In cases of ambiguity, we prefer fewer | |
2754 | autoderefs to more. | |
2755 | ||
2756 | ### Range expressions | |
2757 | ||
2758 | The `..` operator will construct an object of one of the `std::ops::Range` variants. | |
2759 | ||
2760 | ``` | |
2761 | 1..2; // std::ops::Range | |
2762 | 3..; // std::ops::RangeFrom | |
2763 | ..4; // std::ops::RangeTo | |
2764 | ..; // std::ops::RangeFull | |
2765 | ``` | |
2766 | ||
2767 | The following expressions are equivalent. | |
2768 | ||
2769 | ``` | |
2770 | let x = std::ops::Range {start: 0, end: 10}; | |
2771 | let y = 0..10; | |
2772 | ||
b039eaaf | 2773 | assert_eq!(x, y); |
1a4d82fc JJ |
2774 | ``` |
2775 | ||
54a0048b SL |
2776 | Similarly, the `...` operator will construct an object of one of the |
2777 | `std::ops::RangeInclusive` variants. | |
2778 | ||
2779 | ``` | |
2780 | # #![feature(inclusive_range_syntax)] | |
2781 | 1...2; // std::ops::RangeInclusive | |
2782 | ...4; // std::ops::RangeToInclusive | |
2783 | ``` | |
2784 | ||
2785 | The following expressions are equivalent. | |
2786 | ||
2787 | ``` | |
2788 | # #![feature(inclusive_range_syntax, inclusive_range)] | |
2789 | let x = std::ops::RangeInclusive::NonEmpty {start: 0, end: 10}; | |
2790 | let y = 0...10; | |
2791 | ||
2792 | assert_eq!(x, y); | |
2793 | ``` | |
2794 | ||
1a4d82fc JJ |
2795 | ### Unary operator expressions |
2796 | ||
bd371182 | 2797 | Rust defines the following unary operators. They are all written as prefix operators, |
85aaf69f | 2798 | before the expression they apply to. |
1a4d82fc JJ |
2799 | |
2800 | * `-` | |
54a0048b SL |
2801 | : Negation. Signed integer types and floating-point types support negation. It |
2802 | is an error to apply negation to unsigned types; for example, the compiler | |
2803 | rejects `-1u32`. | |
1a4d82fc JJ |
2804 | * `*` |
2805 | : Dereference. When applied to a [pointer](#pointer-types) it denotes the | |
2806 | pointed-to location. For pointers to mutable locations, the resulting | |
b039eaaf | 2807 | [lvalue](#lvalues-rvalues-and-temporaries) can be assigned to. |
1a4d82fc JJ |
2808 | On non-pointer types, it calls the `deref` method of the `std::ops::Deref` |
2809 | trait, or the `deref_mut` method of the `std::ops::DerefMut` trait (if | |
2810 | implemented by the type and required for an outer expression that will or | |
2811 | could mutate the dereference), and produces the result of dereferencing the | |
2812 | `&` or `&mut` borrowed pointer returned from the overload method. | |
1a4d82fc JJ |
2813 | * `!` |
2814 | : Logical negation. On the boolean type, this flips between `true` and | |
2815 | `false`. On integer types, this inverts the individual bits in the | |
2816 | two's complement representation of the value. | |
bd371182 AL |
2817 | * `&` and `&mut` |
2818 | : Borrowing. When applied to an lvalue, these operators produce a | |
2819 | reference (pointer) to the lvalue. The lvalue is also placed into | |
2820 | a borrowed state for the duration of the reference. For a shared | |
2821 | borrow (`&`), this implies that the lvalue may not be mutated, but | |
2822 | it may be read or shared again. For a mutable borrow (`&mut`), the | |
2823 | lvalue may not be accessed in any way until the borrow expires. | |
2824 | If the `&` or `&mut` operators are applied to an rvalue, a | |
2825 | temporary value is created; the lifetime of this temporary value | |
2826 | is defined by [syntactic rules](#temporary-lifetimes). | |
1a4d82fc JJ |
2827 | |
2828 | ### Binary operator expressions | |
2829 | ||
1a4d82fc JJ |
2830 | Binary operators expressions are given in terms of [operator |
2831 | precedence](#operator-precedence). | |
2832 | ||
2833 | #### Arithmetic operators | |
2834 | ||
2835 | Binary arithmetic expressions are syntactic sugar for calls to built-in traits, | |
2836 | defined in the `std::ops` module of the `std` library. This means that | |
2837 | arithmetic operators can be overridden for user-defined types. The default | |
2838 | meaning of the operators on standard types is given here. | |
2839 | ||
2840 | * `+` | |
2841 | : Addition and array/string concatenation. | |
2842 | Calls the `add` method on the `std::ops::Add` trait. | |
2843 | * `-` | |
2844 | : Subtraction. | |
2845 | Calls the `sub` method on the `std::ops::Sub` trait. | |
2846 | * `*` | |
2847 | : Multiplication. | |
2848 | Calls the `mul` method on the `std::ops::Mul` trait. | |
2849 | * `/` | |
2850 | : Quotient. | |
2851 | Calls the `div` method on the `std::ops::Div` trait. | |
2852 | * `%` | |
2853 | : Remainder. | |
2854 | Calls the `rem` method on the `std::ops::Rem` trait. | |
2855 | ||
2856 | #### Bitwise operators | |
2857 | ||
2858 | Like the [arithmetic operators](#arithmetic-operators), bitwise operators are | |
2859 | syntactic sugar for calls to methods of built-in traits. This means that | |
2860 | bitwise operators can be overridden for user-defined types. The default | |
62682a34 SL |
2861 | meaning of the operators on standard types is given here. Bitwise `&`, `|` and |
2862 | `^` applied to boolean arguments are equivalent to logical `&&`, `||` and `!=` | |
2863 | evaluated in non-lazy fashion. | |
1a4d82fc JJ |
2864 | |
2865 | * `&` | |
62682a34 | 2866 | : Bitwise AND. |
1a4d82fc JJ |
2867 | Calls the `bitand` method of the `std::ops::BitAnd` trait. |
2868 | * `|` | |
62682a34 | 2869 | : Bitwise inclusive OR. |
1a4d82fc JJ |
2870 | Calls the `bitor` method of the `std::ops::BitOr` trait. |
2871 | * `^` | |
62682a34 | 2872 | : Bitwise exclusive OR. |
1a4d82fc JJ |
2873 | Calls the `bitxor` method of the `std::ops::BitXor` trait. |
2874 | * `<<` | |
c34b1796 | 2875 | : Left shift. |
1a4d82fc JJ |
2876 | Calls the `shl` method of the `std::ops::Shl` trait. |
2877 | * `>>` | |
62682a34 | 2878 | : Right shift (arithmetic). |
1a4d82fc JJ |
2879 | Calls the `shr` method of the `std::ops::Shr` trait. |
2880 | ||
2881 | #### Lazy boolean operators | |
2882 | ||
2883 | The operators `||` and `&&` may be applied to operands of boolean type. The | |
2884 | `||` operator denotes logical 'or', and the `&&` operator denotes logical | |
2885 | 'and'. They differ from `|` and `&` in that the right-hand operand is only | |
2886 | evaluated when the left-hand operand does not already determine the result of | |
2887 | the expression. That is, `||` only evaluates its right-hand operand when the | |
2888 | left-hand operand evaluates to `false`, and `&&` only when it evaluates to | |
2889 | `true`. | |
2890 | ||
2891 | #### Comparison operators | |
2892 | ||
2893 | Comparison operators are, like the [arithmetic | |
2894 | operators](#arithmetic-operators), and [bitwise operators](#bitwise-operators), | |
2895 | syntactic sugar for calls to built-in traits. This means that comparison | |
2896 | operators can be overridden for user-defined types. The default meaning of the | |
2897 | operators on standard types is given here. | |
2898 | ||
2899 | * `==` | |
2900 | : Equal to. | |
2901 | Calls the `eq` method on the `std::cmp::PartialEq` trait. | |
2902 | * `!=` | |
2903 | : Unequal to. | |
2904 | Calls the `ne` method on the `std::cmp::PartialEq` trait. | |
2905 | * `<` | |
2906 | : Less than. | |
2907 | Calls the `lt` method on the `std::cmp::PartialOrd` trait. | |
2908 | * `>` | |
2909 | : Greater than. | |
2910 | Calls the `gt` method on the `std::cmp::PartialOrd` trait. | |
2911 | * `<=` | |
2912 | : Less than or equal. | |
2913 | Calls the `le` method on the `std::cmp::PartialOrd` trait. | |
2914 | * `>=` | |
2915 | : Greater than or equal. | |
2916 | Calls the `ge` method on the `std::cmp::PartialOrd` trait. | |
2917 | ||
2918 | #### Type cast expressions | |
2919 | ||
2920 | A type cast expression is denoted with the binary operator `as`. | |
2921 | ||
2922 | Executing an `as` expression casts the value on the left-hand side to the type | |
2923 | on the right-hand side. | |
2924 | ||
1a4d82fc JJ |
2925 | An example of an `as` expression: |
2926 | ||
2927 | ``` | |
62682a34 SL |
2928 | # fn sum(values: &[f64]) -> f64 { 0.0 } |
2929 | # fn len(values: &[f64]) -> i32 { 0 } | |
1a4d82fc | 2930 | |
62682a34 | 2931 | fn average(values: &[f64]) -> f64 { |
e9174d1e SL |
2932 | let sum: f64 = sum(values); |
2933 | let size: f64 = len(values) as f64; | |
2934 | sum / size | |
1a4d82fc JJ |
2935 | } |
2936 | ``` | |
2937 | ||
bd371182 AL |
2938 | Some of the conversions which can be done through the `as` operator |
2939 | can also be done implicitly at various points in the program, such as | |
2940 | argument passing and assignment to a `let` binding with an explicit | |
2941 | type. Implicit conversions are limited to "harmless" conversions that | |
2942 | do not lose information and which have minimal or no risk of | |
2943 | surprising side-effects on the dynamic execution semantics. | |
2944 | ||
1a4d82fc JJ |
2945 | #### Assignment expressions |
2946 | ||
2947 | An _assignment expression_ consists of an | |
b039eaaf SL |
2948 | [lvalue](#lvalues-rvalues-and-temporaries) expression followed by an equals |
2949 | sign (`=`) and an [rvalue](#lvalues-rvalues-and-temporaries) expression. | |
1a4d82fc JJ |
2950 | |
2951 | Evaluating an assignment expression [either copies or | |
2952 | moves](#moved-and-copied-types) its right-hand operand to its left-hand | |
2953 | operand. | |
2954 | ||
2955 | ``` | |
85aaf69f | 2956 | # let mut x = 0; |
1a4d82fc | 2957 | # let y = 0; |
1a4d82fc JJ |
2958 | x = y; |
2959 | ``` | |
2960 | ||
2961 | #### Compound assignment expressions | |
2962 | ||
2963 | The `+`, `-`, `*`, `/`, `%`, `&`, `|`, `^`, `<<`, and `>>` operators may be | |
2964 | composed with the `=` operator. The expression `lval OP= val` is equivalent to | |
2965 | `lval = lval OP val`. For example, `x = x + 1` may be written as `x += 1`. | |
2966 | ||
62682a34 | 2967 | Any such expression always has the [`unit`](#tuple-types) type. |
1a4d82fc JJ |
2968 | |
2969 | #### Operator precedence | |
2970 | ||
2971 | The precedence of Rust binary operators is ordered as follows, going from | |
2972 | strong to weak: | |
2973 | ||
2974 | ```{.text .precedence} | |
1a4d82fc | 2975 | as |
85aaf69f | 2976 | * / % |
1a4d82fc JJ |
2977 | + - |
2978 | << >> | |
2979 | & | |
2980 | ^ | |
2981 | | | |
85aaf69f | 2982 | == != < > <= >= |
1a4d82fc JJ |
2983 | && |
2984 | || | |
85aaf69f | 2985 | = .. |
1a4d82fc JJ |
2986 | ``` |
2987 | ||
2988 | Operators at the same precedence level are evaluated left-to-right. [Unary | |
2989 | operators](#unary-operator-expressions) have the same precedence level and are | |
2990 | stronger than any of the binary operators. | |
2991 | ||
2992 | ### Grouped expressions | |
2993 | ||
2994 | An expression enclosed in parentheses evaluates to the result of the enclosed | |
2995 | expression. Parentheses can be used to explicitly specify evaluation order | |
2996 | within an expression. | |
2997 | ||
1a4d82fc JJ |
2998 | An example of a parenthesized expression: |
2999 | ||
3000 | ``` | |
85aaf69f | 3001 | let x: i32 = (2 + 3) * 4; |
1a4d82fc JJ |
3002 | ``` |
3003 | ||
3004 | ||
3005 | ### Call expressions | |
3006 | ||
bd371182 AL |
3007 | A _call expression_ invokes a function, providing zero or more input variables |
3008 | and an optional location to move the function's output into. If the function | |
3009 | eventually returns, then the expression completes. | |
1a4d82fc JJ |
3010 | |
3011 | Some examples of call expressions: | |
3012 | ||
3013 | ``` | |
85aaf69f | 3014 | # fn add(x: i32, y: i32) -> i32 { 0 } |
1a4d82fc | 3015 | |
85aaf69f SL |
3016 | let x: i32 = add(1i32, 2i32); |
3017 | let pi: Result<f32, _> = "3.14".parse(); | |
1a4d82fc JJ |
3018 | ``` |
3019 | ||
3020 | ### Lambda expressions | |
3021 | ||
1a4d82fc JJ |
3022 | A _lambda expression_ (sometimes called an "anonymous function expression") |
3023 | defines a function and denotes it as a value, in a single expression. A lambda | |
3024 | expression is a pipe-symbol-delimited (`|`) list of identifiers followed by an | |
3025 | expression. | |
3026 | ||
3027 | A lambda expression denotes a function that maps a list of parameters | |
3028 | (`ident_list`) onto the expression that follows the `ident_list`. The | |
3029 | identifiers in the `ident_list` are the parameters to the function. These | |
3030 | parameters' types need not be specified, as the compiler infers them from | |
3031 | context. | |
3032 | ||
3033 | Lambda expressions are most useful when passing functions as arguments to other | |
3034 | functions, as an abbreviation for defining and capturing a separate function. | |
3035 | ||
3036 | Significantly, lambda expressions _capture their environment_, which regular | |
3037 | [function definitions](#functions) do not. The exact type of capture depends | |
3038 | on the [function type](#function-types) inferred for the lambda expression. In | |
3039 | the simplest and least-expensive form (analogous to a ```|| { }``` expression), | |
3040 | the lambda expression captures its environment by reference, effectively | |
3041 | borrowing pointers to all outer variables mentioned inside the function. | |
3042 | Alternately, the compiler may infer that a lambda expression should copy or | |
85aaf69f | 3043 | move values (depending on their type) from the environment into the lambda |
1a4d82fc JJ |
3044 | expression's captured environment. |
3045 | ||
3046 | In this example, we define a function `ten_times` that takes a higher-order | |
c1a9b12d | 3047 | function argument, and we then call it with a lambda expression as an argument: |
1a4d82fc JJ |
3048 | |
3049 | ``` | |
85aaf69f | 3050 | fn ten_times<F>(f: F) where F: Fn(i32) { |
c1a9b12d SL |
3051 | for index in 0..10 { |
3052 | f(index); | |
1a4d82fc JJ |
3053 | } |
3054 | } | |
3055 | ||
3056 | ten_times(|j| println!("hello, {}", j)); | |
3057 | ``` | |
3058 | ||
1a4d82fc JJ |
3059 | ### Infinite loops |
3060 | ||
3061 | A `loop` expression denotes an infinite loop. | |
3062 | ||
bd371182 AL |
3063 | A `loop` expression may optionally have a _label_. The label is written as |
3064 | a lifetime preceding the loop expression, as in `'foo: loop{ }`. If a | |
3065 | label is present, then labeled `break` and `continue` expressions nested | |
3066 | within this loop may exit out of this loop or return control to its head. | |
b039eaaf | 3067 | See [break expressions](#break-expressions) and [continue |
1a4d82fc JJ |
3068 | expressions](#continue-expressions). |
3069 | ||
b039eaaf | 3070 | ### `break` expressions |
1a4d82fc | 3071 | |
1a4d82fc JJ |
3072 | A `break` expression has an optional _label_. If the label is absent, then |
3073 | executing a `break` expression immediately terminates the innermost loop | |
3074 | enclosing it. It is only permitted in the body of a loop. If the label is | |
bd371182 | 3075 | present, then `break 'foo` terminates the loop with label `'foo`, which need not |
1a4d82fc JJ |
3076 | be the innermost label enclosing the `break` expression, but must enclose it. |
3077 | ||
b039eaaf | 3078 | ### `continue` expressions |
1a4d82fc | 3079 | |
1a4d82fc JJ |
3080 | A `continue` expression has an optional _label_. If the label is absent, then |
3081 | executing a `continue` expression immediately terminates the current iteration | |
3082 | of the innermost loop enclosing it, returning control to the loop *head*. In | |
3083 | the case of a `while` loop, the head is the conditional expression controlling | |
3084 | the loop. In the case of a `for` loop, the head is the call-expression | |
bd371182 AL |
3085 | controlling the loop. If the label is present, then `continue 'foo` returns |
3086 | control to the head of the loop with label `'foo`, which need not be the | |
7453a54e | 3087 | innermost label enclosing the `continue` expression, but must enclose it. |
1a4d82fc JJ |
3088 | |
3089 | A `continue` expression is only permitted in the body of a loop. | |
3090 | ||
b039eaaf | 3091 | ### `while` loops |
bd371182 AL |
3092 | |
3093 | A `while` loop begins by evaluating the boolean loop conditional expression. | |
3094 | If the loop conditional expression evaluates to `true`, the loop body block | |
3095 | executes and control returns to the loop conditional expression. If the loop | |
3096 | conditional expression evaluates to `false`, the `while` expression completes. | |
3097 | ||
3098 | An example: | |
3099 | ||
3100 | ``` | |
3101 | let mut i = 0; | |
1a4d82fc | 3102 | |
bd371182 AL |
3103 | while i < 10 { |
3104 | println!("hello"); | |
3105 | i = i + 1; | |
3106 | } | |
1a4d82fc JJ |
3107 | ``` |
3108 | ||
bd371182 AL |
3109 | Like `loop` expressions, `while` loops can be controlled with `break` or |
3110 | `continue`, and may optionally have a _label_. See [infinite | |
3111 | loops](#infinite-loops), [break expressions](#break-expressions), and | |
3112 | [continue expressions](#continue-expressions) for more information. | |
3113 | ||
b039eaaf | 3114 | ### `for` expressions |
bd371182 | 3115 | |
1a4d82fc | 3116 | A `for` expression is a syntactic construct for looping over elements provided |
bd371182 | 3117 | by an implementation of `std::iter::IntoIterator`. |
1a4d82fc | 3118 | |
b039eaaf | 3119 | An example of a `for` loop over the contents of an array: |
1a4d82fc JJ |
3120 | |
3121 | ``` | |
85aaf69f | 3122 | # type Foo = i32; |
bd371182 | 3123 | # fn bar(f: &Foo) { } |
1a4d82fc JJ |
3124 | # let a = 0; |
3125 | # let b = 0; | |
3126 | # let c = 0; | |
3127 | ||
3128 | let v: &[Foo] = &[a, b, c]; | |
3129 | ||
bd371182 AL |
3130 | for e in v { |
3131 | bar(e); | |
1a4d82fc JJ |
3132 | } |
3133 | ``` | |
3134 | ||
3135 | An example of a for loop over a series of integers: | |
3136 | ||
3137 | ``` | |
85aaf69f SL |
3138 | # fn bar(b:usize) { } |
3139 | for i in 0..256 { | |
1a4d82fc JJ |
3140 | bar(i); |
3141 | } | |
3142 | ``` | |
3143 | ||
bd371182 AL |
3144 | Like `loop` expressions, `for` loops can be controlled with `break` or |
3145 | `continue`, and may optionally have a _label_. See [infinite | |
3146 | loops](#infinite-loops), [break expressions](#break-expressions), and | |
3147 | [continue expressions](#continue-expressions) for more information. | |
1a4d82fc | 3148 | |
b039eaaf | 3149 | ### `if` expressions |
1a4d82fc JJ |
3150 | |
3151 | An `if` expression is a conditional branch in program control. The form of an | |
3152 | `if` expression is a condition expression, followed by a consequent block, any | |
3153 | number of `else if` conditions and blocks, and an optional trailing `else` | |
3154 | block. The condition expressions must have type `bool`. If a condition | |
3155 | expression evaluates to `true`, the consequent block is executed and any | |
3156 | subsequent `else if` or `else` block is skipped. If a condition expression | |
3157 | evaluates to `false`, the consequent block is skipped and any subsequent `else | |
3158 | if` condition is evaluated. If all `if` and `else if` conditions evaluate to | |
3159 | `false` then any `else` block is executed. | |
3160 | ||
b039eaaf | 3161 | ### `match` expressions |
1a4d82fc | 3162 | |
1a4d82fc JJ |
3163 | A `match` expression branches on a *pattern*. The exact form of matching that |
3164 | occurs depends on the pattern. Patterns consist of some combination of | |
b039eaaf | 3165 | literals, destructured arrays or enum constructors, structs and tuples, |
1a4d82fc JJ |
3166 | variable binding specifications, wildcards (`..`), and placeholders (`_`). A |
3167 | `match` expression has a *head expression*, which is the value to compare to | |
3168 | the patterns. The type of the patterns must equal the type of the head | |
3169 | expression. | |
3170 | ||
3171 | In a pattern whose head expression has an `enum` type, a placeholder (`_`) | |
3172 | stands for a *single* data field, whereas a wildcard `..` stands for *all* the | |
bd371182 | 3173 | fields of a particular variant. |
1a4d82fc JJ |
3174 | |
3175 | A `match` behaves differently depending on whether or not the head expression | |
b039eaaf | 3176 | is an [lvalue or an rvalue](#lvalues-rvalues-and-temporaries). If the head |
1a4d82fc JJ |
3177 | expression is an rvalue, it is first evaluated into a temporary location, and |
3178 | the resulting value is sequentially compared to the patterns in the arms until | |
3179 | a match is found. The first arm with a matching pattern is chosen as the branch | |
3180 | target of the `match`, any variables bound by the pattern are assigned to local | |
3181 | variables in the arm's block, and control enters the block. | |
3182 | ||
3183 | When the head expression is an lvalue, the match does not allocate a temporary | |
3184 | location (however, a by-value binding may copy or move from the lvalue). When | |
3185 | possible, it is preferable to match on lvalues, as the lifetime of these | |
3186 | matches inherits the lifetime of the lvalue, rather than being restricted to | |
3187 | the inside of the match. | |
3188 | ||
3189 | An example of a `match` expression: | |
3190 | ||
3191 | ``` | |
bd371182 | 3192 | let x = 1; |
1a4d82fc | 3193 | |
bd371182 AL |
3194 | match x { |
3195 | 1 => println!("one"), | |
3196 | 2 => println!("two"), | |
3197 | 3 => println!("three"), | |
3198 | 4 => println!("four"), | |
3199 | 5 => println!("five"), | |
3200 | _ => println!("something else"), | |
1a4d82fc JJ |
3201 | } |
3202 | ``` | |
3203 | ||
3204 | Patterns that bind variables default to binding to a copy or move of the | |
3205 | matched value (depending on the matched value's type). This can be changed to | |
3206 | bind to a reference by using the `ref` keyword, or to a mutable reference using | |
3207 | `ref mut`. | |
3208 | ||
3209 | Subpatterns can also be bound to variables by the use of the syntax `variable @ | |
3210 | subpattern`. For example: | |
3211 | ||
3212 | ``` | |
bd371182 | 3213 | let x = 1; |
1a4d82fc | 3214 | |
bd371182 AL |
3215 | match x { |
3216 | e @ 1 ... 5 => println!("got a range element {}", e), | |
3217 | _ => println!("anything"), | |
1a4d82fc | 3218 | } |
1a4d82fc JJ |
3219 | ``` |
3220 | ||
3221 | Patterns can also dereference pointers by using the `&`, `&mut` and `box` | |
85aaf69f | 3222 | symbols, as appropriate. For example, these two matches on `x: &i32` are |
1a4d82fc JJ |
3223 | equivalent: |
3224 | ||
3225 | ``` | |
85aaf69f | 3226 | # let x = &3; |
1a4d82fc JJ |
3227 | let y = match *x { 0 => "zero", _ => "some" }; |
3228 | let z = match x { &0 => "zero", _ => "some" }; | |
3229 | ||
3230 | assert_eq!(y, z); | |
3231 | ``` | |
3232 | ||
1a4d82fc JJ |
3233 | Multiple match patterns may be joined with the `|` operator. A range of values |
3234 | may be specified with `...`. For example: | |
3235 | ||
3236 | ``` | |
85aaf69f | 3237 | # let x = 2; |
1a4d82fc JJ |
3238 | |
3239 | let message = match x { | |
e9174d1e SL |
3240 | 0 | 1 => "not many", |
3241 | 2 ... 9 => "a few", | |
3242 | _ => "lots" | |
1a4d82fc JJ |
3243 | }; |
3244 | ``` | |
3245 | ||
3246 | Range patterns only work on scalar types (like integers and characters; not | |
3247 | like arrays and structs, which have sub-components). A range pattern may not | |
3248 | be a sub-range of another range pattern inside the same `match`. | |
3249 | ||
3250 | Finally, match patterns can accept *pattern guards* to further refine the | |
3251 | criteria for matching a case. Pattern guards appear after the pattern and | |
3252 | consist of a bool-typed expression following the `if` keyword. A pattern guard | |
3253 | may refer to the variables bound within the pattern they follow. | |
3254 | ||
3255 | ``` | |
3256 | # let maybe_digit = Some(0); | |
85aaf69f SL |
3257 | # fn process_digit(i: i32) { } |
3258 | # fn process_other(i: i32) { } | |
1a4d82fc JJ |
3259 | |
3260 | let message = match maybe_digit { | |
e9174d1e SL |
3261 | Some(x) if x < 10 => process_digit(x), |
3262 | Some(x) => process_other(x), | |
7453a54e | 3263 | None => panic!(), |
1a4d82fc JJ |
3264 | }; |
3265 | ``` | |
3266 | ||
b039eaaf | 3267 | ### `if let` expressions |
1a4d82fc | 3268 | |
92a42be0 SL |
3269 | An `if let` expression is semantically identical to an `if` expression but in |
3270 | place of a condition expression it expects a `let` statement with a refutable | |
3271 | pattern. If the value of the expression on the right hand side of the `let` | |
3272 | statement matches the pattern, the corresponding block will execute, otherwise | |
3273 | flow proceeds to the first `else` block that follows. | |
1a4d82fc | 3274 | |
bd371182 AL |
3275 | ``` |
3276 | let dish = ("Ham", "Eggs"); | |
1a4d82fc | 3277 | |
bd371182 AL |
3278 | // this body will be skipped because the pattern is refuted |
3279 | if let ("Bacon", b) = dish { | |
3280 | println!("Bacon is served with {}", b); | |
3281 | } | |
3282 | ||
3283 | // this body will execute | |
3284 | if let ("Ham", b) = dish { | |
3285 | println!("Ham is served with {}", b); | |
3286 | } | |
1a4d82fc JJ |
3287 | ``` |
3288 | ||
b039eaaf | 3289 | ### `while let` loops |
bd371182 | 3290 | |
92a42be0 SL |
3291 | A `while let` loop is semantically identical to a `while` loop but in place of |
3292 | a condition expression it expects `let` statement with a refutable pattern. If | |
3293 | the value of the expression on the right hand side of the `let` statement | |
3294 | matches the pattern, the loop body block executes and control returns to the | |
3295 | pattern matching statement. Otherwise, the while expression completes. | |
1a4d82fc | 3296 | |
b039eaaf | 3297 | ### `return` expressions |
1a4d82fc | 3298 | |
1a4d82fc | 3299 | Return expressions are denoted with the keyword `return`. Evaluating a `return` |
bd371182 AL |
3300 | expression moves its argument into the designated output location for the |
3301 | current function call, destroys the current function activation frame, and | |
3302 | transfers control to the caller frame. | |
1a4d82fc JJ |
3303 | |
3304 | An example of a `return` expression: | |
3305 | ||
3306 | ``` | |
85aaf69f | 3307 | fn max(a: i32, b: i32) -> i32 { |
e9174d1e SL |
3308 | if a > b { |
3309 | return a; | |
3310 | } | |
3311 | return b; | |
1a4d82fc JJ |
3312 | } |
3313 | ``` | |
3314 | ||
3315 | # Type system | |
3316 | ||
3317 | ## Types | |
3318 | ||
bd371182 | 3319 | Every variable, item and value in a Rust program has a type. The _type_ of a |
1a4d82fc JJ |
3320 | *value* defines the interpretation of the memory holding it. |
3321 | ||
3322 | Built-in types and type-constructors are tightly integrated into the language, | |
3323 | in nontrivial ways that are not possible to emulate in user-defined types. | |
3324 | User-defined types have limited capabilities. | |
3325 | ||
3326 | ### Primitive types | |
3327 | ||
3328 | The primitive types are the following: | |
3329 | ||
1a4d82fc | 3330 | * The boolean type `bool` with values `true` and `false`. |
62682a34 SL |
3331 | * The machine types (integer and floating-point). |
3332 | * The machine-dependent integer types. | |
54a0048b SL |
3333 | * Arrays |
3334 | * Tuples | |
3335 | * Slices | |
3336 | * Function pointers | |
1a4d82fc | 3337 | |
1a4d82fc JJ |
3338 | #### Machine types |
3339 | ||
3340 | The machine types are the following: | |
3341 | ||
3342 | * The unsigned word types `u8`, `u16`, `u32` and `u64`, with values drawn from | |
3343 | the integer intervals [0, 2^8 - 1], [0, 2^16 - 1], [0, 2^32 - 1] and | |
3344 | [0, 2^64 - 1] respectively. | |
3345 | ||
3346 | * The signed two's complement word types `i8`, `i16`, `i32` and `i64`, with | |
3347 | values drawn from the integer intervals [-(2^(7)), 2^7 - 1], | |
3348 | [-(2^(15)), 2^15 - 1], [-(2^(31)), 2^31 - 1], [-(2^(63)), 2^63 - 1] | |
3349 | respectively. | |
3350 | ||
3351 | * The IEEE 754-2008 `binary32` and `binary64` floating-point types: `f32` and | |
3352 | `f64`, respectively. | |
3353 | ||
3354 | #### Machine-dependent integer types | |
3355 | ||
85aaf69f | 3356 | The `usize` type is an unsigned integer type with the same number of bits as the |
1a4d82fc JJ |
3357 | platform's pointer type. It can represent every memory address in the process. |
3358 | ||
85aaf69f | 3359 | The `isize` type is a signed integer type with the same number of bits as the |
1a4d82fc | 3360 | platform's pointer type. The theoretical upper bound on object and array size |
85aaf69f | 3361 | is the maximum `isize` value. This ensures that `isize` can be used to calculate |
1a4d82fc JJ |
3362 | differences between pointers into an object or array and can address every byte |
3363 | within an object along with one byte past the end. | |
3364 | ||
3365 | ### Textual types | |
3366 | ||
3367 | The types `char` and `str` hold textual data. | |
3368 | ||
3369 | A value of type `char` is a [Unicode scalar value]( | |
85aaf69f | 3370 | http://www.unicode.org/glossary/#unicode_scalar_value) (i.e. a code point that |
1a4d82fc JJ |
3371 | is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to |
3372 | 0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` array is effectively an UCS-4 / | |
3373 | UTF-32 string. | |
3374 | ||
3375 | A value of type `str` is a Unicode string, represented as an array of 8-bit | |
bd371182 | 3376 | unsigned bytes holding a sequence of UTF-8 code points. Since `str` is of |
85aaf69f | 3377 | unknown size, it is not a _first-class_ type, but can only be instantiated |
bd371182 | 3378 | through a pointer type, such as `&str`. |
1a4d82fc JJ |
3379 | |
3380 | ### Tuple types | |
3381 | ||
3382 | A tuple *type* is a heterogeneous product of other types, called the *elements* | |
3383 | of the tuple. It has no nominal name and is instead structurally typed. | |
3384 | ||
3385 | Tuple types and values are denoted by listing the types or values of their | |
3386 | elements, respectively, in a parenthesized, comma-separated list. | |
3387 | ||
3388 | Because tuple elements don't have a name, they can only be accessed by | |
c34b1796 AL |
3389 | pattern-matching or by using `N` directly as a field to access the |
3390 | `N`th element. | |
1a4d82fc | 3391 | |
1a4d82fc JJ |
3392 | An example of a tuple type and its use: |
3393 | ||
3394 | ``` | |
85aaf69f | 3395 | type Pair<'a> = (i32, &'a str); |
c1a9b12d | 3396 | let p: Pair<'static> = (10, "ten"); |
1a4d82fc | 3397 | let (a, b) = p; |
c1a9b12d SL |
3398 | |
3399 | assert_eq!(a, 10); | |
3400 | assert_eq!(b, "ten"); | |
3401 | assert_eq!(p.0, 10); | |
3402 | assert_eq!(p.1, "ten"); | |
1a4d82fc JJ |
3403 | ``` |
3404 | ||
62682a34 SL |
3405 | For historical reasons and convenience, the tuple type with no elements (`()`) |
3406 | is often called ‘unit’ or ‘the unit type’. | |
3407 | ||
1a4d82fc JJ |
3408 | ### Array, and Slice types |
3409 | ||
3410 | Rust has two different types for a list of items: | |
3411 | ||
c1a9b12d SL |
3412 | * `[T; N]`, an 'array' |
3413 | * `&[T]`, a 'slice' | |
1a4d82fc JJ |
3414 | |
3415 | An array has a fixed size, and can be allocated on either the stack or the | |
3416 | heap. | |
3417 | ||
3418 | A slice is a 'view' into an array. It doesn't own the data it points | |
3419 | to, it borrows it. | |
3420 | ||
c1a9b12d | 3421 | Examples: |
1a4d82fc JJ |
3422 | |
3423 | ```{rust} | |
c1a9b12d SL |
3424 | // A stack-allocated array |
3425 | let array: [i32; 3] = [1, 2, 3]; | |
3426 | ||
3427 | // A heap-allocated array | |
3428 | let vector: Vec<i32> = vec![1, 2, 3]; | |
3429 | ||
3430 | // A slice into an array | |
3431 | let slice: &[i32] = &vector[..]; | |
1a4d82fc JJ |
3432 | ``` |
3433 | ||
3434 | As you can see, the `vec!` macro allows you to create a `Vec<T>` easily. The | |
3435 | `vec!` macro is also part of the standard library, rather than the language. | |
3436 | ||
c1a9b12d | 3437 | All in-bounds elements of arrays and slices are always initialized, and access |
1a4d82fc JJ |
3438 | to an array or slice is always bounds-checked. |
3439 | ||
b039eaaf | 3440 | ### Struct types |
1a4d82fc JJ |
3441 | |
3442 | A `struct` *type* is a heterogeneous product of other types, called the | |
3443 | *fields* of the type.[^structtype] | |
3444 | ||
bd371182 | 3445 | [^structtype]: `struct` types are analogous to `struct` types in C, |
1a4d82fc | 3446 | the *record* types of the ML family, |
b039eaaf | 3447 | or the *struct* types of the Lisp family. |
1a4d82fc JJ |
3448 | |
3449 | New instances of a `struct` can be constructed with a [struct | |
b039eaaf | 3450 | expression](#struct-expressions). |
1a4d82fc JJ |
3451 | |
3452 | The memory layout of a `struct` is undefined by default to allow for compiler | |
3453 | optimizations like field reordering, but it can be fixed with the | |
3454 | `#[repr(...)]` attribute. In either case, fields may be given in any order in | |
3455 | a corresponding struct *expression*; the resulting `struct` value will always | |
3456 | have the same memory layout. | |
3457 | ||
3458 | The fields of a `struct` may be qualified by [visibility | |
bd371182 | 3459 | modifiers](#visibility-and-privacy), to allow access to data in a |
b039eaaf | 3460 | struct outside a module. |
1a4d82fc | 3461 | |
b039eaaf | 3462 | A _tuple struct_ type is just like a struct type, except that the fields are |
1a4d82fc JJ |
3463 | anonymous. |
3464 | ||
b039eaaf SL |
3465 | A _unit-like struct_ type is like a struct type, except that it has no |
3466 | fields. The one value constructed by the associated [struct | |
3467 | expression](#struct-expressions) is the only value that inhabits such a | |
1a4d82fc JJ |
3468 | type. |
3469 | ||
3470 | ### Enumerated types | |
3471 | ||
3472 | An *enumerated type* is a nominal, heterogeneous disjoint union type, denoted | |
3473 | by the name of an [`enum` item](#enumerations). [^enumtype] | |
3474 | ||
3475 | [^enumtype]: The `enum` type is analogous to a `data` constructor declaration in | |
3476 | ML, or a *pick ADT* in Limbo. | |
3477 | ||
3478 | An [`enum` item](#enumerations) declares both the type and a number of *variant | |
3479 | constructors*, each of which is independently named and takes an optional tuple | |
3480 | of arguments. | |
3481 | ||
3482 | New instances of an `enum` can be constructed by calling one of the variant | |
3483 | constructors, in a [call expression](#call-expressions). | |
3484 | ||
3485 | Any `enum` value consumes as much memory as the largest variant constructor for | |
3486 | its corresponding `enum` type. | |
3487 | ||
3488 | Enum types cannot be denoted *structurally* as types, but must be denoted by | |
3489 | named reference to an [`enum` item](#enumerations). | |
3490 | ||
3491 | ### Recursive types | |
3492 | ||
3493 | Nominal types — [enumerations](#enumerated-types) and | |
b039eaaf | 3494 | [structs](#struct-types) — may be recursive. That is, each `enum` |
1a4d82fc JJ |
3495 | constructor or `struct` field may refer, directly or indirectly, to the |
3496 | enclosing `enum` or `struct` type itself. Such recursion has restrictions: | |
3497 | ||
3498 | * Recursive types must include a nominal type in the recursion | |
bd371182 | 3499 | (not mere [type definitions](grammar.html#type-definitions), |
b039eaaf | 3500 | or other structural types such as [arrays](#array-and-slice-types) or [tuples](#tuple-types)). |
1a4d82fc JJ |
3501 | * A recursive `enum` item must have at least one non-recursive constructor |
3502 | (in order to give the recursion a basis case). | |
3503 | * The size of a recursive type must be finite; | |
3504 | in other words the recursive fields of the type must be [pointer types](#pointer-types). | |
3505 | * Recursive type definitions can cross module boundaries, but not module *visibility* boundaries, | |
3506 | or crate boundaries (in order to simplify the module system and type checker). | |
3507 | ||
3508 | An example of a *recursive* type and its use: | |
3509 | ||
3510 | ``` | |
1a4d82fc JJ |
3511 | enum List<T> { |
3512 | Nil, | |
3513 | Cons(T, Box<List<T>>) | |
3514 | } | |
3515 | ||
85aaf69f | 3516 | let a: List<i32> = List::Cons(7, Box::new(List::Cons(13, Box::new(List::Nil)))); |
1a4d82fc JJ |
3517 | ``` |
3518 | ||
3519 | ### Pointer types | |
3520 | ||
3521 | All pointers in Rust are explicit first-class values. They can be copied, | |
b039eaaf | 3522 | stored into data structs, and returned from functions. There are two |
1a4d82fc JJ |
3523 | varieties of pointer in Rust: |
3524 | ||
3525 | * References (`&`) | |
3526 | : These point to memory _owned by some other value_. | |
bd371182 AL |
3527 | A reference type is written `&type`, |
3528 | or `&'a type` when you need to specify an explicit lifetime. | |
1a4d82fc JJ |
3529 | Copying a reference is a "shallow" operation: |
3530 | it involves only copying the pointer itself. | |
bd371182 AL |
3531 | Releasing a reference has no effect on the value it points to, |
3532 | but a reference of a temporary value will keep it alive during the scope | |
3533 | of the reference itself. | |
1a4d82fc JJ |
3534 | |
3535 | * Raw pointers (`*`) | |
3536 | : Raw pointers are pointers without safety or liveness guarantees. | |
3537 | Raw pointers are written as `*const T` or `*mut T`, | |
bd371182 | 3538 | for example `*const i32` means a raw pointer to a 32-bit integer. |
1a4d82fc JJ |
3539 | Copying or dropping a raw pointer has no effect on the lifecycle of any |
3540 | other value. Dereferencing a raw pointer or converting it to any other | |
3541 | pointer type is an [`unsafe` operation](#unsafe-functions). | |
3542 | Raw pointers are generally discouraged in Rust code; | |
3543 | they exist to support interoperability with foreign code, | |
3544 | and writing performance-critical or low-level functions. | |
3545 | ||
3546 | The standard library contains additional 'smart pointer' types beyond references | |
3547 | and raw pointers. | |
3548 | ||
3549 | ### Function types | |
3550 | ||
3551 | The function type constructor `fn` forms new function types. A function type | |
3552 | consists of a possibly-empty set of function-type modifiers (such as `unsafe` | |
3553 | or `extern`), a sequence of input types and an output type. | |
3554 | ||
3555 | An example of a `fn` type: | |
3556 | ||
3557 | ``` | |
85aaf69f | 3558 | fn add(x: i32, y: i32) -> i32 { |
7453a54e | 3559 | x + y |
1a4d82fc JJ |
3560 | } |
3561 | ||
3562 | let mut x = add(5,7); | |
3563 | ||
85aaf69f | 3564 | type Binop = fn(i32, i32) -> i32; |
1a4d82fc JJ |
3565 | let bo: Binop = add; |
3566 | x = bo(5,7); | |
3567 | ``` | |
3568 | ||
bd371182 AL |
3569 | #### Function types for specific items |
3570 | ||
c1a9b12d | 3571 | Internal to the compiler, there are also function types that are specific to a particular |
bd371182 AL |
3572 | function item. In the following snippet, for example, the internal types of the functions |
3573 | `foo` and `bar` are different, despite the fact that they have the same signature: | |
1a4d82fc | 3574 | |
bd371182 AL |
3575 | ``` |
3576 | fn foo() { } | |
3577 | fn bar() { } | |
1a4d82fc JJ |
3578 | ``` |
3579 | ||
bd371182 AL |
3580 | The types of `foo` and `bar` can both be implicitly coerced to the fn |
3581 | pointer type `fn()`. There is currently no syntax for unique fn types, | |
3582 | though the compiler will emit a type like `fn() {foo}` in error | |
3583 | messages to indicate "the unique fn type for the function `foo`". | |
1a4d82fc | 3584 | |
bd371182 | 3585 | ### Closure types |
1a4d82fc | 3586 | |
bd371182 AL |
3587 | A [lambda expression](#lambda-expressions) produces a closure value with |
3588 | a unique, anonymous type that cannot be written out. | |
1a4d82fc | 3589 | |
bd371182 AL |
3590 | Depending on the requirements of the closure, its type implements one or |
3591 | more of the closure traits: | |
1a4d82fc | 3592 | |
bd371182 AL |
3593 | * `FnOnce` |
3594 | : The closure can be called once. A closure called as `FnOnce` | |
3595 | can move out values from its environment. | |
1a4d82fc | 3596 | |
bd371182 AL |
3597 | * `FnMut` |
3598 | : The closure can be called multiple times as mutable. A closure called as | |
c1a9b12d SL |
3599 | `FnMut` can mutate values from its environment. `FnMut` inherits from |
3600 | `FnOnce` (i.e. anything implementing `FnMut` also implements `FnOnce`). | |
1a4d82fc | 3601 | |
bd371182 AL |
3602 | * `Fn` |
3603 | : The closure can be called multiple times through a shared reference. | |
3604 | A closure called as `Fn` can neither move out from nor mutate values | |
c1a9b12d SL |
3605 | from its environment. `Fn` inherits from `FnMut`, which itself |
3606 | inherits from `FnOnce`. | |
1a4d82fc | 3607 | |
1a4d82fc | 3608 | |
bd371182 | 3609 | ### Trait objects |
1a4d82fc | 3610 | |
d9579d0f AL |
3611 | In Rust, a type like `&SomeTrait` or `Box<SomeTrait>` is called a _trait object_. |
3612 | Each instance of a trait object includes: | |
3613 | ||
3614 | - a pointer to an instance of a type `T` that implements `SomeTrait` | |
3615 | - a _virtual method table_, often just called a _vtable_, which contains, for | |
3616 | each method of `SomeTrait` that `T` implements, a pointer to `T`'s | |
3617 | implementation (i.e. a function pointer). | |
3618 | ||
7453a54e SL |
3619 | The purpose of trait objects is to permit "late binding" of methods. Calling a |
3620 | method on a trait object results in virtual dispatch at runtime: that is, a | |
3621 | function pointer is loaded from the trait object vtable and invoked indirectly. | |
d9579d0f AL |
3622 | The actual implementation for each vtable entry can vary on an object-by-object |
3623 | basis. | |
3624 | ||
3625 | Note that for a trait object to be instantiated, the trait must be | |
3626 | _object-safe_. Object safety rules are defined in [RFC 255]. | |
3627 | ||
3628 | [RFC 255]: https://github.com/rust-lang/rfcs/blob/master/text/0255-object-safety.md | |
1a4d82fc JJ |
3629 | |
3630 | Given a pointer-typed expression `E` of type `&T` or `Box<T>`, where `T` | |
3631 | implements trait `R`, casting `E` to the corresponding pointer type `&R` or | |
bd371182 | 3632 | `Box<R>` results in a value of the _trait object_ `R`. This result is |
1a4d82fc JJ |
3633 | represented as a pair of pointers: the vtable pointer for the `T` |
3634 | implementation of `R`, and the pointer value of `E`. | |
3635 | ||
bd371182 | 3636 | An example of a trait object: |
1a4d82fc JJ |
3637 | |
3638 | ``` | |
1a4d82fc | 3639 | trait Printable { |
e9174d1e | 3640 | fn stringify(&self) -> String; |
1a4d82fc JJ |
3641 | } |
3642 | ||
85aaf69f | 3643 | impl Printable for i32 { |
e9174d1e | 3644 | fn stringify(&self) -> String { self.to_string() } |
1a4d82fc JJ |
3645 | } |
3646 | ||
3647 | fn print(a: Box<Printable>) { | |
e9174d1e | 3648 | println!("{}", a.stringify()); |
1a4d82fc JJ |
3649 | } |
3650 | ||
3651 | fn main() { | |
e9174d1e | 3652 | print(Box::new(10) as Box<Printable>); |
1a4d82fc JJ |
3653 | } |
3654 | ``` | |
3655 | ||
bd371182 | 3656 | In this example, the trait `Printable` occurs as a trait object in both the |
1a4d82fc JJ |
3657 | type signature of `print`, and the cast expression in `main`. |
3658 | ||
3659 | ### Type parameters | |
3660 | ||
3661 | Within the body of an item that has type parameter declarations, the names of | |
3662 | its type parameters are types: | |
3663 | ||
3664 | ```ignore | |
bd371182 | 3665 | fn to_vec<A: Clone>(xs: &[A]) -> Vec<A> { |
9346a6ac | 3666 | if xs.is_empty() { |
e9174d1e | 3667 | return vec![]; |
1a4d82fc | 3668 | } |
bd371182 AL |
3669 | let first: A = xs[0].clone(); |
3670 | let mut rest: Vec<A> = to_vec(&xs[1..]); | |
1a4d82fc | 3671 | rest.insert(0, first); |
bd371182 | 3672 | rest |
1a4d82fc JJ |
3673 | } |
3674 | ``` | |
3675 | ||
bd371182 AL |
3676 | Here, `first` has type `A`, referring to `to_vec`'s `A` type parameter; and `rest` |
3677 | has type `Vec<A>`, a vector with element type `A`. | |
1a4d82fc JJ |
3678 | |
3679 | ### Self types | |
3680 | ||
bd371182 AL |
3681 | The special type `Self` has a meaning within traits and impls. In a trait definition, it refers |
3682 | to an implicit type parameter representing the "implementing" type. In an impl, | |
3683 | it is an alias for the implementing type. For example, in: | |
1a4d82fc JJ |
3684 | |
3685 | ``` | |
3686 | trait Printable { | |
e9174d1e | 3687 | fn make_string(&self) -> String; |
1a4d82fc JJ |
3688 | } |
3689 | ||
3690 | impl Printable for String { | |
3691 | fn make_string(&self) -> String { | |
3692 | (*self).clone() | |
3693 | } | |
3694 | } | |
3695 | ``` | |
3696 | ||
bd371182 AL |
3697 | The notation `&self` is a shorthand for `self: &Self`. In this case, |
3698 | in the impl, `Self` refers to the value of type `String` that is the | |
3699 | receiver for a call to the method `make_string`. | |
1a4d82fc | 3700 | |
62682a34 SL |
3701 | ## Subtyping |
3702 | ||
3703 | Subtyping is implicit and can occur at any stage in type checking or | |
3704 | inference. Subtyping in Rust is very restricted and occurs only due to | |
3705 | variance with respect to lifetimes and between types with higher ranked | |
3706 | lifetimes. If we were to erase lifetimes from types, then the only subtyping | |
3707 | would be due to type equality. | |
3708 | ||
3709 | Consider the following example: string literals always have `'static` | |
3710 | lifetime. Nevertheless, we can assign `s` to `t`: | |
3711 | ||
3712 | ``` | |
3713 | fn bar<'a>() { | |
3714 | let s: &'static str = "hi"; | |
3715 | let t: &'a str = s; | |
3716 | } | |
3717 | ``` | |
3718 | Since `'static` "lives longer" than `'a`, `&'static str` is a subtype of | |
3719 | `&'a str`. | |
3720 | ||
3721 | ## Type coercions | |
3722 | ||
3723 | Coercions are defined in [RFC401]. A coercion is implicit and has no syntax. | |
3724 | ||
3725 | [RFC401]: https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md | |
3726 | ||
3727 | ### Coercion sites | |
3728 | ||
3729 | A coercion can only occur at certain coercion sites in a program; these are | |
c1a9b12d | 3730 | typically places where the desired type is explicit or can be derived by |
62682a34 SL |
3731 | propagation from explicit types (without type inference). Possible coercion |
3732 | sites are: | |
3733 | ||
3734 | * `let` statements where an explicit type is given. | |
3735 | ||
9cc50fc6 | 3736 | For example, `42` is coerced to have type `i8` in the following: |
c1a9b12d SL |
3737 | |
3738 | ```rust | |
9cc50fc6 | 3739 | let _: i8 = 42; |
c1a9b12d | 3740 | ``` |
62682a34 SL |
3741 | |
3742 | * `static` and `const` statements (similar to `let` statements). | |
3743 | ||
c1a9b12d SL |
3744 | * Arguments for function calls |
3745 | ||
3746 | The value being coerced is the actual parameter, and it is coerced to | |
3747 | the type of the formal parameter. | |
62682a34 | 3748 | |
9cc50fc6 | 3749 | For example, `42` is coerced to have type `i8` in the following: |
62682a34 | 3750 | |
c1a9b12d SL |
3751 | ```rust |
3752 | fn bar(_: i8) { } | |
62682a34 | 3753 | |
c1a9b12d | 3754 | fn main() { |
9cc50fc6 | 3755 | bar(42); |
c1a9b12d SL |
3756 | } |
3757 | ``` | |
62682a34 | 3758 | |
c1a9b12d | 3759 | * Instantiations of struct or variant fields |
62682a34 | 3760 | |
9cc50fc6 | 3761 | For example, `42` is coerced to have type `i8` in the following: |
c1a9b12d SL |
3762 | |
3763 | ```rust | |
3764 | struct Foo { x: i8 } | |
3765 | ||
3766 | fn main() { | |
9cc50fc6 | 3767 | Foo { x: 42 }; |
c1a9b12d SL |
3768 | } |
3769 | ``` | |
3770 | ||
3771 | * Function results, either the final line of a block if it is not | |
3772 | semicolon-terminated or any expression in a `return` statement | |
3773 | ||
9cc50fc6 | 3774 | For example, `42` is coerced to have type `i8` in the following: |
c1a9b12d SL |
3775 | |
3776 | ```rust | |
3777 | fn foo() -> i8 { | |
9cc50fc6 | 3778 | 42 |
c1a9b12d SL |
3779 | } |
3780 | ``` | |
62682a34 SL |
3781 | |
3782 | If the expression in one of these coercion sites is a coercion-propagating | |
3783 | expression, then the relevant sub-expressions in that expression are also | |
3784 | coercion sites. Propagation recurses from these new coercion sites. | |
3785 | Propagating expressions and their relevant sub-expressions are: | |
3786 | ||
c1a9b12d | 3787 | * Array literals, where the array has type `[U; n]`. Each sub-expression in |
62682a34 SL |
3788 | the array literal is a coercion site for coercion to type `U`. |
3789 | ||
c1a9b12d | 3790 | * Array literals with repeating syntax, where the array has type `[U; n]`. The |
62682a34 SL |
3791 | repeated sub-expression is a coercion site for coercion to type `U`. |
3792 | ||
c1a9b12d | 3793 | * Tuples, where a tuple is a coercion site to type `(U_0, U_1, ..., U_n)`. |
62682a34 SL |
3794 | Each sub-expression is a coercion site to the respective type, e.g. the |
3795 | zeroth sub-expression is a coercion site to type `U_0`. | |
3796 | ||
b039eaaf | 3797 | * Parenthesized sub-expressions (`(e)`): if the expression has type `U`, then |
62682a34 SL |
3798 | the sub-expression is a coercion site to `U`. |
3799 | ||
c1a9b12d | 3800 | * Blocks: if a block has type `U`, then the last expression in the block (if |
62682a34 SL |
3801 | it is not semicolon-terminated) is a coercion site to `U`. This includes |
3802 | blocks which are part of control flow statements, such as `if`/`else`, if | |
3803 | the block has a known type. | |
3804 | ||
3805 | ### Coercion types | |
3806 | ||
3807 | Coercion is allowed between the following types: | |
3808 | ||
c1a9b12d | 3809 | * `T` to `U` if `T` is a subtype of `U` (*reflexive case*) |
62682a34 SL |
3810 | |
3811 | * `T_1` to `T_3` where `T_1` coerces to `T_2` and `T_2` coerces to `T_3` | |
c1a9b12d | 3812 | (*transitive case*) |
62682a34 SL |
3813 | |
3814 | Note that this is not fully supported yet | |
3815 | ||
c1a9b12d | 3816 | * `&mut T` to `&T` |
62682a34 | 3817 | |
c1a9b12d | 3818 | * `*mut T` to `*const T` |
62682a34 | 3819 | |
c1a9b12d | 3820 | * `&T` to `*const T` |
62682a34 | 3821 | |
c1a9b12d | 3822 | * `&mut T` to `*mut T` |
62682a34 SL |
3823 | |
3824 | * `&T` to `&U` if `T` implements `Deref<Target = U>`. For example: | |
3825 | ||
c1a9b12d SL |
3826 | ```rust |
3827 | use std::ops::Deref; | |
62682a34 | 3828 | |
c1a9b12d SL |
3829 | struct CharContainer { |
3830 | value: char | |
3831 | } | |
62682a34 | 3832 | |
c1a9b12d SL |
3833 | impl Deref for CharContainer { |
3834 | type Target = char; | |
62682a34 | 3835 | |
c1a9b12d SL |
3836 | fn deref<'a>(&'a self) -> &'a char { |
3837 | &self.value | |
3838 | } | |
3839 | } | |
62682a34 | 3840 | |
c1a9b12d SL |
3841 | fn foo(arg: &char) {} |
3842 | ||
3843 | fn main() { | |
3844 | let x = &mut CharContainer { value: 'y' }; | |
3845 | foo(x); //&mut CharContainer is coerced to &char. | |
3846 | } | |
3847 | ``` | |
62682a34 | 3848 | |
62682a34 SL |
3849 | * `&mut T` to `&mut U` if `T` implements `DerefMut<Target = U>`. |
3850 | ||
3851 | * TyCtor(`T`) to TyCtor(coerce_inner(`T`)), where TyCtor(`T`) is one of | |
3852 | - `&T` | |
3853 | - `&mut T` | |
3854 | - `*const T` | |
3855 | - `*mut T` | |
3856 | - `Box<T>` | |
3857 | ||
3858 | and where | |
3859 | - coerce_inner(`[T, ..n]`) = `[T]` | |
3860 | - coerce_inner(`T`) = `U` where `T` is a concrete type which implements the | |
3861 | trait `U`. | |
3862 | ||
3863 | In the future, coerce_inner will be recursively extended to tuples and | |
3864 | structs. In addition, coercions from sub-traits to super-traits will be | |
3865 | added. See [RFC401] for more details. | |
3866 | ||
bd371182 | 3867 | # Special traits |
c34b1796 | 3868 | |
bd371182 | 3869 | Several traits define special evaluation behavior. |
c34b1796 | 3870 | |
bd371182 | 3871 | ## The `Copy` trait |
c34b1796 | 3872 | |
bd371182 AL |
3873 | The `Copy` trait changes the semantics of a type implementing it. Values whose |
3874 | type implements `Copy` are copied rather than moved upon assignment. | |
c34b1796 | 3875 | |
bd371182 AL |
3876 | ## The `Sized` trait |
3877 | ||
3878 | The `Sized` trait indicates that the size of this type is known at compile-time. | |
3879 | ||
3880 | ## The `Drop` trait | |
c34b1796 AL |
3881 | |
3882 | The `Drop` trait provides a destructor, to be run whenever a value of this type | |
3883 | is to be destroyed. | |
3884 | ||
62682a34 SL |
3885 | ## The `Deref` trait |
3886 | ||
3887 | The `Deref<Target = U>` trait allows a type to implicitly implement all the methods | |
3888 | of the type `U`. When attempting to resolve a method call, the compiler will search | |
3889 | the top-level type for the implementation of the called method. If no such method is | |
3890 | found, `.deref()` is called and the compiler continues to search for the method | |
3891 | implementation in the returned type `U`. | |
3892 | ||
c34b1796 AL |
3893 | # Memory model |
3894 | ||
3895 | A Rust program's memory consists of a static set of *items* and a *heap*. | |
bd371182 AL |
3896 | Immutable portions of the heap may be safely shared between threads, mutable |
3897 | portions may not be safely shared, but several mechanisms for effectively-safe | |
3898 | sharing of mutable values, built on unsafe code but enforcing a safe locking | |
3899 | discipline, exist in the standard library. | |
1a4d82fc | 3900 | |
bd371182 | 3901 | Allocations in the stack consist of *variables*, and allocations in the heap |
1a4d82fc JJ |
3902 | consist of *boxes*. |
3903 | ||
3904 | ### Memory allocation and lifetime | |
3905 | ||
3906 | The _items_ of a program are those functions, modules and types that have their | |
3907 | value calculated at compile-time and stored uniquely in the memory image of the | |
3908 | rust process. Items are neither dynamically allocated nor freed. | |
3909 | ||
1a4d82fc JJ |
3910 | The _heap_ is a general term that describes boxes. The lifetime of an |
3911 | allocation in the heap depends on the lifetime of the box values pointing to | |
3912 | it. Since box values may themselves be passed in and out of frames, or stored | |
3913 | in the heap, heap allocations may outlive the frame they are allocated within. | |
54a0048b SL |
3914 | An allocation in the heap is guaranteed to reside at a single location in the |
3915 | heap for the whole lifetime of the allocation - it will never be relocated as | |
3916 | a result of moving a box value. | |
1a4d82fc JJ |
3917 | |
3918 | ### Memory ownership | |
3919 | ||
1a4d82fc JJ |
3920 | When a stack frame is exited, its local allocations are all released, and its |
3921 | references to boxes are dropped. | |
3922 | ||
bd371182 | 3923 | ### Variables |
1a4d82fc | 3924 | |
bd371182 | 3925 | A _variable_ is a component of a stack frame, either a named function parameter, |
b039eaaf | 3926 | an anonymous [temporary](#lvalues-rvalues-and-temporaries), or a named local |
bd371182 | 3927 | variable. |
1a4d82fc JJ |
3928 | |
3929 | A _local variable_ (or *stack-local* allocation) holds a value directly, | |
3930 | allocated within the stack's memory. The value is a part of the stack frame. | |
3931 | ||
3932 | Local variables are immutable unless declared otherwise like: `let mut x = ...`. | |
3933 | ||
3934 | Function parameters are immutable unless declared with `mut`. The `mut` keyword | |
3935 | applies only to the following parameter (so `|mut x, y|` and `fn f(mut x: | |
85aaf69f | 3936 | Box<i32>, y: Box<i32>)` declare one mutable variable `x` and one immutable |
1a4d82fc JJ |
3937 | variable `y`). |
3938 | ||
3939 | Methods that take either `self` or `Box<Self>` can optionally place them in a | |
bd371182 | 3940 | mutable variable by prefixing them with `mut` (similar to regular arguments): |
1a4d82fc JJ |
3941 | |
3942 | ``` | |
3943 | trait Changer { | |
3944 | fn change(mut self) -> Self; | |
3945 | fn modify(mut self: Box<Self>) -> Box<Self>; | |
3946 | } | |
3947 | ``` | |
3948 | ||
3949 | Local variables are not initialized when allocated; the entire frame worth of | |
3950 | local variables are allocated at once, on frame-entry, in an uninitialized | |
3951 | state. Subsequent statements within a function may or may not initialize the | |
3952 | local variables. Local variables can be used only after they have been | |
3953 | initialized; this is enforced by the compiler. | |
3954 | ||
bd371182 | 3955 | # Linkage |
1a4d82fc JJ |
3956 | |
3957 | The Rust compiler supports various methods to link crates together both | |
3958 | statically and dynamically. This section will explore the various methods to | |
3959 | link Rust crates together, and more information about native libraries can be | |
b039eaaf | 3960 | found in the [FFI section of the book][ffi]. |
1a4d82fc JJ |
3961 | |
3962 | In one session of compilation, the compiler can generate multiple artifacts | |
3963 | through the usage of either command line flags or the `crate_type` attribute. | |
b039eaaf | 3964 | If one or more command line flags are specified, all `crate_type` attributes will |
1a4d82fc JJ |
3965 | be ignored in favor of only building the artifacts specified by command line. |
3966 | ||
3967 | * `--crate-type=bin`, `#[crate_type = "bin"]` - A runnable executable will be | |
3968 | produced. This requires that there is a `main` function in the crate which | |
3969 | will be run when the program begins executing. This will link in all Rust and | |
3970 | native dependencies, producing a distributable binary. | |
3971 | ||
3972 | * `--crate-type=lib`, `#[crate_type = "lib"]` - A Rust library will be produced. | |
3973 | This is an ambiguous concept as to what exactly is produced because a library | |
3974 | can manifest itself in several forms. The purpose of this generic `lib` option | |
3975 | is to generate the "compiler recommended" style of library. The output library | |
3976 | will always be usable by rustc, but the actual type of library may change from | |
3977 | time-to-time. The remaining output types are all different flavors of | |
3978 | libraries, and the `lib` type can be seen as an alias for one of them (but the | |
3979 | actual one is compiler-defined). | |
3980 | ||
3981 | * `--crate-type=dylib`, `#[crate_type = "dylib"]` - A dynamic Rust library will | |
3982 | be produced. This is different from the `lib` output type in that this forces | |
3983 | dynamic library generation. The resulting dynamic library can be used as a | |
3984 | dependency for other libraries and/or executables. This output type will | |
3985 | create `*.so` files on linux, `*.dylib` files on osx, and `*.dll` files on | |
3986 | windows. | |
3987 | ||
3988 | * `--crate-type=staticlib`, `#[crate_type = "staticlib"]` - A static system | |
3989 | library will be produced. This is different from other library outputs in that | |
3990 | the Rust compiler will never attempt to link to `staticlib` outputs. The | |
3991 | purpose of this output type is to create a static library containing all of | |
3992 | the local crate's code along with all upstream dependencies. The static | |
3993 | library is actually a `*.a` archive on linux and osx and a `*.lib` file on | |
3994 | windows. This format is recommended for use in situations such as linking | |
3995 | Rust code into an existing non-Rust application because it will not have | |
3996 | dynamic dependencies on other Rust code. | |
3997 | ||
3998 | * `--crate-type=rlib`, `#[crate_type = "rlib"]` - A "Rust library" file will be | |
3999 | produced. This is used as an intermediate artifact and can be thought of as a | |
4000 | "static Rust library". These `rlib` files, unlike `staticlib` files, are | |
4001 | interpreted by the Rust compiler in future linkage. This essentially means | |
4002 | that `rustc` will look for metadata in `rlib` files like it looks for metadata | |
4003 | in dynamic libraries. This form of output is used to produce statically linked | |
4004 | executables as well as `staticlib` outputs. | |
4005 | ||
4006 | Note that these outputs are stackable in the sense that if multiple are | |
4007 | specified, then the compiler will produce each form of output at once without | |
4008 | having to recompile. However, this only applies for outputs specified by the | |
4009 | same method. If only `crate_type` attributes are specified, then they will all | |
b039eaaf | 4010 | be built, but if one or more `--crate-type` command line flags are specified, |
1a4d82fc JJ |
4011 | then only those outputs will be built. |
4012 | ||
4013 | With all these different kinds of outputs, if crate A depends on crate B, then | |
4014 | the compiler could find B in various different forms throughout the system. The | |
4015 | only forms looked for by the compiler, however, are the `rlib` format and the | |
4016 | dynamic library format. With these two options for a dependent library, the | |
4017 | compiler must at some point make a choice between these two formats. With this | |
4018 | in mind, the compiler follows these rules when determining what format of | |
4019 | dependencies will be used: | |
4020 | ||
4021 | 1. If a static library is being produced, all upstream dependencies are | |
4022 | required to be available in `rlib` formats. This requirement stems from the | |
4023 | reason that a dynamic library cannot be converted into a static format. | |
4024 | ||
4025 | Note that it is impossible to link in native dynamic dependencies to a static | |
4026 | library, and in this case warnings will be printed about all unlinked native | |
4027 | dynamic dependencies. | |
4028 | ||
4029 | 2. If an `rlib` file is being produced, then there are no restrictions on what | |
4030 | format the upstream dependencies are available in. It is simply required that | |
4031 | all upstream dependencies be available for reading metadata from. | |
4032 | ||
4033 | The reason for this is that `rlib` files do not contain any of their upstream | |
4034 | dependencies. It wouldn't be very efficient for all `rlib` files to contain a | |
4035 | copy of `libstd.rlib`! | |
4036 | ||
4037 | 3. If an executable is being produced and the `-C prefer-dynamic` flag is not | |
4038 | specified, then dependencies are first attempted to be found in the `rlib` | |
4039 | format. If some dependencies are not available in an rlib format, then | |
4040 | dynamic linking is attempted (see below). | |
4041 | ||
4042 | 4. If a dynamic library or an executable that is being dynamically linked is | |
4043 | being produced, then the compiler will attempt to reconcile the available | |
4044 | dependencies in either the rlib or dylib format to create a final product. | |
4045 | ||
4046 | A major goal of the compiler is to ensure that a library never appears more | |
4047 | than once in any artifact. For example, if dynamic libraries B and C were | |
4048 | each statically linked to library A, then a crate could not link to B and C | |
4049 | together because there would be two copies of A. The compiler allows mixing | |
4050 | the rlib and dylib formats, but this restriction must be satisfied. | |
4051 | ||
4052 | The compiler currently implements no method of hinting what format a library | |
4053 | should be linked with. When dynamically linking, the compiler will attempt to | |
4054 | maximize dynamic dependencies while still allowing some dependencies to be | |
4055 | linked in via an rlib. | |
4056 | ||
4057 | For most situations, having all libraries available as a dylib is recommended | |
4058 | if dynamically linking. For other situations, the compiler will emit a | |
4059 | warning if it is unable to determine which formats to link each library with. | |
4060 | ||
4061 | In general, `--crate-type=bin` or `--crate-type=lib` should be sufficient for | |
4062 | all compilation needs, and the other options are just available if more | |
4063 | fine-grained control is desired over the output format of a Rust crate. | |
4064 | ||
b039eaaf SL |
4065 | # Unsafety |
4066 | ||
4067 | Unsafe operations are those that potentially violate the memory-safety | |
4068 | guarantees of Rust's static semantics. | |
4069 | ||
4070 | The following language level features cannot be used in the safe subset of | |
4071 | Rust: | |
4072 | ||
4073 | - Dereferencing a [raw pointer](#pointer-types). | |
4074 | - Reading or writing a [mutable static variable](#mutable-statics). | |
4075 | - Calling an unsafe function (including an intrinsic or foreign function). | |
4076 | ||
4077 | ## Unsafe functions | |
4078 | ||
4079 | Unsafe functions are functions that are not safe in all contexts and/or for all | |
4080 | possible inputs. Such a function must be prefixed with the keyword `unsafe` and | |
4081 | can only be called from an `unsafe` block or another `unsafe` function. | |
4082 | ||
4083 | ## Unsafe blocks | |
4084 | ||
4085 | A block of code can be prefixed with the `unsafe` keyword, to permit calling | |
4086 | `unsafe` functions or dereferencing raw pointers within a safe function. | |
4087 | ||
4088 | When a programmer has sufficient conviction that a sequence of potentially | |
4089 | unsafe operations is actually safe, they can encapsulate that sequence (taken | |
4090 | as a whole) within an `unsafe` block. The compiler will consider uses of such | |
4091 | code safe, in the surrounding context. | |
4092 | ||
4093 | Unsafe blocks are used to wrap foreign libraries, make direct use of hardware | |
4094 | or implement features not directly present in the language. For example, Rust | |
4095 | provides the language features necessary to implement memory-safe concurrency | |
4096 | in the language but the implementation of threads and message passing is in the | |
4097 | standard library. | |
4098 | ||
4099 | Rust's type system is a conservative approximation of the dynamic safety | |
4100 | requirements, so in some cases there is a performance cost to using safe code. | |
4101 | For example, a doubly-linked list is not a tree structure and can only be | |
4102 | represented with reference-counted pointers in safe code. By using `unsafe` | |
4103 | blocks to represent the reverse links as raw pointers, it can be implemented | |
4104 | with only boxes. | |
4105 | ||
4106 | ## Behavior considered undefined | |
4107 | ||
4108 | The following is a list of behavior which is forbidden in all Rust code, | |
4109 | including within `unsafe` blocks and `unsafe` functions. Type checking provides | |
4110 | the guarantee that these issues are never caused by safe code. | |
1a4d82fc | 4111 | |
b039eaaf SL |
4112 | * Data races |
4113 | * Dereferencing a null/dangling raw pointer | |
4114 | * Reads of [undef](http://llvm.org/docs/LangRef.html#undefined-values) | |
4115 | (uninitialized) memory | |
4116 | * Breaking the [pointer aliasing | |
4117 | rules](http://llvm.org/docs/LangRef.html#pointer-aliasing-rules) | |
4118 | with raw pointers (a subset of the rules used by C) | |
7453a54e | 4119 | * `&mut T` and `&T` follow LLVM’s scoped [noalias] model, except if the `&T` |
b039eaaf SL |
4120 | contains an `UnsafeCell<U>`. Unsafe code must not violate these aliasing |
4121 | guarantees. | |
4122 | * Mutating non-mutable data (that is, data reached through a shared reference or | |
4123 | data owned by a `let` binding), unless that data is contained within an `UnsafeCell<U>`. | |
4124 | * Invoking undefined behavior via compiler intrinsics: | |
4125 | * Indexing outside of the bounds of an object with `std::ptr::offset` | |
4126 | (`offset` intrinsic), with | |
4127 | the exception of one byte past the end which is permitted. | |
4128 | * Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64` | |
4129 | intrinsics) on overlapping buffers | |
4130 | * Invalid values in primitive types, even in private fields/locals: | |
4131 | * Dangling/null references or boxes | |
4132 | * A value other than `false` (0) or `true` (1) in a `bool` | |
4133 | * A discriminant in an `enum` not included in the type definition | |
4134 | * A value in a `char` which is a surrogate or above `char::MAX` | |
4135 | * Non-UTF-8 byte sequences in a `str` | |
4136 | * Unwinding into Rust from foreign code or unwinding from Rust into foreign | |
4137 | code. Rust's failure system is not compatible with exception handling in | |
4138 | other languages. Unwinding must be caught and handled at FFI boundaries. | |
4139 | ||
4140 | [noalias]: http://llvm.org/docs/LangRef.html#noalias | |
4141 | ||
4142 | ## Behavior not considered unsafe | |
4143 | ||
4144 | This is a list of behavior not considered *unsafe* in Rust terms, but that may | |
4145 | be undesired. | |
4146 | ||
4147 | * Deadlocks | |
4148 | * Leaks of memory and other resources | |
4149 | * Exiting without calling destructors | |
4150 | * Integer overflow | |
4151 | - Overflow is considered "unexpected" behavior and is always user-error, | |
4152 | unless the `wrapping` primitives are used. In non-optimized builds, the compiler | |
4153 | will insert debug checks that panic on overflow, but in optimized builds overflow | |
4154 | instead results in wrapped values. See [RFC 560] for the rationale and more details. | |
4155 | ||
4156 | [RFC 560]: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md | |
1a4d82fc JJ |
4157 | |
4158 | # Appendix: Influences | |
4159 | ||
4160 | Rust is not a particularly original language, with design elements coming from | |
4161 | a wide range of sources. Some of these are listed below (including elements | |
4162 | that have since been removed): | |
4163 | ||
c1a9b12d | 4164 | * SML, OCaml: algebraic data types, pattern matching, type inference, |
1a4d82fc | 4165 | semicolon statement separation |
b039eaaf | 4166 | * C++: references, RAII, smart pointers, move semantics, monomorphization, |
1a4d82fc JJ |
4167 | memory model |
4168 | * ML Kit, Cyclone: region based memory management | |
4169 | * Haskell (GHC): typeclasses, type families | |
4170 | * Newsqueak, Alef, Limbo: channels, concurrency | |
bd371182 | 4171 | * Erlang: message passing, thread failure, ~~linked thread failure~~, |
1a4d82fc JJ |
4172 | ~~lightweight concurrency~~ |
4173 | * Swift: optional bindings | |
4174 | * Scheme: hygienic macros | |
4175 | * C#: attributes | |
4176 | * Ruby: ~~block syntax~~ | |
4177 | * NIL, Hermes: ~~typestate~~ | |
4178 | * [Unicode Annex #31](http://www.unicode.org/reports/tr31/): identifier and | |
4179 | pattern syntax | |
4180 | ||
4181 | [ffi]: book/ffi.html | |
bd371182 | 4182 | [plugin]: book/compiler-plugins.html |