]>
Commit | Line | Data |
---|---|---|
8bb4bdeb XL |
1 | # Tokens |
2 | ||
3 | Tokens are primitive productions in the grammar defined by regular | |
b7449926 XL |
4 | (non-recursive) languages. Rust source input can be broken down |
5 | into the following kinds of tokens: | |
6 | ||
7 | * [Keywords] | |
8 | * [Identifiers][identifier] | |
9 | * [Literals](#literals) | |
10 | * [Lifetimes](#lifetimes-and-loop-labels) | |
11 | * [Punctuation](#punctuation) | |
12 | * [Delimiters](#delimiters) | |
13 | ||
14 | Within this documentation's grammar, "simple" tokens are given in [string | |
15 | table production] form, and appear in `monospace` font. | |
8bb4bdeb | 16 | |
416331ca | 17 | [string table production]: notation.md#string-table-productions |
8bb4bdeb XL |
18 | |
19 | ## Literals | |
20 | ||
21 | A literal is an expression consisting of a single token, rather than a sequence | |
22 | of tokens, that immediately and directly denotes the value it evaluates to, | |
23 | rather than referring to it by name or some other evaluation rule. A literal is | |
416331ca | 24 | a form of [constant expression](const_eval.md#constant-expressions), so is |
041b39d2 | 25 | evaluated (primarily) at compile time. |
8bb4bdeb XL |
26 | |
27 | ### Examples | |
28 | ||
29 | #### Characters and strings | |
30 | ||
31 | | | Example | `#` sets | Characters | Escapes | | |
94b46f34 XL |
32 | |----------------------------------------------|-----------------|-------------|-------------|---------------------| |
33 | | [Character](#character-literals) | `'H'` | 0 | All Unicode | [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) | | |
34 | | [String](#string-literals) | `"hello"` | 0 | All Unicode | [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) | | |
dc9dc135 | 35 | | [Raw string](#raw-string-literals) | `r#"hello"#` | 0 or more\* | All Unicode | `N/A` | |
94b46f34 XL |
36 | | [Byte](#byte-literals) | `b'H'` | 0 | All ASCII | [Quote](#quote-escapes) & [Byte](#byte-escapes) | |
37 | | [Byte string](#byte-string-literals) | `b"hello"` | 0 | All ASCII | [Quote](#quote-escapes) & [Byte](#byte-escapes) | | |
38 | | [Raw byte string](#raw-byte-string-literals) | `br#"hello"#` | 0 or more\* | All ASCII | `N/A` | | |
39 | ||
40 | \* The number of `#`s on each side of the same literal must be equivalent | |
8bb4bdeb | 41 | |
ea8adc8c XL |
42 | #### ASCII escapes |
43 | ||
44 | | | Name | | |
45 | |---|------| | |
46 | | `\x41` | 7-bit character code (exactly 2 digits, up to 0x7F) | | |
47 | | `\n` | Newline | | |
48 | | `\r` | Carriage return | | |
49 | | `\t` | Tab | | |
50 | | `\\` | Backslash | | |
51 | | `\0` | Null | | |
52 | ||
8bb4bdeb XL |
53 | #### Byte escapes |
54 | ||
55 | | | Name | | |
56 | |---|------| | |
57 | | `\x7F` | 8-bit character code (exactly 2 digits) | | |
58 | | `\n` | Newline | | |
59 | | `\r` | Carriage return | | |
60 | | `\t` | Tab | | |
61 | | `\\` | Backslash | | |
62 | | `\0` | Null | | |
63 | ||
64 | #### Unicode escapes | |
65 | ||
66 | | | Name | | |
67 | |---|------| | |
68 | | `\u{7FFF}` | 24-bit Unicode character code (up to 6 digits) | | |
69 | ||
70 | #### Quote escapes | |
71 | ||
72 | | | Name | | |
73 | |---|------| | |
74 | | `\'` | Single quote | | |
75 | | `\"` | Double quote | | |
76 | ||
77 | #### Numbers | |
78 | ||
79 | | [Number literals](#number-literals)`*` | Example | Exponentiation | Suffixes | | |
80 | |----------------------------------------|---------|----------------|----------| | |
81 | | Decimal integer | `98_222` | `N/A` | Integer suffixes | | |
82 | | Hex integer | `0xff` | `N/A` | Integer suffixes | | |
83 | | Octal integer | `0o77` | `N/A` | Integer suffixes | | |
84 | | Binary integer | `0b1111_0000` | `N/A` | Integer suffixes | | |
85 | | Floating-point | `123.0E+77` | `Optional` | Floating-point suffixes | | |
86 | ||
87 | `*` All number literals allow `_` as a visual separator: `1_234.0E+18f64` | |
88 | ||
89 | #### Suffixes | |
90 | ||
dc9dc135 XL |
91 | A suffix is a non-raw identifier immediately (without whitespace) |
92 | following the primary part of a literal. | |
93 | ||
94 | Any kind of literal (string, integer, etc) with any suffix is valid as a token, | |
e1599b0c | 95 | and can be passed to a macro without producing an error. |
dc9dc135 XL |
96 | The macro itself will decide how to interpret such a token and whether to produce an error or not. |
97 | ||
98 | ```rust | |
99 | macro_rules! blackhole { ($tt:tt) => () } | |
100 | ||
101 | blackhole!("string"suffix); // OK | |
102 | ``` | |
103 | ||
e1599b0c | 104 | However, suffixes on literal tokens parsed as Rust code are restricted. |
dc9dc135 XL |
105 | Any suffixes are rejected on non-numeric literal tokens, |
106 | and numeric literal tokens are accepted only with suffixes from the list below. | |
107 | ||
8bb4bdeb XL |
108 | | Integer | Floating-point | |
109 | |---------|----------------| | |
94b46f34 | 110 | | `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `u128`, `i128`, `usize`, `isize` | `f32`, `f64` | |
8bb4bdeb XL |
111 | |
112 | ### Character and string literals | |
113 | ||
114 | #### Character literals | |
115 | ||
8faf50e0 XL |
116 | > **<sup>Lexer</sup>**\ |
117 | > CHAR_LITERAL :\ | |
29967ef6 | 118 | > `'` ( ~\[`'` `\` \\n \\r \\t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) `'` |
8faf50e0 XL |
119 | > |
120 | > QUOTE_ESCAPE :\ | |
121 | > `\'` | `\"` | |
122 | > | |
123 | > ASCII_ESCAPE :\ | |
124 | > `\x` OCT_DIGIT HEX_DIGIT\ | |
125 | > | `\n` | `\r` | `\t` | `\\` | `\0` | |
126 | > | |
127 | > UNICODE_ESCAPE :\ | |
128 | > `\u{` ( HEX_DIGIT `_`<sup>\*</sup> )<sup>1..6</sup> `}` | |
ea8adc8c | 129 | |
8bb4bdeb XL |
130 | A _character literal_ is a single Unicode character enclosed within two |
131 | `U+0027` (single-quote) characters, with the exception of `U+0027` itself, | |
132 | which must be _escaped_ by a preceding `U+005C` character (`\`). | |
133 | ||
134 | #### String literals | |
135 | ||
8faf50e0 XL |
136 | > **<sup>Lexer</sup>**\ |
137 | > STRING_LITERAL :\ | |
138 | > `"` (\ | |
29967ef6 | 139 | > ~\[`"` `\` _IsolatedCR_]\ |
8faf50e0 XL |
140 | > | QUOTE_ESCAPE\ |
141 | > | ASCII_ESCAPE\ | |
142 | > | UNICODE_ESCAPE\ | |
143 | > | STRING_CONTINUE\ | |
144 | > )<sup>\*</sup> `"` | |
145 | > | |
146 | > STRING_CONTINUE :\ | |
147 | > `\` _followed by_ \\n | |
ea8adc8c | 148 | |
8bb4bdeb XL |
149 | A _string literal_ is a sequence of any Unicode characters enclosed within two |
150 | `U+0022` (double-quote) characters, with the exception of `U+0022` itself, | |
151 | which must be _escaped_ by a preceding `U+005C` character (`\`). | |
152 | ||
e1599b0c XL |
153 | Line-breaks are allowed in string literals. A line-break is either a newline |
154 | (`U+000A`) or a pair of carriage return and newline (`U+000D`, `U+000A`). Both | |
155 | byte sequences are normally translated to `U+000A`, but as a special exception, | |
156 | when an unescaped `U+005C` character (`\`) occurs immediately before the | |
ba9703b0 | 157 | line-break, then the `U+005C` character, the line-break, and all whitespace at the |
e1599b0c | 158 | beginning of the next line are ignored. Thus `a` and `b` are equal: |
8bb4bdeb XL |
159 | |
160 | ```rust | |
161 | let a = "foobar"; | |
162 | let b = "foo\ | |
163 | bar"; | |
164 | ||
165 | assert_eq!(a,b); | |
166 | ``` | |
167 | ||
168 | #### Character escapes | |
169 | ||
170 | Some additional _escapes_ are available in either character or non-raw string | |
171 | literals. An escape starts with a `U+005C` (`\`) and continues with one of the | |
172 | following forms: | |
173 | ||
94b46f34 | 174 | * A _7-bit code point escape_ starts with `U+0078` (`x`) and is |
0531ce1d XL |
175 | followed by exactly two _hex digits_ with value up to `0x7F`. It denotes the |
176 | ASCII character with value equal to the provided hex value. Higher values are | |
177 | not permitted because it is ambiguous whether they mean Unicode code points or | |
178 | byte values. | |
8bb4bdeb XL |
179 | * A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed |
180 | by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D` | |
181 | (`}`). It denotes the Unicode code point equal to the provided hex value. | |
182 | * A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072` | |
183 | (`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF), | |
184 | `U+000D` (CR) or `U+0009` (HT) respectively. | |
185 | * The _null escape_ is the character `U+0030` (`0`) and denotes the Unicode | |
186 | value `U+0000` (NUL). | |
187 | * The _backslash escape_ is the character `U+005C` (`\`) which must be | |
0531ce1d | 188 | escaped in order to denote itself. |
8bb4bdeb XL |
189 | |
190 | #### Raw string literals | |
191 | ||
8faf50e0 XL |
192 | > **<sup>Lexer</sup>**\ |
193 | > RAW_STRING_LITERAL :\ | |
194 | > `r` RAW_STRING_CONTENT | |
195 | > | |
196 | > RAW_STRING_CONTENT :\ | |
197 | > `"` ( ~ _IsolatedCR_ )<sup>* (non-greedy)</sup> `"`\ | |
198 | > | `#` RAW_STRING_CONTENT `#` | |
ea8adc8c | 199 | |
8bb4bdeb XL |
200 | Raw string literals do not process any escapes. They start with the character |
201 | `U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a | |
202 | `U+0022` (double-quote) character. The _raw string body_ can contain any sequence | |
203 | of Unicode characters and is terminated only by another `U+0022` (double-quote) | |
204 | character, followed by the same number of `U+0023` (`#`) characters that preceded | |
205 | the opening `U+0022` (double-quote) character. | |
206 | ||
207 | All Unicode characters contained in the raw string body represent themselves, | |
208 | the characters `U+0022` (double-quote) (except when followed by at least as | |
209 | many `U+0023` (`#`) characters as were used to start the raw string literal) or | |
210 | `U+005C` (`\`) do not have any special meaning. | |
211 | ||
212 | Examples for string literals: | |
213 | ||
cc61c64b | 214 | ```rust |
8bb4bdeb XL |
215 | "foo"; r"foo"; // foo |
216 | "\"foo\""; r#""foo""#; // "foo" | |
217 | ||
218 | "foo #\"# bar"; | |
219 | r##"foo #"# bar"##; // foo #"# bar | |
220 | ||
221 | "\x52"; "R"; r"R"; // R | |
222 | "\\x52"; r"\x52"; // \x52 | |
223 | ``` | |
224 | ||
225 | ### Byte and byte string literals | |
226 | ||
227 | #### Byte literals | |
228 | ||
8faf50e0 XL |
229 | > **<sup>Lexer</sup>**\ |
230 | > BYTE_LITERAL :\ | |
231 | > `b'` ( ASCII_FOR_CHAR | BYTE_ESCAPE ) `'` | |
232 | > | |
233 | > ASCII_FOR_CHAR :\ | |
234 | > _any ASCII (i.e. 0x00 to 0x7F), except_ `'`, `\`, \\n, \\r or \\t | |
235 | > | |
236 | > BYTE_ESCAPE :\ | |
237 | > `\x` HEX_DIGIT HEX_DIGIT\ | |
3c0e092e | 238 | > | `\n` | `\r` | `\t` | `\\` | `\0` | `\'` | `\"` |
ea8adc8c | 239 | |
8bb4bdeb XL |
240 | A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F` |
241 | range) or a single _escape_ preceded by the characters `U+0062` (`b`) and | |
242 | `U+0027` (single-quote), and followed by the character `U+0027`. If the character | |
243 | `U+0027` is present within the literal, it must be _escaped_ by a preceding | |
244 | `U+005C` (`\`) character. It is equivalent to a `u8` unsigned 8-bit integer | |
245 | _number literal_. | |
246 | ||
247 | #### Byte string literals | |
248 | ||
8faf50e0 XL |
249 | > **<sup>Lexer</sup>**\ |
250 | > BYTE_STRING_LITERAL :\ | |
251 | > `b"` ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )<sup>\*</sup> `"` | |
252 | > | |
253 | > ASCII_FOR_STRING :\ | |
254 | > _any ASCII (i.e 0x00 to 0x7F), except_ `"`, `\` _and IsolatedCR_ | |
ea8adc8c | 255 | |
8bb4bdeb XL |
256 | A non-raw _byte string literal_ is a sequence of ASCII characters and _escapes_, |
257 | preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and | |
258 | followed by the character `U+0022`. If the character `U+0022` is present within | |
259 | the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character. | |
260 | Alternatively, a byte string literal can be a _raw byte string literal_, defined | |
0531ce1d | 261 | below. The type of a byte string literal of length `n` is `&'static [u8; n]`. |
8bb4bdeb XL |
262 | |
263 | Some additional _escapes_ are available in either byte or non-raw byte string | |
264 | literals. An escape starts with a `U+005C` (`\`) and continues with one of the | |
265 | following forms: | |
266 | ||
267 | * A _byte escape_ escape starts with `U+0078` (`x`) and is | |
268 | followed by exactly two _hex digits_. It denotes the byte | |
269 | equal to the provided hex value. | |
270 | * A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072` | |
271 | (`r`), or `U+0074` (`t`), denoting the bytes values `0x0A` (ASCII LF), | |
272 | `0x0D` (ASCII CR) or `0x09` (ASCII HT) respectively. | |
273 | * The _null escape_ is the character `U+0030` (`0`) and denotes the byte | |
274 | value `0x00` (ASCII NUL). | |
275 | * The _backslash escape_ is the character `U+005C` (`\`) which must be | |
276 | escaped in order to denote its ASCII encoding `0x5C`. | |
277 | ||
278 | #### Raw byte string literals | |
279 | ||
8faf50e0 XL |
280 | > **<sup>Lexer</sup>**\ |
281 | > RAW_BYTE_STRING_LITERAL :\ | |
282 | > `br` RAW_BYTE_STRING_CONTENT | |
283 | > | |
284 | > RAW_BYTE_STRING_CONTENT :\ | |
285 | > `"` ASCII<sup>* (non-greedy)</sup> `"`\ | |
f9f354fc | 286 | > | `#` RAW_BYTE_STRING_CONTENT `#` |
8faf50e0 XL |
287 | > |
288 | > ASCII :\ | |
289 | > _any ASCII (i.e. 0x00 to 0x7F)_ | |
ea8adc8c | 290 | |
8bb4bdeb XL |
291 | Raw byte string literals do not process any escapes. They start with the |
292 | character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more | |
293 | of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The | |
294 | _raw string body_ can contain any sequence of ASCII characters and is terminated | |
295 | only by another `U+0022` (double-quote) character, followed by the same number of | |
296 | `U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote) | |
297 | character. A raw byte string literal can not contain any non-ASCII byte. | |
298 | ||
299 | All characters contained in the raw string body represent their ASCII encoding, | |
300 | the characters `U+0022` (double-quote) (except when followed by at least as | |
301 | many `U+0023` (`#`) characters as were used to start the raw string literal) or | |
302 | `U+005C` (`\`) do not have any special meaning. | |
303 | ||
304 | Examples for byte string literals: | |
305 | ||
cc61c64b | 306 | ```rust |
8bb4bdeb XL |
307 | b"foo"; br"foo"; // foo |
308 | b"\"foo\""; br#""foo""#; // "foo" | |
309 | ||
310 | b"foo #\"# bar"; | |
311 | br##"foo #"# bar"##; // foo #"# bar | |
312 | ||
313 | b"\x52"; b"R"; br"R"; // R | |
314 | b"\\x52"; br"\x52"; // \x52 | |
315 | ``` | |
316 | ||
317 | ### Number literals | |
318 | ||
319 | A _number literal_ is either an _integer literal_ or a _floating-point | |
320 | literal_. The grammar for recognizing the two kinds of literals is mixed. | |
321 | ||
322 | #### Integer literals | |
323 | ||
8faf50e0 XL |
324 | > **<sup>Lexer</sup>**\ |
325 | > INTEGER_LITERAL :\ | |
ea8adc8c XL |
326 | > ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) |
327 | > INTEGER_SUFFIX<sup>?</sup> | |
8faf50e0 XL |
328 | > |
329 | > DEC_LITERAL :\ | |
330 | > DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup> | |
331 | > | |
8faf50e0 XL |
332 | > BIN_LITERAL :\ |
333 | > `0b` (BIN_DIGIT|`_`)<sup>\*</sup> BIN_DIGIT (BIN_DIGIT|`_`)<sup>\*</sup> | |
334 | > | |
335 | > OCT_LITERAL :\ | |
336 | > `0o` (OCT_DIGIT|`_`)<sup>\*</sup> OCT_DIGIT (OCT_DIGIT|`_`)<sup>\*</sup> | |
337 | > | |
338 | > HEX_LITERAL :\ | |
339 | > `0x` (HEX_DIGIT|`_`)<sup>\*</sup> HEX_DIGIT (HEX_DIGIT|`_`)<sup>\*</sup> | |
340 | > | |
29967ef6 | 341 | > BIN_DIGIT : \[`0`-`1`] |
8faf50e0 | 342 | > |
29967ef6 | 343 | > OCT_DIGIT : \[`0`-`7`] |
8faf50e0 | 344 | > |
29967ef6 | 345 | > DEC_DIGIT : \[`0`-`9`] |
8faf50e0 | 346 | > |
29967ef6 | 347 | > HEX_DIGIT : \[`0`-`9` `a`-`f` `A`-`F`] |
8faf50e0 XL |
348 | > |
349 | > INTEGER_SUFFIX :\ | |
350 | > `u8` | `u16` | `u32` | `u64` | `u128` | `usize`\ | |
94b46f34 | 351 | > | `i8` | `i16` | `i32` | `i64` | `i128` | `isize` |
ea8adc8c | 352 | |
8bb4bdeb XL |
353 | An _integer literal_ has one of four forms: |
354 | ||
355 | * A _decimal literal_ starts with a *decimal digit* and continues with any | |
356 | mixture of *decimal digits* and _underscores_. | |
357 | * A _hex literal_ starts with the character sequence `U+0030` `U+0078` | |
ea8adc8c XL |
358 | (`0x`) and continues as any mixture (with at least one digit) of hex digits |
359 | and underscores. | |
8bb4bdeb | 360 | * An _octal literal_ starts with the character sequence `U+0030` `U+006F` |
ea8adc8c XL |
361 | (`0o`) and continues as any mixture (with at least one digit) of octal digits |
362 | and underscores. | |
8bb4bdeb | 363 | * A _binary literal_ starts with the character sequence `U+0030` `U+0062` |
ea8adc8c XL |
364 | (`0b`) and continues as any mixture (with at least one digit) of binary digits |
365 | and underscores. | |
8bb4bdeb XL |
366 | |
367 | Like any literal, an integer literal may be followed (immediately, | |
368 | without any spaces) by an _integer suffix_, which forcibly sets the | |
369 | type of the literal. The integer suffix must be the name of one of the | |
370 | integral types: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, | |
94b46f34 | 371 | `u128`, `i128`, `usize`, or `isize`. |
8bb4bdeb XL |
372 | |
373 | The type of an _unsuffixed_ integer literal is determined by type inference: | |
374 | ||
375 | * If an integer type can be _uniquely_ determined from the surrounding | |
376 | program context, the unsuffixed integer literal has that type. | |
377 | ||
378 | * If the program context under-constrains the type, it defaults to the | |
379 | signed 32-bit integer `i32`. | |
380 | ||
381 | * If the program context over-constrains the type, it is considered a | |
382 | static type error. | |
383 | ||
384 | Examples of integer literals of various forms: | |
385 | ||
cc61c64b | 386 | ```rust |
ea8adc8c | 387 | 123; // type i32 |
8bb4bdeb XL |
388 | 123i32; // type i32 |
389 | 123u32; // type u32 | |
390 | 123_u32; // type u32 | |
ea8adc8c XL |
391 | let a: u64 = 123; // type u64 |
392 | ||
393 | 0xff; // type i32 | |
8bb4bdeb | 394 | 0xff_u8; // type u8 |
ea8adc8c XL |
395 | |
396 | 0o70; // type i32 | |
8bb4bdeb | 397 | 0o70_i16; // type i16 |
ea8adc8c XL |
398 | |
399 | 0b1111_1111_1001_0000; // type i32 | |
2c00a5a8 | 400 | 0b1111_1111_1001_0000i64; // type i64 |
ea8adc8c XL |
401 | 0b________1; // type i32 |
402 | ||
8bb4bdeb XL |
403 | 0usize; // type usize |
404 | ``` | |
405 | ||
ea8adc8c XL |
406 | Examples of invalid integer literals: |
407 | ||
60c5eb7d | 408 | ```rust,compile_fail |
ea8adc8c XL |
409 | // invalid suffixes |
410 | ||
411 | 0invalidSuffix; | |
412 | ||
413 | // uses numbers of the wrong base | |
414 | ||
415 | 123AFB43; | |
416 | 0b0102; | |
417 | 0o0581; | |
418 | ||
419 | // integers too big for their type (they overflow) | |
420 | ||
421 | 128_i8; | |
422 | 256_u8; | |
423 | ||
e1599b0c | 424 | // bin, hex, and octal literals must have at least one digit |
ea8adc8c XL |
425 | |
426 | 0b_; | |
427 | 0b____; | |
428 | ``` | |
429 | ||
8bb4bdeb XL |
430 | Note that the Rust syntax considers `-1i8` as an application of the [unary minus |
431 | operator] to an integer literal `1i8`, rather than | |
432 | a single integer literal. | |
433 | ||
416331ca | 434 | [unary minus operator]: expressions/operator-expr.md#negation-operators |
8bb4bdeb | 435 | |
f9f354fc XL |
436 | #### Tuple index |
437 | ||
438 | > **<sup>Lexer</sup>**\ | |
439 | > TUPLE_INDEX: \ | |
440 | > INTEGER_LITERAL | |
441 | ||
442 | A tuple index is used to refer to the fields of [tuples], [tuple structs], and | |
443 | [tuple variants]. | |
444 | ||
445 | Tuple indices are compared with the literal token directly. Tuple indices | |
446 | start with `0` and each successive index increments the value by `1` as a | |
447 | decimal value. Thus, only decimal values will match, and the value must not | |
448 | have any extra `0` prefix characters. | |
449 | ||
450 | ```rust,compile_fail | |
451 | let example = ("dog", "cat", "horse"); | |
452 | let dog = example.0; | |
453 | let cat = example.1; | |
454 | // The following examples are invalid. | |
455 | let cat = example.01; // ERROR no field named `01` | |
456 | let horse = example.0b10; // ERROR no field named `0b10` | |
457 | ``` | |
458 | ||
459 | > **Note**: The tuple index may include an `INTEGER_SUFFIX`, but this is not | |
460 | > intended to be valid, and may be removed in a future version. See | |
461 | > <https://github.com/rust-lang/rust/issues/60210> for more information. | |
462 | ||
8bb4bdeb XL |
463 | #### Floating-point literals |
464 | ||
8faf50e0 XL |
465 | > **<sup>Lexer</sup>**\ |
466 | > FLOAT_LITERAL :\ | |
ea8adc8c | 467 | > DEC_LITERAL `.` |
8faf50e0 XL |
468 | > _(not immediately followed by `.`, `_` or an [identifier]_)\ |
469 | > | DEC_LITERAL FLOAT_EXPONENT\ | |
470 | > | DEC_LITERAL `.` DEC_LITERAL FLOAT_EXPONENT<sup>?</sup>\ | |
ea8adc8c | 471 | > | DEC_LITERAL (`.` DEC_LITERAL)<sup>?</sup> |
8faf50e0 XL |
472 | > FLOAT_EXPONENT<sup>?</sup> FLOAT_SUFFIX |
473 | > | |
474 | > FLOAT_EXPONENT :\ | |
ea8adc8c | 475 | > (`e`|`E`) (`+`|`-`)? |
8faf50e0 XL |
476 | > (DEC_DIGIT|`_`)<sup>\*</sup> DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup> |
477 | > | |
478 | > FLOAT_SUFFIX :\ | |
ea8adc8c XL |
479 | > `f32` | `f64` |
480 | ||
8bb4bdeb XL |
481 | A _floating-point literal_ has one of two forms: |
482 | ||
483 | * A _decimal literal_ followed by a period character `U+002E` (`.`). This is | |
484 | optionally followed by another decimal literal, with an optional _exponent_. | |
485 | * A single _decimal literal_ followed by an _exponent_. | |
486 | ||
487 | Like integer literals, a floating-point literal may be followed by a | |
488 | suffix, so long as the pre-suffix part does not end with `U+002E` (`.`). | |
489 | The suffix forcibly sets the type of the literal. There are two valid | |
490 | _floating-point suffixes_, `f32` and `f64` (the 32-bit and 64-bit floating point | |
491 | types), which explicitly determine the type of the literal. | |
492 | ||
493 | The type of an _unsuffixed_ floating-point literal is determined by | |
494 | type inference: | |
495 | ||
496 | * If a floating-point type can be _uniquely_ determined from the | |
497 | surrounding program context, the unsuffixed floating-point literal | |
498 | has that type. | |
499 | ||
500 | * If the program context under-constrains the type, it defaults to `f64`. | |
501 | ||
502 | * If the program context over-constrains the type, it is considered a | |
503 | static type error. | |
504 | ||
505 | Examples of floating-point literals of various forms: | |
506 | ||
cc61c64b | 507 | ```rust |
8bb4bdeb XL |
508 | 123.0f64; // type f64 |
509 | 0.1f64; // type f64 | |
510 | 0.1f32; // type f32 | |
511 | 12E+99_f64; // type f64 | |
5869c6ff | 512 | 5f32; // type f32 |
8bb4bdeb XL |
513 | let x: f64 = 2.; // type f64 |
514 | ``` | |
515 | ||
516 | This last example is different because it is not possible to use the suffix | |
517 | syntax with a floating point literal ending in a period. `2.f64` would attempt | |
518 | to call a method named `f64` on `2`. | |
519 | ||
520 | The representation semantics of floating-point numbers are described in | |
5869c6ff | 521 | ["Machine Types"][machine types]. |
8bb4bdeb XL |
522 | |
523 | ### Boolean literals | |
524 | ||
8faf50e0 XL |
525 | > **<sup>Lexer</sup>**\ |
526 | > BOOLEAN_LITERAL :\ | |
527 | > `true`\ | |
528 | > | `false` | |
ea8adc8c | 529 | |
8bb4bdeb XL |
530 | The two values of the boolean type are written `true` and `false`. |
531 | ||
0531ce1d XL |
532 | ## Lifetimes and loop labels |
533 | ||
8faf50e0 XL |
534 | > **<sup>Lexer</sup>**\ |
535 | > LIFETIME_TOKEN :\ | |
536 | > `'` [IDENTIFIER_OR_KEYWORD][identifier]\ | |
537 | > | `'_` | |
538 | > | |
539 | > LIFETIME_OR_LABEL :\ | |
b7449926 | 540 | > `'` [NON_KEYWORD_IDENTIFIER][identifier] |
0531ce1d XL |
541 | |
542 | Lifetime parameters and [loop labels] use LIFETIME_OR_LABEL tokens. Any | |
543 | LIFETIME_TOKEN will be accepted by the lexer, and for example, can be used in | |
544 | macros. | |
545 | ||
416331ca | 546 | [loop labels]: expressions/loop-expr.md |
0531ce1d | 547 | |
b7449926 XL |
548 | ## Punctuation |
549 | ||
550 | Punctuation symbol tokens are listed here for completeness. Their individual | |
551 | usages and meanings are defined in the linked pages. | |
552 | ||
553 | | Symbol | Name | Usage | | |
554 | |--------|-------------|-------| | |
555 | | `+` | Plus | [Addition][arith], [Trait Bounds], [Macro Kleene Matcher][macros] | |
556 | | `-` | Minus | [Subtraction][arith], [Negation] | |
3dfed10e | 557 | | `*` | Star | [Multiplication][arith], [Dereference], [Raw Pointers], [Macro Kleene Matcher][macros], [Use wildcards] |
b7449926 XL |
558 | | `/` | Slash | [Division][arith] |
559 | | `%` | Percent | [Remainder][arith] | |
560 | | `^` | Caret | [Bitwise and Logical XOR][arith] | |
3dfed10e | 561 | | `!` | Not | [Bitwise and Logical NOT][negation], [Macro Calls][macros], [Inner Attributes][attributes], [Never Type], [Negative impls] |
0bf4aa26 | 562 | | `&` | And | [Bitwise and Logical AND][arith], [Borrow], [References], [Reference patterns] |
3dfed10e | 563 | | <code>\|</code> | Or | [Bitwise and Logical OR][arith], [Closures], Patterns in [match], [if let], and [while let] |
0bf4aa26 | 564 | | `&&` | AndAnd | [Lazy AND][lazy-bool], [Borrow], [References], [Reference patterns] |
b7449926 XL |
565 | | <code>\|\|</code> | OrOr | [Lazy OR][lazy-bool], [Closures] |
566 | | `<<` | Shl | [Shift Left][arith], [Nested Generics][generics] | |
567 | | `>>` | Shr | [Shift Right][arith], [Nested Generics][generics] | |
568 | | `+=` | PlusEq | [Addition assignment][compound] | |
569 | | `-=` | MinusEq | [Subtraction assignment][compound] | |
570 | | `*=` | StarEq | [Multiplication assignment][compound] | |
571 | | `/=` | SlashEq | [Division assignment][compound] | |
572 | | `%=` | PercentEq | [Remainder assignment][compound] | |
573 | | `^=` | CaretEq | [Bitwise XOR assignment][compound] | |
574 | | `&=` | AndEq | [Bitwise And assignment][compound] | |
575 | | <code>\|=</code> | OrEq | [Bitwise Or assignment][compound] | |
576 | | `<<=` | ShlEq | [Shift Left assignment][compound] | |
577 | | `>>=` | ShrEq | [Shift Right assignment][compound], [Nested Generics][generics] | |
578 | | `=` | Eq | [Assignment], [Attributes], Various type definitions | |
579 | | `==` | EqEq | [Equal][comparison] | |
580 | | `!=` | Ne | [Not Equal][comparison] | |
581 | | `>` | Gt | [Greater than][comparison], [Generics], [Paths] | |
582 | | `<` | Lt | [Less than][comparison], [Generics], [Paths] | |
583 | | `>=` | Ge | [Greater than or equal to][comparison], [Generics] | |
584 | | `<=` | Le | [Less than or equal to][comparison] | |
0bf4aa26 | 585 | | `@` | At | [Subpattern binding] |
3dfed10e | 586 | | `_` | Underscore | [Wildcard patterns], [Inferred types], Unnamed items in [constants], [extern crates], and [use declarations] |
b7449926 | 587 | | `.` | Dot | [Field access][field], [Tuple index] |
c295e0f8 | 588 | | `..` | DotDot | [Range][range], [Struct expressions], [Patterns], [Range Patterns][rangepat] |
0bf4aa26 XL |
589 | | `...` | DotDotDot | [Variadic functions][extern], [Range patterns] |
590 | | `..=` | DotDotEq | [Inclusive Range][range], [Range patterns] | |
b7449926 XL |
591 | | `,` | Comma | Various separators |
592 | | `;` | Semi | Terminator for various items and statements, [Array types] | |
593 | | `:` | Colon | Various separators | |
594 | | `::` | PathSep | [Path separator][paths] | |
3dfed10e | 595 | | `->` | RArrow | [Function return type][functions], [Closure return type][closures], [Function pointer type] |
b7449926 XL |
596 | | `=>` | FatArrow | [Match arms][match], [Macros] |
597 | | `#` | Pound | [Attributes] | |
598 | | `$` | Dollar | [Macros] | |
74b04a01 | 599 | | `?` | Question | [Question mark operator][question], [Questionably sized][sized], [Macro Kleene Matcher][macros] |
b7449926 XL |
600 | |
601 | ## Delimiters | |
602 | ||
603 | Bracket punctuation is used in various parts of the grammar. An open bracket | |
604 | must always be paired with a close bracket. Brackets and the tokens within | |
605 | them are referred to as "token trees" in [macros]. The three types of brackets are: | |
606 | ||
607 | | Bracket | Type | | |
608 | |---------|-----------------| | |
609 | | `{` `}` | Curly braces | | |
610 | | `[` `]` | Square brackets | | |
611 | | `(` `)` | Parentheses | | |
612 | ||
613 | ||
416331ca XL |
614 | [Inferred types]: types/inferred.md |
615 | [Range patterns]: patterns.md#range-patterns | |
616 | [Reference patterns]: patterns.md#reference-patterns | |
617 | [Subpattern binding]: patterns.md#identifier-patterns | |
618 | [Wildcard patterns]: patterns.md#wildcard-pattern | |
619 | [arith]: expressions/operator-expr.md#arithmetic-and-logical-binary-operators | |
620 | [array types]: types/array.md | |
621 | [assignment]: expressions/operator-expr.md#assignment-expressions | |
622 | [attributes]: attributes.md | |
623 | [borrow]: expressions/operator-expr.md#borrow-operators | |
624 | [closures]: expressions/closure-expr.md | |
625 | [comparison]: expressions/operator-expr.md#comparison-operators | |
626 | [compound]: expressions/operator-expr.md#compound-assignment-expressions | |
3dfed10e | 627 | [constants]: items/constant-items.md |
416331ca | 628 | [dereference]: expressions/operator-expr.md#the-dereference-operator |
3dfed10e | 629 | [extern crates]: items/extern-crates.md |
416331ca XL |
630 | [extern]: items/external-blocks.md |
631 | [field]: expressions/field-expr.md | |
3dfed10e | 632 | [function pointer type]: types/function-pointer.md |
416331ca XL |
633 | [functions]: items/functions.md |
634 | [generics]: items/generics.md | |
635 | [identifier]: identifiers.md | |
3dfed10e | 636 | [if let]: expressions/if-expr.md#if-let-expressions |
416331ca XL |
637 | [keywords]: keywords.md |
638 | [lazy-bool]: expressions/operator-expr.md#lazy-boolean-operators | |
5869c6ff | 639 | [machine types]: types/numeric.md |
416331ca XL |
640 | [macros]: macros-by-example.md |
641 | [match]: expressions/match-expr.md | |
642 | [negation]: expressions/operator-expr.md#negation-operators | |
3dfed10e | 643 | [negative impls]: items/implementations.md |
416331ca XL |
644 | [never type]: types/never.md |
645 | [paths]: paths.md | |
646 | [patterns]: patterns.md | |
647 | [question]: expressions/operator-expr.md#the-question-mark-operator | |
648 | [range]: expressions/range-expr.md | |
c295e0f8 | 649 | [rangepat]: patterns.md#range-patterns |
416331ca XL |
650 | [raw pointers]: types/pointer.md#raw-pointers-const-and-mut |
651 | [references]: types/pointer.md | |
652 | [sized]: trait-bounds.md#sized | |
653 | [struct expressions]: expressions/struct-expr.md | |
654 | [trait bounds]: trait-bounds.md | |
655 | [tuple index]: expressions/tuple-expr.md#tuple-indexing-expressions | |
656 | [tuple structs]: items/structs.md | |
657 | [tuple variants]: items/enumerations.md | |
658 | [tuples]: types/tuple.md | |
3dfed10e XL |
659 | [use declarations]: items/use-declarations.md |
660 | [use wildcards]: items/use-declarations.md | |
661 | [while let]: expressions/loop-expr.md#predicate-pattern-loops |