[rustc.git] / src / doc / reference / src / tokens.md

# Tokens

Tokens are primitive productions in the grammar defined by regular
(non-recursive) languages. "Simple" tokens are given in [string table
production] form, and occur in the rest of the
grammar as double-quoted strings. Other tokens have exact rules given.

[string table production]: notation.html#string-table-productions

## Literals

A literal is an expression consisting of a single token, rather than a sequence
of tokens, that immediately and directly denotes the value it evaluates to,
rather than referring to it by name or some other evaluation rule. A literal is
a form of [constant expression](expressions.html#constant-expressions), so is
evaluated (primarily) at compile time.

### Examples

#### Characters and strings

|                                              | Example         | `#` sets   | Characters  | Escapes             |
|----------------------------------------------|-----------------|------------|-------------|---------------------|
| [Character](#character-literals)             | `'H'`           | `N/A`      | All Unicode | [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) |
| [String](#string-literals)                   | `"hello"`       | `N/A`      | All Unicode | [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) |
| [Raw](#raw-string-literals)                  | `r#"hello"#`    | `0...`     | All Unicode | `N/A`                                                      |
| [Byte](#byte-literals)                       | `b'H'`          | `N/A`      | All ASCII   | [Quote](#quote-escapes) & [Byte](#byte-escapes)                               |
| [Byte string](#byte-string-literals)         | `b"hello"`      | `N/A`      | All ASCII   | [Quote](#quote-escapes) & [Byte](#byte-escapes)                               |
| [Raw byte string](#raw-byte-string-literals) | `br#"hello"#`   | `0...`     | All ASCII   | `N/A`                                                      |

#### ASCII escapes

|   | Name |
|---|------|
| `\x41` | 7-bit character code (exactly 2 digits, up to 0x7F) |
| `\n` | Newline |
| `\r` | Carriage return |
| `\t` | Tab |
| `\\` | Backslash |
| `\0` | Null |

#### Byte escapes

|   | Name |
|---|------|
| `\x7F` | 8-bit character code (exactly 2 digits) |
| `\n` | Newline |
| `\r` | Carriage return |
| `\t` | Tab |
| `\\` | Backslash |
| `\0` | Null |

#### Unicode escapes

|   | Name |
|---|------|
| `\u{7FFF}` | 24-bit Unicode character code (up to 6 digits) |

#### Quote escapes

|   | Name |
|---|------|
| `\'` | Single quote |
| `\"` | Double quote |

#### Numbers

| [Number literals](#number-literals)`*` | Example | Exponentiation | Suffixes |
|----------------------------------------|---------|----------------|----------|
| Decimal integer | `98_222` | `N/A` | Integer suffixes |
| Hex integer | `0xff` | `N/A` | Integer suffixes |
| Octal integer | `0o77` | `N/A` | Integer suffixes |
| Binary integer | `0b1111_0000` | `N/A` | Integer suffixes |
| Floating-point | `123.0E+77` | `Optional` | Floating-point suffixes |

`*` All number literals allow `_` as a visual separator: `1_234.0E+18f64`

#### Suffixes

| Integer | Floating-point |
|---------|----------------|
| `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `isize`, `usize` | `f32`, `f64` |

### Character and string literals

#### Character literals

> **<sup>Lexer</sup>**  
> CHAR_LITERAL :  
> &nbsp;&nbsp; `'` ( ~[`'` `\` \\n \\r \\t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) `'`  
>  
> QUOTE_ESCAPE :  
> &nbsp;&nbsp; `\'` | `\"`  
>  
> ASCII_ESCAPE :  
> &nbsp;&nbsp; &nbsp;&nbsp; `\x` OCT_DIGIT HEX_DIGIT  
> &nbsp;&nbsp; | `\n` | `\r` | `\t` | `\\` | `\0`  
>  
> UNICODE_ESCAPE :  
> &nbsp;&nbsp; `\u{` ( HEX_DIGIT `_`<sup>\*</sup> )<sup>1..6</sup> `}`  

A _character literal_ is a single Unicode character enclosed within two
`U+0027` (single-quote) characters, with the exception of `U+0027` itself,
which must be _escaped_ by a preceding `U+005C` character (`\`).

#### String literals

> **<sup>Lexer</sup>**  
> STRING_LITERAL :  
> &nbsp;&nbsp; `"` (  
> &nbsp;&nbsp; &nbsp;&nbsp; ~[`"` `\` _IsolatedCR_]  
> &nbsp;&nbsp; &nbsp;&nbsp; | QUOTE_ESCAPE  
> &nbsp;&nbsp; &nbsp;&nbsp; | ASCII_ESCAPE  
> &nbsp;&nbsp; &nbsp;&nbsp; | UNICODE_ESCAPE  
> &nbsp;&nbsp; &nbsp;&nbsp; | STRING_CONTINUE  
> &nbsp;&nbsp; )<sup>\*</sup> `"`  
>  
> STRING_CONTINUE :  
> &nbsp;&nbsp; `\` _followed by_ \\n  

A _string literal_ is a sequence of any Unicode characters enclosed within two
`U+0022` (double-quote) characters, with the exception of `U+0022` itself,
which must be _escaped_ by a preceding `U+005C` character (`\`).

Line-break characters are allowed in string literals. Normally they represent
themselves (i.e. no translation), but as a special exception, when an unescaped
`U+005C` character (`\`) occurs immediately before the newline (`U+000A`), the
`U+005C` character, the newline, and all whitespace at the beginning of the
next line are ignored. Thus `a` and `b` are equal:

```rust
let a = "foobar";
let b = "foo\
         bar";

assert_eq!(a,b);
```

#### Character escapes

Some additional _escapes_ are available in either character or non-raw string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:

* An _8-bit code point escape_ starts with `U+0078` (`x`) and is
  followed by exactly two _hex digits_. It denotes the Unicode code point
  equal to the provided hex value.
* A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed
  by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D`
  (`}`). It denotes the Unicode code point equal to the provided hex value.
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
  (`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF),
  `U+000D` (CR) or `U+0009` (HT) respectively.
* The _null escape_ is the character `U+0030` (`0`) and denotes the Unicode
  value `U+0000` (NUL).
* The _backslash escape_ is the character `U+005C` (`\`) which must be
  escaped in order to denote *itself*.

#### Raw string literals

> **<sup>Lexer</sup>**  
> RAW_STRING_LITERAL :  
> &nbsp;&nbsp; `r` RAW_STRING_CONTENT  
>  
> RAW_STRING_CONTENT :  
> &nbsp;&nbsp; &nbsp;&nbsp; `"` ( ~ _IsolatedCR_ )<sup>* (non-greedy)</sup> `"`  
> &nbsp;&nbsp; | `#` RAW_STRING_CONTENT `#`  

Raw string literals do not process any escapes. They start with the character
`U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a
`U+0022` (double-quote) character. The _raw string body_ can contain any sequence
of Unicode characters and is terminated only by another `U+0022` (double-quote)
character, followed by the same number of `U+0023` (`#`) characters that preceded
the opening `U+0022` (double-quote) character.

All Unicode characters contained in the raw string body represent themselves,
the characters `U+0022` (double-quote) (except when followed by at least as
many `U+0023` (`#`) characters as were used to start the raw string literal) or
`U+005C` (`\`) do not have any special meaning.

Examples for string literals:

```rust
"foo"; r"foo";                     // foo
"\"foo\""; r#""foo""#;             // "foo"

"foo #\"# bar";
r##"foo #"# bar"##;                // foo #"# bar

"\x52"; "R"; r"R";                 // R
"\\x52"; r"\x52";                  // \x52
```

### Byte and byte string literals

#### Byte literals

> **<sup>Lexer</sup>**  
> BYTE_LITERAL :  
> &nbsp;&nbsp; `b'` ( ASCII_FOR_CHAR | BYTE_ESCAPE )  `'`  
>  
> ASCII_FOR_CHAR :  
> &nbsp;&nbsp; _any ASCII (i.e. 0x00 to 0x7F), except_ `'`, `/`, \\n, \\r or \\t  
>  
> BYTE_ESCAPE :  
> &nbsp;&nbsp; &nbsp;&nbsp; `\x` HEX_DIGIT HEX_DIGIT  
> &nbsp;&nbsp; | `\n` | `\r` | `\t` | `\\` | `\0`  

A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F`
range) or a single _escape_ preceded by the characters `U+0062` (`b`) and
`U+0027` (single-quote), and followed by the character `U+0027`. If the character
`U+0027` is present within the literal, it must be _escaped_ by a preceding
`U+005C` (`\`) character. It is equivalent to a `u8` unsigned 8-bit integer
_number literal_.

#### Byte string literals

> **<sup>Lexer</sup>**  
> BYTE_STRING_LITERAL :  
> &nbsp;&nbsp; `b"` ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )<sup>\*</sup> `"`  
>  
> ASCII_FOR_STRING :  
> &nbsp;&nbsp; _any ASCII (i.e 0x00 to 0x7F), except_ `"`, `/` _and IsolatedCR_ 

A non-raw _byte string literal_ is a sequence of ASCII characters and _escapes_,
preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and
followed by the character `U+0022`. If the character `U+0022` is present within
the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character.
Alternatively, a byte string literal can be a _raw byte string literal_, defined
below. A byte string literal of length `n` is equivalent to a `&'static [u8; n]` borrowed fixed-sized array
of unsigned 8-bit integers.

Some additional _escapes_ are available in either byte or non-raw byte string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:

* A _byte escape_ escape starts with `U+0078` (`x`) and is
  followed by exactly two _hex digits_. It denotes the byte
  equal to the provided hex value.
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
  (`r`), or `U+0074` (`t`), denoting the bytes values `0x0A` (ASCII LF),
  `0x0D` (ASCII CR) or `0x09` (ASCII HT) respectively.
* The _null escape_ is the character `U+0030` (`0`) and denotes the byte
  value `0x00` (ASCII NUL).
* The _backslash escape_ is the character `U+005C` (`\`) which must be
  escaped in order to denote its ASCII encoding `0x5C`.

#### Raw byte string literals

> **<sup>Lexer</sup>**  
> RAW_BYTE_STRING_LITERAL :  
> &nbsp;&nbsp; `br` RAW_BYTE_STRING_CONTENT  
>  
> RAW_BYTE_STRING_CONTENT :  
> &nbsp;&nbsp; &nbsp;&nbsp; `"` ASCII<sup>* (non-greedy)</sup> `"`  
> &nbsp;&nbsp; | `#` RAW_STRING_CONTENT `#`  
>  
> ASCII :  
> &nbsp;&nbsp; _any ASCII (i.e. 0x00 to 0x7F)_  

Raw byte string literals do not process any escapes. They start with the
character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
_raw string body_ can contain any sequence of ASCII characters and is terminated
only by another `U+0022` (double-quote) character, followed by the same number of
`U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote)
character. A raw byte string literal can not contain any non-ASCII byte.

All characters contained in the raw string body represent their ASCII encoding,
the characters `U+0022` (double-quote) (except when followed by at least as
many `U+0023` (`#`) characters as were used to start the raw string literal) or
`U+005C` (`\`) do not have any special meaning.

Examples for byte string literals:

```rust
b"foo"; br"foo";                     // foo
b"\"foo\""; br#""foo""#;             // "foo"

b"foo #\"# bar";
br##"foo #"# bar"##;                 // foo #"# bar

b"\x52"; b"R"; br"R";                // R
b"\\x52"; br"\x52";                  // \x52
```

### Number literals

A _number literal_ is either an _integer literal_ or a _floating-point
literal_. The grammar for recognizing the two kinds of literals is mixed.

#### Integer literals

> **<sup>Lexer</sup>**  
> INTEGER_LITERAL :  
> &nbsp;&nbsp; ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL )
>              INTEGER_SUFFIX<sup>?</sup>
>   
> DEC_LITERAL :  
> &nbsp;&nbsp; DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup>  
>  
> BIN_LITERAL :  
> &nbsp;&nbsp; `0b` (BIN_DIGIT|`_`)<sup>\*</sup> BIN_DIGIT (BIN_DIGIT|`_`)<sup>\*</sup>  
>  
> OCT_LITERAL :  
> &nbsp;&nbsp; `0o` (OCT_DIGIT|`_`)<sup>\*</sup> OCT_DIGIT (OCT_DIGIT|`_`)<sup>\*</sup>  
>  
> HEX_LITERAL :  
> &nbsp;&nbsp; `0x` (HEX_DIGIT|`_`)<sup>\*</sup> HEX_DIGIT (HEX_DIGIT|`_`)<sup>\*</sup>  
>  
> BIN_DIGIT : [`0`-`1`]  
>  
> OCT_DIGIT : [`0`-`7`]  
>  
> DEC_DIGIT : [`0`-`9`]  
>  
> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F`]  
>  
> INTEGER_SUFFIX :  
> &nbsp;&nbsp; &nbsp;&nbsp; `u8` | `u16` | `u32` | `u64` | `usize`  
> &nbsp;&nbsp; | `i8` | `u16` | `i32` | `i64` | `usize`

<!-- FIXME: separate the DECIMAL_LITERAL with no prefix or suffix (used on tuple indexing and float_literal -->
<!-- FIXME: u128 and i128 -->

An _integer literal_ has one of four forms:

* A _decimal literal_ starts with a *decimal digit* and continues with any
  mixture of *decimal digits* and _underscores_.
* A _hex literal_ starts with the character sequence `U+0030` `U+0078`
  (`0x`) and continues as any mixture (with at least one digit) of hex digits
  and underscores.
* An _octal literal_ starts with the character sequence `U+0030` `U+006F`
  (`0o`) and continues as any mixture (with at least one digit) of octal digits
  and underscores.
* A _binary literal_ starts with the character sequence `U+0030` `U+0062`
  (`0b`) and continues as any mixture (with at least one digit) of binary digits
  and underscores.

Like any literal, an integer literal may be followed (immediately,
without any spaces) by an _integer suffix_, which forcibly sets the
type of the literal. The integer suffix must be the name of one of the
integral types: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`,
`isize`, or `usize`.

The type of an _unsuffixed_ integer literal is determined by type inference:

* If an integer type can be _uniquely_ determined from the surrounding
  program context, the unsuffixed integer literal has that type.

* If the program context under-constrains the type, it defaults to the
  signed 32-bit integer `i32`.

* If the program context over-constrains the type, it is considered a
  static type error.

Examples of integer literals of various forms:

```rust
123;                               // type i32
123i32;                            // type i32
123u32;                            // type u32
123_u32;                           // type u32
let a: u64 = 123;                  // type u64

0xff;                              // type i32
0xff_u8;                           // type u8

0o70;                              // type i32
0o70_i16;                          // type i16

0b1111_1111_1001_0000;             // type i32
0b1111_1111_1001_0000i32;          // type i64
0b________1;                       // type i32

0usize;                            // type usize
```

Examples of invalid integer literals:

```rust,ignore
// invalid suffixes

0invalidSuffix;

// uses numbers of the wrong base

123AFB43;
0b0102;
0o0581;

// integers too big for their type (they overflow)

128_i8;
256_u8;

// bin, hex and octal literals must have at least one digit

0b_;
0b____;
```

Note that the Rust syntax considers `-1i8` as an application of the [unary minus
operator] to an integer literal `1i8`, rather than
a single integer literal.

[unary minus operator]: expressions/operator-expr.html#negation-operators

#### Floating-point literals

> **<sup>Lexer</sup>**  
> FLOAT_LITERAL :  
> &nbsp;&nbsp; &nbsp;&nbsp; DEC_LITERAL `.`
>   _(not immediately followed by `.`, `_` or an identifier_)  
> &nbsp;&nbsp; | DEC_LITERAL FLOAT_EXPONENT  
> &nbsp;&nbsp; | DEC_LITERAL `.` DEC_LITERAL FLOAT_EXPONENT<sup>?</sup>  
> &nbsp;&nbsp; | DEC_LITERAL (`.` DEC_LITERAL)<sup>?</sup>
>                    FLOAT_EXPONENT<sup>?</sup> FLOAT_SUFFIX  
>  
> FLOAT_EXPONENT :  
> &nbsp;&nbsp; (`e`|`E`) (`+`|`-`)?
>               (DEC_DIGIT|`_`)<sup>\*</sup> DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup>   
>  
> FLOAT_SUFFIX :  
> &nbsp;&nbsp; `f32` | `f64`

A _floating-point literal_ has one of two forms:

* A _decimal literal_ followed by a period character `U+002E` (`.`). This is
  optionally followed by another decimal literal, with an optional _exponent_.
* A single _decimal literal_ followed by an _exponent_.

Like integer literals, a floating-point literal may be followed by a
suffix, so long as the pre-suffix part does not end with `U+002E` (`.`).
The suffix forcibly sets the type of the literal. There are two valid
_floating-point suffixes_, `f32` and `f64` (the 32-bit and 64-bit floating point
types), which explicitly determine the type of the literal.

The type of an _unsuffixed_ floating-point literal is determined by
type inference:

* If a floating-point type can be _uniquely_ determined from the
  surrounding program context, the unsuffixed floating-point literal
  has that type.

* If the program context under-constrains the type, it defaults to `f64`.

* If the program context over-constrains the type, it is considered a
  static type error.

Examples of floating-point literals of various forms:

```rust
123.0f64;        // type f64
0.1f64;          // type f64
0.1f32;          // type f32
12E+99_f64;      // type f64
let x: f64 = 2.; // type f64
```

This last example is different because it is not possible to use the suffix
syntax with a floating point literal ending in a period. `2.f64` would attempt
to call a method named `f64` on `2`.

The representation semantics of floating-point numbers are described in
["Machine Types"].

["Machine Types"]: types.html#machine-types

### Boolean literals

> **<sup>Lexer</sup>**  
> BOOLEAN_LITERAL :  
> &nbsp;&nbsp; &nbsp;&nbsp; `true`  
> &nbsp;&nbsp; | `false`  

The two values of the boolean type are written `true` and `false`.

## Symbols

Symbols are a general class of printable [tokens] that play structural
roles in a variety of grammar productions. They are a
set of remaining miscellaneous printable tokens that do not
otherwise appear as [unary operators], [binary
operators], or [keywords].
They are catalogued in [the Symbols section][symbols] of the Grammar document.

[unary operators]: expressions/operator-expr.html#borrow-operators
[binary operators]: expressions/operator-expr.html#arithmetic-and-logical-binary-operators
[tokens]: #tokens
[symbols]: ../grammar.html#symbols
[keywords]: keywords.html
Commit	Line	Data
	1	# Tokens
	2
	3	Tokens are primitive productions in the grammar defined by regular
	4	(non-recursive) languages. "Simple" tokens are given in [string table
	5	production] form, and occur in the rest of the
	6	grammar as double-quoted strings. Other tokens have exact rules given.
	7
	8	[string table production]: notation.html#string-table-productions
	9
	10	## Literals
	11
	12	A literal is an expression consisting of a single token, rather than a sequence
	13	of tokens, that immediately and directly denotes the value it evaluates to,
	14	rather than referring to it by name or some other evaluation rule. A literal is
	15	a form of [constant expression](expressions.html#constant-expressions), so is
	16	evaluated (primarily) at compile time.
	17
	18	### Examples
	19
	20	#### Characters and strings
	21
	22	\| \| Example \| `#` sets \| Characters \| Escapes \|
	23	\|----------------------------------------------\|-----------------\|------------\|-------------\|---------------------\|
	24	\| [Character](#character-literals) \| `'H'` \| `N/A` \| All Unicode \| [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) \|
	25	\| [String](#string-literals) \| `"hello"` \| `N/A` \| All Unicode \| [Quote](#quote-escapes) & [ASCII](#ascii-escapes) & [Unicode](#unicode-escapes) \|
	26	\| [Raw](#raw-string-literals) \| `r#"hello"#` \| `0...` \| All Unicode \| `N/A` \|
	27	\| [Byte](#byte-literals) \| `b'H'` \| `N/A` \| All ASCII \| [Quote](#quote-escapes) & [Byte](#byte-escapes) \|
	28	\| [Byte string](#byte-string-literals) \| `b"hello"` \| `N/A` \| All ASCII \| [Quote](#quote-escapes) & [Byte](#byte-escapes) \|
	29	\| [Raw byte string](#raw-byte-string-literals) \| `br#"hello"#` \| `0...` \| All ASCII \| `N/A` \|
	30
	31	#### ASCII escapes
	32
	33	\| \| Name \|
	34	\|---\|------\|
	35	\| `\x41` \| 7-bit character code (exactly 2 digits, up to 0x7F) \|
	36	\| `\n` \| Newline \|
	37	\| `\r` \| Carriage return \|
	38	\| `\t` \| Tab \|
	39	\| `\\` \| Backslash \|
	40	\| `\0` \| Null \|
	41
	42	#### Byte escapes
	43
	44	\| \| Name \|
	45	\|---\|------\|
	46	\| `\x7F` \| 8-bit character code (exactly 2 digits) \|
	47	\| `\n` \| Newline \|
	48	\| `\r` \| Carriage return \|
	49	\| `\t` \| Tab \|
	50	\| `\\` \| Backslash \|
	51	\| `\0` \| Null \|
	52
	53	#### Unicode escapes
	54
	55	\| \| Name \|
	56	\|---\|------\|
	57	\| `\u{7FFF}` \| 24-bit Unicode character code (up to 6 digits) \|
	58
	59	#### Quote escapes
	60
	61	\| \| Name \|
	62	\|---\|------\|
	63	\| `\'` \| Single quote \|
	64	\| `\"` \| Double quote \|
	65
	66	#### Numbers
	67
	68	\| [Number literals](#number-literals)`*` \| Example \| Exponentiation \| Suffixes \|
	69	\|----------------------------------------\|---------\|----------------\|----------\|
	70	\| Decimal integer \| `98_222` \| `N/A` \| Integer suffixes \|
	71	\| Hex integer \| `0xff` \| `N/A` \| Integer suffixes \|
	72	\| Octal integer \| `0o77` \| `N/A` \| Integer suffixes \|
	73	\| Binary integer \| `0b1111_0000` \| `N/A` \| Integer suffixes \|
	74	\| Floating-point \| `123.0E+77` \| `Optional` \| Floating-point suffixes \|
	75
	76	`*` All number literals allow `_` as a visual separator: `1_234.0E+18f64`
	77
	78	#### Suffixes
	79
	80	\| Integer \| Floating-point \|
	81	\|---------\|----------------\|
	82	\| `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `isize`, `usize` \| `f32`, `f64` \|
	83
	84	### Character and string literals
	85
	86	#### Character literals
	87
	88	> <sup>Lexer</sup>
	89	> CHAR_LITERAL :
	90	>    `'` ( ~[`'` `\` \\n \\r \\t] \| QUOTE_ESCAPE \| ASCII_ESCAPE \| UNICODE_ESCAPE ) `'`
	91	>
	92	> QUOTE_ESCAPE :
	93	>    `\'` \| `\"`
	94	>
	95	> ASCII_ESCAPE :
	96	>       `\x` OCT_DIGIT HEX_DIGIT
	97	>    \| `\n` \| `\r` \| `\t` \| `\\` \| `\0`
	98	>
	99	> UNICODE_ESCAPE :
	100	>    `\u{` ( HEX_DIGIT `_`<sup>\*</sup> )<sup>1..6</sup> `}`
	101
	102	A _character literal_ is a single Unicode character enclosed within two
	103	`U+0027` (single-quote) characters, with the exception of `U+0027` itself,
	104	which must be _escaped_ by a preceding `U+005C` character (`\`).
	105
	106	#### String literals
	107
	108	> <sup>Lexer</sup>
	109	> STRING_LITERAL :
	110	>    `"` (
	111	>       ~[`"` `\` _IsolatedCR_]
	112	>       \| QUOTE_ESCAPE
	113	>       \| ASCII_ESCAPE
	114	>       \| UNICODE_ESCAPE
	115	>       \| STRING_CONTINUE
	116	>    )<sup>\*</sup> `"`
	117	>
	118	> STRING_CONTINUE :
	119	>    `\` _followed by_ \\n
	120
	121	A _string literal_ is a sequence of any Unicode characters enclosed within two
	122	`U+0022` (double-quote) characters, with the exception of `U+0022` itself,
	123	which must be _escaped_ by a preceding `U+005C` character (`\`).
	124
	125	Line-break characters are allowed in string literals. Normally they represent
	126	themselves (i.e. no translation), but as a special exception, when an unescaped
	127	`U+005C` character (`\`) occurs immediately before the newline (`U+000A`), the
	128	`U+005C` character, the newline, and all whitespace at the beginning of the
	129	next line are ignored. Thus `a` and `b` are equal:
	130
	131	```rust
	132	let a = "foobar";
	133	let b = "foo\
	134	bar";
	135
	136	assert_eq!(a,b);
	137	```
	138
	139	#### Character escapes
	140
	141	Some additional _escapes_ are available in either character or non-raw string
	142	literals. An escape starts with a `U+005C` (`\`) and continues with one of the
	143	following forms:
	144
	145	* An _8-bit code point escape_ starts with `U+0078` (`x`) and is
	146	followed by exactly two _hex digits_. It denotes the Unicode code point
	147	equal to the provided hex value.
	148	* A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed
	149	by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D`
	150	(`}`). It denotes the Unicode code point equal to the provided hex value.
	151	* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
	152	(`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF),
	153	`U+000D` (CR) or `U+0009` (HT) respectively.
	154	* The _null escape_ is the character `U+0030` (`0`) and denotes the Unicode
	155	value `U+0000` (NUL).
	156	* The _backslash escape_ is the character `U+005C` (`\`) which must be
	157	escaped in order to denote itself.
	158
	159	#### Raw string literals
	160
	161	> <sup>Lexer</sup>
	162	> RAW_STRING_LITERAL :
	163	>    `r` RAW_STRING_CONTENT
	164	>
	165	> RAW_STRING_CONTENT :
	166	>       `"` ( ~ _IsolatedCR_ )<sup>* (non-greedy)</sup> `"`
	167	>    \| `#` RAW_STRING_CONTENT `#`
	168
	169	Raw string literals do not process any escapes. They start with the character
	170	`U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a
	171	`U+0022` (double-quote) character. The _raw string body_ can contain any sequence
	172	of Unicode characters and is terminated only by another `U+0022` (double-quote)
	173	character, followed by the same number of `U+0023` (`#`) characters that preceded
	174	the opening `U+0022` (double-quote) character.
	175
	176	All Unicode characters contained in the raw string body represent themselves,
	177	the characters `U+0022` (double-quote) (except when followed by at least as
	178	many `U+0023` (`#`) characters as were used to start the raw string literal) or
	179	`U+005C` (`\`) do not have any special meaning.
	180
	181	Examples for string literals:
	182
	183	```rust
	184	"foo"; r"foo"; // foo
	185	"\"foo\""; r#""foo""#; // "foo"
	186
	187	"foo #\"# bar";
	188	r##"foo #"# bar"##; // foo #"# bar
	189
	190	"\x52"; "R"; r"R"; // R
	191	"\\x52"; r"\x52"; // \x52
	192	```
	193
	194	### Byte and byte string literals
	195
	196	#### Byte literals
	197
	198	> <sup>Lexer</sup>
	199	> BYTE_LITERAL :
	200	>    `b'` ( ASCII_FOR_CHAR \| BYTE_ESCAPE ) `'`
	201	>
	202	> ASCII_FOR_CHAR :
	203	>    _any ASCII (i.e. 0x00 to 0x7F), except_ `'`, `/`, \\n, \\r or \\t
	204	>
	205	> BYTE_ESCAPE :
	206	>       `\x` HEX_DIGIT HEX_DIGIT
	207	>    \| `\n` \| `\r` \| `\t` \| `\\` \| `\0`
	208
	209	A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F`
	210	range) or a single _escape_ preceded by the characters `U+0062` (`b`) and
	211	`U+0027` (single-quote), and followed by the character `U+0027`. If the character
	212	`U+0027` is present within the literal, it must be _escaped_ by a preceding
	213	`U+005C` (`\`) character. It is equivalent to a `u8` unsigned 8-bit integer
	214	_number literal_.
	215
	216	#### Byte string literals
	217
	218	> <sup>Lexer</sup>
	219	> BYTE_STRING_LITERAL :
	220	>    `b"` ( ASCII_FOR_STRING \| BYTE_ESCAPE \| STRING_CONTINUE )<sup>\*</sup> `"`
	221	>
	222	> ASCII_FOR_STRING :
	223	>    _any ASCII (i.e 0x00 to 0x7F), except_ `"`, `/` _and IsolatedCR_
	224
	225	A non-raw _byte string literal_ is a sequence of ASCII characters and _escapes_,
	226	preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and
	227	followed by the character `U+0022`. If the character `U+0022` is present within
	228	the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character.
	229	Alternatively, a byte string literal can be a _raw byte string literal_, defined
	230	below. A byte string literal of length `n` is equivalent to a `&'static [u8; n]` borrowed fixed-sized array
	231	of unsigned 8-bit integers.
	232
	233	Some additional _escapes_ are available in either byte or non-raw byte string
	234	literals. An escape starts with a `U+005C` (`\`) and continues with one of the
	235	following forms:
	236
	237	* A _byte escape_ escape starts with `U+0078` (`x`) and is
	238	followed by exactly two _hex digits_. It denotes the byte
	239	equal to the provided hex value.
	240	* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
	241	(`r`), or `U+0074` (`t`), denoting the bytes values `0x0A` (ASCII LF),
	242	`0x0D` (ASCII CR) or `0x09` (ASCII HT) respectively.
	243	* The _null escape_ is the character `U+0030` (`0`) and denotes the byte
	244	value `0x00` (ASCII NUL).
	245	* The _backslash escape_ is the character `U+005C` (`\`) which must be
	246	escaped in order to denote its ASCII encoding `0x5C`.
	247
	248	#### Raw byte string literals
	249
	250	> <sup>Lexer</sup>
	251	> RAW_BYTE_STRING_LITERAL :
	252	>    `br` RAW_BYTE_STRING_CONTENT
	253	>
	254	> RAW_BYTE_STRING_CONTENT :
	255	>       `"` ASCII<sup>* (non-greedy)</sup> `"`
	256	>    \| `#` RAW_STRING_CONTENT `#`
	257	>
	258	> ASCII :
	259	>    _any ASCII (i.e. 0x00 to 0x7F)_
	260
	261	Raw byte string literals do not process any escapes. They start with the
	262	character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more
	263	of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
	264	_raw string body_ can contain any sequence of ASCII characters and is terminated
	265	only by another `U+0022` (double-quote) character, followed by the same number of
	266	`U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote)
	267	character. A raw byte string literal can not contain any non-ASCII byte.
	268
	269	All characters contained in the raw string body represent their ASCII encoding,
	270	the characters `U+0022` (double-quote) (except when followed by at least as
	271	many `U+0023` (`#`) characters as were used to start the raw string literal) or
	272	`U+005C` (`\`) do not have any special meaning.
	273
	274	Examples for byte string literals:
	275
	276	```rust
	277	b"foo"; br"foo"; // foo
	278	b"\"foo\""; br#""foo""#; // "foo"
	279
	280	b"foo #\"# bar";
	281	br##"foo #"# bar"##; // foo #"# bar
	282
	283	b"\x52"; b"R"; br"R"; // R
	284	b"\\x52"; br"\x52"; // \x52
	285	```
	286
	287	### Number literals
	288
	289	A _number literal_ is either an _integer literal_ or a _floating-point
	290	literal_. The grammar for recognizing the two kinds of literals is mixed.
	291
	292	#### Integer literals
	293
	294	> <sup>Lexer</sup>
	295	> INTEGER_LITERAL :
	296	>    ( DEC_LITERAL \| BIN_LITERAL \| OCT_LITERAL \| HEX_LITERAL )
	297	> INTEGER_SUFFIX<sup>?</sup>
	298	>
	299	> DEC_LITERAL :
	300	>    DEC_DIGIT (DEC_DIGIT\|`_`)<sup>\*</sup>
	301	>
	302	> BIN_LITERAL :
	303	>    `0b` (BIN_DIGIT\|`_`)<sup>\</sup> BIN_DIGIT (BIN_DIGIT\|`_`)<sup>\</sup>
	304	>
	305	> OCT_LITERAL :
	306	>    `0o` (OCT_DIGIT\|`_`)<sup>\</sup> OCT_DIGIT (OCT_DIGIT\|`_`)<sup>\</sup>
	307	>
	308	> HEX_LITERAL :
	309	>    `0x` (HEX_DIGIT\|`_`)<sup>\</sup> HEX_DIGIT (HEX_DIGIT\|`_`)<sup>\</sup>
	310	>
	311	> BIN_DIGIT : [`0`-`1`]
	312	>
	313	> OCT_DIGIT : [`0`-`7`]
	314	>
	315	> DEC_DIGIT : [`0`-`9`]
	316	>
	317	> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F`]
	318	>
	319	> INTEGER_SUFFIX :
	320	>       `u8` \| `u16` \| `u32` \| `u64` \| `usize`
	321	>    \| `i8` \| `u16` \| `i32` \| `i64` \| `usize`
	322
	323	<!-- FIXME: separate the DECIMAL_LITERAL with no prefix or suffix (used on tuple indexing and float_literal -->
	324	<!-- FIXME: u128 and i128 -->
	325
	326	An _integer literal_ has one of four forms:
	327
	328	* A _decimal literal_ starts with a decimal digit and continues with any
	329	mixture of decimal digits and _underscores_.
	330	* A _hex literal_ starts with the character sequence `U+0030` `U+0078`
	331	(`0x`) and continues as any mixture (with at least one digit) of hex digits
	332	and underscores.
	333	* An _octal literal_ starts with the character sequence `U+0030` `U+006F`
	334	(`0o`) and continues as any mixture (with at least one digit) of octal digits
	335	and underscores.
	336	* A _binary literal_ starts with the character sequence `U+0030` `U+0062`
	337	(`0b`) and continues as any mixture (with at least one digit) of binary digits
	338	and underscores.
	339
	340	Like any literal, an integer literal may be followed (immediately,
	341	without any spaces) by an _integer suffix_, which forcibly sets the
	342	type of the literal. The integer suffix must be the name of one of the
	343	integral types: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`,
	344	`isize`, or `usize`.
	345
	346	The type of an _unsuffixed_ integer literal is determined by type inference:
	347
	348	* If an integer type can be _uniquely_ determined from the surrounding
	349	program context, the unsuffixed integer literal has that type.
	350
	351	* If the program context under-constrains the type, it defaults to the
	352	signed 32-bit integer `i32`.
	353
	354	* If the program context over-constrains the type, it is considered a
	355	static type error.
	356
	357	Examples of integer literals of various forms:
	358
	359	```rust
	360	123; // type i32
	361	123i32; // type i32
	362	123u32; // type u32
	363	123_u32; // type u32
	364	let a: u64 = 123; // type u64
	365
	366	0xff; // type i32
	367	0xff_u8; // type u8
	368
	369	0o70; // type i32
	370	0o70_i16; // type i16
	371
	372	0b1111_1111_1001_0000; // type i32
	373	0b1111_1111_1001_0000i32; // type i64
	374	0b________1; // type i32
	375
	376	0usize; // type usize
	377	```
	378
	379	Examples of invalid integer literals:
	380
	381	```rust,ignore
	382	// invalid suffixes
	383
	384	0invalidSuffix;
	385
	386	// uses numbers of the wrong base
	387
	388	123AFB43;
	389	0b0102;
	390	0o0581;
	391
	392	// integers too big for their type (they overflow)
	393
	394	128_i8;
	395	256_u8;
	396
	397	// bin, hex and octal literals must have at least one digit
	398
	399	0b_;
	400	0b____;
	401	```
	402
	403	Note that the Rust syntax considers `-1i8` as an application of the [unary minus
	404	operator] to an integer literal `1i8`, rather than
	405	a single integer literal.
	406
	407	[unary minus operator]: expressions/operator-expr.html#negation-operators
	408
	409	#### Floating-point literals
	410
	411	> <sup>Lexer</sup>
	412	> FLOAT_LITERAL :
	413	>       DEC_LITERAL `.`
	414	> _(not immediately followed by `.`, `_` or an identifier_)
	415	>    \| DEC_LITERAL FLOAT_EXPONENT
	416	>    \| DEC_LITERAL `.` DEC_LITERAL FLOAT_EXPONENT<sup>?</sup>
	417	>    \| DEC_LITERAL (`.` DEC_LITERAL)<sup>?</sup>
	418	> FLOAT_EXPONENT<sup>?</sup> FLOAT_SUFFIX
	419	>
	420	> FLOAT_EXPONENT :
	421	>    (`e`\|`E`) (`+`\|`-`)?
	422	> (DEC_DIGIT\|`_`)<sup>\</sup> DEC_DIGIT (DEC_DIGIT\|`_`)<sup>\</sup>
	423	>
	424	> FLOAT_SUFFIX :
	425	>    `f32` \| `f64`
	426
	427	A _floating-point literal_ has one of two forms:
	428
	429	* A _decimal literal_ followed by a period character `U+002E` (`.`). This is
	430	optionally followed by another decimal literal, with an optional _exponent_.
	431	* A single _decimal literal_ followed by an _exponent_.
	432
	433	Like integer literals, a floating-point literal may be followed by a
	434	suffix, so long as the pre-suffix part does not end with `U+002E` (`.`).
	435	The suffix forcibly sets the type of the literal. There are two valid
	436	_floating-point suffixes_, `f32` and `f64` (the 32-bit and 64-bit floating point
	437	types), which explicitly determine the type of the literal.
	438
	439	The type of an _unsuffixed_ floating-point literal is determined by
	440	type inference:
	441
	442	* If a floating-point type can be _uniquely_ determined from the
	443	surrounding program context, the unsuffixed floating-point literal
	444	has that type.
	445
	446	* If the program context under-constrains the type, it defaults to `f64`.
	447
	448	* If the program context over-constrains the type, it is considered a
	449	static type error.
	450
	451	Examples of floating-point literals of various forms:
	452
	453	```rust
	454	123.0f64; // type f64
	455	0.1f64; // type f64
	456	0.1f32; // type f32
	457	12E+99_f64; // type f64
	458	let x: f64 = 2.; // type f64
	459	```
	460
	461	This last example is different because it is not possible to use the suffix
	462	syntax with a floating point literal ending in a period. `2.f64` would attempt
	463	to call a method named `f64` on `2`.
	464
	465	The representation semantics of floating-point numbers are described in
	466	["Machine Types"].
	467
	468	["Machine Types"]: types.html#machine-types
	469
	470	### Boolean literals
	471
	472	> <sup>Lexer</sup>
	473	> BOOLEAN_LITERAL :
	474	>       `true`
	475	>    \| `false`
	476
	477	The two values of the boolean type are written `true` and `false`.
	478
	479	## Symbols
	480
	481	Symbols are a general class of printable [tokens] that play structural
	482	roles in a variety of grammar productions. They are a
	483	set of remaining miscellaneous printable tokens that do not
	484	otherwise appear as [unary operators], [binary
	485	operators], or [keywords].
	486	They are catalogued in [the Symbols section][symbols] of the Grammar document.
	487
	488	[unary operators]: expressions/operator-expr.html#borrow-operators
	489	[binary operators]: expressions/operator-expr.html#arithmetic-and-logical-binary-operators
	490	[tokens]: #tokens
	491	[symbols]: ../grammar.html#symbols
	492	[keywords]: keywords.html