]> git.proxmox.com Git - rustc.git/blob - src/doc/book/src/ch03-02-data-types.md
New upstream version 1.40.0+dfsg1
[rustc.git] / src / doc / book / src / ch03-02-data-types.md
1 ## Data Types
2
3 Every value in Rust is of a certain *data type*, which tells Rust what kind of
4 data is being specified so it knows how to work with that data. We’ll look at
5 two data type subsets: scalar and compound.
6
7 Keep in mind that Rust is a *statically typed* language, which means that it
8 must know the types of all variables at compile time. The compiler can usually
9 infer what type we want to use based on the value and how we use it. In cases
10 when many types are possible, such as when we converted a `String` to a numeric
11 type using `parse` in the [“Comparing the Guess to the Secret
12 Number”][comparing-the-guess-to-the-secret-number]<!-- ignore --> section in
13 Chapter 2, we must add a type annotation, like this:
14
15 ```rust
16 let guess: u32 = "42".parse().expect("Not a number!");
17 ```
18
19 If we don’t add the type annotation here, Rust will display the following
20 error, which means the compiler needs more information from us to know which
21 type we want to use:
22
23 ```text
24 error[E0282]: type annotations needed
25 --> src/main.rs:2:9
26 |
27 2 | let guess = "42".parse().expect("Not a number!");
28 | ^^^^^
29 | |
30 | cannot infer type for `_`
31 | consider giving `guess` a type
32 ```
33
34 You’ll see different type annotations for other data types.
35
36 ### Scalar Types
37
38 A *scalar* type represents a single value. Rust has four primary scalar types:
39 integers, floating-point numbers, Booleans, and characters. You may recognize
40 these from other programming languages. Let’s jump into how they work in Rust.
41
42 #### Integer Types
43
44 An *integer* is a number without a fractional component. We used one integer
45 type in Chapter 2, the `u32` type. This type declaration indicates that the
46 value it’s associated with should be an unsigned integer (signed integer types
47 start with `i`, instead of `u`) that takes up 32 bits of space. Table 3-1 shows
48 the built-in integer types in Rust. Each variant in the Signed and Unsigned
49 columns (for example, `i16`) can be used to declare the type of an integer
50 value.
51
52 <span class="caption">Table 3-1: Integer Types in Rust</span>
53
54 | Length | Signed | Unsigned |
55 |---------|---------|----------|
56 | 8-bit | `i8` | `u8` |
57 | 16-bit | `i16` | `u16` |
58 | 32-bit | `i32` | `u32` |
59 | 64-bit | `i64` | `u64` |
60 | 128-bit | `i128` | `u128` |
61 | arch | `isize` | `usize` |
62
63 Each variant can be either signed or unsigned and has an explicit size.
64 *Signed* and *unsigned* refer to whether it’s possible for the number to be
65 negative or positive—in other words, whether the number needs to have a sign
66 with it (signed) or whether it will only ever be positive and can therefore be
67 represented without a sign (unsigned). It’s like writing numbers on paper: when
68 the sign matters, a number is shown with a plus sign or a minus sign; however,
69 when it’s safe to assume the number is positive, it’s shown with no sign.
70 Signed numbers are stored using [two’s complement](https://en.wikipedia.org/wiki/Two%27s_complement) representation.
71
72 Each signed variant can store numbers from -(2<sup>n - 1</sup>) to 2<sup>n -
73 1</sup> - 1 inclusive, where *n* is the number of bits that variant uses. So an
74 `i8` can store numbers from -(2<sup>7</sup>) to 2<sup>7</sup> - 1, which equals
75 -128 to 127. Unsigned variants can store numbers from 0 to 2<sup>n</sup> - 1,
76 so a `u8` can store numbers from 0 to 2<sup>8</sup> - 1, which equals 0 to 255.
77
78 Additionally, the `isize` and `usize` types depend on the kind of computer your
79 program is running on: 64 bits if you’re on a 64-bit architecture and 32 bits
80 if you’re on a 32-bit architecture.
81
82 You can write integer literals in any of the forms shown in Table 3-2. Note
83 that all number literals except the byte literal allow a type suffix, such as
84 `57u8`, and `_` as a visual separator, such as `1_000`.
85
86 <span class="caption">Table 3-2: Integer Literals in Rust</span>
87
88 | Number literals | Example |
89 |------------------|---------------|
90 | Decimal | `98_222` |
91 | Hex | `0xff` |
92 | Octal | `0o77` |
93 | Binary | `0b1111_0000` |
94 | Byte (`u8` only) | `b'A'` |
95
96 So how do you know which type of integer to use? If you’re unsure, Rust’s
97 defaults are generally good choices, and integer types default to `i32`: this
98 type is generally the fastest, even on 64-bit systems. The primary situation in
99 which you’d use `isize` or `usize` is when indexing some sort of collection.
100
101 > ##### Integer Overflow
102 >
103 > Let’s say you have a variable of type `u8` that can hold values between 0 and 255.
104 > If you try to change the variable to a value outside of that range, such
105 > as 256, *integer overflow* will occur. Rust has some interesting rules
106 > involving this behavior. When you’re compiling in debug mode, Rust includes
107 > checks for integer overflow that cause your program to *panic* at runtime if
108 > this behavior occurs. Rust uses the term panicking when a program exits with
109 > an error; we’ll discuss panics in more depth in the [“Unrecoverable Errors
110 > with `panic!`”][unrecoverable-errors-with-panic]<!-- ignore --> section in
111 > Chapter 9.
112 >
113 > When you’re compiling in release mode with the `--release` flag, Rust does
114 > *not* include checks for integer overflow that cause panics. Instead, if
115 > overflow occurs, Rust performs *two’s complement wrapping*. In short, values
116 > greater than the maximum value the type can hold “wrap around” to the minimum
117 > of the values the type can hold. In the case of a `u8`, 256 becomes 0, 257
118 > becomes 1, and so on. The program won’t panic, but the variable will have a
119 > value that probably isn’t what you were expecting it to have. Relying on
120 > integer overflow’s wrapping behavior is considered an error. If you want to
121 > wrap explicitly, you can use the standard library type [`Wrapping`][wrapping].
122
123 #### Floating-Point Types
124
125 Rust also has two primitive types for *floating-point numbers*, which are
126 numbers with decimal points. Rust’s floating-point types are `f32` and `f64`,
127 which are 32 bits and 64 bits in size, respectively. The default type is `f64`
128 because on modern CPUs it’s roughly the same speed as `f32` but is capable of
129 more precision.
130
131 Here’s an example that shows floating-point numbers in action:
132
133 <span class="filename">Filename: src/main.rs</span>
134
135 ```rust
136 fn main() {
137 let x = 2.0; // f64
138
139 let y: f32 = 3.0; // f32
140 }
141 ```
142
143 Floating-point numbers are represented according to the IEEE-754 standard. The
144 `f32` type is a single-precision float, and `f64` has double precision.
145
146 #### Numeric Operations
147
148 Rust supports the basic mathematical operations you’d expect for all of the
149 number types: addition, subtraction, multiplication, division, and remainder.
150 The following code shows how you’d use each one in a `let` statement:
151
152 <span class="filename">Filename: src/main.rs</span>
153
154 ```rust
155 fn main() {
156 // addition
157 let sum = 5 + 10;
158
159 // subtraction
160 let difference = 95.5 - 4.3;
161
162 // multiplication
163 let product = 4 * 30;
164
165 // division
166 let quotient = 56.7 / 32.2;
167
168 // remainder
169 let remainder = 43 % 5;
170 }
171 ```
172
173 Each expression in these statements uses a mathematical operator and evaluates
174 to a single value, which is then bound to a variable. Appendix B contains a
175 list of all operators that Rust provides.
176
177 #### The Boolean Type
178
179 As in most other programming languages, a Boolean type in Rust has two possible
180 values: `true` and `false`. Booleans are one byte in size. The Boolean type in
181 Rust is specified using `bool`. For example:
182
183 <span class="filename">Filename: src/main.rs</span>
184
185 ```rust
186 fn main() {
187 let t = true;
188
189 let f: bool = false; // with explicit type annotation
190 }
191 ```
192
193 The main way to use Boolean values is through conditionals, such as an `if`
194 expression. We’ll cover how `if` expressions work in Rust in the [“Control
195 Flow”][control-flow]<!-- ignore --> section.
196
197 #### The Character Type
198
199 So far we’ve worked only with numbers, but Rust supports letters too. Rust’s
200 `char` type is the language’s most primitive alphabetic type, and the following
201 code shows one way to use it. (Note that `char` literals are specified with
202 single quotes, as opposed to string literals, which use double quotes.)
203
204 <span class="filename">Filename: src/main.rs</span>
205
206 ```rust
207 fn main() {
208 let c = 'z';
209 let z = 'ℤ';
210 let heart_eyed_cat = '😻';
211 }
212 ```
213
214 Rust’s `char` type is four bytes in size and represents a Unicode Scalar Value,
215 which means it can represent a lot more than just ASCII. Accented letters;
216 Chinese, Japanese, and Korean characters; emoji; and zero-width spaces are all
217 valid `char` values in Rust. Unicode Scalar Values range from `U+0000` to
218 `U+D7FF` and `U+E000` to `U+10FFFF` inclusive. However, a “character” isn’t
219 really a concept in Unicode, so your human intuition for what a “character” is
220 may not match up with what a `char` is in Rust. We’ll discuss this topic in
221 detail in [“Storing UTF-8 Encoded Text with Strings”][strings]<!-- ignore -->
222 in Chapter 8.
223
224 ### Compound Types
225
226 *Compound types* can group multiple values into one type. Rust has two
227 primitive compound types: tuples and arrays.
228
229 #### The Tuple Type
230
231 A tuple is a general way of grouping together a number of values with a variety
232 of types into one compound type. Tuples have a fixed length: once declared,
233 they cannot grow or shrink in size.
234
235 We create a tuple by writing a comma-separated list of values inside
236 parentheses. Each position in the tuple has a type, and the types of the
237 different values in the tuple don’t have to be the same. We’ve added optional
238 type annotations in this example:
239
240 <span class="filename">Filename: src/main.rs</span>
241
242 ```rust
243 fn main() {
244 let tup: (i32, f64, u8) = (500, 6.4, 1);
245 }
246 ```
247
248 The variable `tup` binds to the entire tuple, because a tuple is considered a
249 single compound element. To get the individual values out of a tuple, we can
250 use pattern matching to destructure a tuple value, like this:
251
252 <span class="filename">Filename: src/main.rs</span>
253
254 ```rust
255 fn main() {
256 let tup = (500, 6.4, 1);
257
258 let (x, y, z) = tup;
259
260 println!("The value of y is: {}", y);
261 }
262 ```
263
264 This program first creates a tuple and binds it to the variable `tup`. It then
265 uses a pattern with `let` to take `tup` and turn it into three separate
266 variables, `x`, `y`, and `z`. This is called *destructuring*, because it breaks
267 the single tuple into three parts. Finally, the program prints the value of
268 `y`, which is `6.4`.
269
270 In addition to destructuring through pattern matching, we can access a tuple
271 element directly by using a period (`.`) followed by the index of the value we
272 want to access. For example:
273
274 <span class="filename">Filename: src/main.rs</span>
275
276 ```rust
277 fn main() {
278 let x: (i32, f64, u8) = (500, 6.4, 1);
279
280 let five_hundred = x.0;
281
282 let six_point_four = x.1;
283
284 let one = x.2;
285 }
286 ```
287
288 This program creates a tuple, `x`, and then makes new variables for each
289 element by using their respective indices. As with most programming languages,
290 the first index in a tuple is 0.
291
292 #### The Array Type
293
294 Another way to have a collection of multiple values is with an *array*. Unlike
295 a tuple, every element of an array must have the same type. Arrays in Rust are
296 different from arrays in some other languages because arrays in Rust have a
297 fixed length, like tuples.
298
299 In Rust, the values going into an array are written as a comma-separated list
300 inside square brackets:
301
302 <span class="filename">Filename: src/main.rs</span>
303
304 ```rust
305 fn main() {
306 let a = [1, 2, 3, 4, 5];
307 }
308 ```
309
310 Arrays are useful when you want your data allocated on the stack rather than
311 the heap (we will discuss the stack and the heap more in Chapter 4) or when
312 you want to ensure you always have a fixed number of elements. An array isn’t
313 as flexible as the vector type, though. A vector is a similar collection type
314 provided by the standard library that *is* allowed to grow or shrink in size.
315 If you’re unsure whether to use an array or a vector, you should probably use a
316 vector. Chapter 8 discusses vectors in more detail.
317
318 An example of when you might want to use an array rather than a vector is in a
319 program that needs to know the names of the months of the year. It’s very
320 unlikely that such a program will need to add or remove months, so you can use
321 an array because you know it will always contain 12 elements:
322
323 ```rust
324 let months = ["January", "February", "March", "April", "May", "June", "July",
325 "August", "September", "October", "November", "December"];
326 ```
327
328 You would write an array’s type by using square brackets, and within the
329 brackets include the type of each element, a semicolon, and then the number of
330 elements in the array, like so:
331
332 ```rust
333 let a: [i32; 5] = [1, 2, 3, 4, 5];
334 ```
335
336 Here, `i32` is the type of each element. After the semicolon, the number `5`
337 indicates the array contains five elements.
338
339 Writing an array’s type this way looks similar to an alternative syntax for
340 initializing an array: if you want to create an array that contains the same
341 value for each element, you can specify the initial value, followed by a
342 semicolon, and then the length of the array in square brackets, as shown here:
343
344 ```rust
345 let a = [3; 5];
346 ```
347
348 The array named `a` will contain `5` elements that will all be set to the value
349 `3` initially. This is the same as writing `let a = [3, 3, 3, 3, 3];` but in a
350 more concise way.
351
352 ##### Accessing Array Elements
353
354 An array is a single chunk of memory allocated on the stack. You can access
355 elements of an array using indexing, like this:
356
357 <span class="filename">Filename: src/main.rs</span>
358
359 ```rust
360 fn main() {
361 let a = [1, 2, 3, 4, 5];
362
363 let first = a[0];
364 let second = a[1];
365 }
366 ```
367
368 In this example, the variable named `first` will get the value `1`, because
369 that is the value at index `[0]` in the array. The variable named `second` will
370 get the value `2` from index `[1]` in the array.
371
372 ##### Invalid Array Element Access
373
374 What happens if you try to access an element of an array that is past the end
375 of the array? Say you change the example to the following code, which will
376 compile but exit with an error when it runs:
377
378 <span class="filename">Filename: src/main.rs</span>
379
380 ```rust,ignore,panics
381 fn main() {
382 let a = [1, 2, 3, 4, 5];
383 let index = 10;
384
385 let element = a[index];
386
387 println!("The value of element is: {}", element);
388 }
389 ```
390
391 Running this code using `cargo run` produces the following result:
392
393 ```text
394 $ cargo run
395 Compiling arrays v0.1.0 (file:///projects/arrays)
396 Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
397 Running `target/debug/arrays`
398 thread 'main' panicked at 'index out of bounds: the len is 5 but the index is
399 10', src/main.rs:5:19
400 note: Run with `RUST_BACKTRACE=1` for a backtrace.
401 ```
402
403 The compilation didn’t produce any errors, but the program resulted in a
404 *runtime* error and didn’t exit successfully. When you attempt to access an
405 element using indexing, Rust will check that the index you’ve specified is less
406 than the array length. If the index is greater than or equal to the array
407 length, Rust will panic.
408
409 This is the first example of Rust’s safety principles in action. In many
410 low-level languages, this kind of check is not done, and when you provide an
411 incorrect index, invalid memory can be accessed. Rust protects you against this
412 kind of error by immediately exiting instead of allowing the memory access and
413 continuing. Chapter 9 discusses more of Rust’s error handling.
414
415 [comparing-the-guess-to-the-secret-number]:
416 ch02-00-guessing-game-tutorial.html#comparing-the-guess-to-the-secret-number
417 [control-flow]: ch03-05-control-flow.html#control-flow
418 [strings]: ch08-02-strings.html#storing-utf-8-encoded-text-with-strings
419 [unrecoverable-errors-with-panic]: ch09-01-unrecoverable-errors-with-panic.html
420 [wrapping]: ../std/num/struct.Wrapping.html