]>
Commit | Line | Data |
---|---|---|
13cf67c4 XL |
1 | ## Data Types |
2 | ||
3 | Every value in Rust is of a certain *data type*, which tells Rust what kind of | |
4 | data is being specified so it knows how to work with that data. We’ll look at | |
5 | two data type subsets: scalar and compound. | |
6 | ||
7 | Keep in mind that Rust is a *statically typed* language, which means that it | |
8 | must know the types of all variables at compile time. The compiler can usually | |
9 | infer what type we want to use based on the value and how we use it. In cases | |
10 | when many types are possible, such as when we converted a `String` to a numeric | |
9fa01778 XL |
11 | type using `parse` in the [“Comparing the Guess to the Secret Number”] |
12 | [comparing-the-guess-to-the-secret-number]<!-- ignore --> section in Chapter 2, | |
13 | we must add a type annotation, like this: | |
13cf67c4 XL |
14 | |
15 | ```rust | |
16 | let guess: u32 = "42".parse().expect("Not a number!"); | |
17 | ``` | |
18 | ||
19 | If we don’t add the type annotation here, Rust will display the following | |
20 | error, which means the compiler needs more information from us to know which | |
21 | type we want to use: | |
22 | ||
23 | ```text | |
24 | error[E0282]: type annotations needed | |
25 | --> src/main.rs:2:9 | |
26 | | | |
27 | 2 | let guess = "42".parse().expect("Not a number!"); | |
28 | | ^^^^^ | |
29 | | | | |
30 | | cannot infer type for `_` | |
31 | | consider giving `guess` a type | |
32 | ``` | |
33 | ||
34 | You’ll see different type annotations for other data types. | |
35 | ||
36 | ### Scalar Types | |
37 | ||
38 | A *scalar* type represents a single value. Rust has four primary scalar types: | |
39 | integers, floating-point numbers, Booleans, and characters. You may recognize | |
40 | these from other programming languages. Let’s jump into how they work in Rust. | |
41 | ||
42 | #### Integer Types | |
43 | ||
44 | An *integer* is a number without a fractional component. We used one integer | |
45 | type in Chapter 2, the `u32` type. This type declaration indicates that the | |
46 | value it’s associated with should be an unsigned integer (signed integer types | |
47 | start with `i`, instead of `u`) that takes up 32 bits of space. Table 3-1 shows | |
48 | the built-in integer types in Rust. Each variant in the Signed and Unsigned | |
49 | columns (for example, `i16`) can be used to declare the type of an integer | |
50 | value. | |
51 | ||
52 | <span class="caption">Table 3-1: Integer Types in Rust</span> | |
53 | ||
54 | | Length | Signed | Unsigned | | |
55 | |---------|---------|----------| | |
56 | | 8-bit | `i8` | `u8` | | |
57 | | 16-bit | `i16` | `u16` | | |
58 | | 32-bit | `i32` | `u32` | | |
59 | | 64-bit | `i64` | `u64` | | |
60 | | 128-bit | `i128` | `u128` | | |
61 | | arch | `isize` | `usize` | | |
62 | ||
63 | Each variant can be either signed or unsigned and has an explicit size. | |
64 | *Signed* and *unsigned* refer to whether it’s possible for the number to be | |
65 | negative or positive—in other words, whether the number needs to have a sign | |
66 | with it (signed) or whether it will only ever be positive and can therefore be | |
67 | represented without a sign (unsigned). It’s like writing numbers on paper: when | |
68 | the sign matters, a number is shown with a plus sign or a minus sign; however, | |
69 | when it’s safe to assume the number is positive, it’s shown with no sign. | |
9fa01778 | 70 | Signed numbers are stored using [two’s complement](https://en.wikipedia.org/wiki/Two%27s_complement) representation. |
13cf67c4 XL |
71 | |
72 | Each signed variant can store numbers from -(2<sup>n - 1</sup>) to 2<sup>n - | |
73 | 1</sup> - 1 inclusive, where *n* is the number of bits that variant uses. So an | |
74 | `i8` can store numbers from -(2<sup>7</sup>) to 2<sup>7</sup> - 1, which equals | |
75 | -128 to 127. Unsigned variants can store numbers from 0 to 2<sup>n</sup> - 1, | |
76 | so a `u8` can store numbers from 0 to 2<sup>8</sup> - 1, which equals 0 to 255. | |
77 | ||
78 | Additionally, the `isize` and `usize` types depend on the kind of computer your | |
79 | program is running on: 64 bits if you’re on a 64-bit architecture and 32 bits | |
80 | if you’re on a 32-bit architecture. | |
81 | ||
82 | You can write integer literals in any of the forms shown in Table 3-2. Note | |
83 | that all number literals except the byte literal allow a type suffix, such as | |
84 | `57u8`, and `_` as a visual separator, such as `1_000`. | |
85 | ||
86 | <span class="caption">Table 3-2: Integer Literals in Rust</span> | |
87 | ||
88 | | Number literals | Example | | |
89 | |------------------|---------------| | |
90 | | Decimal | `98_222` | | |
91 | | Hex | `0xff` | | |
92 | | Octal | `0o77` | | |
93 | | Binary | `0b1111_0000` | | |
94 | | Byte (`u8` only) | `b'A'` | | |
95 | ||
96 | So how do you know which type of integer to use? If you’re unsure, Rust’s | |
97 | defaults are generally good choices, and integer types default to `i32`: this | |
98 | type is generally the fastest, even on 64-bit systems. The primary situation in | |
99 | which you’d use `isize` or `usize` is when indexing some sort of collection. | |
100 | ||
9fa01778 XL |
101 | > ##### Integer Overflow |
102 | > | |
103 | > Let’s say that you have a variable of type `u8`, which can hold values | |
104 | > between 0 and 255. What happens if you try to change the variable's value to | |
105 | > 256? This is called *integer overflow*, and Rust has some interesting rules | |
106 | > around this behavior. When compiling in debug mode, Rust includes checks for | |
107 | > integer overflow that will cause your program to *panic* at runtime if integer | |
108 | > overflow occurs. Panicking is the term Rust uses when a program exits with an | |
109 | > error; we’ll discuss panics more in the ["Unrecoverable Errors with `panic!` | |
110 | > section"][unrecoverable-errors-with-panic] of Chapter 9. | |
111 | > | |
112 | > When compiling in release mode with the `--release` flag, Rust does not | |
113 | > include checks for integer overflow that cause panics. Instead, if overflow | |
114 | > occurs, Rust will perform something called *two’s complement wrapping*. In | |
115 | > short, values greater than the maximum value the type can hold "wrap around" | |
116 | > to the minimum of the values the type can hold. In the case of a `u8`, 256 | |
117 | > becomes 0, 257 becomes 1, etc. Relying on the wrapping behavior of integer | |
118 | > overflow is considered an error. If you want to wrap explicitly, the standard | |
119 | > library has a type named `Wrapping` that provides this behavior. | |
13cf67c4 XL |
120 | |
121 | #### Floating-Point Types | |
122 | ||
123 | Rust also has two primitive types for *floating-point numbers*, which are | |
124 | numbers with decimal points. Rust’s floating-point types are `f32` and `f64`, | |
125 | which are 32 bits and 64 bits in size, respectively. The default type is `f64` | |
126 | because on modern CPUs it’s roughly the same speed as `f32` but is capable of | |
127 | more precision. | |
128 | ||
129 | Here’s an example that shows floating-point numbers in action: | |
130 | ||
131 | <span class="filename">Filename: src/main.rs</span> | |
132 | ||
133 | ```rust | |
134 | fn main() { | |
135 | let x = 2.0; // f64 | |
136 | ||
137 | let y: f32 = 3.0; // f32 | |
138 | } | |
139 | ``` | |
140 | ||
141 | Floating-point numbers are represented according to the IEEE-754 standard. The | |
142 | `f32` type is a single-precision float, and `f64` has double precision. | |
143 | ||
144 | #### Numeric Operations | |
145 | ||
146 | Rust supports the basic mathematical operations you’d expect for all of the | |
147 | number types: addition, subtraction, multiplication, division, and remainder. | |
148 | The following code shows how you’d use each one in a `let` statement: | |
149 | ||
150 | <span class="filename">Filename: src/main.rs</span> | |
151 | ||
152 | ```rust | |
153 | fn main() { | |
154 | // addition | |
155 | let sum = 5 + 10; | |
156 | ||
157 | // subtraction | |
158 | let difference = 95.5 - 4.3; | |
159 | ||
160 | // multiplication | |
161 | let product = 4 * 30; | |
162 | ||
163 | // division | |
164 | let quotient = 56.7 / 32.2; | |
165 | ||
166 | // remainder | |
167 | let remainder = 43 % 5; | |
168 | } | |
169 | ``` | |
170 | ||
171 | Each expression in these statements uses a mathematical operator and evaluates | |
172 | to a single value, which is then bound to a variable. Appendix B contains a | |
173 | list of all operators that Rust provides. | |
174 | ||
175 | #### The Boolean Type | |
176 | ||
177 | As in most other programming languages, a Boolean type in Rust has two possible | |
9fa01778 XL |
178 | values: `true` and `false`. Booleans are one byte in size. The Boolean type in |
179 | Rust is specified using `bool`. For example: | |
13cf67c4 XL |
180 | |
181 | <span class="filename">Filename: src/main.rs</span> | |
182 | ||
183 | ```rust | |
184 | fn main() { | |
185 | let t = true; | |
186 | ||
187 | let f: bool = false; // with explicit type annotation | |
188 | } | |
189 | ``` | |
190 | ||
69743fb6 | 191 | The main way to use Boolean values is through conditionals, such as an `if` |
9fa01778 XL |
192 | expression. We’ll cover how `if` expressions work in Rust in the [“Control |
193 | Flow”][control-flow]<!-- ignore --> section. | |
13cf67c4 XL |
194 | |
195 | #### The Character Type | |
196 | ||
197 | So far we’ve worked only with numbers, but Rust supports letters too. Rust’s | |
198 | `char` type is the language’s most primitive alphabetic type, and the following | |
9fa01778 | 199 | code shows one way to use it. (Note that `char` literals are specified with |
13cf67c4 XL |
200 | single quotes, as opposed to string literals, which use double quotes.) |
201 | ||
202 | <span class="filename">Filename: src/main.rs</span> | |
203 | ||
204 | ```rust | |
205 | fn main() { | |
206 | let c = 'z'; | |
207 | let z = 'ℤ'; | |
208 | let heart_eyed_cat = '😻'; | |
209 | } | |
210 | ``` | |
211 | ||
9fa01778 XL |
212 | Rust’s `char` type is four bytes in size and represents a Unicode Scalar Value, |
213 | which means it can represent a lot more than just ASCII. Accented letters; | |
214 | Chinese, Japanese, and Korean characters; emoji; and zero-width spaces are all | |
215 | valid `char` values in Rust. Unicode Scalar Values range from `U+0000` to | |
216 | `U+D7FF` and `U+E000` to `U+10FFFF` inclusive. However, a “character” isn’t | |
217 | really a concept in Unicode, so your human intuition for what a “character” is | |
218 | may not match up with what a `char` is in Rust. We’ll discuss this topic in | |
219 | detail in [“Storing UTF-8 Encoded Text with Strings”][strings]<!-- ignore --> | |
220 | in Chapter 8. | |
13cf67c4 XL |
221 | |
222 | ### Compound Types | |
223 | ||
224 | *Compound types* can group multiple values into one type. Rust has two | |
225 | primitive compound types: tuples and arrays. | |
226 | ||
227 | #### The Tuple Type | |
228 | ||
229 | A tuple is a general way of grouping together some number of other values | |
230 | with a variety of types into one compound type. Tuples have a fixed length: | |
231 | once declared, they cannot grow or shrink in size. | |
232 | ||
233 | We create a tuple by writing a comma-separated list of values inside | |
234 | parentheses. Each position in the tuple has a type, and the types of the | |
235 | different values in the tuple don’t have to be the same. We’ve added optional | |
236 | type annotations in this example: | |
237 | ||
238 | <span class="filename">Filename: src/main.rs</span> | |
239 | ||
240 | ```rust | |
241 | fn main() { | |
242 | let tup: (i32, f64, u8) = (500, 6.4, 1); | |
243 | } | |
244 | ``` | |
245 | ||
246 | The variable `tup` binds to the entire tuple, because a tuple is considered a | |
247 | single compound element. To get the individual values out of a tuple, we can | |
248 | use pattern matching to destructure a tuple value, like this: | |
249 | ||
250 | <span class="filename">Filename: src/main.rs</span> | |
251 | ||
252 | ```rust | |
253 | fn main() { | |
254 | let tup = (500, 6.4, 1); | |
255 | ||
256 | let (x, y, z) = tup; | |
257 | ||
258 | println!("The value of y is: {}", y); | |
259 | } | |
260 | ``` | |
261 | ||
262 | This program first creates a tuple and binds it to the variable `tup`. It then | |
263 | uses a pattern with `let` to take `tup` and turn it into three separate | |
264 | variables, `x`, `y`, and `z`. This is called *destructuring*, because it breaks | |
265 | the single tuple into three parts. Finally, the program prints the value of | |
266 | `y`, which is `6.4`. | |
267 | ||
268 | In addition to destructuring through pattern matching, we can access a tuple | |
269 | element directly by using a period (`.`) followed by the index of the value we | |
270 | want to access. For example: | |
271 | ||
272 | <span class="filename">Filename: src/main.rs</span> | |
273 | ||
274 | ```rust | |
275 | fn main() { | |
276 | let x: (i32, f64, u8) = (500, 6.4, 1); | |
277 | ||
278 | let five_hundred = x.0; | |
279 | ||
280 | let six_point_four = x.1; | |
281 | ||
282 | let one = x.2; | |
283 | } | |
284 | ``` | |
285 | ||
286 | This program creates a tuple, `x`, and then makes new variables for each | |
287 | element by using their index. As with most programming languages, the first | |
288 | index in a tuple is 0. | |
289 | ||
290 | #### The Array Type | |
291 | ||
292 | Another way to have a collection of multiple values is with an *array*. Unlike | |
293 | a tuple, every element of an array must have the same type. Arrays in Rust are | |
294 | different from arrays in some other languages because arrays in Rust have a | |
295 | fixed length, like tuples. | |
296 | ||
297 | In Rust, the values going into an array are written as a comma-separated list | |
298 | inside square brackets: | |
299 | ||
300 | <span class="filename">Filename: src/main.rs</span> | |
301 | ||
302 | ```rust | |
303 | fn main() { | |
304 | let a = [1, 2, 3, 4, 5]; | |
305 | } | |
306 | ``` | |
307 | ||
308 | Arrays are useful when you want your data allocated on the stack rather than | |
69743fb6 | 309 | the heap (we will discuss the stack and the heap more in Chapter 4) or when |
13cf67c4 XL |
310 | you want to ensure you always have a fixed number of elements. An array isn’t |
311 | as flexible as the vector type, though. A vector is a similar collection type | |
312 | provided by the standard library that *is* allowed to grow or shrink in size. | |
313 | If you’re unsure whether to use an array or a vector, you should probably use a | |
314 | vector. Chapter 8 discusses vectors in more detail. | |
315 | ||
316 | An example of when you might want to use an array rather than a vector is in a | |
317 | program that needs to know the names of the months of the year. It’s very | |
318 | unlikely that such a program will need to add or remove months, so you can use | |
319 | an array because you know it will always contain 12 items: | |
320 | ||
321 | ```rust | |
322 | let months = ["January", "February", "March", "April", "May", "June", "July", | |
323 | "August", "September", "October", "November", "December"]; | |
324 | ``` | |
325 | ||
9fa01778 XL |
326 | Writing an array's type is done with square brackets containing the type of |
327 | each element in the array followed by a semicolon and the number of elements in | |
328 | the array, like so: | |
13cf67c4 XL |
329 | |
330 | ```rust | |
331 | let a: [i32; 5] = [1, 2, 3, 4, 5]; | |
332 | ``` | |
333 | ||
9fa01778 XL |
334 | Here, `i32` is the type of each element. After the semicolon, the number `5` |
335 | indicates the element contains five items. | |
336 | ||
337 | The way an array's type is written looks similar to an alternative syntax for | |
338 | initializing an array: if you want to create an array that contains the same | |
339 | value for each element, you can specify the initial value, then a semicolon, | |
340 | then the length of the array in square brackets as shown here: | |
341 | ||
342 | ```rust | |
343 | let a = [3; 5]; | |
344 | ``` | |
345 | ||
346 | The array named `a` will contain 5 elements that will all be set to the value | |
347 | `3` initially. This is the same as writing `let a = [3, 3, 3, 3, 3];` but in a | |
348 | more concise way. | |
13cf67c4 XL |
349 | |
350 | ##### Accessing Array Elements | |
351 | ||
352 | An array is a single chunk of memory allocated on the stack. You can access | |
353 | elements of an array using indexing, like this: | |
354 | ||
355 | <span class="filename">Filename: src/main.rs</span> | |
356 | ||
357 | ```rust | |
358 | fn main() { | |
359 | let a = [1, 2, 3, 4, 5]; | |
360 | ||
361 | let first = a[0]; | |
362 | let second = a[1]; | |
363 | } | |
364 | ``` | |
365 | ||
366 | In this example, the variable named `first` will get the value `1`, because | |
367 | that is the value at index `[0]` in the array. The variable named `second` will | |
368 | get the value `2` from index `[1]` in the array. | |
369 | ||
370 | ##### Invalid Array Element Access | |
371 | ||
372 | What happens if you try to access an element of an array that is past the end | |
373 | of the array? Say you change the example to the following code, which will | |
374 | compile but exit with an error when it runs: | |
375 | ||
376 | <span class="filename">Filename: src/main.rs</span> | |
377 | ||
378 | ```rust,ignore,panics | |
379 | fn main() { | |
380 | let a = [1, 2, 3, 4, 5]; | |
381 | let index = 10; | |
382 | ||
383 | let element = a[index]; | |
384 | ||
385 | println!("The value of element is: {}", element); | |
386 | } | |
387 | ``` | |
388 | ||
389 | Running this code using `cargo run` produces the following result: | |
390 | ||
391 | ```text | |
392 | $ cargo run | |
393 | Compiling arrays v0.1.0 (file:///projects/arrays) | |
394 | Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs | |
395 | Running `target/debug/arrays` | |
396 | thread '<main>' panicked at 'index out of bounds: the len is 5 but the index is | |
397 | 10', src/main.rs:6 | |
398 | note: Run with `RUST_BACKTRACE=1` for a backtrace. | |
399 | ``` | |
400 | ||
401 | The compilation didn’t produce any errors, but the program resulted in a | |
402 | *runtime* error and didn’t exit successfully. When you attempt to access an | |
403 | element using indexing, Rust will check that the index you’ve specified is less | |
9fa01778 XL |
404 | than the array length. If the index is greater than or equal to the array |
405 | length, Rust will panic. | |
13cf67c4 XL |
406 | |
407 | This is the first example of Rust’s safety principles in action. In many | |
408 | low-level languages, this kind of check is not done, and when you provide an | |
409 | incorrect index, invalid memory can be accessed. Rust protects you against this | |
410 | kind of error by immediately exiting instead of allowing the memory access and | |
411 | continuing. Chapter 9 discusses more of Rust’s error handling. | |
9fa01778 XL |
412 | |
413 | [comparing-the-guess-to-the-secret-number]: | |
414 | ch02-00-guessing-game-tutorial.html#comparing-the-guess-to-the-secret-number | |
415 | [control-flow]: ch03-05-control-flow.html#control-flow | |
416 | [strings]: ch08-02-strings.html#storing-utf-8-encoded-text-with-strings | |
417 | [unrecoverable-errors-with-panic]: ch09-01-unrecoverable-errors-with-panic.html |