]>
Commit | Line | Data |
---|---|---|
04454e1e FG |
1 | <!-- DO NOT EDIT THIS FILE. |
2 | ||
3 | This file is periodically generated from the content in the `/src/` | |
4 | directory, so all fixes need to be made in `/src/`. | |
5 | --> | |
6 | ||
7 | [TOC] | |
8 | ||
9 | # Functional Language Features: Iterators and Closures | |
10 | ||
11 | Rust’s design has taken inspiration from many existing languages and | |
12 | techniques, and one significant influence is *functional programming*. | |
13 | Programming in a functional style often includes using functions as values by | |
14 | passing them in arguments, returning them from other functions, assigning them | |
15 | to variables for later execution, and so forth. | |
16 | ||
17 | In this chapter, we won’t debate the issue of what functional programming is or | |
18 | isn’t but will instead discuss some features of Rust that are similar to | |
19 | features in many languages often referred to as functional. | |
20 | ||
21 | More specifically, we’ll cover: | |
22 | ||
23 | * *Closures*, a function-like construct you can store in a variable | |
24 | * *Iterators*, a way of processing a series of elements | |
923072b8 | 25 | * How to use closures and iterators to improve the I/O project in Chapter 12 |
2b03887a FG |
26 | * The performance of closures and iterators (spoiler alert: they’re faster than |
27 | you might think!) | |
04454e1e | 28 | |
923072b8 FG |
29 | We’ve already covered some other Rust features, such as pattern matching and |
30 | enums, that are also influenced by the functional style. Because mastering | |
04454e1e | 31 | closures and iterators is an important part of writing idiomatic, fast Rust |
923072b8 | 32 | code, we’ll devote this entire chapter to them. |
04454e1e | 33 | |
2b03887a | 34 | ## Closures: Anonymous Functions That Capture Their Environment |
04454e1e FG |
35 | |
36 | Rust’s closures are anonymous functions you can save in a variable or pass as | |
37 | arguments to other functions. You can create the closure in one place and then | |
923072b8 FG |
38 | call the closure elsewhere to evaluate it in a different context. Unlike |
39 | functions, closures can capture values from the scope in which they’re defined. | |
40 | We’ll demonstrate how these closure features allow for code reuse and behavior | |
04454e1e FG |
41 | customization. |
42 | ||
43 | ### Capturing the Environment with Closures | |
44 | ||
923072b8 | 45 | We’ll first examine how we can use closures to capture values from the |
2b03887a FG |
46 | environment they’re defined in for later use. Here’s the scenario: every so |
47 | often, our T-shirt company gives away an exclusive, limited-edition shirt to | |
923072b8 FG |
48 | someone on our mailing list as a promotion. People on the mailing list can |
49 | optionally add their favorite color to their profile. If the person chosen for | |
50 | a free shirt has their favorite color set, they get that color shirt. If the | |
51 | person hasn’t specified a favorite color, they get whatever color the company | |
52 | currently has the most of. | |
04454e1e FG |
53 | |
54 | There are many ways to implement this. For this example, we’re going to use an | |
923072b8 | 55 | enum called `ShirtColor` that has the variants `Red` and `Blue` (limiting the |
064997fb FG |
56 | number of colors available for simplicity). We represent the company’s |
57 | inventory with an `Inventory` struct that has a field named `shirts` that | |
58 | contains a `Vec<ShirtColor>` representing the shirt colors currently in stock. | |
2b03887a FG |
59 | The method `giveaway` defined on `Inventory` gets the optional shirt color |
60 | preference of the free-shirt winner, and returns the shirt color the person | |
61 | will get. This setup is shown in Listing 13-1. | |
04454e1e FG |
62 | |
63 | Filename: src/main.rs | |
64 | ||
65 | ``` | |
66 | #[derive(Debug, PartialEq, Copy, Clone)] | |
67 | enum ShirtColor { | |
68 | Red, | |
69 | Blue, | |
70 | } | |
71 | ||
72 | struct Inventory { | |
73 | shirts: Vec<ShirtColor>, | |
74 | } | |
75 | ||
76 | impl Inventory { | |
2b03887a FG |
77 | fn giveaway( |
78 | &self, | |
79 | user_preference: Option<ShirtColor>, | |
80 | ) -> ShirtColor { | |
81 | 1 user_preference.unwrap_or_else(|| self.most_stocked()) | |
04454e1e FG |
82 | } |
83 | ||
84 | fn most_stocked(&self) -> ShirtColor { | |
85 | let mut num_red = 0; | |
86 | let mut num_blue = 0; | |
87 | ||
88 | for color in &self.shirts { | |
89 | match color { | |
90 | ShirtColor::Red => num_red += 1, | |
91 | ShirtColor::Blue => num_blue += 1, | |
92 | } | |
93 | } | |
94 | if num_red > num_blue { | |
95 | ShirtColor::Red | |
96 | } else { | |
97 | ShirtColor::Blue | |
98 | } | |
99 | } | |
100 | } | |
101 | ||
102 | fn main() { | |
103 | let store = Inventory { | |
2b03887a FG |
104 | 2 shirts: vec![ |
105 | ShirtColor::Blue, | |
106 | ShirtColor::Red, | |
107 | ShirtColor::Blue, | |
108 | ], | |
04454e1e FG |
109 | }; |
110 | ||
111 | let user_pref1 = Some(ShirtColor::Red); | |
2b03887a | 112 | 3 let giveaway1 = store.giveaway(user_pref1); |
04454e1e FG |
113 | println!( |
114 | "The user with preference {:?} gets {:?}", | |
115 | user_pref1, giveaway1 | |
116 | ); | |
117 | ||
118 | let user_pref2 = None; | |
2b03887a | 119 | 4 let giveaway2 = store.giveaway(user_pref2); |
04454e1e FG |
120 | println!( |
121 | "The user with preference {:?} gets {:?}", | |
122 | user_pref2, giveaway2 | |
123 | ); | |
124 | } | |
125 | ``` | |
126 | ||
923072b8 FG |
127 | Listing 13-1: Shirt company giveaway situation |
128 | ||
129 | The `store` defined in `main` has two blue shirts and one red shirt remaining | |
130 | to distribute for this limited-edition promotion [2]. We call the `giveaway` | |
131 | method for a user with a preference for a red shirt [3] and a user without any | |
132 | preference [4]. | |
923072b8 FG |
133 | |
134 | Again, this code could be implemented in many ways, and here, to focus on | |
2b03887a FG |
135 | closures, we’ve stuck to concepts you’ve already learned, except for the body |
136 | of the `giveaway` method that uses a closure. In the `giveaway` method, we get | |
137 | the user preference as a parameter of type `Option<ShirtColor>` and call the | |
138 | `unwrap_or_else` method on `user_preference` [1]. The `unwrap_or_else` method | |
139 | on `Option<T>` is defined by the standard library. It takes one argument: a | |
923072b8 FG |
140 | closure without any arguments that returns a value `T` (the same type stored in |
141 | the `Some` variant of the `Option<T>`, in this case `ShirtColor`). If the | |
142 | `Option<T>` is the `Some` variant, `unwrap_or_else` returns the value from | |
143 | within the `Some`. If the `Option<T>` is the `None` variant, `unwrap_or_else` | |
144 | calls the closure and returns the value returned by the closure. | |
145 | ||
146 | We specify the closure expression `|| self.most_stocked()` as the argument to | |
147 | `unwrap_or_else`. This is a closure that takes no parameters itself (if the | |
2b03887a | 148 | closure had parameters, they would appear between the two vertical pipes). The |
923072b8 FG |
149 | body of the closure calls `self.most_stocked()`. We’re defining the closure |
150 | here, and the implementation of `unwrap_or_else` will evaluate the closure | |
151 | later if the result is needed. | |
152 | ||
2b03887a | 153 | Running this code prints the following: |
04454e1e FG |
154 | |
155 | ``` | |
04454e1e FG |
156 | The user with preference Some(Red) gets Red |
157 | The user with preference None gets Blue | |
158 | ``` | |
159 | ||
923072b8 | 160 | One interesting aspect here is that we’ve passed a closure that calls |
04454e1e FG |
161 | `self.most_stocked()` on the current `Inventory` instance. The standard library |
162 | didn’t need to know anything about the `Inventory` or `ShirtColor` types we | |
923072b8 FG |
163 | defined, or the logic we want to use in this scenario. The closure captures an |
164 | immutable reference to the `self` `Inventory` instance and passes it with the | |
165 | code we specify to the `unwrap_or_else` method. Functions, on the other hand, | |
166 | are not able to capture their environment in this way. | |
04454e1e FG |
167 | |
168 | ### Closure Type Inference and Annotation | |
169 | ||
170 | There are more differences between functions and closures. Closures don’t | |
171 | usually require you to annotate the types of the parameters or the return value | |
064997fb FG |
172 | like `fn` functions do. Type annotations are required on functions because the |
173 | types are part of an explicit interface exposed to your users. Defining this | |
04454e1e | 174 | interface rigidly is important for ensuring that everyone agrees on what types |
923072b8 FG |
175 | of values a function uses and returns. Closures, on the other hand, aren’t used |
176 | in an exposed interface like this: they’re stored in variables and used without | |
177 | naming them and exposing them to users of our library. | |
04454e1e FG |
178 | |
179 | Closures are typically short and relevant only within a narrow context rather | |
180 | than in any arbitrary scenario. Within these limited contexts, the compiler can | |
181 | infer the types of the parameters and the return type, similar to how it’s able | |
182 | to infer the types of most variables (there are rare cases where the compiler | |
183 | needs closure type annotations too). | |
184 | ||
185 | As with variables, we can add type annotations if we want to increase | |
186 | explicitness and clarity at the cost of being more verbose than is strictly | |
187 | necessary. Annotating the types for a closure would look like the definition | |
064997fb | 188 | shown in Listing 13-2. In this example, we’re defining a closure and storing it |
923072b8 | 189 | in a variable rather than defining the closure in the spot we pass it as an |
2b03887a | 190 | argument, as we did in Listing 13-1. |
04454e1e FG |
191 | |
192 | Filename: src/main.rs | |
193 | ||
194 | ``` | |
195 | let expensive_closure = |num: u32| -> u32 { | |
196 | println!("calculating slowly..."); | |
197 | thread::sleep(Duration::from_secs(2)); | |
198 | num | |
199 | }; | |
200 | ``` | |
201 | ||
064997fb | 202 | Listing 13-2: Adding optional type annotations of the parameter and return |
04454e1e FG |
203 | value types in the closure |
204 | ||
205 | With type annotations added, the syntax of closures looks more similar to the | |
2b03887a FG |
206 | syntax of functions. Here, we define a function that adds 1 to its parameter |
207 | and a closure that has the same behavior, for comparison. We’ve added some | |
208 | spaces to line up the relevant parts. This illustrates how closure syntax is | |
209 | similar to function syntax except for the use of pipes and the amount of syntax | |
210 | that is optional: | |
04454e1e FG |
211 | |
212 | ``` | |
213 | fn add_one_v1 (x: u32) -> u32 { x + 1 } | |
214 | let add_one_v2 = |x: u32| -> u32 { x + 1 }; | |
215 | let add_one_v3 = |x| { x + 1 }; | |
216 | let add_one_v4 = |x| x + 1 ; | |
217 | ``` | |
218 | ||
2b03887a | 219 | The first line shows a function definition and the second line shows a fully |
923072b8 | 220 | annotated closure definition. In the third line, we remove the type annotations |
2b03887a FG |
221 | from the closure definition. In the fourth line, we remove the curly brackets, |
222 | which are optional because the closure body has only one expression. These are | |
223 | all valid definitions that will produce the same behavior when they’re called. | |
224 | The `add_one_v3` and `add_one_v4` lines require the closures to be evaluated to | |
225 | be able to compile because the types will be inferred from their usage. This is | |
923072b8 FG |
226 | similar to `let v = Vec::new();` needing either type annotations or values of |
227 | some type to be inserted into the `Vec` for Rust to be able to infer the type. | |
923072b8 FG |
228 | |
229 | For closure definitions, the compiler will infer one concrete type for each of | |
064997fb | 230 | their parameters and for their return value. For instance, Listing 13-3 shows |
923072b8 | 231 | the definition of a short closure that just returns the value it receives as a |
04454e1e | 232 | parameter. This closure isn’t very useful except for the purposes of this |
923072b8 FG |
233 | example. Note that we haven’t added any type annotations to the definition. |
234 | Because there are no type annotations, we can call the closure with any type, | |
235 | which we’ve done here with `String` the first time. If we then try to call | |
236 | `example_closure` with an integer, we’ll get an error. | |
04454e1e FG |
237 | |
238 | Filename: src/main.rs | |
239 | ||
240 | ``` | |
241 | let example_closure = |x| x; | |
242 | ||
243 | let s = example_closure(String::from("hello")); | |
244 | let n = example_closure(5); | |
245 | ``` | |
246 | ||
064997fb | 247 | Listing 13-3: Attempting to call a closure whose types are inferred with two |
04454e1e FG |
248 | different types |
249 | ||
250 | The compiler gives us this error: | |
251 | ||
252 | ``` | |
253 | error[E0308]: mismatched types | |
254 | --> src/main.rs:5:29 | |
255 | | | |
256 | 5 | let n = example_closure(5); | |
2b03887a FG |
257 | | ^- help: try using a conversion method: |
258 | `.to_string()` | |
04454e1e FG |
259 | | | |
260 | | expected struct `String`, found integer | |
261 | ``` | |
262 | ||
263 | The first time we call `example_closure` with the `String` value, the compiler | |
264 | infers the type of `x` and the return type of the closure to be `String`. Those | |
265 | types are then locked into the closure in `example_closure`, and we get a type | |
923072b8 | 266 | error when we next try to use a different type with the same closure. |
04454e1e FG |
267 | |
268 | ### Capturing References or Moving Ownership | |
269 | ||
270 | Closures can capture values from their environment in three ways, which | |
271 | directly map to the three ways a function can take a parameter: borrowing | |
272 | immutably, borrowing mutably, and taking ownership. The closure will decide | |
273 | which of these to use based on what the body of the function does with the | |
274 | captured values. | |
275 | ||
064997fb FG |
276 | In Listing 13-4, we define a closure that captures an immutable reference to |
277 | the vector named `list` because it only needs an immutable reference to print | |
2b03887a | 278 | the value. |
04454e1e FG |
279 | |
280 | Filename: src/main.rs | |
281 | ||
282 | ``` | |
283 | fn main() { | |
284 | let list = vec![1, 2, 3]; | |
285 | println!("Before defining closure: {:?}", list); | |
286 | ||
2b03887a | 287 | 1 let only_borrows = || println!("From closure: {:?}", list); |
04454e1e FG |
288 | |
289 | println!("Before calling closure: {:?}", list); | |
2b03887a | 290 | 2 only_borrows(); |
04454e1e FG |
291 | println!("After calling closure: {:?}", list); |
292 | } | |
293 | ``` | |
294 | ||
064997fb | 295 | Listing 13-4: Defining and calling a closure that captures an immutable |
923072b8 FG |
296 | reference |
297 | ||
298 | This example also illustrates that a variable can bind to a closure definition | |
299 | [1], and we can later call the closure by using the variable name and | |
300 | parentheses as if the variable name were a function name [2]. | |
04454e1e | 301 | |
923072b8 FG |
302 | Because we can have multiple immutable references to `list` at the same time, |
303 | `list` is still accessible from the code before the closure definition, after | |
04454e1e | 304 | the closure definition but before the closure is called, and after the closure |
923072b8 | 305 | is called. This code compiles, runs, and prints: |
04454e1e FG |
306 | |
307 | ``` | |
308 | Before defining closure: [1, 2, 3] | |
309 | Before calling closure: [1, 2, 3] | |
310 | From closure: [1, 2, 3] | |
311 | After calling closure: [1, 2, 3] | |
312 | ``` | |
313 | ||
064997fb | 314 | Next, in Listing 13-5, we change the closure body so that it adds an element to |
2b03887a | 315 | the `list` vector. The closure now captures a mutable reference. |
04454e1e FG |
316 | |
317 | Filename: src/main.rs | |
318 | ||
319 | ``` | |
320 | fn main() { | |
321 | let mut list = vec![1, 2, 3]; | |
322 | println!("Before defining closure: {:?}", list); | |
323 | ||
324 | let mut borrows_mutably = || list.push(7); | |
325 | ||
326 | borrows_mutably(); | |
327 | println!("After calling closure: {:?}", list); | |
328 | } | |
329 | ``` | |
330 | ||
064997fb | 331 | Listing 13-5: Defining and calling a closure that captures a mutable reference |
04454e1e FG |
332 | |
333 | This code compiles, runs, and prints: | |
334 | ||
335 | ``` | |
336 | Before defining closure: [1, 2, 3] | |
337 | After calling closure: [1, 2, 3, 7] | |
338 | ``` | |
339 | ||
340 | Note that there’s no longer a `println!` between the definition and the call of | |
341 | the `borrows_mutably` closure: when `borrows_mutably` is defined, it captures a | |
923072b8 FG |
342 | mutable reference to `list`. We don’t use the closure again after the closure |
343 | is called, so the mutable borrow ends. Between the closure definition and the | |
344 | closure call, an immutable borrow to print isn’t allowed because no other | |
345 | borrows are allowed when there’s a mutable borrow. Try adding a `println!` | |
346 | there to see what error message you get! | |
04454e1e FG |
347 | |
348 | If you want to force the closure to take ownership of the values it uses in the | |
349 | environment even though the body of the closure doesn’t strictly need | |
064997fb FG |
350 | ownership, you can use the `move` keyword before the parameter list. |
351 | ||
352 | This technique is mostly useful when passing a closure to a new thread to move | |
353 | the data so that it’s owned by the new thread. We’ll discuss threads and why | |
354 | you would want to use them in detail in Chapter 16 when we talk about | |
355 | concurrency, but for now, let’s briefly explore spawning a new thread using a | |
356 | closure that needs the `move` keyword. Listing 13-6 shows Listing 13-4 modified | |
2b03887a | 357 | to print the vector in a new thread rather than in the main thread. |
064997fb FG |
358 | |
359 | Filename: src/main.rs | |
360 | ||
361 | ``` | |
362 | use std::thread; | |
363 | ||
364 | fn main() { | |
365 | let list = vec![1, 2, 3]; | |
366 | println!("Before defining closure: {:?}", list); | |
367 | ||
2b03887a FG |
368 | 1 thread::spawn(move || { |
369 | 2 println!("From thread: {:?}", list) | |
064997fb FG |
370 | }).join().unwrap(); |
371 | } | |
372 | ``` | |
373 | ||
374 | Listing 13-6: Using `move` to force the closure for the thread to take | |
375 | ownership of `list` | |
376 | ||
377 | We spawn a new thread, giving the thread a closure to run as an argument. The | |
378 | closure body prints out the list. In Listing 13-4, the closure only captured | |
2b03887a | 379 | `list` using an immutable reference because that’s the least amount of access |
064997fb | 380 | to `list` needed to print it. In this example, even though the closure body |
2b03887a FG |
381 | still only needs an immutable reference [2], we need to specify that `list` |
382 | should be moved into the closure by putting the `move` keyword [1] at the | |
383 | beginning of the closure definition. The new thread might finish before the | |
384 | rest of the main thread finishes, or the main thread might finish first. If the | |
385 | main thread maintains ownership of `list` but ends before the new thread and | |
386 | drops `list`, the immutable reference in the thread would be invalid. | |
387 | Therefore, the compiler requires that `list` be moved into the closure given to | |
388 | the new thread so the reference will be valid. Try removing the `move` keyword | |
389 | or using `list` in the main thread after the closure is defined to see what | |
390 | compiler errors you get! | |
391 | ||
392 | ### Moving Captured Values Out of Closures and the Fn Traits | |
923072b8 | 393 | |
064997fb FG |
394 | Once a closure has captured a reference or captured ownership of a value from |
395 | the environment where the closure is defined (thus affecting what, if anything, | |
396 | is moved *into* the closure), the code in the body of the closure defines what | |
397 | happens to the references or values when the closure is evaluated later (thus | |
2b03887a FG |
398 | affecting what, if anything, is moved *out of* the closure). |
399 | ||
400 | A closure body can do any of the following: move a captured value out of the | |
401 | closure, mutate the captured value, neither move nor mutate the value, or | |
402 | capture nothing from the environment to begin with. | |
923072b8 FG |
403 | |
404 | The way a closure captures and handles values from the environment affects | |
064997fb FG |
405 | which traits the closure implements, and traits are how functions and structs |
406 | can specify what kinds of closures they can use. Closures will automatically | |
407 | implement one, two, or all three of these `Fn` traits, in an additive fashion, | |
408 | depending on how the closure’s body handles the values: | |
409 | ||
2b03887a FG |
410 | * `FnOnce` applies to closures that can be called once. All closures implement |
411 | at least this trait because all closures can be called. A closure that moves | |
412 | captured values out of its body will only implement `FnOnce` and none of the | |
413 | other `Fn` traits because it can only be called once. | |
414 | * `FnMut` applies to closures that don’t move captured values out of their | |
415 | body, but that might mutate the captured values. These closures can be called | |
416 | more than once. | |
417 | * `Fn` applies to closures that don’t move captured values out of their body | |
418 | and that don’t mutate captured values, as well as closures that capture nothing | |
419 | from their environment. These closures can be called more than once without | |
420 | mutating their environment, which is important in cases such as calling a | |
421 | closure multiple times concurrently. | |
923072b8 | 422 | |
04454e1e | 423 | Let’s look at the definition of the `unwrap_or_else` method on `Option<T>` that |
064997fb | 424 | we used in Listing 13-1: |
04454e1e FG |
425 | |
426 | ``` | |
427 | impl<T> Option<T> { | |
428 | pub fn unwrap_or_else<F>(self, f: F) -> T | |
429 | where | |
430 | F: FnOnce() -> T | |
431 | { | |
432 | match self { | |
433 | Some(x) => x, | |
434 | None => f(), | |
435 | } | |
436 | } | |
437 | } | |
438 | ``` | |
439 | ||
440 | Recall that `T` is the generic type representing the type of the value in the | |
441 | `Some` variant of an `Option`. That type `T` is also the return type of the | |
442 | `unwrap_or_else` function: code that calls `unwrap_or_else` on an | |
443 | `Option<String>`, for example, will get a `String`. | |
444 | ||
923072b8 FG |
445 | Next, notice that the `unwrap_or_else` function has the additional generic type |
446 | parameter `F`. The `F` type is the type of the parameter named `f`, which is | |
04454e1e FG |
447 | the closure we provide when calling `unwrap_or_else`. |
448 | ||
449 | The trait bound specified on the generic type `F` is `FnOnce() -> T`, which | |
064997fb FG |
450 | means `F` must be able to be called once, take no arguments, and return a `T`. |
451 | Using `FnOnce` in the trait bound expresses the constraint that | |
2b03887a | 452 | `unwrap_or_else` is only going to call `f` one time, at most. In the body of |
04454e1e FG |
453 | `unwrap_or_else`, we can see that if the `Option` is `Some`, `f` won’t be |
454 | called. If the `Option` is `None`, `f` will be called once. Because all | |
2b03887a FG |
455 | closures implement `FnOnce`, `unwrap_or_else` accepts the largest variety of |
456 | closures and is as flexible as it can be. | |
04454e1e FG |
457 | |
458 | > Note: Functions can implement all three of the `Fn` traits too. If what we | |
2b03887a FG |
459 | want to do doesn’t require capturing a value from the environment, we can use |
460 | the name of a function rather than a closure where we need something that | |
461 | implements one of the `Fn` traits. For example, on an `Option<Vec<T>>` value, | |
462 | we could call `unwrap_or_else(Vec::new)` to get a new, empty vector if the | |
463 | value is `None`. | |
04454e1e | 464 | |
2b03887a | 465 | Now let’s look at the standard library method `sort_by_key`, defined on slices, |
923072b8 | 466 | to see how that differs from `unwrap_or_else` and why `sort_by_key` uses |
064997fb FG |
467 | `FnMut` instead of `FnOnce` for the trait bound. The closure gets one argument |
468 | in the form of a reference to the current item in the slice being considered, | |
469 | and returns a value of type `K` that can be ordered. This function is useful | |
470 | when you want to sort a slice by a particular attribute of each item. In | |
471 | Listing 13-7, we have a list of `Rectangle` instances and we use `sort_by_key` | |
2b03887a | 472 | to order them by their `width` attribute from low to high. |
04454e1e FG |
473 | |
474 | Filename: src/main.rs | |
475 | ||
476 | ``` | |
477 | #[derive(Debug)] | |
478 | struct Rectangle { | |
479 | width: u32, | |
480 | height: u32, | |
481 | } | |
482 | ||
483 | fn main() { | |
484 | let mut list = [ | |
923072b8 FG |
485 | Rectangle { width: 10, height: 1 }, |
486 | Rectangle { width: 3, height: 5 }, | |
487 | Rectangle { width: 7, height: 12 }, | |
04454e1e FG |
488 | ]; |
489 | ||
490 | list.sort_by_key(|r| r.width); | |
491 | println!("{:#?}", list); | |
492 | } | |
493 | ``` | |
064997fb FG |
494 | |
495 | Listing 13-7: Using `sort_by_key` to order rectangles by width | |
04454e1e FG |
496 | |
497 | This code prints: | |
498 | ||
499 | ``` | |
500 | [ | |
501 | Rectangle { | |
502 | width: 3, | |
503 | height: 5, | |
504 | }, | |
505 | Rectangle { | |
506 | width: 7, | |
507 | height: 12, | |
508 | }, | |
509 | Rectangle { | |
510 | width: 10, | |
511 | height: 1, | |
512 | }, | |
513 | ] | |
514 | ``` | |
515 | ||
516 | The reason `sort_by_key` is defined to take an `FnMut` closure is that it calls | |
517 | the closure multiple times: once for each item in the slice. The closure `|r| | |
2b03887a | 518 | r.width` doesn’t capture, mutate, or move anything out from its environment, so |
04454e1e FG |
519 | it meets the trait bound requirements. |
520 | ||
064997fb | 521 | In contrast, Listing 13-8 shows an example of a closure that implements just |
923072b8 | 522 | the `FnOnce` trait, because it moves a value out of the environment. The |
2b03887a | 523 | compiler won’t let us use this closure with `sort_by_key`. |
04454e1e FG |
524 | |
525 | Filename: src/main.rs | |
526 | ||
527 | ``` | |
2b03887a | 528 | --snip-- |
04454e1e FG |
529 | |
530 | fn main() { | |
531 | let mut list = [ | |
923072b8 FG |
532 | Rectangle { width: 10, height: 1 }, |
533 | Rectangle { width: 3, height: 5 }, | |
534 | Rectangle { width: 7, height: 12 }, | |
04454e1e FG |
535 | ]; |
536 | ||
537 | let mut sort_operations = vec![]; | |
538 | let value = String::from("by key called"); | |
539 | ||
540 | list.sort_by_key(|r| { | |
541 | sort_operations.push(value); | |
542 | r.width | |
543 | }); | |
544 | println!("{:#?}", list); | |
545 | } | |
546 | ``` | |
547 | ||
064997fb | 548 | Listing 13-8: Attempting to use an `FnOnce` closure with `sort_by_key` |
04454e1e FG |
549 | |
550 | This is a contrived, convoluted way (that doesn’t work) to try and count the | |
551 | number of times `sort_by_key` gets called when sorting `list`. This code | |
923072b8 | 552 | attempts to do this counting by pushing `value`—a `String` from the closure’s |
2b03887a | 553 | environment—into the `sort_operations` vector. The closure captures `value` and |
04454e1e FG |
554 | then moves `value` out of the closure by transferring ownership of `value` to |
555 | the `sort_operations` vector. This closure can be called once; trying to call | |
556 | it a second time wouldn’t work because `value` would no longer be in the | |
557 | environment to be pushed into `sort_operations` again! Therefore, this closure | |
558 | only implements `FnOnce`. When we try to compile this code, we get this error | |
559 | that `value` can’t be moved out of the closure because the closure must | |
560 | implement `FnMut`: | |
561 | ||
562 | ``` | |
2b03887a FG |
563 | error[E0507]: cannot move out of `value`, a captured variable in an `FnMut` |
564 | closure | |
923072b8 | 565 | --> src/main.rs:18:30 |
04454e1e | 566 | | |
923072b8 | 567 | 15 | let value = String::from("by key called"); |
04454e1e | 568 | | ----- captured outer variable |
923072b8 FG |
569 | 16 | |
570 | 17 | list.sort_by_key(|r| { | |
04454e1e | 571 | | ______________________- |
923072b8 | 572 | 18 | | sort_operations.push(value); |
2b03887a FG |
573 | | | ^^^^^ move occurs because `value` has |
574 | type `String`, which does not implement the `Copy` trait | |
923072b8 FG |
575 | 19 | | r.width |
576 | 20 | | }); | |
04454e1e FG |
577 | | |_____- captured by this `FnMut` closure |
578 | ``` | |
579 | ||
580 | The error points to the line in the closure body that moves `value` out of the | |
581 | environment. To fix this, we need to change the closure body so that it doesn’t | |
2b03887a FG |
582 | move values out of the environment. Keeping a counter in the environment and |
583 | incrementing its value in the closure body is a more straightforward way to | |
584 | count the number of times `sort_by_key` is called. The closure in Listing 13-9 | |
585 | works with `sort_by_key` because it is only capturing a mutable reference to | |
586 | the `num_sort_operations` counter and can therefore be called more than once. | |
04454e1e FG |
587 | |
588 | Filename: src/main.rs | |
589 | ||
590 | ``` | |
2b03887a | 591 | --snip-- |
04454e1e FG |
592 | |
593 | fn main() { | |
2b03887a | 594 | --snip-- |
04454e1e FG |
595 | |
596 | let mut num_sort_operations = 0; | |
597 | list.sort_by_key(|r| { | |
598 | num_sort_operations += 1; | |
599 | r.width | |
600 | }); | |
2b03887a FG |
601 | println!( |
602 | "{:#?}, sorted in {num_sort_operations} operations", | |
603 | list | |
604 | ); | |
04454e1e FG |
605 | } |
606 | ``` | |
607 | ||
2b03887a | 608 | Listing 13-9: Using an `FnMut` closure with `sort_by_key` is allowed. |
04454e1e FG |
609 | |
610 | The `Fn` traits are important when defining or using functions or types that | |
923072b8 FG |
611 | make use of closures. In the next section, we’ll discuss iterators. Many |
612 | iterator methods take closure arguments, so keep these closure details in mind | |
613 | as we continue! | |
04454e1e FG |
614 | |
615 | ## Processing a Series of Items with Iterators | |
616 | ||
617 | The iterator pattern allows you to perform some task on a sequence of items in | |
618 | turn. An iterator is responsible for the logic of iterating over each item and | |
619 | determining when the sequence has finished. When you use iterators, you don’t | |
620 | have to reimplement that logic yourself. | |
621 | ||
622 | In Rust, iterators are *lazy*, meaning they have no effect until you call | |
623 | methods that consume the iterator to use it up. For example, the code in | |
064997fb | 624 | Listing 13-10 creates an iterator over the items in the vector `v1` by calling |
04454e1e FG |
625 | the `iter` method defined on `Vec<T>`. This code by itself doesn’t do anything |
626 | useful. | |
627 | ||
628 | ``` | |
629 | let v1 = vec![1, 2, 3]; | |
630 | ||
631 | let v1_iter = v1.iter(); | |
632 | ``` | |
633 | ||
064997fb | 634 | Listing 13-10: Creating an iterator |
04454e1e | 635 | |
923072b8 | 636 | The iterator is stored in the `v1_iter` variable. Once we’ve created an |
2b03887a FG |
637 | iterator, we can use it in a variety of ways. In Listing 3-5, we iterated over |
638 | an array using a `for` loop to execute some code on each of its items. Under | |
639 | the hood, this implicitly created and then consumed an iterator, but we glossed | |
640 | over how exactly that works until now. | |
04454e1e | 641 | |
064997fb | 642 | In the example in Listing 13-11, we separate the creation of the iterator from |
923072b8 FG |
643 | the use of the iterator in the `for` loop. When the `for` loop is called using |
644 | the iterator in `v1_iter`, each element in the iterator is used in one | |
645 | iteration of the loop, which prints out each value. | |
04454e1e FG |
646 | |
647 | ``` | |
648 | let v1 = vec![1, 2, 3]; | |
649 | ||
650 | let v1_iter = v1.iter(); | |
651 | ||
652 | for val in v1_iter { | |
2b03887a | 653 | println!("Got: {val}"); |
04454e1e FG |
654 | } |
655 | ``` | |
656 | ||
064997fb | 657 | Listing 13-11: Using an iterator in a `for` loop |
04454e1e FG |
658 | |
659 | In languages that don’t have iterators provided by their standard libraries, | |
660 | you would likely write this same functionality by starting a variable at index | |
661 | 0, using that variable to index into the vector to get a value, and | |
662 | incrementing the variable value in a loop until it reached the total number of | |
663 | items in the vector. | |
664 | ||
2b03887a | 665 | Iterators handle all of that logic for you, cutting down on repetitive code you |
04454e1e FG |
666 | could potentially mess up. Iterators give you more flexibility to use the same |
667 | logic with many different kinds of sequences, not just data structures you can | |
668 | index into, like vectors. Let’s examine how iterators do that. | |
669 | ||
2b03887a | 670 | ### The Iterator Trait and the next Method |
04454e1e FG |
671 | |
672 | All iterators implement a trait named `Iterator` that is defined in the | |
673 | standard library. The definition of the trait looks like this: | |
674 | ||
675 | ``` | |
676 | pub trait Iterator { | |
677 | type Item; | |
678 | ||
679 | fn next(&mut self) -> Option<Self::Item>; | |
680 | ||
681 | // methods with default implementations elided | |
682 | } | |
683 | ``` | |
684 | ||
2b03887a | 685 | Notice that this definition uses some new syntax: `type Item` and `Self::Item`, |
04454e1e FG |
686 | which are defining an *associated type* with this trait. We’ll talk about |
687 | associated types in depth in Chapter 19. For now, all you need to know is that | |
688 | this code says implementing the `Iterator` trait requires that you also define | |
689 | an `Item` type, and this `Item` type is used in the return type of the `next` | |
690 | method. In other words, the `Item` type will be the type returned from the | |
691 | iterator. | |
692 | ||
693 | The `Iterator` trait only requires implementors to define one method: the | |
2b03887a FG |
694 | `next` method, which returns one item of the iterator at a time, wrapped in |
695 | `Some`, and, when iteration is over, returns `None`. | |
04454e1e | 696 | |
064997fb | 697 | We can call the `next` method on iterators directly; Listing 13-12 demonstrates |
04454e1e FG |
698 | what values are returned from repeated calls to `next` on the iterator created |
699 | from the vector. | |
700 | ||
701 | Filename: src/lib.rs | |
702 | ||
703 | ``` | |
704 | #[test] | |
705 | fn iterator_demonstration() { | |
706 | let v1 = vec![1, 2, 3]; | |
707 | ||
708 | let mut v1_iter = v1.iter(); | |
709 | ||
710 | assert_eq!(v1_iter.next(), Some(&1)); | |
711 | assert_eq!(v1_iter.next(), Some(&2)); | |
712 | assert_eq!(v1_iter.next(), Some(&3)); | |
713 | assert_eq!(v1_iter.next(), None); | |
714 | } | |
715 | ``` | |
716 | ||
064997fb | 717 | Listing 13-12: Calling the `next` method on an iterator |
04454e1e FG |
718 | |
719 | Note that we needed to make `v1_iter` mutable: calling the `next` method on an | |
720 | iterator changes internal state that the iterator uses to keep track of where | |
721 | it is in the sequence. In other words, this code *consumes*, or uses up, the | |
722 | iterator. Each call to `next` eats up an item from the iterator. We didn’t need | |
723 | to make `v1_iter` mutable when we used a `for` loop because the loop took | |
724 | ownership of `v1_iter` and made it mutable behind the scenes. | |
725 | ||
726 | Also note that the values we get from the calls to `next` are immutable | |
727 | references to the values in the vector. The `iter` method produces an iterator | |
728 | over immutable references. If we want to create an iterator that takes | |
729 | ownership of `v1` and returns owned values, we can call `into_iter` instead of | |
730 | `iter`. Similarly, if we want to iterate over mutable references, we can call | |
731 | `iter_mut` instead of `iter`. | |
732 | ||
2b03887a | 733 | ### Methods That Consume the Iterator |
04454e1e FG |
734 | |
735 | The `Iterator` trait has a number of different methods with default | |
736 | implementations provided by the standard library; you can find out about these | |
737 | methods by looking in the standard library API documentation for the `Iterator` | |
738 | trait. Some of these methods call the `next` method in their definition, which | |
739 | is why you’re required to implement the `next` method when implementing the | |
740 | `Iterator` trait. | |
741 | ||
2b03887a | 742 | Methods that call `next` are called *consuming adapters* because calling them |
04454e1e FG |
743 | uses up the iterator. One example is the `sum` method, which takes ownership of |
744 | the iterator and iterates through the items by repeatedly calling `next`, thus | |
745 | consuming the iterator. As it iterates through, it adds each item to a running | |
064997fb | 746 | total and returns the total when iteration is complete. Listing 13-13 has a |
2b03887a | 747 | test illustrating a use of the `sum` method. |
04454e1e FG |
748 | |
749 | Filename: src/lib.rs | |
750 | ||
751 | ``` | |
752 | #[test] | |
753 | fn iterator_sum() { | |
754 | let v1 = vec![1, 2, 3]; | |
755 | ||
756 | let v1_iter = v1.iter(); | |
757 | ||
758 | let total: i32 = v1_iter.sum(); | |
759 | ||
760 | assert_eq!(total, 6); | |
761 | } | |
762 | ``` | |
763 | ||
064997fb | 764 | Listing 13-13: Calling the `sum` method to get the total of all items in the |
04454e1e FG |
765 | iterator |
766 | ||
767 | We aren’t allowed to use `v1_iter` after the call to `sum` because `sum` takes | |
768 | ownership of the iterator we call it on. | |
769 | ||
2b03887a | 770 | ### Methods That Produce Other Iterators |
04454e1e | 771 | |
2b03887a | 772 | *Iterator adapters* are methods defined on the `Iterator` trait that don’t |
923072b8 FG |
773 | consume the iterator. Instead, they produce different iterators by changing |
774 | some aspect of the original iterator. | |
775 | ||
2b03887a | 776 | Listing 13-14 shows an example of calling the iterator adapter method `map`, |
923072b8 FG |
777 | which takes a closure to call on each item as the items are iterated through. |
778 | The `map` method returns a new iterator that produces the modified items. The | |
779 | closure here creates a new iterator in which each item from the vector will be | |
2b03887a | 780 | incremented by 1. |
04454e1e FG |
781 | |
782 | Filename: src/main.rs | |
783 | ||
784 | ``` | |
785 | let v1: Vec<i32> = vec![1, 2, 3]; | |
786 | ||
787 | v1.iter().map(|x| x + 1); | |
788 | ``` | |
789 | ||
2b03887a | 790 | Listing 13-14: Calling the iterator adapter `map` to create a new iterator |
04454e1e | 791 | |
923072b8 | 792 | However, this code produces a warning: |
04454e1e FG |
793 | |
794 | ``` | |
795 | warning: unused `Map` that must be used | |
796 | --> src/main.rs:4:5 | |
797 | | | |
798 | 4 | v1.iter().map(|x| x + 1); | |
799 | | ^^^^^^^^^^^^^^^^^^^^^^^^^ | |
800 | | | |
801 | = note: `#[warn(unused_must_use)]` on by default | |
802 | = note: iterators are lazy and do nothing unless consumed | |
803 | ``` | |
804 | ||
064997fb | 805 | The code in Listing 13-14 doesn’t do anything; the closure we’ve specified |
2b03887a | 806 | never gets called. The warning reminds us why: iterator adapters are lazy, and |
04454e1e FG |
807 | we need to consume the iterator here. |
808 | ||
923072b8 | 809 | To fix this warning and consume the iterator, we’ll use the `collect` method, |
2b03887a FG |
810 | which we used with `env::args` in Listing 12-1. This method consumes the |
811 | iterator and collects the resultant values into a collection data type. | |
04454e1e | 812 | |
2b03887a FG |
813 | In Listing 13-15, we collect into a vector the results of iterating over the |
814 | iterator that’s returned from the call to `map`. This vector will end up | |
815 | containing each item from the original vector, incremented by 1. | |
04454e1e FG |
816 | |
817 | Filename: src/main.rs | |
818 | ||
819 | ``` | |
820 | let v1: Vec<i32> = vec![1, 2, 3]; | |
821 | ||
822 | let v2: Vec<_> = v1.iter().map(|x| x + 1).collect(); | |
823 | ||
824 | assert_eq!(v2, vec![2, 3, 4]); | |
825 | ``` | |
826 | ||
2b03887a | 827 | Listing 13-15: Calling the `map` method to create a new iterator, and then |
04454e1e FG |
828 | calling the `collect` method to consume the new iterator and create a vector |
829 | ||
830 | Because `map` takes a closure, we can specify any operation we want to perform | |
831 | on each item. This is a great example of how closures let you customize some | |
832 | behavior while reusing the iteration behavior that the `Iterator` trait | |
833 | provides. | |
834 | ||
2b03887a | 835 | You can chain multiple calls to iterator adapters to perform complex actions in |
923072b8 | 836 | a readable way. But because all iterators are lazy, you have to call one of the |
2b03887a | 837 | consuming adapter methods to get results from calls to iterator adapters. |
04454e1e | 838 | |
2b03887a | 839 | ### Using Closures That Capture Their Environment |
04454e1e | 840 | |
923072b8 FG |
841 | Many iterator adapters take closures as arguments, and commonly the closures |
842 | we’ll specify as arguments to iterator adapters will be closures that capture | |
843 | their environment. | |
064997fb | 844 | |
923072b8 | 845 | For this example, we’ll use the `filter` method that takes a closure. The |
064997fb | 846 | closure gets an item from the iterator and returns a `bool`. If the closure |
923072b8 FG |
847 | returns `true`, the value will be included in the iteration produced by |
848 | `filter`. If the closure returns `false`, the value won’t be included. | |
849 | ||
064997fb | 850 | In Listing 13-16, we use `filter` with a closure that captures the `shoe_size` |
04454e1e FG |
851 | variable from its environment to iterate over a collection of `Shoe` struct |
852 | instances. It will return only shoes that are the specified size. | |
853 | ||
854 | Filename: src/lib.rs | |
855 | ||
856 | ``` | |
857 | #[derive(PartialEq, Debug)] | |
858 | struct Shoe { | |
859 | size: u32, | |
860 | style: String, | |
861 | } | |
862 | ||
863 | fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> { | |
864 | shoes.into_iter().filter(|s| s.size == shoe_size).collect() | |
865 | } | |
866 | ||
867 | #[cfg(test)] | |
868 | mod tests { | |
869 | use super::*; | |
870 | ||
871 | #[test] | |
872 | fn filters_by_size() { | |
873 | let shoes = vec![ | |
874 | Shoe { | |
875 | size: 10, | |
876 | style: String::from("sneaker"), | |
877 | }, | |
878 | Shoe { | |
879 | size: 13, | |
880 | style: String::from("sandal"), | |
881 | }, | |
882 | Shoe { | |
883 | size: 10, | |
884 | style: String::from("boot"), | |
885 | }, | |
886 | ]; | |
887 | ||
888 | let in_my_size = shoes_in_size(shoes, 10); | |
889 | ||
890 | assert_eq!( | |
891 | in_my_size, | |
892 | vec![ | |
893 | Shoe { | |
894 | size: 10, | |
895 | style: String::from("sneaker") | |
896 | }, | |
897 | Shoe { | |
898 | size: 10, | |
899 | style: String::from("boot") | |
900 | }, | |
901 | ] | |
902 | ); | |
903 | } | |
904 | } | |
905 | ``` | |
906 | ||
064997fb | 907 | Listing 13-16: Using the `filter` method with a closure that captures |
04454e1e FG |
908 | `shoe_size` |
909 | ||
910 | The `shoes_in_size` function takes ownership of a vector of shoes and a shoe | |
911 | size as parameters. It returns a vector containing only shoes of the specified | |
912 | size. | |
913 | ||
2b03887a FG |
914 | In the body of `shoes_in_size`, we call `into_iter` to create an iterator that |
915 | takes ownership of the vector. Then we call `filter` to adapt that iterator | |
916 | into a new iterator that only contains elements for which the closure returns | |
917 | `true`. | |
04454e1e FG |
918 | |
919 | The closure captures the `shoe_size` parameter from the environment and | |
920 | compares the value with each shoe’s size, keeping only shoes of the size | |
921 | specified. Finally, calling `collect` gathers the values returned by the | |
922 | adapted iterator into a vector that’s returned by the function. | |
923 | ||
2b03887a FG |
924 | The test shows that when we call `shoes_in_size`, we get back only shoes that |
925 | have the same size as the value we specified. | |
04454e1e FG |
926 | |
927 | ## Improving Our I/O Project | |
928 | ||
929 | With this new knowledge about iterators, we can improve the I/O project in | |
930 | Chapter 12 by using iterators to make places in the code clearer and more | |
931 | concise. Let’s look at how iterators can improve our implementation of the | |
923072b8 | 932 | `Config::build` function and the `search` function. |
04454e1e | 933 | |
2b03887a | 934 | ### Removing a clone Using an Iterator |
04454e1e FG |
935 | |
936 | In Listing 12-6, we added code that took a slice of `String` values and created | |
937 | an instance of the `Config` struct by indexing into the slice and cloning the | |
064997fb | 938 | values, allowing the `Config` struct to own those values. In Listing 13-17, |
923072b8 | 939 | we’ve reproduced the implementation of the `Config::build` function as it was |
2b03887a | 940 | in Listing 12-23. |
04454e1e FG |
941 | |
942 | Filename: src/lib.rs | |
943 | ||
944 | ``` | |
945 | impl Config { | |
2b03887a FG |
946 | pub fn build( |
947 | args: &[String] | |
948 | ) -> Result<Config, &'static str> { | |
04454e1e FG |
949 | if args.len() < 3 { |
950 | return Err("not enough arguments"); | |
951 | } | |
952 | ||
953 | let query = args[1].clone(); | |
923072b8 | 954 | let file_path = args[2].clone(); |
04454e1e | 955 | |
923072b8 | 956 | let ignore_case = env::var("IGNORE_CASE").is_ok(); |
04454e1e FG |
957 | |
958 | Ok(Config { | |
959 | query, | |
923072b8 FG |
960 | file_path, |
961 | ignore_case, | |
04454e1e FG |
962 | }) |
963 | } | |
964 | } | |
965 | ``` | |
966 | ||
064997fb | 967 | Listing 13-17: Reproduction of the `Config::build` function from Listing 12-23 |
04454e1e FG |
968 | |
969 | At the time, we said not to worry about the inefficient `clone` calls because | |
970 | we would remove them in the future. Well, that time is now! | |
971 | ||
972 | We needed `clone` here because we have a slice with `String` elements in the | |
923072b8 | 973 | parameter `args`, but the `build` function doesn’t own `args`. To return |
04454e1e FG |
974 | ownership of a `Config` instance, we had to clone the values from the `query` |
975 | and `filename` fields of `Config` so the `Config` instance can own its values. | |
976 | ||
923072b8 | 977 | With our new knowledge about iterators, we can change the `build` function to |
04454e1e FG |
978 | take ownership of an iterator as its argument instead of borrowing a slice. |
979 | We’ll use the iterator functionality instead of the code that checks the length | |
980 | of the slice and indexes into specific locations. This will clarify what the | |
923072b8 | 981 | `Config::build` function is doing because the iterator will access the values. |
04454e1e | 982 | |
923072b8 | 983 | Once `Config::build` takes ownership of the iterator and stops using indexing |
04454e1e FG |
984 | operations that borrow, we can move the `String` values from the iterator into |
985 | `Config` rather than calling `clone` and making a new allocation. | |
986 | ||
987 | #### Using the Returned Iterator Directly | |
988 | ||
989 | Open your I/O project’s *src/main.rs* file, which should look like this: | |
990 | ||
991 | Filename: src/main.rs | |
992 | ||
993 | ``` | |
994 | fn main() { | |
995 | let args: Vec<String> = env::args().collect(); | |
996 | ||
923072b8 FG |
997 | let config = Config::build(&args).unwrap_or_else(|err| { |
998 | eprintln!("Problem parsing arguments: {err}"); | |
04454e1e FG |
999 | process::exit(1); |
1000 | }); | |
1001 | ||
2b03887a | 1002 | --snip-- |
04454e1e FG |
1003 | } |
1004 | ``` | |
1005 | ||
923072b8 | 1006 | We’ll first change the start of the `main` function that we had in Listing |
064997fb | 1007 | 12-24 to the code in Listing 13-18, which this time uses an iterator. This |
923072b8 | 1008 | won’t compile until we update `Config::build` as well. |
04454e1e FG |
1009 | |
1010 | Filename: src/main.rs | |
1011 | ||
1012 | ``` | |
1013 | fn main() { | |
2b03887a FG |
1014 | let config = |
1015 | Config::build(env::args()).unwrap_or_else(|err| { | |
1016 | eprintln!("Problem parsing arguments: {err}"); | |
1017 | process::exit(1); | |
1018 | }); | |
04454e1e | 1019 | |
2b03887a | 1020 | --snip-- |
04454e1e FG |
1021 | } |
1022 | ``` | |
1023 | ||
064997fb | 1024 | Listing 13-18: Passing the return value of `env::args` to `Config::build` |
04454e1e FG |
1025 | |
1026 | The `env::args` function returns an iterator! Rather than collecting the | |
923072b8 | 1027 | iterator values into a vector and then passing a slice to `Config::build`, now |
04454e1e | 1028 | we’re passing ownership of the iterator returned from `env::args` to |
923072b8 | 1029 | `Config::build` directly. |
04454e1e | 1030 | |
923072b8 FG |
1031 | Next, we need to update the definition of `Config::build`. In your I/O |
1032 | project’s *src/lib.rs* file, let’s change the signature of `Config::build` to | |
2b03887a FG |
1033 | look like Listing 13-19. This still won’t compile, because we need to update |
1034 | the function body. | |
04454e1e FG |
1035 | |
1036 | Filename: src/lib.rs | |
1037 | ||
1038 | ``` | |
1039 | impl Config { | |
923072b8 | 1040 | pub fn build( |
04454e1e FG |
1041 | mut args: impl Iterator<Item = String>, |
1042 | ) -> Result<Config, &'static str> { | |
2b03887a | 1043 | --snip-- |
04454e1e FG |
1044 | ``` |
1045 | ||
064997fb | 1046 | Listing 13-19: Updating the signature of `Config::build` to expect an iterator |
04454e1e FG |
1047 | |
1048 | The standard library documentation for the `env::args` function shows that the | |
1049 | type of the iterator it returns is `std::env::Args`, and that type implements | |
1050 | the `Iterator` trait and returns `String` values. | |
1051 | ||
923072b8 | 1052 | We’ve updated the signature of the `Config::build` function so the parameter |
04454e1e FG |
1053 | `args` has a generic type with the trait bounds `impl Iterator<Item = String>` |
1054 | instead of `&[String]`. This usage of the `impl Trait` syntax we discussed in | |
2b03887a FG |
1055 | “Traits as Parameters” on page XX means that `args` can be any type that |
1056 | implements the `Iterator` type and returns `String` items. | |
04454e1e FG |
1057 | |
1058 | Because we’re taking ownership of `args` and we’ll be mutating `args` by | |
1059 | iterating over it, we can add the `mut` keyword into the specification of the | |
1060 | `args` parameter to make it mutable. | |
1061 | ||
2b03887a | 1062 | #### Using Iterator Trait Methods Instead of Indexing |
04454e1e | 1063 | |
923072b8 | 1064 | Next, we’ll fix the body of `Config::build`. Because `args` implements the |
064997fb | 1065 | `Iterator` trait, we know we can call the `next` method on it! Listing 13-20 |
2b03887a | 1066 | updates the code from Listing 12-23 to use the `next` method. |
04454e1e FG |
1067 | |
1068 | Filename: src/lib.rs | |
1069 | ||
1070 | ``` | |
1071 | impl Config { | |
923072b8 | 1072 | pub fn build( |
04454e1e FG |
1073 | mut args: impl Iterator<Item = String>, |
1074 | ) -> Result<Config, &'static str> { | |
1075 | args.next(); | |
1076 | ||
1077 | let query = match args.next() { | |
1078 | Some(arg) => arg, | |
1079 | None => return Err("Didn't get a query string"), | |
1080 | }; | |
1081 | ||
923072b8 | 1082 | let file_path = match args.next() { |
04454e1e | 1083 | Some(arg) => arg, |
923072b8 | 1084 | None => return Err("Didn't get a file path"), |
04454e1e FG |
1085 | }; |
1086 | ||
923072b8 | 1087 | let ignore_case = env::var("IGNORE_CASE").is_ok(); |
04454e1e FG |
1088 | |
1089 | Ok(Config { | |
1090 | query, | |
923072b8 FG |
1091 | file_path, |
1092 | ignore_case, | |
04454e1e FG |
1093 | }) |
1094 | } | |
1095 | } | |
1096 | ``` | |
1097 | ||
064997fb | 1098 | Listing 13-20: Changing the body of `Config::build` to use iterator methods |
04454e1e FG |
1099 | |
1100 | Remember that the first value in the return value of `env::args` is the name of | |
1101 | the program. We want to ignore that and get to the next value, so first we call | |
2b03887a FG |
1102 | `next` and do nothing with the return value. Then we call `next` to get the |
1103 | value we want to put in the `query` field of `Config`. If `next` returns | |
04454e1e FG |
1104 | `Some`, we use a `match` to extract the value. If it returns `None`, it means |
1105 | not enough arguments were given and we return early with an `Err` value. We do | |
1106 | the same thing for the `filename` value. | |
1107 | ||
2b03887a | 1108 | ### Making Code Clearer with Iterator Adapters |
04454e1e FG |
1109 | |
1110 | We can also take advantage of iterators in the `search` function in our I/O | |
2b03887a | 1111 | project, which is reproduced here in Listing 13-21 as it was in Listing 12-19. |
04454e1e FG |
1112 | |
1113 | Filename: src/lib.rs | |
1114 | ||
1115 | ``` | |
2b03887a FG |
1116 | pub fn search<'a>( |
1117 | query: &str, | |
1118 | contents: &'a str, | |
1119 | ) -> Vec<&'a str> { | |
04454e1e FG |
1120 | let mut results = Vec::new(); |
1121 | ||
1122 | for line in contents.lines() { | |
1123 | if line.contains(query) { | |
1124 | results.push(line); | |
1125 | } | |
1126 | } | |
1127 | ||
1128 | results | |
1129 | } | |
1130 | ``` | |
1131 | ||
064997fb | 1132 | Listing 13-21: The implementation of the `search` function from Listing 12-19 |
04454e1e | 1133 | |
2b03887a | 1134 | We can write this code in a more concise way using iterator adapter methods. |
04454e1e FG |
1135 | Doing so also lets us avoid having a mutable intermediate `results` vector. The |
1136 | functional programming style prefers to minimize the amount of mutable state to | |
1137 | make code clearer. Removing the mutable state might enable a future enhancement | |
2b03887a FG |
1138 | to make searching happen in parallel because we wouldn’t have to manage |
1139 | concurrent access to the `results` vector. Listing 13-22 shows this change. | |
04454e1e FG |
1140 | |
1141 | Filename: src/lib.rs | |
1142 | ||
1143 | ``` | |
2b03887a FG |
1144 | pub fn search<'a>( |
1145 | query: &str, | |
1146 | contents: &'a str, | |
1147 | ) -> Vec<&'a str> { | |
04454e1e FG |
1148 | contents |
1149 | .lines() | |
1150 | .filter(|line| line.contains(query)) | |
1151 | .collect() | |
1152 | } | |
1153 | ``` | |
1154 | ||
2b03887a | 1155 | Listing 13-22: Using iterator adapter methods in the implementation of the |
04454e1e FG |
1156 | `search` function |
1157 | ||
1158 | Recall that the purpose of the `search` function is to return all lines in | |
1159 | `contents` that contain the `query`. Similar to the `filter` example in Listing | |
2b03887a FG |
1160 | 13-16, this code uses the `filter` adapter to keep only the lines for which |
1161 | `line.contains(query)` returns `true`. We then collect the matching lines into | |
1162 | another vector with `collect`. Much simpler! Feel free to make the same change | |
1163 | to use iterator methods in the `search_case_insensitive` function as well. | |
04454e1e | 1164 | |
2b03887a | 1165 | ### Choosing Between Loops and Iterators |
923072b8 | 1166 | |
04454e1e | 1167 | The next logical question is which style you should choose in your own code and |
064997fb FG |
1168 | why: the original implementation in Listing 13-21 or the version using |
1169 | iterators in Listing 13-22. Most Rust programmers prefer to use the iterator | |
04454e1e | 1170 | style. It’s a bit tougher to get the hang of at first, but once you get a feel |
2b03887a | 1171 | for the various iterator adapters and what they do, iterators can be easier to |
04454e1e FG |
1172 | understand. Instead of fiddling with the various bits of looping and building |
1173 | new vectors, the code focuses on the high-level objective of the loop. This | |
1174 | abstracts away some of the commonplace code so it’s easier to see the concepts | |
1175 | that are unique to this code, such as the filtering condition each element in | |
1176 | the iterator must pass. | |
1177 | ||
1178 | But are the two implementations truly equivalent? The intuitive assumption | |
2b03887a | 1179 | might be that the lower-level loop will be faster. Let’s talk about performance. |
04454e1e FG |
1180 | |
1181 | ## Comparing Performance: Loops vs. Iterators | |
1182 | ||
1183 | To determine whether to use loops or iterators, you need to know which | |
1184 | implementation is faster: the version of the `search` function with an explicit | |
1185 | `for` loop or the version with iterators. | |
1186 | ||
1187 | We ran a benchmark by loading the entire contents of *The Adventures of | |
1188 | Sherlock Holmes* by Sir Arthur Conan Doyle into a `String` and looking for the | |
1189 | word *the* in the contents. Here are the results of the benchmark on the | |
1190 | version of `search` using the `for` loop and the version using iterators: | |
1191 | ||
1192 | ``` | |
1193 | test bench_search_for ... bench: 19,620,300 ns/iter (+/- 915,700) | |
1194 | test bench_search_iter ... bench: 19,234,900 ns/iter (+/- 657,200) | |
1195 | ``` | |
1196 | ||
1197 | The iterator version was slightly faster! We won’t explain the benchmark code | |
2b03887a FG |
1198 | here because the point is not to prove that the two versions are equivalent but |
1199 | to get a general sense of how these two implementations compare | |
04454e1e FG |
1200 | performance-wise. |
1201 | ||
1202 | For a more comprehensive benchmark, you should check using various texts of | |
1203 | various sizes as the `contents`, different words and words of different lengths | |
1204 | as the `query`, and all kinds of other variations. The point is this: | |
1205 | iterators, although a high-level abstraction, get compiled down to roughly the | |
1206 | same code as if you’d written the lower-level code yourself. Iterators are one | |
2b03887a | 1207 | of Rust’s *zero-cost abstractions*, by which we mean that using the abstraction |
04454e1e FG |
1208 | imposes no additional runtime overhead. This is analogous to how Bjarne |
1209 | Stroustrup, the original designer and implementor of C++, defines | |
1210 | *zero-overhead* in “Foundations of C++” (2012): | |
1211 | ||
1212 | > In general, C++ implementations obey the zero-overhead principle: What you | |
2b03887a FG |
1213 | don’t use, you don’t pay for. And further: What you do use, you couldn’t hand |
1214 | code any better.As another example, the following code is taken from an audio | |
1215 | decoder. The decoding algorithm uses the linear prediction mathematical | |
1216 | operation to estimate future values based on a linear function of the previous | |
1217 | samples. This code uses an iterator chain to do some math on three variables in | |
1218 | scope: a `buffer` slice of data, an array of 12 `coefficients`, and an amount | |
1219 | by which to shift data in `qlp_shift`. We’ve declared the variables within this | |
1220 | example but not given them any values; although this code doesn’t have much | |
1221 | meaning outside of its context, it’s still a concise, real-world example of how | |
1222 | Rust translates high-level ideas to low-level code. | |
04454e1e FG |
1223 | |
1224 | ``` | |
1225 | let buffer: &mut [i32]; | |
1226 | let coefficients: [i64; 12]; | |
1227 | let qlp_shift: i16; | |
1228 | ||
1229 | for i in 12..buffer.len() { | |
1230 | let prediction = coefficients.iter() | |
1231 | .zip(&buffer[i - 12..i]) | |
1232 | .map(|(&c, &s)| c * s as i64) | |
1233 | .sum::<i64>() >> qlp_shift; | |
1234 | let delta = buffer[i]; | |
1235 | buffer[i] = prediction as i32 + delta; | |
1236 | } | |
1237 | ``` | |
1238 | ||
1239 | To calculate the value of `prediction`, this code iterates through each of the | |
1240 | 12 values in `coefficients` and uses the `zip` method to pair the coefficient | |
2b03887a FG |
1241 | values with the previous 12 values in `buffer`. Then, for each pair, it |
1242 | multiplies the values together, sums all the results, and shifts the bits in | |
1243 | the sum `qlp_shift` bits to the right. | |
04454e1e FG |
1244 | |
1245 | Calculations in applications like audio decoders often prioritize performance | |
2b03887a | 1246 | most highly. Here, we’re creating an iterator, using two adapters, and then |
04454e1e FG |
1247 | consuming the value. What assembly code would this Rust code compile to? Well, |
1248 | as of this writing, it compiles down to the same assembly you’d write by hand. | |
1249 | There’s no loop at all corresponding to the iteration over the values in | |
1250 | `coefficients`: Rust knows that there are 12 iterations, so it “unrolls” the | |
1251 | loop. *Unrolling* is an optimization that removes the overhead of the loop | |
1252 | controlling code and instead generates repetitive code for each iteration of | |
1253 | the loop. | |
1254 | ||
1255 | All of the coefficients get stored in registers, which means accessing the | |
1256 | values is very fast. There are no bounds checks on the array access at runtime. | |
2b03887a | 1257 | All of these optimizations that Rust is able to apply make the resultant code |
04454e1e FG |
1258 | extremely efficient. Now that you know this, you can use iterators and closures |
1259 | without fear! They make code seem like it’s higher level but don’t impose a | |
1260 | runtime performance penalty for doing so. | |
1261 | ||
1262 | ## Summary | |
1263 | ||
1264 | Closures and iterators are Rust features inspired by functional programming | |
1265 | language ideas. They contribute to Rust’s capability to clearly express | |
1266 | high-level ideas at low-level performance. The implementations of closures and | |
1267 | iterators are such that runtime performance is not affected. This is part of | |
1268 | Rust’s goal to strive to provide zero-cost abstractions. | |
1269 | ||
1270 | Now that we’ve improved the expressiveness of our I/O project, let’s look at | |
1271 | some more features of `cargo` that will help us share the project with the | |
1272 | world. | |
2b03887a | 1273 |