]>
Commit | Line | Data |
---|---|---|
04454e1e FG |
1 | <!-- DO NOT EDIT THIS FILE. |
2 | ||
3 | This file is periodically generated from the content in the `/src/` | |
4 | directory, so all fixes need to be made in `/src/`. | |
5 | --> | |
6 | ||
7 | [TOC] | |
8 | ||
9 | # Functional Language Features: Iterators and Closures | |
10 | ||
11 | Rust’s design has taken inspiration from many existing languages and | |
12 | techniques, and one significant influence is *functional programming*. | |
13 | Programming in a functional style often includes using functions as values by | |
14 | passing them in arguments, returning them from other functions, assigning them | |
15 | to variables for later execution, and so forth. | |
16 | ||
17 | In this chapter, we won’t debate the issue of what functional programming is or | |
18 | isn’t but will instead discuss some features of Rust that are similar to | |
19 | features in many languages often referred to as functional. | |
20 | ||
21 | More specifically, we’ll cover: | |
22 | ||
23 | * *Closures*, a function-like construct you can store in a variable | |
24 | * *Iterators*, a way of processing a series of elements | |
923072b8 FG |
25 | * How to use closures and iterators to improve the I/O project in Chapter 12 |
26 | * The performance of closures and iterators (Spoiler alert: they’re faster than | |
27 | you might think!) | |
04454e1e | 28 | |
923072b8 FG |
29 | We’ve already covered some other Rust features, such as pattern matching and |
30 | enums, that are also influenced by the functional style. Because mastering | |
04454e1e | 31 | closures and iterators is an important part of writing idiomatic, fast Rust |
923072b8 | 32 | code, we’ll devote this entire chapter to them. |
04454e1e | 33 | |
923072b8 | 34 | ## Closures: Anonymous Functions that Capture Their Environment |
04454e1e FG |
35 | |
36 | Rust’s closures are anonymous functions you can save in a variable or pass as | |
37 | arguments to other functions. You can create the closure in one place and then | |
923072b8 FG |
38 | call the closure elsewhere to evaluate it in a different context. Unlike |
39 | functions, closures can capture values from the scope in which they’re defined. | |
40 | We’ll demonstrate how these closure features allow for code reuse and behavior | |
04454e1e FG |
41 | customization. |
42 | ||
43 | ### Capturing the Environment with Closures | |
44 | ||
923072b8 FG |
45 | We’ll first examine how we can use closures to capture values from the |
46 | environment they’re defined in for later use. Here’s the scenario: Every so | |
47 | often, our t-shirt company gives away an exclusive, limited-edition shirt to | |
48 | someone on our mailing list as a promotion. People on the mailing list can | |
49 | optionally add their favorite color to their profile. If the person chosen for | |
50 | a free shirt has their favorite color set, they get that color shirt. If the | |
51 | person hasn’t specified a favorite color, they get whatever color the company | |
52 | currently has the most of. | |
04454e1e FG |
53 | |
54 | There are many ways to implement this. For this example, we’re going to use an | |
923072b8 FG |
55 | enum called `ShirtColor` that has the variants `Red` and `Blue` (limiting the |
56 | number of colors available for simplicity). | |
57 | <!-- are we saying the company only offers shirts in red or blue, or are we just | |
58 | starting with these two colors? Likely not important for the code, but good to | |
59 | clarify for the narrative! /LC --> | |
60 | <!-- Only red and blue, I've clarified here that it's for the purposes of | |
61 | making this example simpler. In the previous paragraph, I specified that these | |
62 | t-shirts are exclusive, limited-edition promotional items, perhaps that'll make | |
63 | the details of this toy make enough sense that the readers aren't distracted | |
64 | from the closures! /Carol --> | |
65 | We represent the company’s inventory with an `Inventory` struct that has a | |
66 | field named `shirts` that contains a `Vec<ShirtColor>` representing the shirt | |
67 | colors currently in stock. The method `shirt_giveaway` defined on `Inventory` | |
68 | gets the optional shirt color preference of the free shirt winner, and returns | |
69 | the shirt color the person will get. This setup is shown in Listing 13-1: | |
04454e1e FG |
70 | |
71 | Filename: src/main.rs | |
72 | ||
73 | ``` | |
74 | #[derive(Debug, PartialEq, Copy, Clone)] | |
75 | enum ShirtColor { | |
76 | Red, | |
77 | Blue, | |
78 | } | |
79 | ||
80 | struct Inventory { | |
81 | shirts: Vec<ShirtColor>, | |
82 | } | |
83 | ||
84 | impl Inventory { | |
85 | fn giveaway(&self, user_preference: Option<ShirtColor>) -> ShirtColor { | |
923072b8 | 86 | user_preference.unwrap_or_else(|| self.most_stocked()) [1] |
04454e1e FG |
87 | } |
88 | ||
89 | fn most_stocked(&self) -> ShirtColor { | |
90 | let mut num_red = 0; | |
91 | let mut num_blue = 0; | |
92 | ||
93 | for color in &self.shirts { | |
94 | match color { | |
95 | ShirtColor::Red => num_red += 1, | |
96 | ShirtColor::Blue => num_blue += 1, | |
97 | } | |
98 | } | |
99 | if num_red > num_blue { | |
100 | ShirtColor::Red | |
101 | } else { | |
102 | ShirtColor::Blue | |
103 | } | |
104 | } | |
105 | } | |
106 | ||
107 | fn main() { | |
108 | let store = Inventory { | |
923072b8 | 109 | shirts: vec![ShirtColor::Blue, ShirtColor::Red, ShirtColor::Blue], [2] |
04454e1e FG |
110 | }; |
111 | ||
112 | let user_pref1 = Some(ShirtColor::Red); | |
923072b8 | 113 | let giveaway1 = store.giveaway(user_pref1); [3] |
04454e1e FG |
114 | println!( |
115 | "The user with preference {:?} gets {:?}", | |
116 | user_pref1, giveaway1 | |
117 | ); | |
118 | ||
119 | let user_pref2 = None; | |
923072b8 | 120 | let giveaway2 = store.giveaway(user_pref2); [4] |
04454e1e FG |
121 | println!( |
122 | "The user with preference {:?} gets {:?}", | |
123 | user_pref2, giveaway2 | |
124 | ); | |
125 | } | |
126 | ``` | |
127 | ||
923072b8 FG |
128 | Listing 13-1: Shirt company giveaway situation |
129 | ||
130 | The `store` defined in `main` has two blue shirts and one red shirt remaining | |
131 | to distribute for this limited-edition promotion [2]. We call the `giveaway` | |
132 | method for a user with a preference for a red shirt [3] and a user without any | |
133 | preference [4]. | |
134 | <!-- Again... I know this is just a toy example, but it seems jarring for a | |
135 | tshirt company to only have three shirts in stock. I think it's fine if we add | |
136 | a line earlier that says something like "for the sake of simplicity, we'll deal | |
137 | with just two shirt colors, and a small volume of stock" /LC --> | |
138 | <!-- I've repeated the detail here that this is a limited-edition promotion, | |
139 | hope that helps remove the jarringness! /Carol --> | |
140 | ||
141 | Again, this code could be implemented in many ways, and here, to focus on | |
142 | closures, we’ve stuck to concepts you’ve already learned except for the body of | |
143 | the `giveaway` method that uses a closure. In the `giveaway` method, we get the | |
144 | user preference as a parameter of type `Option<ShirtColor>` and call the | |
145 | `unwrap_or_else` method on `user_preference` [1]. The `unwrap_or_else` method on | |
146 | `Option<T>` is defined by the standard library. It takes one argument: a | |
147 | closure without any arguments that returns a value `T` (the same type stored in | |
148 | the `Some` variant of the `Option<T>`, in this case `ShirtColor`). If the | |
149 | `Option<T>` is the `Some` variant, `unwrap_or_else` returns the value from | |
150 | within the `Some`. If the `Option<T>` is the `None` variant, `unwrap_or_else` | |
151 | calls the closure and returns the value returned by the closure. | |
152 | ||
153 | We specify the closure expression `|| self.most_stocked()` as the argument to | |
154 | `unwrap_or_else`. This is a closure that takes no parameters itself (if the | |
155 | closure had parameters, they would appear between the two vertical bars). The | |
156 | body of the closure calls `self.most_stocked()`. We’re defining the closure | |
157 | here, and the implementation of `unwrap_or_else` will evaluate the closure | |
158 | later if the result is needed. | |
159 | ||
160 | <!-- can you show us the code that here counts as they closure? is it, for | |
161 | example, this whole section: (&self, user_preference: Option<ShirtColor>) -> | |
162 | ShirtColor { user_preference.unwrap_or_else(|| self.most_stocked()) ? And what | |
163 | indicates to Rust that it's a closure? Or do we not need to indicate that, Rust | |
164 | doesn't care? I'm thinking about the earlier closure definition "You can create | |
165 | the closure in one place and then call the closure elsewhere to evaluate it in | |
166 | a different context" -- are we using this aspect of the closure here? Can you | |
167 | highligt that in the text? /LC --> | |
168 | <!-- I've tried to clarify, and moved the clarification before showing the | |
169 | result of running the code. Is this better? /Carol --> | |
170 | ||
171 | Running this code prints: | |
04454e1e FG |
172 | |
173 | ``` | |
174 | $ cargo run | |
175 | Compiling shirt-company v0.1.0 (file:///projects/shirt-company) | |
176 | Finished dev [unoptimized + debuginfo] target(s) in 0.27s | |
177 | Running `target/debug/shirt-company` | |
178 | The user with preference Some(Red) gets Red | |
179 | The user with preference None gets Blue | |
180 | ``` | |
181 | ||
923072b8 | 182 | One interesting aspect here is that we’ve passed a closure that calls |
04454e1e FG |
183 | `self.most_stocked()` on the current `Inventory` instance. The standard library |
184 | didn’t need to know anything about the `Inventory` or `ShirtColor` types we | |
923072b8 FG |
185 | defined, or the logic we want to use in this scenario. The closure captures an |
186 | immutable reference to the `self` `Inventory` instance and passes it with the | |
187 | code we specify to the `unwrap_or_else` method. Functions, on the other hand, | |
188 | are not able to capture their environment in this way. | |
04454e1e FG |
189 | |
190 | ### Closure Type Inference and Annotation | |
191 | ||
192 | There are more differences between functions and closures. Closures don’t | |
193 | usually require you to annotate the types of the parameters or the return value | |
194 | like `fn` functions do. Type annotations are required on functions because | |
923072b8 FG |
195 | the types are part of an explicit interface exposed to your users. Defining this |
196 | <!-- functions are part of the explicit interface, or type annotations are? /LC --> | |
197 | <!-- I've clarified! /Carol --> | |
04454e1e | 198 | interface rigidly is important for ensuring that everyone agrees on what types |
923072b8 FG |
199 | of values a function uses and returns. Closures, on the other hand, aren’t used |
200 | in an exposed interface like this: they’re stored in variables and used without | |
201 | naming them and exposing them to users of our library. | |
04454e1e FG |
202 | |
203 | Closures are typically short and relevant only within a narrow context rather | |
204 | than in any arbitrary scenario. Within these limited contexts, the compiler can | |
205 | infer the types of the parameters and the return type, similar to how it’s able | |
206 | to infer the types of most variables (there are rare cases where the compiler | |
207 | needs closure type annotations too). | |
208 | ||
209 | As with variables, we can add type annotations if we want to increase | |
210 | explicitness and clarity at the cost of being more verbose than is strictly | |
211 | necessary. Annotating the types for a closure would look like the definition | |
923072b8 FG |
212 | shown in Listing 13-x. In this example, we’re defining a closure and storing it |
213 | in a variable rather than defining the closure in the spot we pass it as an | |
214 | argument as we did in Listing 13-1. | |
04454e1e FG |
215 | |
216 | Filename: src/main.rs | |
217 | ||
218 | ``` | |
219 | let expensive_closure = |num: u32| -> u32 { | |
220 | println!("calculating slowly..."); | |
221 | thread::sleep(Duration::from_secs(2)); | |
222 | num | |
223 | }; | |
224 | ``` | |
225 | ||
923072b8 | 226 | Listing 13-x: Adding optional type annotations of the parameter and return |
04454e1e | 227 | value types in the closure |
923072b8 FG |
228 | <!-- Interestng, so is this another way to define a closure: with the let |
229 | keywork like a variable? Earlier we defined in (I think!) in the function | |
230 | definition: fn giveaway(&self,... Is it worth pointing out different ways they | |
231 | can be defined, or should that be obvious to the reader? /LC --> | |
232 | <!-- I've tried to clarify the paragraph before the listing, does this clear it | |
233 | up? This code is storing the closure in a variable rather than as an argument. | |
234 | Closure definitions are expressions and are always defined the same way, it's | |
235 | just that they can be used in different contexts like any other expression can. | |
236 | I think the reader will understand that? /Carol --> | |
04454e1e FG |
237 | |
238 | With type annotations added, the syntax of closures looks more similar to the | |
923072b8 FG |
239 | syntax of functions. Here we define a function that adds 1 to its parameter and |
240 | a closure that has the same behavior, for comparison. We’ve added some spaces | |
241 | to line up the relevant parts. This illustrates how closure syntax is similar | |
242 | to function syntax except for the use of pipes and the amount of syntax that is | |
243 | optional: | |
04454e1e FG |
244 | |
245 | ``` | |
246 | fn add_one_v1 (x: u32) -> u32 { x + 1 } | |
247 | let add_one_v2 = |x: u32| -> u32 { x + 1 }; | |
248 | let add_one_v3 = |x| { x + 1 }; | |
249 | let add_one_v4 = |x| x + 1 ; | |
250 | ``` | |
251 | ||
252 | The first line shows a function definition, and the second line shows a fully | |
923072b8 FG |
253 | annotated closure definition. In the third line, we remove the type annotations |
254 | from the closure definition. In the fourth line, we remove the brackets, which | |
255 | are optional because the closure body has only one expression. These are all | |
256 | valid definitions that will produce the same behavior when they’re called. | |
257 | Evaluating the closures is required for `add_one_v3` and `add_one_v4` to be | |
258 | able to compile because the types will be inferred from their usage. This is | |
259 | similar to `let v = Vec::new();` needing either type annotations or values of | |
260 | some type to be inserted into the `Vec` for Rust to be able to infer the type. | |
261 | <!-- I wasn't clear what was meant by "Calling the closures is required for | |
262 | `add_one_v3` and `add_one_v4`..." -- I thought these were closures? /LC --> | |
263 | <!-- I've changed "Calling" to "Evaluating", does that clear it up? I've also | |
264 | added a sentence to have the reader recall how `Vec` works with type | |
265 | annotations; it's similar here /Carol --> | |
266 | ||
267 | For closure definitions, the compiler will infer one concrete type for each of | |
268 | their parameters and for their return value. For instance, Listing 13-x shows | |
269 | the definition of a short closure that just returns the value it receives as a | |
04454e1e | 270 | parameter. This closure isn’t very useful except for the purposes of this |
923072b8 FG |
271 | example. Note that we haven’t added any type annotations to the definition. |
272 | Because there are no type annotations, we can call the closure with any type, | |
273 | which we’ve done here with `String` the first time. If we then try to call | |
274 | `example_closure` with an integer, we’ll get an error. | |
275 | <!-- if we did add type annotations, you mean? Or because we haven't? /LC --> | |
276 | <!-- Because we haven't. I've tried to clarify? /Carol --> | |
04454e1e FG |
277 | |
278 | Filename: src/main.rs | |
279 | ||
280 | ``` | |
281 | let example_closure = |x| x; | |
282 | ||
283 | let s = example_closure(String::from("hello")); | |
284 | let n = example_closure(5); | |
285 | ``` | |
286 | ||
923072b8 | 287 | Listing 13-x: Attempting to call a closure whose types are inferred with two |
04454e1e FG |
288 | different types |
289 | ||
290 | The compiler gives us this error: | |
291 | ||
292 | ``` | |
293 | error[E0308]: mismatched types | |
294 | --> src/main.rs:5:29 | |
295 | | | |
296 | 5 | let n = example_closure(5); | |
297 | | ^- help: try using a conversion method: `.to_string()` | |
298 | | | | |
299 | | expected struct `String`, found integer | |
300 | ``` | |
301 | ||
302 | The first time we call `example_closure` with the `String` value, the compiler | |
303 | infers the type of `x` and the return type of the closure to be `String`. Those | |
304 | types are then locked into the closure in `example_closure`, and we get a type | |
923072b8 | 305 | error when we next try to use a different type with the same closure. |
04454e1e FG |
306 | |
307 | ### Capturing References or Moving Ownership | |
308 | ||
309 | Closures can capture values from their environment in three ways, which | |
310 | directly map to the three ways a function can take a parameter: borrowing | |
311 | immutably, borrowing mutably, and taking ownership. The closure will decide | |
312 | which of these to use based on what the body of the function does with the | |
313 | captured values. | |
314 | ||
923072b8 FG |
315 | In Listing 13-x, we define a closure that captures an immutable reference to the |
316 | vector named `list` because it only needs an immutable reference to print the | |
317 | value: | |
04454e1e FG |
318 | |
319 | Filename: src/main.rs | |
320 | ||
321 | ``` | |
322 | fn main() { | |
323 | let list = vec![1, 2, 3]; | |
324 | println!("Before defining closure: {:?}", list); | |
325 | ||
923072b8 | 326 | [1] let only_borrows = || println!("From closure: {:?}", list); |
04454e1e FG |
327 | |
328 | println!("Before calling closure: {:?}", list); | |
923072b8 | 329 | only_borrows(); [2] |
04454e1e FG |
330 | println!("After calling closure: {:?}", list); |
331 | } | |
332 | ``` | |
333 | ||
923072b8 FG |
334 | Listing 13-x: Defining and calling a closure that captures an immutable |
335 | reference | |
336 | ||
337 | This example also illustrates that a variable can bind to a closure definition | |
338 | [1], and we can later call the closure by using the variable name and | |
339 | parentheses as if the variable name were a function name [2]. | |
340 | <!-- That's cool. I'm changing to the active voice here, but I wanted to make | |
341 | sure I'm not changing meaning: it is us calling the closure later | |
342 | intentionally, right? It's not happening automatically behind the scenes? /LC | |
343 | --> | |
344 | <!-- Yes, beacuse we've typed `only_borrows()`. I've moved this text after the | |
345 | listing and added some wingdings to hopefully be clearer /Carol --> | |
04454e1e | 346 | |
923072b8 FG |
347 | Because we can have multiple immutable references to `list` at the same time, |
348 | `list` is still accessible from the code before the closure definition, after | |
04454e1e | 349 | the closure definition but before the closure is called, and after the closure |
923072b8 | 350 | is called. This code compiles, runs, and prints: |
04454e1e FG |
351 | |
352 | ``` | |
353 | Before defining closure: [1, 2, 3] | |
354 | Before calling closure: [1, 2, 3] | |
355 | From closure: [1, 2, 3] | |
356 | After calling closure: [1, 2, 3] | |
357 | ``` | |
358 | ||
923072b8 FG |
359 | Next, in Listing 13-x, we change the closure body so that it adds an element to |
360 | the `list` vector. The closure now captures a mutable reference: | |
04454e1e FG |
361 | |
362 | Filename: src/main.rs | |
363 | ||
364 | ``` | |
365 | fn main() { | |
366 | let mut list = vec![1, 2, 3]; | |
367 | println!("Before defining closure: {:?}", list); | |
368 | ||
369 | let mut borrows_mutably = || list.push(7); | |
370 | ||
371 | borrows_mutably(); | |
372 | println!("After calling closure: {:?}", list); | |
373 | } | |
374 | ``` | |
375 | ||
923072b8 | 376 | Listing 13-x: Defining and calling a closure that captures a mutable reference |
04454e1e FG |
377 | |
378 | This code compiles, runs, and prints: | |
379 | ||
380 | ``` | |
381 | Before defining closure: [1, 2, 3] | |
382 | After calling closure: [1, 2, 3, 7] | |
383 | ``` | |
384 | ||
385 | Note that there’s no longer a `println!` between the definition and the call of | |
386 | the `borrows_mutably` closure: when `borrows_mutably` is defined, it captures a | |
923072b8 FG |
387 | mutable reference to `list`. We don’t use the closure again after the closure |
388 | is called, so the mutable borrow ends. Between the closure definition and the | |
389 | closure call, an immutable borrow to print isn’t allowed because no other | |
390 | borrows are allowed when there’s a mutable borrow. Try adding a `println!` | |
391 | there to see what error message you get! | |
04454e1e FG |
392 | |
393 | If you want to force the closure to take ownership of the values it uses in the | |
394 | environment even though the body of the closure doesn’t strictly need | |
395 | ownership, you can use the `move` keyword before the parameter list. This | |
396 | technique is mostly useful when passing a closure to a new thread to move the | |
923072b8 | 397 | data so that it’s owned by the new thread. We’ll have more examples of `move` |
04454e1e FG |
398 | closures in Chapter 16 when we talk about concurrency. |
399 | ||
923072b8 FG |
400 | ### Moving Captured Values Out of Closures and the `Fn` Traits |
401 | ||
402 | Once a closure has captured a reference or captured ownership of a value where | |
403 | the closure is defined (thus affecting what, if anything, is moved *into* the | |
404 | closure), the code in the body of the closure defines what happens to the | |
405 | references or | |
406 | <!-- which function does this refer to -- is a closure always tied to a | |
407 | function? I'm thinking of the let closure created earlier. Is "function" here | |
408 | just a way to refer to the functionality of the closure? I'm wary of mixing the | |
409 | two terms /LC --> | |
410 | <!-- This was my mistake, this should say "closure" throughout! Great catch! | |
411 | /Carol --> | |
412 | values when the closure is evaluated later (thus affecting what, if anything, | |
413 | is moved *out of* the closure). | |
414 | <!-- do we mean "the references and values that are a result of calling the | |
415 | function"? This line confused me a little. Surely it's self-evident that the | |
416 | code in the function body affects the value or reference it's called on; I | |
417 | think I'm missing something! /LC --> | |
418 | <!-- This is an important part of closures that is confusing that I'm trying to | |
419 | clear up with this revision. Closure definitions can move references or values | |
420 | *in*, then the closure body can move references or values *out*, and we can | |
421 | vary these two aspects independently. I'm not sure the edit I've made here | |
422 | makes it better or worse? --> | |
423 | A closure body can do any of the following: move a captured value out of the | |
424 | closure, mutate the captured value, neither move nor mutate the value, or | |
425 | capture nothing from the environment to begin with. | |
426 | ||
427 | The way a closure captures and handles values from the environment affects | |
428 | which traits | |
429 | <!-- so the closure will automatically implement the traits depending on how we | |
430 | set it to handle the values? /LC --> | |
431 | <!-- Yup! /Carol--> | |
432 | the closure implements, and traits are how functions and structs can specify | |
433 | what kinds of closures they can use. Closures will automatically implement one, | |
434 | two, or all three of these `Fn` traits, in an additive fashion: | |
04454e1e FG |
435 | |
436 | 1. `FnOnce` applies to closures that can be called at least once. All closures | |
923072b8 FG |
437 | implement at least this trait, because all closures can be called. A closure |
438 | that moves captured values out of its body will only implement `FnOnce` | |
439 | and none of the other `Fn` traits, because it can only be called once. | |
04454e1e FG |
440 | 2. `FnMut` applies to closures that don’t move captured values out of their |
441 | body, but that might mutate the captured values. These closures can be | |
442 | called more than once. | |
443 | 3. `Fn` applies to closures that don’t move captured values out of their body | |
923072b8 FG |
444 | and that don’t mutate captured values, as well as closures that capture |
445 | nothing from their environment. These closures can be called more than once | |
446 | without mutating their environment, which is important in cases such as | |
447 | calling a closure multiple times concurrently. | |
448 | ||
449 | <!-- so there isn't a trait for the first action listed above: moving a | |
450 | capture value out of the closure? /LC --> | |
451 | <!-- There is-- it's `FnOnce`, as stated there: "A closure that moves captured | |
452 | values out of its body will only implement `FnOnce`". I think the confusion is | |
453 | that the 3rd trait here, `Fn`, applies to the last *two* actions listed above, | |
454 | which I tried to express but perhaps wasn't clear enough. I rearranged the text | |
455 | in #3 to maybe make it clearer? /Carol --> | |
04454e1e FG |
456 | |
457 | Let’s look at the definition of the `unwrap_or_else` method on `Option<T>` that | |
923072b8 | 458 | we used in Listing 13-x: |
04454e1e FG |
459 | |
460 | ``` | |
461 | impl<T> Option<T> { | |
462 | pub fn unwrap_or_else<F>(self, f: F) -> T | |
463 | where | |
464 | F: FnOnce() -> T | |
465 | { | |
466 | match self { | |
467 | Some(x) => x, | |
468 | None => f(), | |
469 | } | |
470 | } | |
471 | } | |
472 | ``` | |
473 | ||
474 | Recall that `T` is the generic type representing the type of the value in the | |
475 | `Some` variant of an `Option`. That type `T` is also the return type of the | |
476 | `unwrap_or_else` function: code that calls `unwrap_or_else` on an | |
477 | `Option<String>`, for example, will get a `String`. | |
478 | ||
923072b8 FG |
479 | Next, notice that the `unwrap_or_else` function has the additional generic type |
480 | parameter `F`. The `F` type is the type of the parameter named `f`, which is | |
04454e1e FG |
481 | the closure we provide when calling `unwrap_or_else`. |
482 | ||
483 | The trait bound specified on the generic type `F` is `FnOnce() -> T`, which | |
484 | means `F` must be able to be called at least once, take no arguments, and | |
485 | return a `T`. Using `FnOnce` in the trait bound expresses the constraint that | |
486 | `unwrap_or_else` is only going to call `f` at most one time. In the body of | |
487 | `unwrap_or_else`, we can see that if the `Option` is `Some`, `f` won’t be | |
488 | called. If the `Option` is `None`, `f` will be called once. Because all | |
489 | closures implement `FnOnce`, `unwrap_or_else` accepts the most different kinds | |
490 | of closures and is as flexible as it can be. | |
491 | ||
492 | > Note: Functions can implement all three of the `Fn` traits too. If what we | |
493 | > want to do doesn’t require capturing a value from the environment, we can use | |
494 | > the name of a function rather than a closure where we need something that | |
495 | > implements one of the `Fn` traits. For example, on an `Option<Vec<T>>` value, | |
496 | > we could call `unwrap_or_else(Vec::new)` to get a new, empty vector if the | |
497 | > value is `None`. | |
498 | ||
499 | Now let’s look at the standard library method `sort_by_key` defined on slices, | |
923072b8 FG |
500 | to see how that differs from `unwrap_or_else` and why `sort_by_key` uses |
501 | `FnMut` instead of `FnOnce` for the trait bound. | |
502 | <!-- can you tell us what about this method makes it a good comparison: what's | |
503 | the difference we're focusing on? /LC --> | |
504 | <!-- Done! /Carol --> | |
505 | The closure gets one argument, a reference to the current item in the slice | |
506 | being considered, and returns a value of type `K` that can be ordered. This | |
507 | function is useful when you want to sort a slice by a particular attribute of | |
508 | each item. In Listing 13-x, we have a list of `Rectangle` instances and we use | |
04454e1e FG |
509 | `sort_by_key` to order them by their `width` attribute from low to high: |
510 | ||
511 | Filename: src/main.rs | |
512 | ||
513 | ``` | |
514 | #[derive(Debug)] | |
515 | struct Rectangle { | |
516 | width: u32, | |
517 | height: u32, | |
518 | } | |
519 | ||
520 | fn main() { | |
521 | let mut list = [ | |
923072b8 FG |
522 | Rectangle { width: 10, height: 1 }, |
523 | Rectangle { width: 3, height: 5 }, | |
524 | Rectangle { width: 7, height: 12 }, | |
04454e1e FG |
525 | ]; |
526 | ||
527 | list.sort_by_key(|r| r.width); | |
528 | println!("{:#?}", list); | |
529 | } | |
530 | ``` | |
923072b8 | 531 | Listing 13-x: Using `sort_by_key` to order rectangles by width |
04454e1e FG |
532 | |
533 | This code prints: | |
534 | ||
535 | ``` | |
536 | [ | |
537 | Rectangle { | |
538 | width: 3, | |
539 | height: 5, | |
540 | }, | |
541 | Rectangle { | |
542 | width: 7, | |
543 | height: 12, | |
544 | }, | |
545 | Rectangle { | |
546 | width: 10, | |
547 | height: 1, | |
548 | }, | |
549 | ] | |
550 | ``` | |
551 | ||
552 | The reason `sort_by_key` is defined to take an `FnMut` closure is that it calls | |
553 | the closure multiple times: once for each item in the slice. The closure `|r| | |
554 | r.width` doesn’t capture, mutate, or move out anything from its environment, so | |
555 | it meets the trait bound requirements. | |
556 | ||
923072b8 FG |
557 | In contrast, Listing 13-x shows an example of a closure that implements just |
558 | the `FnOnce` trait, because it moves a value out of the environment. The | |
559 | compiler won’t let us use this closure with `sort_by_key`: | |
04454e1e FG |
560 | |
561 | Filename: src/main.rs | |
562 | ||
563 | ``` | |
564 | #[derive(Debug)] | |
565 | struct Rectangle { | |
566 | width: u32, | |
567 | height: u32, | |
568 | } | |
569 | ||
570 | fn main() { | |
571 | let mut list = [ | |
923072b8 FG |
572 | Rectangle { width: 10, height: 1 }, |
573 | Rectangle { width: 3, height: 5 }, | |
574 | Rectangle { width: 7, height: 12 }, | |
04454e1e FG |
575 | ]; |
576 | ||
577 | let mut sort_operations = vec![]; | |
578 | let value = String::from("by key called"); | |
579 | ||
580 | list.sort_by_key(|r| { | |
581 | sort_operations.push(value); | |
582 | r.width | |
583 | }); | |
584 | println!("{:#?}", list); | |
585 | } | |
586 | ``` | |
587 | ||
923072b8 | 588 | Listing 13-x: Attempting to use an `FnOnce` closure with `sort_by_key` |
04454e1e FG |
589 | |
590 | This is a contrived, convoluted way (that doesn’t work) to try and count the | |
591 | number of times `sort_by_key` gets called when sorting `list`. This code | |
923072b8 FG |
592 | attempts to do this counting by pushing `value`—a `String` from the closure’s |
593 | environment—into the `sort_operations` vector. The closure captures `value` | |
04454e1e FG |
594 | then moves `value` out of the closure by transferring ownership of `value` to |
595 | the `sort_operations` vector. This closure can be called once; trying to call | |
596 | it a second time wouldn’t work because `value` would no longer be in the | |
597 | environment to be pushed into `sort_operations` again! Therefore, this closure | |
598 | only implements `FnOnce`. When we try to compile this code, we get this error | |
599 | that `value` can’t be moved out of the closure because the closure must | |
600 | implement `FnMut`: | |
601 | ||
602 | ``` | |
603 | error[E0507]: cannot move out of `value`, a captured variable in an `FnMut` closure | |
923072b8 | 604 | --> src/main.rs:18:30 |
04454e1e | 605 | | |
923072b8 | 606 | 15 | let value = String::from("by key called"); |
04454e1e | 607 | | ----- captured outer variable |
923072b8 FG |
608 | 16 | |
609 | 17 | list.sort_by_key(|r| { | |
04454e1e | 610 | | ______________________- |
923072b8 | 611 | 18 | | sort_operations.push(value); |
04454e1e | 612 | | | ^^^^^ move occurs because `value` has type `String`, which does not implement the `Copy` trait |
923072b8 FG |
613 | 19 | | r.width |
614 | 20 | | }); | |
04454e1e FG |
615 | | |_____- captured by this `FnMut` closure |
616 | ``` | |
617 | ||
618 | The error points to the line in the closure body that moves `value` out of the | |
619 | environment. To fix this, we need to change the closure body so that it doesn’t | |
923072b8 FG |
620 | move values out of the environment. To count the number of times `sort_by_key` |
621 | is called, keeping a counter in the environment and incrementing its value in | |
622 | the closure body is a more straightforward way to calculate that. | |
623 | <!-- are we interested in this? Are we saying this is basically what we're | |
624 | doing, but we're going about it in a roundabout way? /LC --> | |
625 | <!-- I'm trying to point out how to fix the example if you wanted this | |
626 | behavior, I guess the hypothetical isn't necessary /Carol --> | |
627 | The closure in Listing 13-x works with `sort_by_key` because it is only | |
04454e1e FG |
628 | capturing a mutable reference to the `num_sort_operations` counter and can |
629 | therefore be called more than once: | |
630 | ||
631 | Filename: src/main.rs | |
632 | ||
633 | ``` | |
634 | #[derive(Debug)] | |
635 | struct Rectangle { | |
636 | width: u32, | |
637 | height: u32, | |
638 | } | |
639 | ||
640 | fn main() { | |
641 | let mut list = [ | |
923072b8 FG |
642 | Rectangle { width: 10, height: 1 }, |
643 | Rectangle { width: 3, height: 5 }, | |
644 | Rectangle { width: 7, height: 12 }, | |
04454e1e FG |
645 | ]; |
646 | ||
647 | let mut num_sort_operations = 0; | |
648 | list.sort_by_key(|r| { | |
649 | num_sort_operations += 1; | |
650 | r.width | |
651 | }); | |
652 | println!("{:#?}, sorted in {num_sort_operations} operations", list); | |
653 | } | |
654 | ``` | |
655 | ||
923072b8 | 656 | Listing 13-x: Using an `FnMut` closure with `sort_by_key` is allowed |
04454e1e FG |
657 | |
658 | The `Fn` traits are important when defining or using functions or types that | |
923072b8 FG |
659 | make use of closures. In the next section, we’ll discuss iterators. Many |
660 | iterator methods take closure arguments, so keep these closure details in mind | |
661 | as we continue! | |
04454e1e FG |
662 | |
663 | ## Processing a Series of Items with Iterators | |
664 | ||
665 | The iterator pattern allows you to perform some task on a sequence of items in | |
666 | turn. An iterator is responsible for the logic of iterating over each item and | |
667 | determining when the sequence has finished. When you use iterators, you don’t | |
668 | have to reimplement that logic yourself. | |
669 | ||
670 | In Rust, iterators are *lazy*, meaning they have no effect until you call | |
671 | methods that consume the iterator to use it up. For example, the code in | |
923072b8 | 672 | Listing 13-13 creates an iterator over the items in the vector `v1` by calling |
04454e1e FG |
673 | the `iter` method defined on `Vec<T>`. This code by itself doesn’t do anything |
674 | useful. | |
675 | ||
676 | ``` | |
677 | let v1 = vec![1, 2, 3]; | |
678 | ||
679 | let v1_iter = v1.iter(); | |
680 | ``` | |
681 | ||
923072b8 | 682 | Listing 13-13: Creating an iterator |
04454e1e | 683 | |
923072b8 FG |
684 | The iterator is stored in the `v1_iter` variable. Once we’ve created an |
685 | iterator, we can use it in a variety of ways. In Listing 3-5 in Chapter 3, we | |
686 | iterated over an array using a `for` loop to execute some code on each of its | |
687 | items. Under the hood this implicitly created and then consumed an iterator, | |
688 | but we glossed over how exactly that works until now. | |
04454e1e | 689 | |
923072b8 FG |
690 | In the example in Listing 13-14, we separate the creation of the iterator from |
691 | the use of the iterator in the `for` loop. When the `for` loop is called using | |
692 | the iterator in `v1_iter`, each element in the iterator is used in one | |
693 | iteration of the loop, which prints out each value. | |
04454e1e FG |
694 | |
695 | ``` | |
696 | let v1 = vec![1, 2, 3]; | |
697 | ||
698 | let v1_iter = v1.iter(); | |
699 | ||
700 | for val in v1_iter { | |
701 | println!("Got: {}", val); | |
702 | } | |
703 | ``` | |
704 | ||
923072b8 | 705 | Listing 13-14: Using an iterator in a `for` loop |
04454e1e FG |
706 | |
707 | In languages that don’t have iterators provided by their standard libraries, | |
708 | you would likely write this same functionality by starting a variable at index | |
709 | 0, using that variable to index into the vector to get a value, and | |
710 | incrementing the variable value in a loop until it reached the total number of | |
711 | items in the vector. | |
712 | ||
713 | Iterators handle all that logic for you, cutting down on repetitive code you | |
714 | could potentially mess up. Iterators give you more flexibility to use the same | |
715 | logic with many different kinds of sequences, not just data structures you can | |
716 | index into, like vectors. Let’s examine how iterators do that. | |
717 | ||
718 | ### The `Iterator` Trait and the `next` Method | |
719 | ||
720 | All iterators implement a trait named `Iterator` that is defined in the | |
721 | standard library. The definition of the trait looks like this: | |
722 | ||
723 | ``` | |
724 | pub trait Iterator { | |
725 | type Item; | |
726 | ||
727 | fn next(&mut self) -> Option<Self::Item>; | |
728 | ||
729 | // methods with default implementations elided | |
730 | } | |
731 | ``` | |
732 | ||
733 | Notice this definition uses some new syntax: `type Item` and `Self::Item`, | |
734 | which are defining an *associated type* with this trait. We’ll talk about | |
735 | associated types in depth in Chapter 19. For now, all you need to know is that | |
736 | this code says implementing the `Iterator` trait requires that you also define | |
737 | an `Item` type, and this `Item` type is used in the return type of the `next` | |
738 | method. In other words, the `Item` type will be the type returned from the | |
739 | iterator. | |
740 | ||
741 | The `Iterator` trait only requires implementors to define one method: the | |
742 | `next` method, which returns one item of the iterator at a time wrapped in | |
743 | `Some` and, when iteration is over, returns `None`. | |
744 | ||
923072b8 | 745 | We can call the `next` method on iterators directly; Listing 13-15 demonstrates |
04454e1e FG |
746 | what values are returned from repeated calls to `next` on the iterator created |
747 | from the vector. | |
748 | ||
749 | Filename: src/lib.rs | |
750 | ||
751 | ``` | |
752 | #[test] | |
753 | fn iterator_demonstration() { | |
754 | let v1 = vec![1, 2, 3]; | |
755 | ||
756 | let mut v1_iter = v1.iter(); | |
757 | ||
758 | assert_eq!(v1_iter.next(), Some(&1)); | |
759 | assert_eq!(v1_iter.next(), Some(&2)); | |
760 | assert_eq!(v1_iter.next(), Some(&3)); | |
761 | assert_eq!(v1_iter.next(), None); | |
762 | } | |
763 | ``` | |
764 | ||
923072b8 | 765 | Listing 13-15: Calling the `next` method on an iterator |
04454e1e FG |
766 | |
767 | Note that we needed to make `v1_iter` mutable: calling the `next` method on an | |
768 | iterator changes internal state that the iterator uses to keep track of where | |
769 | it is in the sequence. In other words, this code *consumes*, or uses up, the | |
770 | iterator. Each call to `next` eats up an item from the iterator. We didn’t need | |
771 | to make `v1_iter` mutable when we used a `for` loop because the loop took | |
772 | ownership of `v1_iter` and made it mutable behind the scenes. | |
773 | ||
774 | Also note that the values we get from the calls to `next` are immutable | |
775 | references to the values in the vector. The `iter` method produces an iterator | |
776 | over immutable references. If we want to create an iterator that takes | |
777 | ownership of `v1` and returns owned values, we can call `into_iter` instead of | |
778 | `iter`. Similarly, if we want to iterate over mutable references, we can call | |
779 | `iter_mut` instead of `iter`. | |
780 | ||
781 | ### Methods that Consume the Iterator | |
782 | ||
783 | The `Iterator` trait has a number of different methods with default | |
784 | implementations provided by the standard library; you can find out about these | |
785 | methods by looking in the standard library API documentation for the `Iterator` | |
786 | trait. Some of these methods call the `next` method in their definition, which | |
787 | is why you’re required to implement the `next` method when implementing the | |
788 | `Iterator` trait. | |
789 | ||
790 | Methods that call `next` are called *consuming adaptors*, because calling them | |
791 | uses up the iterator. One example is the `sum` method, which takes ownership of | |
792 | the iterator and iterates through the items by repeatedly calling `next`, thus | |
793 | consuming the iterator. As it iterates through, it adds each item to a running | |
923072b8 | 794 | total and returns the total when iteration is complete. Listing 13-16 has a |
04454e1e FG |
795 | test illustrating a use of the `sum` method: |
796 | ||
797 | Filename: src/lib.rs | |
798 | ||
799 | ``` | |
800 | #[test] | |
801 | fn iterator_sum() { | |
802 | let v1 = vec![1, 2, 3]; | |
803 | ||
804 | let v1_iter = v1.iter(); | |
805 | ||
806 | let total: i32 = v1_iter.sum(); | |
807 | ||
808 | assert_eq!(total, 6); | |
809 | } | |
810 | ``` | |
811 | ||
923072b8 | 812 | Listing 13-16: Calling the `sum` method to get the total of all items in the |
04454e1e FG |
813 | iterator |
814 | ||
815 | We aren’t allowed to use `v1_iter` after the call to `sum` because `sum` takes | |
816 | ownership of the iterator we call it on. | |
817 | ||
818 | ### Methods that Produce Other Iterators | |
819 | ||
923072b8 FG |
820 | *Iterator adaptors* are methods defined on the `Iterator` trait that don’t |
821 | consume the iterator. Instead, they produce different iterators by changing | |
822 | some aspect of the original iterator. | |
823 | ||
824 | <!-- are all methods defined on Iterator know as iterator adaptors? If so, | |
825 | should that term be introduced in the previous section? /LC --> | |
826 | <!-- No, only methods on iterator that produce other iterators are adaptors. | |
827 | I've tried to make that distinction clearer? /Carol --> | |
828 | <!-- is there a quick example of a different kind of iterator you could give? I | |
829 | wasn't aware there were different kinds, unless you just mean change them into | |
830 | iterators that act on something else? /LC --> | |
831 | <!-- Adaptors typically do change an iterator into an iterator with a different | |
832 | type, but usually that doesn't matter because the important part is that the | |
833 | new type also implents the `Iterator` trait. The iterator typically acts on the | |
834 | same items but changes them in different ways. I've tried to rearrange this so | |
835 | that the example comes sooner, does this help? /Carol --> | |
836 | ||
837 | Listing 13-17 shows an example of calling the iterator adaptor method `map`, | |
838 | which takes a closure to call on each item as the items are iterated through. | |
839 | The `map` method returns a new iterator that produces the modified items. The | |
840 | closure here creates a new iterator in which each item from the vector will be | |
841 | incremented by 1: | |
04454e1e FG |
842 | |
843 | Filename: src/main.rs | |
844 | ||
845 | ``` | |
846 | let v1: Vec<i32> = vec![1, 2, 3]; | |
847 | ||
848 | v1.iter().map(|x| x + 1); | |
849 | ``` | |
850 | ||
923072b8 | 851 | Listing 13-17: Calling the iterator adaptor `map` to create a new iterator |
04454e1e | 852 | |
923072b8 | 853 | However, this code produces a warning: |
04454e1e FG |
854 | |
855 | ``` | |
856 | warning: unused `Map` that must be used | |
857 | --> src/main.rs:4:5 | |
858 | | | |
859 | 4 | v1.iter().map(|x| x + 1); | |
860 | | ^^^^^^^^^^^^^^^^^^^^^^^^^ | |
861 | | | |
862 | = note: `#[warn(unused_must_use)]` on by default | |
863 | = note: iterators are lazy and do nothing unless consumed | |
864 | ``` | |
865 | ||
923072b8 | 866 | The code in Listing 13-17 doesn’t do anything; the closure we’ve specified |
04454e1e FG |
867 | never gets called. The warning reminds us why: iterator adaptors are lazy, and |
868 | we need to consume the iterator here. | |
869 | ||
923072b8 FG |
870 | To fix this warning and consume the iterator, we’ll use the `collect` method, |
871 | which we used in Chapter 12 with `env::args` in Listing 12-1. This method | |
872 | consumes the iterator and collects the resulting values into a collection data | |
873 | type. | |
04454e1e | 874 | |
923072b8 | 875 | In Listing 13-18, we collect the results of iterating over the iterator that’s |
04454e1e FG |
876 | returned from the call to `map` into a vector. This vector will end up |
877 | containing each item from the original vector incremented by 1. | |
878 | ||
879 | Filename: src/main.rs | |
880 | ||
881 | ``` | |
882 | let v1: Vec<i32> = vec![1, 2, 3]; | |
883 | ||
884 | let v2: Vec<_> = v1.iter().map(|x| x + 1).collect(); | |
885 | ||
886 | assert_eq!(v2, vec![2, 3, 4]); | |
887 | ``` | |
888 | ||
923072b8 | 889 | Listing 13-18: Calling the `map` method to create a new iterator and then |
04454e1e FG |
890 | calling the `collect` method to consume the new iterator and create a vector |
891 | ||
892 | Because `map` takes a closure, we can specify any operation we want to perform | |
893 | on each item. This is a great example of how closures let you customize some | |
894 | behavior while reusing the iteration behavior that the `Iterator` trait | |
895 | provides. | |
896 | ||
923072b8 FG |
897 | You can chain multiple calls to iterator adaptors to perform complex actions in |
898 | a readable way. But because all iterators are lazy, you have to call one of the | |
899 | consuming adaptor methods to get results from calls to iterator adaptors. | |
04454e1e | 900 | |
923072b8 | 901 | ### Using Closures that Capture Their Environment |
04454e1e | 902 | |
923072b8 FG |
903 | Many iterator adapters take closures as arguments, and commonly the closures |
904 | we’ll specify as arguments to iterator adapters will be closures that capture | |
905 | their environment. | |
906 | <!-- are we saying we use filter to demonstrate some common use, or that using | |
907 | filter is the common use? If the former, can you specify what the common usage | |
908 | is? /LC --> | |
909 | <!-- Iterator adapters commonly use closures that capture their environment, | |
910 | I've tried to reword to avoid the ambiguity? /Carol --> | |
911 | For this example, we’ll use the `filter` method that takes a closure. The | |
912 | closure gets an item from the iterator and returns a Boolean. If the closure | |
913 | returns `true`, the value will be included in the iteration produced by | |
914 | `filter`. If the closure returns `false`, the value won’t be included. | |
915 | ||
916 | In Listing 13-19, we use `filter` with a closure that captures the `shoe_size` | |
04454e1e FG |
917 | variable from its environment to iterate over a collection of `Shoe` struct |
918 | instances. It will return only shoes that are the specified size. | |
919 | ||
920 | Filename: src/lib.rs | |
921 | ||
922 | ``` | |
923 | #[derive(PartialEq, Debug)] | |
924 | struct Shoe { | |
925 | size: u32, | |
926 | style: String, | |
927 | } | |
928 | ||
929 | fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> { | |
930 | shoes.into_iter().filter(|s| s.size == shoe_size).collect() | |
931 | } | |
932 | ||
933 | #[cfg(test)] | |
934 | mod tests { | |
935 | use super::*; | |
936 | ||
937 | #[test] | |
938 | fn filters_by_size() { | |
939 | let shoes = vec![ | |
940 | Shoe { | |
941 | size: 10, | |
942 | style: String::from("sneaker"), | |
943 | }, | |
944 | Shoe { | |
945 | size: 13, | |
946 | style: String::from("sandal"), | |
947 | }, | |
948 | Shoe { | |
949 | size: 10, | |
950 | style: String::from("boot"), | |
951 | }, | |
952 | ]; | |
953 | ||
954 | let in_my_size = shoes_in_size(shoes, 10); | |
955 | ||
956 | assert_eq!( | |
957 | in_my_size, | |
958 | vec![ | |
959 | Shoe { | |
960 | size: 10, | |
961 | style: String::from("sneaker") | |
962 | }, | |
963 | Shoe { | |
964 | size: 10, | |
965 | style: String::from("boot") | |
966 | }, | |
967 | ] | |
968 | ); | |
969 | } | |
970 | } | |
971 | ``` | |
972 | ||
923072b8 | 973 | Listing 13-19: Using the `filter` method with a closure that captures |
04454e1e FG |
974 | `shoe_size` |
975 | ||
976 | The `shoes_in_size` function takes ownership of a vector of shoes and a shoe | |
977 | size as parameters. It returns a vector containing only shoes of the specified | |
978 | size. | |
979 | ||
980 | In the body of `shoes_in_size`, we call `into_iter` to create an iterator | |
981 | that takes ownership of the vector. Then we call `filter` to adapt that | |
982 | iterator into a new iterator that only contains elements for which the closure | |
983 | returns `true`. | |
984 | ||
985 | The closure captures the `shoe_size` parameter from the environment and | |
986 | compares the value with each shoe’s size, keeping only shoes of the size | |
987 | specified. Finally, calling `collect` gathers the values returned by the | |
988 | adapted iterator into a vector that’s returned by the function. | |
989 | ||
990 | The test shows that when we call `shoes_in_size`, we get back only shoes | |
991 | that have the same size as the value we specified. | |
992 | ||
993 | ## Improving Our I/O Project | |
994 | ||
995 | With this new knowledge about iterators, we can improve the I/O project in | |
996 | Chapter 12 by using iterators to make places in the code clearer and more | |
997 | concise. Let’s look at how iterators can improve our implementation of the | |
923072b8 | 998 | `Config::build` function and the `search` function. |
04454e1e FG |
999 | |
1000 | ### Removing a `clone` Using an Iterator | |
1001 | ||
1002 | In Listing 12-6, we added code that took a slice of `String` values and created | |
1003 | an instance of the `Config` struct by indexing into the slice and cloning the | |
923072b8 FG |
1004 | values, allowing the `Config` struct to own those values. In Listing 13-24, |
1005 | we’ve reproduced the implementation of the `Config::build` function as it was | |
1006 | in Listing 12-23: | |
04454e1e FG |
1007 | |
1008 | Filename: src/lib.rs | |
1009 | ||
1010 | ``` | |
1011 | impl Config { | |
923072b8 | 1012 | pub fn build(args: &[String]) -> Result<Config, &'static str> { |
04454e1e FG |
1013 | if args.len() < 3 { |
1014 | return Err("not enough arguments"); | |
1015 | } | |
1016 | ||
1017 | let query = args[1].clone(); | |
923072b8 | 1018 | let file_path = args[2].clone(); |
04454e1e | 1019 | |
923072b8 | 1020 | let ignore_case = env::var("IGNORE_CASE").is_ok(); |
04454e1e FG |
1021 | |
1022 | Ok(Config { | |
1023 | query, | |
923072b8 FG |
1024 | file_path, |
1025 | ignore_case, | |
04454e1e FG |
1026 | }) |
1027 | } | |
1028 | } | |
1029 | ``` | |
1030 | ||
923072b8 | 1031 | Listing 13-24: Reproduction of the `Config::build` function from Listing 12-23 |
04454e1e FG |
1032 | |
1033 | At the time, we said not to worry about the inefficient `clone` calls because | |
1034 | we would remove them in the future. Well, that time is now! | |
1035 | ||
1036 | We needed `clone` here because we have a slice with `String` elements in the | |
923072b8 | 1037 | parameter `args`, but the `build` function doesn’t own `args`. To return |
04454e1e FG |
1038 | ownership of a `Config` instance, we had to clone the values from the `query` |
1039 | and `filename` fields of `Config` so the `Config` instance can own its values. | |
1040 | ||
923072b8 | 1041 | With our new knowledge about iterators, we can change the `build` function to |
04454e1e FG |
1042 | take ownership of an iterator as its argument instead of borrowing a slice. |
1043 | We’ll use the iterator functionality instead of the code that checks the length | |
1044 | of the slice and indexes into specific locations. This will clarify what the | |
923072b8 | 1045 | `Config::build` function is doing because the iterator will access the values. |
04454e1e | 1046 | |
923072b8 | 1047 | Once `Config::build` takes ownership of the iterator and stops using indexing |
04454e1e FG |
1048 | operations that borrow, we can move the `String` values from the iterator into |
1049 | `Config` rather than calling `clone` and making a new allocation. | |
1050 | ||
1051 | #### Using the Returned Iterator Directly | |
1052 | ||
1053 | Open your I/O project’s *src/main.rs* file, which should look like this: | |
1054 | ||
1055 | Filename: src/main.rs | |
1056 | ||
1057 | ``` | |
1058 | fn main() { | |
1059 | let args: Vec<String> = env::args().collect(); | |
1060 | ||
923072b8 FG |
1061 | let config = Config::build(&args).unwrap_or_else(|err| { |
1062 | eprintln!("Problem parsing arguments: {err}"); | |
04454e1e FG |
1063 | process::exit(1); |
1064 | }); | |
1065 | ||
1066 | // --snip-- | |
1067 | } | |
1068 | ``` | |
1069 | ||
923072b8 FG |
1070 | We’ll first change the start of the `main` function that we had in Listing |
1071 | 12-24 to the code in Listing 13-25, which this time uses an iterator. This | |
1072 | won’t compile until we update `Config::build` as well. | |
04454e1e FG |
1073 | |
1074 | Filename: src/main.rs | |
1075 | ||
1076 | ``` | |
1077 | fn main() { | |
923072b8 FG |
1078 | let config = Config::build(env::args()).unwrap_or_else(|err| { |
1079 | eprintln!("Problem parsing arguments: {err}"); | |
04454e1e FG |
1080 | process::exit(1); |
1081 | }); | |
1082 | ||
1083 | // --snip-- | |
1084 | } | |
1085 | ``` | |
1086 | ||
923072b8 | 1087 | Listing 13-25: Passing the return value of `env::args` to `Config::build` |
04454e1e FG |
1088 | |
1089 | The `env::args` function returns an iterator! Rather than collecting the | |
923072b8 | 1090 | iterator values into a vector and then passing a slice to `Config::build`, now |
04454e1e | 1091 | we’re passing ownership of the iterator returned from `env::args` to |
923072b8 | 1092 | `Config::build` directly. |
04454e1e | 1093 | |
923072b8 FG |
1094 | Next, we need to update the definition of `Config::build`. In your I/O |
1095 | project’s *src/lib.rs* file, let’s change the signature of `Config::build` to | |
1096 | look like Listing 13-26. This still won’t compile because we need to update the | |
1097 | function body. | |
04454e1e FG |
1098 | |
1099 | Filename: src/lib.rs | |
1100 | ||
1101 | ``` | |
1102 | impl Config { | |
923072b8 | 1103 | pub fn build( |
04454e1e FG |
1104 | mut args: impl Iterator<Item = String>, |
1105 | ) -> Result<Config, &'static str> { | |
1106 | // --snip-- | |
1107 | ``` | |
1108 | ||
923072b8 | 1109 | Listing 13-26: Updating the signature of `Config::build` to expect an iterator |
04454e1e FG |
1110 | |
1111 | The standard library documentation for the `env::args` function shows that the | |
1112 | type of the iterator it returns is `std::env::Args`, and that type implements | |
1113 | the `Iterator` trait and returns `String` values. | |
1114 | ||
923072b8 | 1115 | We’ve updated the signature of the `Config::build` function so the parameter |
04454e1e FG |
1116 | `args` has a generic type with the trait bounds `impl Iterator<Item = String>` |
1117 | instead of `&[String]`. This usage of the `impl Trait` syntax we discussed in | |
923072b8 FG |
1118 | the “Traits as Parameters” section of Chapter 10 means that `args` can be any |
1119 | type that implements the `Iterator` type and returns `String` items. | |
04454e1e FG |
1120 | |
1121 | Because we’re taking ownership of `args` and we’ll be mutating `args` by | |
1122 | iterating over it, we can add the `mut` keyword into the specification of the | |
1123 | `args` parameter to make it mutable. | |
1124 | ||
1125 | #### Using `Iterator` Trait Methods Instead of Indexing | |
1126 | ||
923072b8 FG |
1127 | Next, we’ll fix the body of `Config::build`. Because `args` implements the |
1128 | `Iterator` trait, we know we can call the `next` method on it! Listing 13-27 | |
04454e1e FG |
1129 | updates the code from Listing 12-23 to use the `next` method: |
1130 | ||
1131 | Filename: src/lib.rs | |
1132 | ||
1133 | ``` | |
1134 | impl Config { | |
923072b8 | 1135 | pub fn build( |
04454e1e FG |
1136 | mut args: impl Iterator<Item = String>, |
1137 | ) -> Result<Config, &'static str> { | |
1138 | args.next(); | |
1139 | ||
1140 | let query = match args.next() { | |
1141 | Some(arg) => arg, | |
1142 | None => return Err("Didn't get a query string"), | |
1143 | }; | |
1144 | ||
923072b8 | 1145 | let file_path = match args.next() { |
04454e1e | 1146 | Some(arg) => arg, |
923072b8 | 1147 | None => return Err("Didn't get a file path"), |
04454e1e FG |
1148 | }; |
1149 | ||
923072b8 | 1150 | let ignore_case = env::var("IGNORE_CASE").is_ok(); |
04454e1e FG |
1151 | |
1152 | Ok(Config { | |
1153 | query, | |
923072b8 FG |
1154 | file_path, |
1155 | ignore_case, | |
04454e1e FG |
1156 | }) |
1157 | } | |
1158 | } | |
1159 | ``` | |
1160 | ||
923072b8 | 1161 | Listing 13-27: Changing the body of `Config::build` to use iterator methods |
04454e1e FG |
1162 | |
1163 | Remember that the first value in the return value of `env::args` is the name of | |
1164 | the program. We want to ignore that and get to the next value, so first we call | |
1165 | `next` and do nothing with the return value. Second, we call `next` to get the | |
1166 | value we want to put in the `query` field of `Config`. If `next` returns a | |
1167 | `Some`, we use a `match` to extract the value. If it returns `None`, it means | |
1168 | not enough arguments were given and we return early with an `Err` value. We do | |
1169 | the same thing for the `filename` value. | |
1170 | ||
1171 | ### Making Code Clearer with Iterator Adaptors | |
1172 | ||
1173 | We can also take advantage of iterators in the `search` function in our I/O | |
923072b8 | 1174 | project, which is reproduced here in Listing 13-28 as it was in Listing 12-19: |
04454e1e FG |
1175 | |
1176 | Filename: src/lib.rs | |
1177 | ||
1178 | ``` | |
1179 | pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { | |
1180 | let mut results = Vec::new(); | |
1181 | ||
1182 | for line in contents.lines() { | |
1183 | if line.contains(query) { | |
1184 | results.push(line); | |
1185 | } | |
1186 | } | |
1187 | ||
1188 | results | |
1189 | } | |
1190 | ``` | |
1191 | ||
923072b8 | 1192 | Listing 13-28: The implementation of the `search` function from Listing 12-19 |
04454e1e FG |
1193 | |
1194 | We can write this code in a more concise way using iterator adaptor methods. | |
1195 | Doing so also lets us avoid having a mutable intermediate `results` vector. The | |
1196 | functional programming style prefers to minimize the amount of mutable state to | |
1197 | make code clearer. Removing the mutable state might enable a future enhancement | |
1198 | to make searching happen in parallel, because we wouldn’t have to manage | |
923072b8 | 1199 | concurrent access to the `results` vector. Listing 13-29 shows this change: |
04454e1e FG |
1200 | |
1201 | Filename: src/lib.rs | |
1202 | ||
1203 | ``` | |
1204 | pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> { | |
1205 | contents | |
1206 | .lines() | |
1207 | .filter(|line| line.contains(query)) | |
1208 | .collect() | |
1209 | } | |
1210 | ``` | |
1211 | ||
923072b8 | 1212 | Listing 13-29: Using iterator adaptor methods in the implementation of the |
04454e1e FG |
1213 | `search` function |
1214 | ||
1215 | Recall that the purpose of the `search` function is to return all lines in | |
1216 | `contents` that contain the `query`. Similar to the `filter` example in Listing | |
923072b8 | 1217 | 13-19, this code uses the `filter` adaptor to keep only the lines that |
04454e1e FG |
1218 | `line.contains(query)` returns `true` for. We then collect the matching lines |
1219 | into another vector with `collect`. Much simpler! Feel free to make the same | |
1220 | change to use iterator methods in the `search_case_insensitive` function as | |
1221 | well. | |
1222 | ||
923072b8 FG |
1223 | ### Choosing Between Loops or Iterators |
1224 | ||
04454e1e | 1225 | The next logical question is which style you should choose in your own code and |
923072b8 FG |
1226 | why: the original implementation in Listing 13-28 or the version using |
1227 | iterators in Listing 13-29. Most Rust programmers prefer to use the iterator | |
04454e1e FG |
1228 | style. It’s a bit tougher to get the hang of at first, but once you get a feel |
1229 | for the various iterator adaptors and what they do, iterators can be easier to | |
1230 | understand. Instead of fiddling with the various bits of looping and building | |
1231 | new vectors, the code focuses on the high-level objective of the loop. This | |
1232 | abstracts away some of the commonplace code so it’s easier to see the concepts | |
1233 | that are unique to this code, such as the filtering condition each element in | |
1234 | the iterator must pass. | |
1235 | ||
1236 | But are the two implementations truly equivalent? The intuitive assumption | |
1237 | might be that the more low-level loop will be faster. Let’s talk about | |
1238 | performance. | |
1239 | ||
1240 | ## Comparing Performance: Loops vs. Iterators | |
1241 | ||
1242 | To determine whether to use loops or iterators, you need to know which | |
1243 | implementation is faster: the version of the `search` function with an explicit | |
1244 | `for` loop or the version with iterators. | |
1245 | ||
1246 | We ran a benchmark by loading the entire contents of *The Adventures of | |
1247 | Sherlock Holmes* by Sir Arthur Conan Doyle into a `String` and looking for the | |
1248 | word *the* in the contents. Here are the results of the benchmark on the | |
1249 | version of `search` using the `for` loop and the version using iterators: | |
1250 | ||
1251 | ``` | |
1252 | test bench_search_for ... bench: 19,620,300 ns/iter (+/- 915,700) | |
1253 | test bench_search_iter ... bench: 19,234,900 ns/iter (+/- 657,200) | |
1254 | ``` | |
1255 | ||
1256 | The iterator version was slightly faster! We won’t explain the benchmark code | |
1257 | here, because the point is not to prove that the two versions are equivalent | |
1258 | but to get a general sense of how these two implementations compare | |
1259 | performance-wise. | |
1260 | ||
1261 | For a more comprehensive benchmark, you should check using various texts of | |
1262 | various sizes as the `contents`, different words and words of different lengths | |
1263 | as the `query`, and all kinds of other variations. The point is this: | |
1264 | iterators, although a high-level abstraction, get compiled down to roughly the | |
1265 | same code as if you’d written the lower-level code yourself. Iterators are one | |
1266 | of Rust’s *zero-cost abstractions*, by which we mean using the abstraction | |
1267 | imposes no additional runtime overhead. This is analogous to how Bjarne | |
1268 | Stroustrup, the original designer and implementor of C++, defines | |
1269 | *zero-overhead* in “Foundations of C++” (2012): | |
1270 | ||
1271 | > In general, C++ implementations obey the zero-overhead principle: What you | |
1272 | > don’t use, you don’t pay for. And further: What you do use, you couldn’t hand | |
1273 | > code any better. | |
1274 | ||
1275 | As another example, the following code is taken from an audio decoder. The | |
1276 | decoding algorithm uses the linear prediction mathematical operation to | |
1277 | estimate future values based on a linear function of the previous samples. This | |
1278 | code uses an iterator chain to do some math on three variables in scope: a | |
1279 | `buffer` slice of data, an array of 12 `coefficients`, and an amount by which | |
1280 | to shift data in `qlp_shift`. We’ve declared the variables within this example | |
1281 | but not given them any values; although this code doesn’t have much meaning | |
1282 | outside of its context, it’s still a concise, real-world example of how Rust | |
1283 | translates high-level ideas to low-level code. | |
1284 | ||
1285 | ``` | |
1286 | let buffer: &mut [i32]; | |
1287 | let coefficients: [i64; 12]; | |
1288 | let qlp_shift: i16; | |
1289 | ||
1290 | for i in 12..buffer.len() { | |
1291 | let prediction = coefficients.iter() | |
1292 | .zip(&buffer[i - 12..i]) | |
1293 | .map(|(&c, &s)| c * s as i64) | |
1294 | .sum::<i64>() >> qlp_shift; | |
1295 | let delta = buffer[i]; | |
1296 | buffer[i] = prediction as i32 + delta; | |
1297 | } | |
1298 | ``` | |
1299 | ||
1300 | To calculate the value of `prediction`, this code iterates through each of the | |
1301 | 12 values in `coefficients` and uses the `zip` method to pair the coefficient | |
1302 | values with the previous 12 values in `buffer`. Then, for each pair, we | |
1303 | multiply the values together, sum all the results, and shift the bits in the | |
1304 | sum `qlp_shift` bits to the right. | |
1305 | ||
1306 | Calculations in applications like audio decoders often prioritize performance | |
1307 | most highly. Here, we’re creating an iterator, using two adaptors, and then | |
1308 | consuming the value. What assembly code would this Rust code compile to? Well, | |
1309 | as of this writing, it compiles down to the same assembly you’d write by hand. | |
1310 | There’s no loop at all corresponding to the iteration over the values in | |
1311 | `coefficients`: Rust knows that there are 12 iterations, so it “unrolls” the | |
1312 | loop. *Unrolling* is an optimization that removes the overhead of the loop | |
1313 | controlling code and instead generates repetitive code for each iteration of | |
1314 | the loop. | |
1315 | ||
1316 | All of the coefficients get stored in registers, which means accessing the | |
1317 | values is very fast. There are no bounds checks on the array access at runtime. | |
1318 | All these optimizations that Rust is able to apply make the resulting code | |
1319 | extremely efficient. Now that you know this, you can use iterators and closures | |
1320 | without fear! They make code seem like it’s higher level but don’t impose a | |
1321 | runtime performance penalty for doing so. | |
1322 | ||
1323 | ## Summary | |
1324 | ||
1325 | Closures and iterators are Rust features inspired by functional programming | |
1326 | language ideas. They contribute to Rust’s capability to clearly express | |
1327 | high-level ideas at low-level performance. The implementations of closures and | |
1328 | iterators are such that runtime performance is not affected. This is part of | |
1329 | Rust’s goal to strive to provide zero-cost abstractions. | |
1330 | ||
1331 | Now that we’ve improved the expressiveness of our I/O project, let’s look at | |
1332 | some more features of `cargo` that will help us share the project with the | |
1333 | world. |