]> git.proxmox.com Git - rustc.git/blob - src/doc/book/second-edition/src/ch13-03-improving-our-io-project.md
New upstream version 1.21.0+dfsg1
[rustc.git] / src / doc / book / second-edition / src / ch13-03-improving-our-io-project.md
1 ## Improving our I/O Project
2
3 We can improve our implementation of the I/O project in Chapter 12 by using
4 iterators to make places in the code clearer and more concise. Let’s take a
5 look at how iterators can improve our implementation of both the `Config::new`
6 function and the `search` function.
7
8 ### Removing a `clone` Using an Iterator
9
10 In Listing 12-6, we added code that took a slice of `String` values and created
11 an instance of the `Config` struct by indexing into the slice and cloning the
12 values so that the `Config` struct could own those values. We’ve reproduced the
13 implementation of the `Config::new` function as it was at the end of Chapter 12
14 in Listing 13-24:
15
16 <span class="filename">Filename: src/lib.rs</span>
17
18 ```rust,ignore
19 impl Config {
20 pub fn new(args: &[String]) -> Result<Config, &'static str> {
21 if args.len() < 3 {
22 return Err("not enough arguments");
23 }
24
25 let query = args[1].clone();
26 let filename = args[2].clone();
27
28 let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
29
30 Ok(Config { query, filename, case_sensitive })
31 }
32 }
33 ```
34
35 <span class="caption">Listing 13-24: Reproduction of the `Config::new` function
36 from the end of Chapter 12</span>
37
38 <!--Is this why we didn't want to use clone calls, they were inefficient, or
39 was it that stacking clone calls can become confusing/is bad practice? -->
40 <!-- Yep, it's for performance reasons /Carol -->
41
42 At the time, we said not to worry about the inefficient `clone` calls here
43 because we would remove them in the future. Well, that time is now!
44
45 The reason we needed `clone` here in the first place is that we have a slice
46 with `String` elements in the parameter `args`, but the `new` function does not
47 own `args`. In order to be able to return ownership of a `Config` instance, we
48 need to clone the values that we put in the `query` and `filename` fields of
49 `Config`, so that the `Config` instance can own its values.
50
51 With our new knowledge about iterators, we can change the `new` function to
52 take ownership of an iterator as its argument instead of borrowing a slice.
53 We’ll use the iterator functionality instead of the code we had that checks the
54 length of the slice and indexes into specific locations. This will clear up
55 what the `Config::new` function is doing since the iterator will take care of
56 accessing the values.
57
58 <!-- use the iterator functionality to what? How will iterating allow us to do
59 the same thing, can you briefly lay that out? -->
60 <!-- It's mostly for clarity and using a good abstraction, I've tried fixing
61 /Carol -->
62
63 Once `Config::new` taking ownership of the iterator and not using indexing
64 operations that borrow, we can move the `String` values from the iterator into
65 `Config` rather than calling `clone` and making a new allocation.
66
67 <!-- below: which file are we in, can you specify here? -->
68 <!-- done /Carol -->
69
70 #### Using the Iterator Returned by `env::args` Directly
71
72 In your I/O project’s *src/main.rs*, let’s change the start of the `main`
73 function from this code that we had at the end of Chapter 12:
74
75 ```rust,ignore
76 fn main() {
77 let args: Vec<String> = env::args().collect();
78
79 let config = Config::new(&args).unwrap_or_else(|err| {
80 eprintln!("Problem parsing arguments: {}", err);
81 process::exit(1);
82 });
83
84 // ...snip...
85 }
86 ```
87
88 To the code in Listing 13-25:
89
90 <span class="filename">Filename: src/main.rs</span>
91
92 ```rust,ignore
93 fn main() {
94 let config = Config::new(env::args()).unwrap_or_else(|err| {
95 eprintln!("Problem parsing arguments: {}", err);
96 process::exit(1);
97 });
98
99 // ...snip...
100 }
101 ```
102
103 <span class="caption">Listing 13-25: Passing the return value of `env::args` to
104 `Config::new`</span>
105
106 <!-- I think, if we're going to be building this up bit by bit, it might be
107 worth adding listing numbers and file names to each, can you add those? Don't
108 worry about being accurate with the numbers, we can update them more easily
109 later -->
110 <!-- That's nice of you to offer, but since we're maintaining an online version
111 that we're keeping in sync with each round of edits, we need to keep the
112 listing numbers making sense as well. We'll just take care of them. /Carol -->
113
114 The `env::args` function returns an iterator! Rather than collecting the
115 iterator values into a vector and then passing a slice to `Config::new`, now
116 we’re passing ownership of the iterator returned from `env::args` to
117 `Config::new` directly.
118
119 Next, we need to update the definition of `Config::new`. In your I/O project’s
120 *src/lib.rs*, let’s change the signature of `Config::new` to look like Listing
121 13-26:
122
123 <!-- can you give the filename here too? -->
124 <!-- done /Carol -->
125
126 <span class="filename">Filename: src/lib.rs</span>
127
128 ```rust,ignore
129 impl Config {
130 pub fn new(args: std::env::Args) -> Result<Config, &'static str> {
131 // ...snip...
132 ```
133
134 <span class="caption">Listing 13-26: Updating the signature of `Config::new` to
135 expect an iterator</span>
136
137 The standard library documentation for the `env::args` function shows that the
138 type of the iterator it returns is `std::env::Args`. We’ve updated the
139 signature of the `Config::new` function so that the parameter `args` has the
140 type `std::env::Args` instead of `&[String]`.
141
142 #### Using `Iterator` Trait Methods Instead of Indexing
143
144 Next, we’ll fix the body of `Config::new`. The standard library documentation
145 also mentions that `std::env::Args` implements the `Iterator` trait, so we know
146 we can call the `next` method on it! Listing 13-27 has updated the code
147 from Listing 12-23 to use the `next` method:
148
149 <span class="filename">Filename: src/lib.rs</span>
150
151 ```rust
152 # use std::env;
153 #
154 # struct Config {
155 # query: String,
156 # filename: String,
157 # case_sensitive: bool,
158 # }
159 #
160 impl Config {
161 pub fn new(mut args: std::env::Args) -> Result<Config, &'static str> {
162 args.next();
163
164 let query = match args.next() {
165 Some(arg) => arg,
166 None => return Err("Didn't get a query string"),
167 };
168
169 let filename = match args.next() {
170 Some(arg) => arg,
171 None => return Err("Didn't get a file name"),
172 };
173
174 let case_sensitive = env::var("CASE_INSENSITIVE").is_err();
175
176 Ok(Config {
177 query, filename, case_sensitive
178 })
179 }
180 }
181 ```
182
183 <span class="caption">Listing 13-27: Changing the body of `Config::new` to use
184 iterator methods</span>
185
186 <!-- is this the *full* new lib.rs code? Worth noting for ghosting purposes -->
187 <!-- No, this is just the `Config::new` function, which I thought would be
188 clear by saying "Next, we'll fix the body of `Config::new`.", can you elaborate
189 on why that's not clear enough? I would expect programmers to be able to
190 understand where a function starts and ends. /Carol -->
191
192 Remember that the first value in the return value of `env::args` is the name of
193 the program. We want to ignore that and get to the next value, so first we call
194 `next` and do nothing with the return value. Second, we call `next` on the
195 value we want to put in the `query` field of `Config`. If `next` returns a
196 `Some`, we use a `match` to extract the value. If it returns `None`, it means
197 not enough arguments were given and we return early with an `Err` value. We do
198 the same thing for the `filename` value.
199
200 <!-- Hm, if ? would not work anyway, I'm not clear on why we mention, why it's
201 a shame we cant use it on Option? -->
202 <!-- We've taken this out, it's something that a portion of the readers might
203 be wondering and something that Rust might let you do someday, but yeah, it's
204 probably just distracting to most people /Carol -->
205
206 ### Making Code Clearer with Iterator Adaptors
207
208 The other place in our I/O project we could take advantage of iterators is in
209 the `search` function, reproduced here in Listing 13-28 as it was at the end of
210 Chapter 12:
211
212 <span class="filename">Filename: src/lib.rs</span>
213
214 ```rust,ignore
215 pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
216 let mut results = Vec::new();
217
218 for line in contents.lines() {
219 if line.contains(query) {
220 results.push(line);
221 }
222 }
223
224 results
225 }
226 ```
227
228 <span class="caption">Listing 13-28: The implementation of the `search`
229 function from Chapter 12</span>
230
231 We can write this code in a much shorter way by using iterator adaptor methods
232 instead. This also lets us avoid having a mutable intermediate `results`
233 vector. The functional programming style prefers to minimize the amount of
234 mutable state to make code clearer. Removing the mutable state might make it
235 easier for us to make a future enhancement to make searching happen in
236 parallel, since we wouldn’t have to manage concurrent access to the `results`
237 vector. Listing 13-29 shows this change:
238
239 <!-- Remind us why we want to avoid the mutable results vector? -->
240 <!-- done /Carol -->
241
242 <span class="filename">Filename: src/lib.rs</span>
243
244 ```rust,ignore
245 pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
246 contents.lines()
247 .filter(|line| line.contains(query))
248 .collect()
249 }
250 ```
251
252 <span class="caption">Listing 13-29: Using iterator adaptor methods in the
253 implementation of the `search` function</span>
254
255 Recall that the purpose of the `search` function is to return all lines in
256 `contents` that contain the `query`. Similarly to the `filter` example in
257 Listing 13-19, we can use the `filter` adaptor to keep only the lines that
258 `line.contains(query)` returns true for. We then collect the matching lines up
259 into another vector with `collect`. Much simpler! Feel free to make the same
260 change to use iterator methods in the `search_case_insensitive` function as
261 well.
262
263 <!-- what is that, here, only lines that contain a matching string? A bit more
264 context would help out, we probably can't rely on readers remembering all the
265 details I'm afraid -->
266 <!-- done /Carol -->
267
268 The next logical question is which style you should choose in your own code:
269 the original implementation in Listing 13-28, or the version using iterators in
270 Listing 13-29. Most Rust programmers prefer to use the iterator style. It’s a
271 bit tougher to get the hang of at first, but once you get a feel for the
272 various iterator adaptors and what they do, iterators can be easier to
273 understand. Instead of fiddling with the various bits of looping and building
274 new vectors, the code focuses on the high-level objective of the loop. This
275 abstracts away some of the commonplace code so that it’s easier to see the
276 concepts that are unique to this code, like the filtering condition each
277 element in the iterator must pass.
278
279 But are the two implementations truly equivalent? The intuitive assumption
280 might be that the more low-level loop will be faster. Let’s talk about
281 performance.