]> git.proxmox.com Git - rustc.git/blob - src/doc/book/src/ch12-03-improving-error-handling-and-modularity.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / book / src / ch12-03-improving-error-handling-and-modularity.md
1 ## Refactoring to Improve Modularity and Error Handling
2
3 To improve our program, we’ll fix four problems that have to do with the
4 program’s structure and how it’s handling potential errors. First, our `main`
5 function now performs two tasks: it parses arguments and reads files. As our
6 program grows, the number of separate tasks the `main` function handles will
7 increase. As a function gains responsibilities, it becomes more difficult to
8 reason about, harder to test, and harder to change without breaking one of its
9 parts. It’s best to separate functionality so each function is responsible for
10 one task.
11
12 This issue also ties into the second problem: although `query` and `file_path`
13 are configuration variables to our program, variables like `contents` are used
14 to perform the program’s logic. The longer `main` becomes, the more variables
15 we’ll need to bring into scope; the more variables we have in scope, the harder
16 it will be to keep track of the purpose of each. It’s best to group the
17 configuration variables into one structure to make their purpose clear.
18
19 The third problem is that we’ve used `expect` to print an error message when
20 reading the file fails, but the error message just prints `Should have been
21 able to read the file`. Reading a file can fail in a number of ways: for
22 example, the file could be missing, or we might not have permission to open it.
23 Right now, regardless of the situation, we’d print the same error message for
24 everything, which wouldn’t give the user any information!
25
26 Fourth, we use `expect` repeatedly to handle different errors, and if the user
27 runs our program without specifying enough arguments, they’ll get an `index out
28 of bounds` error from Rust that doesn’t clearly explain the problem. It would
29 be best if all the error-handling code were in one place so future maintainers
30 had only one place to consult the code if the error-handling logic needed to
31 change. Having all the error-handling code in one place will also ensure that
32 we’re printing messages that will be meaningful to our end users.
33
34 Let’s address these four problems by refactoring our project.
35
36 ### Separation of Concerns for Binary Projects
37
38 The organizational problem of allocating responsibility for multiple tasks to
39 the `main` function is common to many binary projects. As a result, the Rust
40 community has developed guidelines for splitting the separate concerns of a
41 binary program when `main` starts getting large. This process has the following
42 steps:
43
44 * Split your program into a *main.rs* and a *lib.rs* and move your program’s
45 logic to *lib.rs*.
46 * As long as your command line parsing logic is small, it can remain in
47 *main.rs*.
48 * When the command line parsing logic starts getting complicated, extract it
49 from *main.rs* and move it to *lib.rs*.
50
51 The responsibilities that remain in the `main` function after this process
52 should be limited to the following:
53
54 * Calling the command line parsing logic with the argument values
55 * Setting up any other configuration
56 * Calling a `run` function in *lib.rs*
57 * Handling the error if `run` returns an error
58
59 This pattern is about separating concerns: *main.rs* handles running the
60 program, and *lib.rs* handles all the logic of the task at hand. Because you
61 can’t test the `main` function directly, this structure lets you test all of
62 your program’s logic by moving it into functions in *lib.rs*. The code that
63 remains in *main.rs* will be small enough to verify its correctness by reading
64 it. Let’s rework our program by following this process.
65
66 #### Extracting the Argument Parser
67
68 We’ll extract the functionality for parsing arguments into a function that
69 `main` will call to prepare for moving the command line parsing logic to
70 *src/lib.rs*. Listing 12-5 shows the new start of `main` that calls a new
71 function `parse_config`, which we’ll define in *src/main.rs* for the moment.
72
73 <span class="filename">Filename: src/main.rs</span>
74
75 ```rust,ignore
76 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-05/src/main.rs:here}}
77 ```
78
79 <span class="caption">Listing 12-5: Extracting a `parse_config` function from
80 `main`</span>
81
82 We’re still collecting the command line arguments into a vector, but instead of
83 assigning the argument value at index 1 to the variable `query` and the
84 argument value at index 2 to the variable `file_path` within the `main`
85 function, we pass the whole vector to the `parse_config` function. The
86 `parse_config` function then holds the logic that determines which argument
87 goes in which variable and passes the values back to `main`. We still create
88 the `query` and `file_path` variables in `main`, but `main` no longer has the
89 responsibility of determining how the command line arguments and variables
90 correspond.
91
92 This rework may seem like overkill for our small program, but we’re refactoring
93 in small, incremental steps. After making this change, run the program again to
94 verify that the argument parsing still works. It’s good to check your progress
95 often, to help identify the cause of problems when they occur.
96
97 #### Grouping Configuration Values
98
99 We can take another small step to improve the `parse_config` function further.
100 At the moment, we’re returning a tuple, but then we immediately break that
101 tuple into individual parts again. This is a sign that perhaps we don’t have
102 the right abstraction yet.
103
104 Another indicator that shows there’s room for improvement is the `config` part
105 of `parse_config`, which implies that the two values we return are related and
106 are both part of one configuration value. We’re not currently conveying this
107 meaning in the structure of the data other than by grouping the two values into
108 a tuple; we’ll instead put the two values into one struct and give each of the
109 struct fields a meaningful name. Doing so will make it easier for future
110 maintainers of this code to understand how the different values relate to each
111 other and what their purpose is.
112
113 Listing 12-6 shows the improvements to the `parse_config` function.
114
115 <span class="filename">Filename: src/main.rs</span>
116
117 ```rust,should_panic,noplayground
118 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-06/src/main.rs:here}}
119 ```
120
121 <span class="caption">Listing 12-6: Refactoring `parse_config` to return an
122 instance of a `Config` struct</span>
123
124 We’ve added a struct named `Config` defined to have fields named `query` and
125 `file_path`. The signature of `parse_config` now indicates that it returns a
126 `Config` value. In the body of `parse_config`, where we used to return
127 string slices that reference `String` values in `args`, we now define `Config`
128 to contain owned `String` values. The `args` variable in `main` is the owner of
129 the argument values and is only letting the `parse_config` function borrow
130 them, which means we’d violate Rust’s borrowing rules if `Config` tried to take
131 ownership of the values in `args`.
132
133 There are a number of ways we could manage the `String` data; the easiest,
134 though somewhat inefficient, route is to call the `clone` method on the values.
135 This will make a full copy of the data for the `Config` instance to own, which
136 takes more time and memory than storing a reference to the string data.
137 However, cloning the data also makes our code very straightforward because we
138 don’t have to manage the lifetimes of the references; in this circumstance,
139 giving up a little performance to gain simplicity is a worthwhile trade-off.
140
141 > ### The Trade-Offs of Using `clone`
142 >
143 > There’s a tendency among many Rustaceans to avoid using `clone` to fix
144 > ownership problems because of its runtime cost. In
145 > [Chapter 13][ch13]<!-- ignore -->, you’ll learn how to use more efficient
146 > methods in this type of situation. But for now, it’s okay to copy a few
147 > strings to continue making progress because you’ll make these copies only
148 > once and your file path and query string are very small. It’s better to have
149 > a working program that’s a bit inefficient than to try to hyperoptimize code
150 > on your first pass. As you become more experienced with Rust, it’ll be
151 > easier to start with the most efficient solution, but for now, it’s
152 > perfectly acceptable to call `clone`.
153
154 We’ve updated `main` so it places the instance of `Config` returned by
155 `parse_config` into a variable named `config`, and we updated the code that
156 previously used the separate `query` and `file_path` variables so it now uses
157 the fields on the `Config` struct instead.
158
159 Now our code more clearly conveys that `query` and `file_path` are related and
160 that their purpose is to configure how the program will work. Any code that
161 uses these values knows to find them in the `config` instance in the fields
162 named for their purpose.
163
164 #### Creating a Constructor for `Config`
165
166 So far, we’ve extracted the logic responsible for parsing the command line
167 arguments from `main` and placed it in the `parse_config` function. Doing so
168 helped us to see that the `query` and `file_path` values were related and that
169 relationship should be conveyed in our code. We then added a `Config` struct to
170 name the related purpose of `query` and `file_path` and to be able to return the
171 values’ names as struct field names from the `parse_config` function.
172
173 So now that the purpose of the `parse_config` function is to create a `Config`
174 instance, we can change `parse_config` from a plain function to a function
175 named `new` that is associated with the `Config` struct. Making this change
176 will make the code more idiomatic. We can create instances of types in the
177 standard library, such as `String`, by calling `String::new`. Similarly, by
178 changing `parse_config` into a `new` function associated with `Config`, we’ll
179 be able to create instances of `Config` by calling `Config::new`. Listing 12-7
180 shows the changes we need to make.
181
182 <span class="filename">Filename: src/main.rs</span>
183
184 ```rust,should_panic,noplayground
185 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-07/src/main.rs:here}}
186 ```
187
188 <span class="caption">Listing 12-7: Changing `parse_config` into
189 `Config::new`</span>
190
191 We’ve updated `main` where we were calling `parse_config` to instead call
192 `Config::new`. We’ve changed the name of `parse_config` to `new` and moved it
193 within an `impl` block, which associates the `new` function with `Config`. Try
194 compiling this code again to make sure it works.
195
196 ### Fixing the Error Handling
197
198 Now we’ll work on fixing our error handling. Recall that attempting to access
199 the values in the `args` vector at index 1 or index 2 will cause the program to
200 panic if the vector contains fewer than three items. Try running the program
201 without any arguments; it will look like this:
202
203 ```console
204 {{#include ../listings/ch12-an-io-project/listing-12-07/output.txt}}
205 ```
206
207 The line `index out of bounds: the len is 1 but the index is 1` is an error
208 message intended for programmers. It won’t help our end users understand what
209 they should do instead. Let’s fix that now.
210
211 #### Improving the Error Message
212
213 In Listing 12-8, we add a check in the `new` function that will verify that the
214 slice is long enough before accessing index 1 and 2. If the slice isn’t long
215 enough, the program panics and displays a better error message.
216
217 <span class="filename">Filename: src/main.rs</span>
218
219 ```rust,ignore
220 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-08/src/main.rs:here}}
221 ```
222
223 <span class="caption">Listing 12-8: Adding a check for the number of
224 arguments</span>
225
226 This code is similar to [the `Guess::new` function we wrote in Listing
227 9-13][ch9-custom-types]<!-- ignore -->, where we called `panic!` when the
228 `value` argument was out of the range of valid values. Instead of checking for
229 a range of values here, we’re checking that the length of `args` is at least 3
230 and the rest of the function can operate under the assumption that this
231 condition has been met. If `args` has fewer than three items, this condition
232 will be true, and we call the `panic!` macro to end the program immediately.
233
234 With these extra few lines of code in `new`, let’s run the program without any
235 arguments again to see what the error looks like now:
236
237 ```console
238 {{#include ../listings/ch12-an-io-project/listing-12-08/output.txt}}
239 ```
240
241 This output is better: we now have a reasonable error message. However, we also
242 have extraneous information we don’t want to give to our users. Perhaps using
243 the technique we used in Listing 9-13 isn’t the best to use here: a call to
244 `panic!` is more appropriate for a programming problem than a usage problem,
245 [as discussed in Chapter 9][ch9-error-guidelines]<!-- ignore -->. Instead,
246 we’ll use the other technique you learned about in Chapter 9—[returning a
247 `Result`][ch9-result]<!-- ignore --> that indicates either success or an error.
248
249 <!-- Old headings. Do not remove or links may break. -->
250 <a id="returning-a-result-from-new-instead-of-calling-panic"></a>
251
252 #### Returning a `Result` Instead of Calling `panic!`
253
254 We can instead return a `Result` value that will contain a `Config` instance in
255 the successful case and will describe the problem in the error case. We’re also
256 going to change the function name from `new` to `build` because many
257 programmers expect `new` functions to never fail. When `Config::build` is
258 communicating to `main`, we can use the `Result` type to signal there was a
259 problem. Then we can change `main` to convert an `Err` variant into a more
260 practical error for our users without the surrounding text about `thread
261 'main'` and `RUST_BACKTRACE` that a call to `panic!` causes.
262
263 Listing 12-9 shows the changes we need to make to the return value of the
264 function we’re now calling `Config::build` and the body of the function needed
265 to return a `Result`. Note that this won’t compile until we update `main` as
266 well, which we’ll do in the next listing.
267
268 <span class="filename">Filename: src/main.rs</span>
269
270 ```rust,ignore,does_not_compile
271 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-09/src/main.rs:here}}
272 ```
273
274 <span class="caption">Listing 12-9: Returning a `Result` from
275 `Config::build`</span>
276
277 Our `build` function now returns a `Result` with a `Config` instance in the
278 success case and a `&'static str` in the error case. Our error values will
279 always be string literals that have the `'static` lifetime.
280
281 We’ve made two changes in the body of the function: instead of calling `panic!`
282 when the user doesn’t pass enough arguments, we now return an `Err` value, and
283 we’ve wrapped the `Config` return value in an `Ok`. These changes make the
284 function conform to its new type signature.
285
286 Returning an `Err` value from `Config::build` allows the `main` function to
287 handle the `Result` value returned from the `build` function and exit the
288 process more cleanly in the error case.
289
290 <!-- Old headings. Do not remove or links may break. -->
291 <a id="calling-confignew-and-handling-errors"></a>
292
293 #### Calling `Config::build` and Handling Errors
294
295 To handle the error case and print a user-friendly message, we need to update
296 `main` to handle the `Result` being returned by `Config::build`, as shown in
297 Listing 12-10. We’ll also take the responsibility of exiting the command line
298 tool with a nonzero error code away from `panic!` and instead implement it by
299 hand. A nonzero exit status is a convention to signal to the process that
300 called our program that the program exited with an error state.
301
302 <span class="filename">Filename: src/main.rs</span>
303
304 ```rust,ignore
305 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-10/src/main.rs:here}}
306 ```
307
308 <span class="caption">Listing 12-10: Exiting with an error code if building a
309 `Config` fails</span>
310
311 In this listing, we’ve used a method we haven’t covered in detail yet:
312 `unwrap_or_else`, which is defined on `Result<T, E>` by the standard library.
313 Using `unwrap_or_else` allows us to define some custom, non-`panic!` error
314 handling. If the `Result` is an `Ok` value, this method’s behavior is similar
315 to `unwrap`: it returns the inner value `Ok` is wrapping. However, if the value
316 is an `Err` value, this method calls the code in the *closure*, which is an
317 anonymous function we define and pass as an argument to `unwrap_or_else`. We’ll
318 cover closures in more detail in [Chapter 13][ch13]<!-- ignore -->. For now,
319 you just need to know that `unwrap_or_else` will pass the inner value of the
320 `Err`, which in this case is the static string `"not enough arguments"` that we
321 added in Listing 12-9, to our closure in the argument `err` that appears
322 between the vertical pipes. The code in the closure can then use the `err`
323 value when it runs.
324
325 We’ve added a new `use` line to bring `process` from the standard library into
326 scope. The code in the closure that will be run in the error case is only two
327 lines: we print the `err` value and then call `process::exit`. The
328 `process::exit` function will stop the program immediately and return the
329 number that was passed as the exit status code. This is similar to the
330 `panic!`-based handling we used in Listing 12-8, but we no longer get all the
331 extra output. Let’s try it:
332
333 ```console
334 {{#include ../listings/ch12-an-io-project/listing-12-10/output.txt}}
335 ```
336
337 Great! This output is much friendlier for our users.
338
339 ### Extracting Logic from `main`
340
341 Now that we’ve finished refactoring the configuration parsing, let’s turn to
342 the program’s logic. As we stated in [“Separation of Concerns for Binary
343 Projects”](#separation-of-concerns-for-binary-projects)<!-- ignore -->, we’ll
344 extract a function named `run` that will hold all the logic currently in the
345 `main` function that isn’t involved with setting up configuration or handling
346 errors. When we’re done, `main` will be concise and easy to verify by
347 inspection, and we’ll be able to write tests for all the other logic.
348
349 Listing 12-11 shows the extracted `run` function. For now, we’re just making
350 the small, incremental improvement of extracting the function. We’re still
351 defining the function in *src/main.rs*.
352
353 <span class="filename">Filename: src/main.rs</span>
354
355 ```rust,ignore
356 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-11/src/main.rs:here}}
357 ```
358
359 <span class="caption">Listing 12-11: Extracting a `run` function containing the
360 rest of the program logic</span>
361
362 The `run` function now contains all the remaining logic from `main`, starting
363 from reading the file. The `run` function takes the `Config` instance as an
364 argument.
365
366 #### Returning Errors from the `run` Function
367
368 With the remaining program logic separated into the `run` function, we can
369 improve the error handling, as we did with `Config::build` in Listing 12-9.
370 Instead of allowing the program to panic by calling `expect`, the `run`
371 function will return a `Result<T, E>` when something goes wrong. This will let
372 us further consolidate the logic around handling errors into `main` in a
373 user-friendly way. Listing 12-12 shows the changes we need to make to the
374 signature and body of `run`.
375
376 <span class="filename">Filename: src/main.rs</span>
377
378 ```rust,ignore
379 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-12/src/main.rs:here}}
380 ```
381
382 <span class="caption">Listing 12-12: Changing the `run` function to return
383 `Result`</span>
384
385 We’ve made three significant changes here. First, we changed the return type of
386 the `run` function to `Result<(), Box<dyn Error>>`. This function previously
387 returned the unit type, `()`, and we keep that as the value returned in the
388 `Ok` case.
389
390 For the error type, we used the *trait object* `Box<dyn Error>` (and we’ve
391 brought `std::error::Error` into scope with a `use` statement at the top).
392 We’ll cover trait objects in [Chapter 17][ch17]<!-- ignore -->. For now, just
393 know that `Box<dyn Error>` means the function will return a type that
394 implements the `Error` trait, but we don’t have to specify what particular type
395 the return value will be. This gives us flexibility to return error values that
396 may be of different types in different error cases. The `dyn` keyword is short
397 for “dynamic.”
398
399 Second, we’ve removed the call to `expect` in favor of the `?` operator, as we
400 talked about in [Chapter 9][ch9-question-mark]<!-- ignore -->. Rather than
401 `panic!` on an error, `?` will return the error value from the current function
402 for the caller to handle.
403
404 Third, the `run` function now returns an `Ok` value in the success case.
405 We’ve declared the `run` function’s success type as `()` in the signature,
406 which means we need to wrap the unit type value in the `Ok` value. This
407 `Ok(())` syntax might look a bit strange at first, but using `()` like this is
408 the idiomatic way to indicate that we’re calling `run` for its side effects
409 only; it doesn’t return a value we need.
410
411 When you run this code, it will compile but will display a warning:
412
413 ```console
414 {{#include ../listings/ch12-an-io-project/listing-12-12/output.txt}}
415 ```
416
417 Rust tells us that our code ignored the `Result` value and the `Result` value
418 might indicate that an error occurred. But we’re not checking to see whether or
419 not there was an error, and the compiler reminds us that we probably meant to
420 have some error-handling code here! Let’s rectify that problem now.
421
422 #### Handling Errors Returned from `run` in `main`
423
424 We’ll check for errors and handle them using a technique similar to one we used
425 with `Config::build` in Listing 12-10, but with a slight difference:
426
427 <span class="filename">Filename: src/main.rs</span>
428
429 ```rust,ignore
430 {{#rustdoc_include ../listings/ch12-an-io-project/no-listing-01-handling-errors-in-main/src/main.rs:here}}
431 ```
432
433 We use `if let` rather than `unwrap_or_else` to check whether `run` returns an
434 `Err` value and call `process::exit(1)` if it does. The `run` function doesn’t
435 return a value that we want to `unwrap` in the same way that `Config::build`
436 returns the `Config` instance. Because `run` returns `()` in the success case,
437 we only care about detecting an error, so we don’t need `unwrap_or_else` to
438 return the unwrapped value, which would only be `()`.
439
440 The bodies of the `if let` and the `unwrap_or_else` functions are the same in
441 both cases: we print the error and exit.
442
443 ### Splitting Code into a Library Crate
444
445 Our `minigrep` project is looking good so far! Now we’ll split the
446 *src/main.rs* file and put some code into the *src/lib.rs* file. That way we
447 can test the code and have a *src/main.rs* file with fewer responsibilities.
448
449 Let’s move all the code that isn’t the `main` function from *src/main.rs* to
450 *src/lib.rs*:
451
452 * The `run` function definition
453 * The relevant `use` statements
454 * The definition of `Config`
455 * The `Config::build` function definition
456
457 The contents of *src/lib.rs* should have the signatures shown in Listing 12-13
458 (we’ve omitted the bodies of the functions for brevity). Note that this won’t
459 compile until we modify *src/main.rs* in Listing 12-14.
460
461 <span class="filename">Filename: src/lib.rs</span>
462
463 ```rust,ignore,does_not_compile
464 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-13/src/lib.rs:here}}
465 ```
466
467 <span class="caption">Listing 12-13: Moving `Config` and `run` into
468 *src/lib.rs*</span>
469
470 We’ve made liberal use of the `pub` keyword: on `Config`, on its fields and its
471 `new` method, and on the `run` function. We now have a library crate that has a
472 public API we can test!
473
474 Now we need to bring the code we moved to *src/lib.rs* into the scope of the
475 binary crate in *src/main.rs*, as shown in Listing 12-14.
476
477 <span class="filename">Filename: src/main.rs</span>
478
479 ```rust,ignore
480 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-14/src/main.rs:here}}
481 ```
482
483 <span class="caption">Listing 12-14: Using the `minigrep` library crate in
484 *src/main.rs*</span>
485
486 We add a `use minigrep::Config` line to bring the `Config` type from the
487 library crate into the binary crate’s scope, and we prefix the `run` function
488 with our crate name. Now all the functionality should be connected and should
489 work. Run the program with `cargo run` and make sure everything works
490 correctly.
491
492 Whew! That was a lot of work, but we’ve set ourselves up for success in the
493 future. Now it’s much easier to handle errors, and we’ve made the code more
494 modular. Almost all of our work will be done in *src/lib.rs* from here on out.
495
496 Let’s take advantage of this newfound modularity by doing something that would
497 have been difficult with the old code but is easy with the new code: we’ll
498 write some tests!
499
500 [ch13]: ch13-00-functional-features.html
501 [ch9-custom-types]: ch09-03-to-panic-or-not-to-panic.html#creating-custom-types-for-validation
502 [ch9-error-guidelines]: ch09-03-to-panic-or-not-to-panic.html#guidelines-for-error-handling
503 [ch9-result]: ch09-02-recoverable-errors-with-result.html
504 [ch17]: ch17-00-oop.html
505 [ch9-question-mark]: ch09-02-recoverable-errors-with-result.html#a-shortcut-for-propagating-errors-the--operator