src/doc/book/src/ch12-03-improving-error-handling-and-modularity.md

   1 ## Refactoring to Improve Modularity and Error Handling
   2
   3 To improve our program, we’ll fix four problems that have to do with the
   4 program’s structure and how it’s handling potential errors. First, our `main`
   5 function now performs two tasks: it parses arguments and reads files. As our
   6 program grows, the number of separate tasks the `main` function handles will
   7 increase. As a function gains responsibilities, it becomes more difficult to
   8 reason about, harder to test, and harder to change without breaking one of its
   9 parts. It’s best to separate functionality so each function is responsible for
  10 one task.
  11
  12 This issue also ties into the second problem: although `query` and `file_path`
  13 are configuration variables to our program, variables like `contents` are used
  14 to perform the program’s logic. The longer `main` becomes, the more variables
  15 we’ll need to bring into scope; the more variables we have in scope, the harder
  16 it will be to keep track of the purpose of each. It’s best to group the
  17 configuration variables into one structure to make their purpose clear.
  18
  19 The third problem is that we’ve used `expect` to print an error message when
  20 reading the file fails, but the error message just prints `Should have been
  21 able to read the file`. Reading a file can fail in a number of ways: for
  22 example, the file could be missing, or we might not have permission to open it.
  23 Right now, regardless of the situation, we’d print the same error message for
  24 everything, which wouldn’t give the user any information!
  25
  26 Fourth, we use `expect` repeatedly to handle different errors, and if the user
  27 runs our program without specifying enough arguments, they’ll get an `index out
  28 of bounds` error from Rust that doesn’t clearly explain the problem. It would
  29 be best if all the error-handling code were in one place so future maintainers
  30 had only one place to consult the code if the error-handling logic needed to
  31 change. Having all the error-handling code in one place will also ensure that
  32 we’re printing messages that will be meaningful to our end users.
  33
  34 Let’s address these four problems by refactoring our project.
  35
  36 ### Separation of Concerns for Binary Projects
  37
  38 The organizational problem of allocating responsibility for multiple tasks to
  39 the `main` function is common to many binary projects. As a result, the Rust
  40 community has developed guidelines for splitting the separate concerns of a
  41 binary program when `main` starts getting large. This process has the following
  42 steps:
  43
  44 * Split your program into a *main.rs* and a *lib.rs* and move your program’s
  45   logic to *lib.rs*.
  46 * As long as your command line parsing logic is small, it can remain in
  47   *main.rs*.
  48 * When the command line parsing logic starts getting complicated, extract it
  49   from *main.rs* and move it to *lib.rs*.
  50
  51 The responsibilities that remain in the `main` function after this process
  52 should be limited to the following:
  53
  54 * Calling the command line parsing logic with the argument values
  55 * Setting up any other configuration
  56 * Calling a `run` function in *lib.rs*
  57 * Handling the error if `run` returns an error
  58
  59 This pattern is about separating concerns: *main.rs* handles running the
  60 program, and *lib.rs* handles all the logic of the task at hand. Because you
  61 can’t test the `main` function directly, this structure lets you test all of
  62 your program’s logic by moving it into functions in *lib.rs*. The code that
  63 remains in *main.rs* will be small enough to verify its correctness by reading
  64 it. Let’s rework our program by following this process.
  65
  66 #### Extracting the Argument Parser
  67
  68 We’ll extract the functionality for parsing arguments into a function that
  69 `main` will call to prepare for moving the command line parsing logic to
  70 *src/lib.rs*. Listing 12-5 shows the new start of `main` that calls a new
  71 function `parse_config`, which we’ll define in *src/main.rs* for the moment.
  72
  73 <span class="filename">Filename: src/main.rs</span>
  74
  75 ```rust,ignore
  76 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-05/src/main.rs:here}}
  77 ```
  78
  79 <span class="caption">Listing 12-5: Extracting a `parse_config` function from
  80 `main`</span>
  81
  82 We’re still collecting the command line arguments into a vector, but instead of
  83 assigning the argument value at index 1 to the variable `query` and the
  84 argument value at index 2 to the variable `file_path` within the `main`
  85 function, we pass the whole vector to the `parse_config` function. The
  86 `parse_config` function then holds the logic that determines which argument
  87 goes in which variable and passes the values back to `main`. We still create
  88 the `query` and `file_path` variables in `main`, but `main` no longer has the
  89 responsibility of determining how the command line arguments and variables
  90 correspond.
  91
  92 This rework may seem like overkill for our small program, but we’re refactoring
  93 in small, incremental steps. After making this change, run the program again to
  94 verify that the argument parsing still works. It’s good to check your progress
  95 often, to help identify the cause of problems when they occur.
  96
  97 #### Grouping Configuration Values
  98
  99 We can take another small step to improve the `parse_config` function further.
 100 At the moment, we’re returning a tuple, but then we immediately break that
 101 tuple into individual parts again. This is a sign that perhaps we don’t have
 102 the right abstraction yet.
 103
 104 Another indicator that shows there’s room for improvement is the `config` part
 105 of `parse_config`, which implies that the two values we return are related and
 106 are both part of one configuration value. We’re not currently conveying this
 107 meaning in the structure of the data other than by grouping the two values into
 108 a tuple; we’ll instead put the two values into one struct and give each of the
 109 struct fields a meaningful name. Doing so will make it easier for future
 110 maintainers of this code to understand how the different values relate to each
 111 other and what their purpose is.
 112
 113 Listing 12-6 shows the improvements to the `parse_config` function.
 114
 115 <span class="filename">Filename: src/main.rs</span>
 116
 117 ```rust,should_panic,noplayground
 118 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-06/src/main.rs:here}}
 119 ```
 120
 121 <span class="caption">Listing 12-6: Refactoring `parse_config` to return an
 122 instance of a `Config` struct</span>
 123
 124 We’ve added a struct named `Config` defined to have fields named `query` and
 125 `file_path`. The signature of `parse_config` now indicates that it returns a
 126 `Config` value. In the body of `parse_config`, where we used to return
 127 string slices that reference `String` values in `args`, we now define `Config`
 128 to contain owned `String` values. The `args` variable in `main` is the owner of
 129 the argument values and is only letting the `parse_config` function borrow
 130 them, which means we’d violate Rust’s borrowing rules if `Config` tried to take
 131 ownership of the values in `args`.
 132
 133 There are a number of ways we could manage the `String` data; the easiest,
 134 though somewhat inefficient, route is to call the `clone` method on the values.
 135 This will make a full copy of the data for the `Config` instance to own, which
 136 takes more time and memory than storing a reference to the string data.
 137 However, cloning the data also makes our code very straightforward because we
 138 don’t have to manage the lifetimes of the references; in this circumstance,
 139 giving up a little performance to gain simplicity is a worthwhile trade-off.
 140
 141 > ### The Trade-Offs of Using `clone`
 142 >
 143 > There’s a tendency among many Rustaceans to avoid using `clone` to fix
 144 > ownership problems because of its runtime cost. In
 145 > [Chapter 13][ch13]<!-- ignore -->, you’ll learn how to use more efficient
 146 > methods in this type of situation. But for now, it’s okay to copy a few
 147 > strings to continue making progress because you’ll make these copies only
 148 > once and your file path and query string are very small. It’s better to have
 149 > a working program that’s a bit inefficient than to try to hyperoptimize code
 150 > on your first pass. As you become more experienced with Rust, it’ll be
 151 > easier to start with the most efficient solution, but for now, it’s
 152 > perfectly acceptable to call `clone`.
 153
 154 We’ve updated `main` so it places the instance of `Config` returned by
 155 `parse_config` into a variable named `config`, and we updated the code that
 156 previously used the separate `query` and `file_path` variables so it now uses
 157 the fields on the `Config` struct instead.
 158
 159 Now our code more clearly conveys that `query` and `file_path` are related and
 160 that their purpose is to configure how the program will work. Any code that
 161 uses these values knows to find them in the `config` instance in the fields
 162 named for their purpose.
 163
 164 #### Creating a Constructor for `Config`
 165
 166 So far, we’ve extracted the logic responsible for parsing the command line
 167 arguments from `main` and placed it in the `parse_config` function. Doing so
 168 helped us to see that the `query` and `file_path` values were related and that
 169 relationship should be conveyed in our code. We then added a `Config` struct to
 170 name the related purpose of `query` and `file_path` and to be able to return the
 171 values’ names as struct field names from the `parse_config` function.
 172
 173 So now that the purpose of the `parse_config` function is to create a `Config`
 174 instance, we can change `parse_config` from a plain function to a function
 175 named `new` that is associated with the `Config` struct. Making this change
 176 will make the code more idiomatic. We can create instances of types in the
 177 standard library, such as `String`, by calling `String::new`. Similarly, by
 178 changing `parse_config` into a `new` function associated with `Config`, we’ll
 179 be able to create instances of `Config` by calling `Config::new`. Listing 12-7
 180 shows the changes we need to make.
 181
 182 <span class="filename">Filename: src/main.rs</span>
 183
 184 ```rust,should_panic,noplayground
 185 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-07/src/main.rs:here}}
 186 ```
 187
 188 <span class="caption">Listing 12-7: Changing `parse_config` into
 189 `Config::new`</span>
 190
 191 We’ve updated `main` where we were calling `parse_config` to instead call
 192 `Config::new`. We’ve changed the name of `parse_config` to `new` and moved it
 193 within an `impl` block, which associates the `new` function with `Config`. Try
 194 compiling this code again to make sure it works.
 195
 196 ### Fixing the Error Handling
 197
 198 Now we’ll work on fixing our error handling. Recall that attempting to access
 199 the values in the `args` vector at index 1 or index 2 will cause the program to
 200 panic if the vector contains fewer than three items. Try running the program
 201 without any arguments; it will look like this:
 202
 203 ```console
 204 {{#include ../listings/ch12-an-io-project/listing-12-07/output.txt}}
 205 ```
 206
 207 The line `index out of bounds: the len is 1 but the index is 1` is an error
 208 message intended for programmers. It won’t help our end users understand what
 209 they should do instead. Let’s fix that now.
 210
 211 #### Improving the Error Message
 212
 213 In Listing 12-8, we add a check in the `new` function that will verify that the
 214 slice is long enough before accessing index 1 and 2. If the slice isn’t long
 215 enough, the program panics and displays a better error message.
 216
 217 <span class="filename">Filename: src/main.rs</span>
 218
 219 ```rust,ignore
 220 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-08/src/main.rs:here}}
 221 ```
 222
 223 <span class="caption">Listing 12-8: Adding a check for the number of
 224 arguments</span>
 225
 226 This code is similar to [the `Guess::new` function we wrote in Listing
 227 9-13][ch9-custom-types]<!-- ignore -->, where we called `panic!` when the
 228 `value` argument was out of the range of valid values. Instead of checking for
 229 a range of values here, we’re checking that the length of `args` is at least 3
 230 and the rest of the function can operate under the assumption that this
 231 condition has been met. If `args` has fewer than three items, this condition
 232 will be true, and we call the `panic!` macro to end the program immediately.
 233
 234 With these extra few lines of code in `new`, let’s run the program without any
 235 arguments again to see what the error looks like now:
 236
 237 ```console
 238 {{#include ../listings/ch12-an-io-project/listing-12-08/output.txt}}
 239 ```
 240
 241 This output is better: we now have a reasonable error message. However, we also
 242 have extraneous information we don’t want to give to our users. Perhaps using
 243 the technique we used in Listing 9-13 isn’t the best to use here: a call to
 244 `panic!` is more appropriate for a programming problem than a usage problem,
 245 [as discussed in Chapter 9][ch9-error-guidelines]<!-- ignore -->. Instead,
 246 we’ll use the other technique you learned about in Chapter 9—[returning a
 247 `Result`][ch9-result]<!-- ignore --> that indicates either success or an error.
 248
 249 <!-- Old headings. Do not remove or links may break. -->
 250 <a id="returning-a-result-from-new-instead-of-calling-panic"></a>
 251
 252 #### Returning a `Result` Instead of Calling `panic!`
 253
 254 We can instead return a `Result` value that will contain a `Config` instance in
 255 the successful case and will describe the problem in the error case. We’re also
 256 going to change the function name from `new` to `build` because many
 257 programmers expect `new` functions to never fail. When `Config::build` is
 258 communicating to `main`, we can use the `Result` type to signal there was a
 259 problem. Then we can change `main` to convert an `Err` variant into a more
 260 practical error for our users without the surrounding text about `thread
 261 'main'` and `RUST_BACKTRACE` that a call to `panic!` causes.
 262
 263 Listing 12-9 shows the changes we need to make to the return value of the
 264 function we’re now calling `Config::build` and the body of the function needed
 265 to return a `Result`. Note that this won’t compile until we update `main` as
 266 well, which we’ll do in the next listing.
 267
 268 <span class="filename">Filename: src/main.rs</span>
 269
 270 ```rust,ignore,does_not_compile
 271 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-09/src/main.rs:here}}
 272 ```
 273
 274 <span class="caption">Listing 12-9: Returning a `Result` from
 275 `Config::build`</span>
 276
 277 Our `build` function now returns a `Result` with a `Config` instance in the
 278 success case and a `&'static str` in the error case. Our error values will
 279 always be string literals that have the `'static` lifetime.
 280
 281 We’ve made two changes in the body of the function: instead of calling `panic!`
 282 when the user doesn’t pass enough arguments, we now return an `Err` value, and
 283 we’ve wrapped the `Config` return value in an `Ok`. These changes make the
 284 function conform to its new type signature.
 285
 286 Returning an `Err` value from `Config::build` allows the `main` function to
 287 handle the `Result` value returned from the `build` function and exit the
 288 process more cleanly in the error case.
 289
 290 <!-- Old headings. Do not remove or links may break. -->
 291 <a id="calling-confignew-and-handling-errors"></a>
 292
 293 #### Calling `Config::build` and Handling Errors
 294
 295 To handle the error case and print a user-friendly message, we need to update
 296 `main` to handle the `Result` being returned by `Config::build`, as shown in
 297 Listing 12-10. We’ll also take the responsibility of exiting the command line
 298 tool with a nonzero error code away from `panic!` and instead implement it by
 299 hand. A nonzero exit status is a convention to signal to the process that
 300 called our program that the program exited with an error state.
 301
 302 <span class="filename">Filename: src/main.rs</span>
 303
 304 ```rust,ignore
 305 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-10/src/main.rs:here}}
 306 ```
 307
 308 <span class="caption">Listing 12-10: Exiting with an error code if building a
 309 `Config` fails</span>
 310
 311 In this listing, we’ve used a method we haven’t covered in detail yet:
 312 `unwrap_or_else`, which is defined on `Result<T, E>` by the standard library.
 313 Using `unwrap_or_else` allows us to define some custom, non-`panic!` error
 314 handling. If the `Result` is an `Ok` value, this method’s behavior is similar
 315 to `unwrap`: it returns the inner value `Ok` is wrapping. However, if the value
 316 is an `Err` value, this method calls the code in the *closure*, which is an
 317 anonymous function we define and pass as an argument to `unwrap_or_else`. We’ll
 318 cover closures in more detail in [Chapter 13][ch13]<!-- ignore -->. For now,
 319 you just need to know that `unwrap_or_else` will pass the inner value of the
 320 `Err`, which in this case is the static string `"not enough arguments"` that we
 321 added in Listing 12-9, to our closure in the argument `err` that appears
 322 between the vertical pipes. The code in the closure can then use the `err`
 323 value when it runs.
 324
 325 We’ve added a new `use` line to bring `process` from the standard library into
 326 scope. The code in the closure that will be run in the error case is only two
 327 lines: we print the `err` value and then call `process::exit`. The
 328 `process::exit` function will stop the program immediately and return the
 329 number that was passed as the exit status code. This is similar to the
 330 `panic!`-based handling we used in Listing 12-8, but we no longer get all the
 331 extra output. Let’s try it:
 332
 333 ```console
 334 {{#include ../listings/ch12-an-io-project/listing-12-10/output.txt}}
 335 ```
 336
 337 Great! This output is much friendlier for our users.
 338
 339 ### Extracting Logic from `main`
 340
 341 Now that we’ve finished refactoring the configuration parsing, let’s turn to
 342 the program’s logic. As we stated in [“Separation of Concerns for Binary
 343 Projects”](#separation-of-concerns-for-binary-projects)<!-- ignore -->, we’ll
 344 extract a function named `run` that will hold all the logic currently in the
 345 `main` function that isn’t involved with setting up configuration or handling
 346 errors. When we’re done, `main` will be concise and easy to verify by
 347 inspection, and we’ll be able to write tests for all the other logic.
 348
 349 Listing 12-11 shows the extracted `run` function. For now, we’re just making
 350 the small, incremental improvement of extracting the function. We’re still
 351 defining the function in *src/main.rs*.
 352
 353 <span class="filename">Filename: src/main.rs</span>
 354
 355 ```rust,ignore
 356 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-11/src/main.rs:here}}
 357 ```
 358
 359 <span class="caption">Listing 12-11: Extracting a `run` function containing the
 360 rest of the program logic</span>
 361
 362 The `run` function now contains all the remaining logic from `main`, starting
 363 from reading the file. The `run` function takes the `Config` instance as an
 364 argument.
 365
 366 #### Returning Errors from the `run` Function
 367
 368 With the remaining program logic separated into the `run` function, we can
 369 improve the error handling, as we did with `Config::build` in Listing 12-9.
 370 Instead of allowing the program to panic by calling `expect`, the `run`
 371 function will return a `Result<T, E>` when something goes wrong. This will let
 372 us further consolidate the logic around handling errors into `main` in a
 373 user-friendly way. Listing 12-12 shows the changes we need to make to the
 374 signature and body of `run`.
 375
 376 <span class="filename">Filename: src/main.rs</span>
 377
 378 ```rust,ignore
 379 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-12/src/main.rs:here}}
 380 ```
 381
 382 <span class="caption">Listing 12-12: Changing the `run` function to return
 383 `Result`</span>
 384
 385 We’ve made three significant changes here. First, we changed the return type of
 386 the `run` function to `Result<(), Box<dyn Error>>`. This function previously
 387 returned the unit type, `()`, and we keep that as the value returned in the
 388 `Ok` case.
 389
 390 For the error type, we used the *trait object* `Box<dyn Error>` (and we’ve
 391 brought `std::error::Error` into scope with a `use` statement at the top).
 392 We’ll cover trait objects in [Chapter 17][ch17]<!-- ignore -->. For now, just
 393 know that `Box<dyn Error>` means the function will return a type that
 394 implements the `Error` trait, but we don’t have to specify what particular type
 395 the return value will be. This gives us flexibility to return error values that
 396 may be of different types in different error cases. The `dyn` keyword is short
 397 for “dynamic.”
 398
 399 Second, we’ve removed the call to `expect` in favor of the `?` operator, as we
 400 talked about in [Chapter 9][ch9-question-mark]<!-- ignore -->. Rather than
 401 `panic!` on an error, `?` will return the error value from the current function
 402 for the caller to handle.
 403
 404 Third, the `run` function now returns an `Ok` value in the success case.
 405 We’ve declared the `run` function’s success type as `()` in the signature,
 406 which means we need to wrap the unit type value in the `Ok` value. This
 407 `Ok(())` syntax might look a bit strange at first, but using `()` like this is
 408 the idiomatic way to indicate that we’re calling `run` for its side effects
 409 only; it doesn’t return a value we need.
 410
 411 When you run this code, it will compile but will display a warning:
 412
 413 ```console
 414 {{#include ../listings/ch12-an-io-project/listing-12-12/output.txt}}
 415 ```
 416
 417 Rust tells us that our code ignored the `Result` value and the `Result` value
 418 might indicate that an error occurred. But we’re not checking to see whether or
 419 not there was an error, and the compiler reminds us that we probably meant to
 420 have some error-handling code here! Let’s rectify that problem now.
 421
 422 #### Handling Errors Returned from `run` in `main`
 423
 424 We’ll check for errors and handle them using a technique similar to one we used
 425 with `Config::build` in Listing 12-10, but with a slight difference:
 426
 427 <span class="filename">Filename: src/main.rs</span>
 428
 429 ```rust,ignore
 430 {{#rustdoc_include ../listings/ch12-an-io-project/no-listing-01-handling-errors-in-main/src/main.rs:here}}
 431 ```
 432
 433 We use `if let` rather than `unwrap_or_else` to check whether `run` returns an
 434 `Err` value and call `process::exit(1)` if it does. The `run` function doesn’t
 435 return a value that we want to `unwrap` in the same way that `Config::build`
 436 returns the `Config` instance. Because `run` returns `()` in the success case,
 437 we only care about detecting an error, so we don’t need `unwrap_or_else` to
 438 return the unwrapped value, which would only be `()`.
 439
 440 The bodies of the `if let` and the `unwrap_or_else` functions are the same in
 441 both cases: we print the error and exit.
 442
 443 ### Splitting Code into a Library Crate
 444
 445 Our `minigrep` project is looking good so far! Now we’ll split the
 446 *src/main.rs* file and put some code into the *src/lib.rs* file. That way we
 447 can test the code and have a *src/main.rs* file with fewer responsibilities.
 448
 449 Let’s move all the code that isn’t the `main` function from *src/main.rs* to
 450 *src/lib.rs*:
 451
 452 * The `run` function definition
 453 * The relevant `use` statements
 454 * The definition of `Config`
 455 * The `Config::build` function definition
 456
 457 The contents of *src/lib.rs* should have the signatures shown in Listing 12-13
 458 (we’ve omitted the bodies of the functions for brevity). Note that this won’t
 459 compile until we modify *src/main.rs* in Listing 12-14.
 460
 461 <span class="filename">Filename: src/lib.rs</span>
 462
 463 ```rust,ignore,does_not_compile
 464 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-13/src/lib.rs:here}}
 465 ```
 466
 467 <span class="caption">Listing 12-13: Moving `Config` and `run` into
 468 *src/lib.rs*</span>
 469
 470 We’ve made liberal use of the `pub` keyword: on `Config`, on its fields and its
 471 `new` method, and on the `run` function. We now have a library crate that has a
 472 public API we can test!
 473
 474 Now we need to bring the code we moved to *src/lib.rs* into the scope of the
 475 binary crate in *src/main.rs*, as shown in Listing 12-14.
 476
 477 <span class="filename">Filename: src/main.rs</span>
 478
 479 ```rust,ignore
 480 {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-14/src/main.rs:here}}
 481 ```
 482
 483 <span class="caption">Listing 12-14: Using the `minigrep` library crate in
 484 *src/main.rs*</span>
 485
 486 We add a `use minigrep::Config` line to bring the `Config` type from the
 487 library crate into the binary crate’s scope, and we prefix the `run` function
 488 with our crate name. Now all the functionality should be connected and should
 489 work. Run the program with `cargo run` and make sure everything works
 490 correctly.
 491
 492 Whew! That was a lot of work, but we’ve set ourselves up for success in the
 493 future. Now it’s much easier to handle errors, and we’ve made the code more
 494 modular. Almost all of our work will be done in *src/lib.rs* from here on out.
 495
 496 Let’s take advantage of this newfound modularity by doing something that would
 497 have been difficult with the old code but is easy with the new code: we’ll
 498 write some tests!
 499
 500 [ch13]: ch13-00-functional-features.html
 501 [ch9-custom-types]: ch09-03-to-panic-or-not-to-panic.html#creating-custom-types-for-validation
 502 [ch9-error-guidelines]: ch09-03-to-panic-or-not-to-panic.html#guidelines-for-error-handling
 503 [ch9-result]: ch09-02-recoverable-errors-with-result.html
 504 [ch17]: ch17-00-oop.html
 505 [ch9-question-mark]: ch09-02-recoverable-errors-with-result.html#a-shortcut-for-propagating-errors-the--operator