]>
Commit | Line | Data |
---|---|---|
85aaf69f | 1 | % Error Handling |
1a4d82fc | 2 | |
e9174d1e SL |
3 | Like most programming languages, Rust encourages the programmer to handle |
4 | errors in a particular way. Generally speaking, error handling is divided into | |
5 | two broad categories: exceptions and return values. Rust opts for return | |
6 | values. | |
7 | ||
9cc50fc6 | 8 | In this section, we intend to provide a comprehensive treatment of how to deal |
e9174d1e SL |
9 | with errors in Rust. More than that, we will attempt to introduce error handling |
10 | one piece at a time so that you'll come away with a solid working knowledge of | |
11 | how everything fits together. | |
12 | ||
13 | When done naïvely, error handling in Rust can be verbose and annoying. This | |
9cc50fc6 | 14 | section will explore those stumbling blocks and demonstrate how to use the |
e9174d1e SL |
15 | standard library to make error handling concise and ergonomic. |
16 | ||
17 | # Table of Contents | |
18 | ||
9cc50fc6 | 19 | This section is very long, mostly because we start at the very beginning with |
e9174d1e SL |
20 | sum types and combinators, and try to motivate the way Rust does error handling |
21 | incrementally. As such, programmers with experience in other expressive type | |
22 | systems may want to jump around. | |
23 | ||
24 | * [The Basics](#the-basics) | |
25 | * [Unwrapping explained](#unwrapping-explained) | |
26 | * [The `Option` type](#the-option-type) | |
27 | * [Composing `Option<T>` values](#composing-optiont-values) | |
28 | * [The `Result` type](#the-result-type) | |
29 | * [Parsing integers](#parsing-integers) | |
30 | * [The `Result` type alias idiom](#the-result-type-alias-idiom) | |
31 | * [A brief interlude: unwrapping isn't evil](#a-brief-interlude-unwrapping-isnt-evil) | |
32 | * [Working with multiple error types](#working-with-multiple-error-types) | |
33 | * [Composing `Option` and `Result`](#composing-option-and-result) | |
34 | * [The limits of combinators](#the-limits-of-combinators) | |
35 | * [Early returns](#early-returns) | |
36 | * [The `try!` macro](#the-try-macro) | |
37 | * [Defining your own error type](#defining-your-own-error-type) | |
38 | * [Standard library traits used for error handling](#standard-library-traits-used-for-error-handling) | |
39 | * [The `Error` trait](#the-error-trait) | |
40 | * [The `From` trait](#the-from-trait) | |
41 | * [The real `try!` macro](#the-real-try-macro) | |
42 | * [Composing custom error types](#composing-custom-error-types) | |
43 | * [Advice for library writers](#advice-for-library-writers) | |
44 | * [Case study: A program to read population data](#case-study-a-program-to-read-population-data) | |
45 | * [Initial setup](#initial-setup) | |
46 | * [Argument parsing](#argument-parsing) | |
47 | * [Writing the logic](#writing-the-logic) | |
48 | * [Error handling with `Box<Error>`](#error-handling-with-boxerror) | |
49 | * [Reading from stdin](#reading-from-stdin) | |
50 | * [Error handling with a custom type](#error-handling-with-a-custom-type) | |
51 | * [Adding functionality](#adding-functionality) | |
52 | * [The short story](#the-short-story) | |
53 | ||
54 | # The Basics | |
55 | ||
56 | You can think of error handling as using *case analysis* to determine whether | |
57 | a computation was successful or not. As you will see, the key to ergonomic error | |
58 | handling is reducing the amount of explicit case analysis the programmer has to | |
59 | do while keeping code composable. | |
60 | ||
61 | Keeping code composable is important, because without that requirement, we | |
9e0c209e | 62 | could [`panic`](../std/macro.panic.html) whenever we |
e9174d1e SL |
63 | come across something unexpected. (`panic` causes the current task to unwind, |
64 | and in most cases, the entire program aborts.) Here's an example: | |
65 | ||
66 | ```rust,should_panic | |
67 | // Guess a number between 1 and 10. | |
68 | // If it matches the number we had in mind, return true. Else, return false. | |
69 | fn guess(n: i32) -> bool { | |
70 | if n < 1 || n > 10 { | |
71 | panic!("Invalid number: {}", n); | |
72 | } | |
73 | n == 5 | |
74 | } | |
1a4d82fc | 75 | |
e9174d1e SL |
76 | fn main() { |
77 | guess(11); | |
78 | } | |
79 | ``` | |
1a4d82fc | 80 | |
e9174d1e | 81 | If you try running this code, the program will crash with a message like this: |
1a4d82fc | 82 | |
e9174d1e | 83 | ```text |
3157f602 | 84 | thread 'main' panicked at 'Invalid number: 11', src/bin/panic-simple.rs:5 |
e9174d1e | 85 | ``` |
1a4d82fc | 86 | |
e9174d1e SL |
87 | Here's another example that is slightly less contrived. A program that accepts |
88 | an integer as an argument, doubles it and prints it. | |
1a4d82fc | 89 | |
b039eaaf SL |
90 | <span id="code-unwrap-double"></span> |
91 | ||
e9174d1e SL |
92 | ```rust,should_panic |
93 | use std::env; | |
1a4d82fc | 94 | |
e9174d1e SL |
95 | fn main() { |
96 | let mut argv = env::args(); | |
97 | let arg: String = argv.nth(1).unwrap(); // error 1 | |
98 | let n: i32 = arg.parse().unwrap(); // error 2 | |
99 | println!("{}", 2 * n); | |
100 | } | |
1a4d82fc JJ |
101 | ``` |
102 | ||
e9174d1e SL |
103 | If you give this program zero arguments (error 1) or if the first argument |
104 | isn't an integer (error 2), the program will panic just like in the first | |
105 | example. | |
106 | ||
107 | You can think of this style of error handling as similar to a bull running | |
108 | through a china shop. The bull will get to where it wants to go, but it will | |
109 | trample everything in the process. | |
110 | ||
111 | ## Unwrapping explained | |
112 | ||
113 | In the previous example, we claimed | |
114 | that the program would simply panic if it reached one of the two error | |
115 | conditions, yet, the program does not include an explicit call to `panic` like | |
116 | the first example. This is because the | |
117 | panic is embedded in the calls to `unwrap`. | |
118 | ||
119 | To “unwrap” something in Rust is to say, “Give me the result of the | |
9cc50fc6 SL |
120 | computation, and if there was an error, panic and stop the program.” |
121 | It would be better if we showed the code for unwrapping because it is so | |
e9174d1e SL |
122 | simple, but to do that, we will first need to explore the `Option` and `Result` |
123 | types. Both of these types have a method called `unwrap` defined on them. | |
124 | ||
b039eaaf | 125 | ### The `Option` type |
1a4d82fc | 126 | |
b039eaaf | 127 | The `Option` type is [defined in the standard library][5]: |
e9174d1e SL |
128 | |
129 | ```rust | |
130 | enum Option<T> { | |
131 | None, | |
132 | Some(T), | |
133 | } | |
1a4d82fc JJ |
134 | ``` |
135 | ||
e9174d1e SL |
136 | The `Option` type is a way to use Rust's type system to express the |
137 | *possibility of absence*. Encoding the possibility of absence into the type | |
138 | system is an important concept because it will cause the compiler to force the | |
139 | programmer to handle that absence. Let's take a look at an example that tries | |
140 | to find a character in a string: | |
1a4d82fc | 141 | |
b039eaaf SL |
142 | <span id="code-option-ex-string-find"></span> |
143 | ||
e9174d1e SL |
144 | ```rust |
145 | // Searches `haystack` for the Unicode character `needle`. If one is found, the | |
146 | // byte offset of the character is returned. Otherwise, `None` is returned. | |
147 | fn find(haystack: &str, needle: char) -> Option<usize> { | |
148 | for (offset, c) in haystack.char_indices() { | |
149 | if c == needle { | |
150 | return Some(offset); | |
151 | } | |
152 | } | |
153 | None | |
154 | } | |
155 | ``` | |
156 | ||
9cc50fc6 | 157 | Notice that when this function finds a matching character, it doesn't only |
e9174d1e SL |
158 | return the `offset`. Instead, it returns `Some(offset)`. `Some` is a variant or |
159 | a *value constructor* for the `Option` type. You can think of it as a function | |
160 | with the type `fn<T>(value: T) -> Option<T>`. Correspondingly, `None` is also a | |
161 | value constructor, except it has no arguments. You can think of `None` as a | |
162 | function with the type `fn<T>() -> Option<T>`. | |
163 | ||
164 | This might seem like much ado about nothing, but this is only half of the | |
165 | story. The other half is *using* the `find` function we've written. Let's try | |
166 | to use it to find the extension in a file name. | |
1a4d82fc | 167 | |
bd371182 | 168 | ```rust |
5bcae85e | 169 | # fn find(haystack: &str, needle: char) -> Option<usize> { haystack.find(needle) } |
e9174d1e SL |
170 | fn main() { |
171 | let file_name = "foobar.rs"; | |
172 | match find(file_name, '.') { | |
173 | None => println!("No file extension found."), | |
174 | Some(i) => println!("File extension: {}", &file_name[i+1..]), | |
175 | } | |
176 | } | |
1a4d82fc JJ |
177 | ``` |
178 | ||
e9174d1e SL |
179 | This code uses [pattern matching][1] to do *case |
180 | analysis* on the `Option<usize>` returned by the `find` function. In fact, case | |
181 | analysis is the only way to get at the value stored inside an `Option<T>`. This | |
182 | means that you, as the programmer, must handle the case when an `Option<T>` is | |
183 | `None` instead of `Some(t)`. | |
1a4d82fc | 184 | |
9cc50fc6 | 185 | But wait, what about `unwrap`, which we used [previously](#code-unwrap-double)? |
e9174d1e SL |
186 | There was no case analysis there! Instead, the case analysis was put inside the |
187 | `unwrap` method for you. You could define it yourself if you want: | |
c1a9b12d | 188 | |
b039eaaf SL |
189 | <span id="code-option-def-unwrap"></span> |
190 | ||
e9174d1e SL |
191 | ```rust |
192 | enum Option<T> { | |
193 | None, | |
194 | Some(T), | |
195 | } | |
196 | ||
197 | impl<T> Option<T> { | |
198 | fn unwrap(self) -> T { | |
199 | match self { | |
200 | Option::Some(val) => val, | |
201 | Option::None => | |
202 | panic!("called `Option::unwrap()` on a `None` value"), | |
203 | } | |
204 | } | |
1a4d82fc | 205 | } |
e9174d1e SL |
206 | ``` |
207 | ||
208 | The `unwrap` method *abstracts away the case analysis*. This is precisely the thing | |
209 | that makes `unwrap` ergonomic to use. Unfortunately, that `panic!` means that | |
210 | `unwrap` is not composable: it is the bull in the china shop. | |
211 | ||
212 | ### Composing `Option<T>` values | |
1a4d82fc | 213 | |
b039eaaf | 214 | In an [example from before](#code-option-ex-string-find), |
e9174d1e SL |
215 | we saw how to use `find` to discover the extension in a file name. Of course, |
216 | not all file names have a `.` in them, so it's possible that the file name has | |
217 | no extension. This *possibility of absence* is encoded into the types using | |
218 | `Option<T>`. In other words, the compiler will force us to address the | |
9cc50fc6 | 219 | possibility that an extension does not exist. In our case, we only print out a |
e9174d1e SL |
220 | message saying as such. |
221 | ||
222 | Getting the extension of a file name is a pretty common operation, so it makes | |
223 | sense to put it into a function: | |
224 | ||
225 | ```rust | |
5bcae85e | 226 | # fn find(haystack: &str, needle: char) -> Option<usize> { haystack.find(needle) } |
e9174d1e | 227 | // Returns the extension of the given file name, where the extension is defined |
a7813a04 | 228 | // as all characters following the first `.`. |
e9174d1e SL |
229 | // If `file_name` has no `.`, then `None` is returned. |
230 | fn extension_explicit(file_name: &str) -> Option<&str> { | |
231 | match find(file_name, '.') { | |
232 | None => None, | |
233 | Some(i) => Some(&file_name[i+1..]), | |
234 | } | |
1a4d82fc | 235 | } |
e9174d1e SL |
236 | ``` |
237 | ||
238 | (Pro-tip: don't use this code. Use the | |
239 | [`extension`](../std/path/struct.Path.html#method.extension) | |
240 | method in the standard library instead.) | |
241 | ||
242 | The code stays simple, but the important thing to notice is that the type of | |
243 | `find` forces us to consider the possibility of absence. This is a good thing | |
244 | because it means the compiler won't let us accidentally forget about the case | |
245 | where a file name doesn't have an extension. On the other hand, doing explicit | |
246 | case analysis like we've done in `extension_explicit` every time can get a bit | |
247 | tiresome. | |
248 | ||
249 | In fact, the case analysis in `extension_explicit` follows a very common | |
250 | pattern: *map* a function on to the value inside of an `Option<T>`, unless the | |
9cc50fc6 | 251 | option is `None`, in which case, return `None`. |
e9174d1e SL |
252 | |
253 | Rust has parametric polymorphism, so it is very easy to define a combinator | |
254 | that abstracts this pattern: | |
255 | ||
b039eaaf SL |
256 | <span id="code-option-map"></span> |
257 | ||
e9174d1e SL |
258 | ```rust |
259 | fn map<F, T, A>(option: Option<T>, f: F) -> Option<A> where F: FnOnce(T) -> A { | |
260 | match option { | |
261 | None => None, | |
262 | Some(value) => Some(f(value)), | |
263 | } | |
264 | } | |
265 | ``` | |
266 | ||
267 | Indeed, `map` is [defined as a method][2] on `Option<T>` in the standard library. | |
7453a54e SL |
268 | As a method, it has a slightly different signature: methods take `self`, `&self`, |
269 | or `&mut self` as their first argument. | |
e9174d1e SL |
270 | |
271 | Armed with our new combinator, we can rewrite our `extension_explicit` method | |
272 | to get rid of the case analysis: | |
273 | ||
274 | ```rust | |
5bcae85e | 275 | # fn find(haystack: &str, needle: char) -> Option<usize> { haystack.find(needle) } |
e9174d1e | 276 | // Returns the extension of the given file name, where the extension is defined |
a7813a04 | 277 | // as all characters following the first `.`. |
e9174d1e SL |
278 | // If `file_name` has no `.`, then `None` is returned. |
279 | fn extension(file_name: &str) -> Option<&str> { | |
280 | find(file_name, '.').map(|i| &file_name[i+1..]) | |
281 | } | |
282 | ``` | |
1a4d82fc | 283 | |
b039eaaf SL |
284 | One other pattern we commonly find is assigning a default value to the case |
285 | when an `Option` value is `None`. For example, maybe your program assumes that | |
286 | the extension of a file is `rs` even if none is present. As you might imagine, | |
287 | the case analysis for this is not specific to file extensions - it can work | |
288 | with any `Option<T>`: | |
e9174d1e SL |
289 | |
290 | ```rust | |
291 | fn unwrap_or<T>(option: Option<T>, default: T) -> T { | |
292 | match option { | |
293 | None => default, | |
294 | Some(value) => value, | |
1a4d82fc JJ |
295 | } |
296 | } | |
e9174d1e SL |
297 | ``` |
298 | ||
7453a54e SL |
299 | Like with `map` above, the standard library implementation is a method instead |
300 | of a free function. | |
301 | ||
e9174d1e SL |
302 | The trick here is that the default value must have the same type as the value |
303 | that might be inside the `Option<T>`. Using it is dead simple in our case: | |
1a4d82fc | 304 | |
e9174d1e SL |
305 | ```rust |
306 | # fn find(haystack: &str, needle: char) -> Option<usize> { | |
307 | # for (offset, c) in haystack.char_indices() { | |
308 | # if c == needle { | |
309 | # return Some(offset); | |
310 | # } | |
311 | # } | |
312 | # None | |
313 | # } | |
314 | # | |
315 | # fn extension(file_name: &str) -> Option<&str> { | |
316 | # find(file_name, '.').map(|i| &file_name[i+1..]) | |
317 | # } | |
1a4d82fc | 318 | fn main() { |
e9174d1e SL |
319 | assert_eq!(extension("foobar.csv").unwrap_or("rs"), "csv"); |
320 | assert_eq!(extension("foobar").unwrap_or("rs"), "rs"); | |
321 | } | |
322 | ``` | |
323 | ||
324 | (Note that `unwrap_or` is [defined as a method][3] on `Option<T>` in the | |
325 | standard library, so we use that here instead of the free-standing function we | |
326 | defined above. Don't forget to check out the more general [`unwrap_or_else`][4] | |
327 | method.) | |
328 | ||
329 | There is one more combinator that we think is worth paying special attention to: | |
330 | `and_then`. It makes it easy to compose distinct computations that admit the | |
331 | *possibility of absence*. For example, much of the code in this section is | |
332 | about finding an extension given a file name. In order to do this, you first | |
333 | need the file name which is typically extracted from a file *path*. While most | |
334 | file paths have a file name, not *all* of them do. For example, `.`, `..` or | |
335 | `/`. | |
336 | ||
337 | So, we are tasked with the challenge of finding an extension given a file | |
338 | *path*. Let's start with explicit case analysis: | |
339 | ||
340 | ```rust | |
341 | # fn extension(file_name: &str) -> Option<&str> { None } | |
342 | fn file_path_ext_explicit(file_path: &str) -> Option<&str> { | |
343 | match file_name(file_path) { | |
344 | None => None, | |
345 | Some(name) => match extension(name) { | |
346 | None => None, | |
347 | Some(ext) => Some(ext), | |
348 | } | |
349 | } | |
350 | } | |
351 | ||
352 | fn file_name(file_path: &str) -> Option<&str> { | |
353 | // implementation elided | |
354 | unimplemented!() | |
355 | } | |
356 | ``` | |
357 | ||
9cc50fc6 | 358 | You might think that we could use the `map` combinator to reduce the case |
7453a54e SL |
359 | analysis, but its type doesn't quite fit... |
360 | ||
361 | ```rust,ignore | |
362 | fn file_path_ext(file_path: &str) -> Option<&str> { | |
363 | file_name(file_path).map(|x| extension(x)) //Compilation error | |
364 | } | |
365 | ``` | |
366 | ||
367 | The `map` function here wraps the value returned by the `extension` function | |
368 | inside an `Option<_>` and since the `extension` function itself returns an | |
369 | `Option<&str>` the expression `file_name(file_path).map(|x| extension(x))` | |
370 | actually returns an `Option<Option<&str>>`. | |
371 | ||
372 | But since `file_path_ext` just returns `Option<&str>` (and not | |
373 | `Option<Option<&str>>`) we get a compilation error. | |
374 | ||
375 | The result of the function taken by map as input is *always* [rewrapped with | |
376 | `Some`](#code-option-map). Instead, we need something like `map`, but which | |
377 | allows the caller to return a `Option<_>` directly without wrapping it in | |
378 | another `Option<_>`. | |
379 | ||
380 | Its generic implementation is even simpler than `map`: | |
e9174d1e SL |
381 | |
382 | ```rust | |
383 | fn and_then<F, T, A>(option: Option<T>, f: F) -> Option<A> | |
384 | where F: FnOnce(T) -> Option<A> { | |
385 | match option { | |
386 | None => None, | |
387 | Some(value) => f(value), | |
388 | } | |
389 | } | |
390 | ``` | |
391 | ||
392 | Now we can rewrite our `file_path_ext` function without explicit case analysis: | |
393 | ||
394 | ```rust | |
395 | # fn extension(file_name: &str) -> Option<&str> { None } | |
396 | # fn file_name(file_path: &str) -> Option<&str> { None } | |
397 | fn file_path_ext(file_path: &str) -> Option<&str> { | |
398 | file_name(file_path).and_then(extension) | |
399 | } | |
400 | ``` | |
401 | ||
7453a54e SL |
402 | Side note: Since `and_then` essentially works like `map` but returns an |
403 | `Option<_>` instead of an `Option<Option<_>>` it is known as `flatmap` in some | |
404 | other languages. | |
405 | ||
e9174d1e SL |
406 | The `Option` type has many other combinators [defined in the standard |
407 | library][5]. It is a good idea to skim this list and familiarize | |
408 | yourself with what's available—they can often reduce case analysis | |
409 | for you. Familiarizing yourself with these combinators will pay | |
410 | dividends because many of them are also defined (with similar | |
411 | semantics) for `Result`, which we will talk about next. | |
412 | ||
413 | Combinators make using types like `Option` ergonomic because they reduce | |
414 | explicit case analysis. They are also composable because they permit the caller | |
415 | to handle the possibility of absence in their own way. Methods like `unwrap` | |
416 | remove choices because they will panic if `Option<T>` is `None`. | |
417 | ||
418 | ## The `Result` type | |
419 | ||
420 | The `Result` type is also | |
421 | [defined in the standard library][6]: | |
422 | ||
b039eaaf SL |
423 | <span id="code-result-def"></span> |
424 | ||
e9174d1e SL |
425 | ```rust |
426 | enum Result<T, E> { | |
427 | Ok(T), | |
428 | Err(E), | |
1a4d82fc JJ |
429 | } |
430 | ``` | |
431 | ||
e9174d1e SL |
432 | The `Result` type is a richer version of `Option`. Instead of expressing the |
433 | possibility of *absence* like `Option` does, `Result` expresses the possibility | |
92a42be0 | 434 | of *error*. Usually, the *error* is used to explain why the execution of some |
e9174d1e SL |
435 | computation failed. This is a strictly more general form of `Option`. Consider |
436 | the following type alias, which is semantically equivalent to the real | |
437 | `Option<T>` in every way: | |
438 | ||
439 | ```rust | |
440 | type Option<T> = Result<T, ()>; | |
441 | ``` | |
442 | ||
443 | This fixes the second type parameter of `Result` to always be `()` (pronounced | |
444 | “unit” or “empty tuple”). Exactly one value inhabits the `()` type: `()`. (Yup, | |
445 | the type and value level terms have the same notation!) | |
446 | ||
447 | The `Result` type is a way of representing one of two possible outcomes in a | |
448 | computation. By convention, one outcome is meant to be expected or “`Ok`” while | |
449 | the other outcome is meant to be unexpected or “`Err`”. | |
450 | ||
451 | Just like `Option`, the `Result` type also has an | |
452 | [`unwrap` method | |
453 | defined][7] | |
454 | in the standard library. Let's define it: | |
455 | ||
456 | ```rust | |
457 | # enum Result<T, E> { Ok(T), Err(E) } | |
458 | impl<T, E: ::std::fmt::Debug> Result<T, E> { | |
459 | fn unwrap(self) -> T { | |
460 | match self { | |
461 | Result::Ok(val) => val, | |
462 | Result::Err(err) => | |
463 | panic!("called `Result::unwrap()` on an `Err` value: {:?}", err), | |
464 | } | |
465 | } | |
466 | } | |
467 | ``` | |
468 | ||
469 | This is effectively the same as our [definition for | |
470 | `Option::unwrap`](#code-option-def-unwrap), except it includes the | |
471 | error value in the `panic!` message. This makes debugging easier, but | |
472 | it also requires us to add a [`Debug`][8] constraint on the `E` type | |
473 | parameter (which represents our error type). Since the vast majority | |
474 | of types should satisfy the `Debug` constraint, this tends to work out | |
475 | in practice. (`Debug` on a type simply means that there's a reasonable | |
476 | way to print a human readable description of values with that type.) | |
477 | ||
478 | OK, let's move on to an example. | |
479 | ||
480 | ### Parsing integers | |
481 | ||
482 | The Rust standard library makes converting strings to integers dead simple. | |
483 | It's so easy in fact, that it is very tempting to write something like the | |
484 | following: | |
485 | ||
486 | ```rust | |
487 | fn double_number(number_str: &str) -> i32 { | |
488 | 2 * number_str.parse::<i32>().unwrap() | |
489 | } | |
490 | ||
491 | fn main() { | |
492 | let n: i32 = double_number("10"); | |
493 | assert_eq!(n, 20); | |
494 | } | |
495 | ``` | |
496 | ||
497 | At this point, you should be skeptical of calling `unwrap`. For example, if | |
498 | the string doesn't parse as a number, you'll get a panic: | |
1a4d82fc JJ |
499 | |
500 | ```text | |
3157f602 | 501 | thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: ParseIntError { kind: InvalidDigit }', /home/rustbuild/src/rust-buildbot/slave/beta-dist-rustc-linux/build/src/libcore/result.rs:729 |
e9174d1e SL |
502 | ``` |
503 | ||
504 | This is rather unsightly, and if this happened inside a library you're | |
505 | using, you might be understandably annoyed. Instead, we should try to | |
506 | handle the error in our function and let the caller decide what to | |
507 | do. This means changing the return type of `double_number`. But to | |
508 | what? Well, that requires looking at the signature of the [`parse` | |
509 | method][9] in the standard library: | |
510 | ||
511 | ```rust,ignore | |
512 | impl str { | |
513 | fn parse<F: FromStr>(&self) -> Result<F, F::Err>; | |
514 | } | |
1a4d82fc JJ |
515 | ``` |
516 | ||
e9174d1e SL |
517 | Hmm. So we at least know that we need to use a `Result`. Certainly, it's |
518 | possible that this could have returned an `Option`. After all, a string either | |
519 | parses as a number or it doesn't, right? That's certainly a reasonable way to | |
520 | go, but the implementation internally distinguishes *why* the string didn't | |
521 | parse as an integer. (Whether it's an empty string, an invalid digit, too big | |
522 | or too small.) Therefore, using a `Result` makes sense because we want to | |
523 | provide more information than simply “absence.” We want to say *why* the | |
524 | parsing failed. You should try to emulate this line of reasoning when faced | |
525 | with a choice between `Option` and `Result`. If you can provide detailed error | |
526 | information, then you probably should. (We'll see more on this later.) | |
527 | ||
528 | OK, but how do we write our return type? The `parse` method as defined | |
529 | above is generic over all the different number types defined in the | |
530 | standard library. We could (and probably should) also make our | |
531 | function generic, but let's favor explicitness for the moment. We only | |
532 | care about `i32`, so we need to [find its implementation of | |
533 | `FromStr`](../std/primitive.i32.html) (do a `CTRL-F` in your browser | |
534 | for “FromStr”) and look at its [associated type][10] `Err`. We did | |
535 | this so we can find the concrete error type. In this case, it's | |
536 | [`std::num::ParseIntError`](../std/num/struct.ParseIntError.html). | |
537 | Finally, we can rewrite our function: | |
1a4d82fc JJ |
538 | |
539 | ```rust | |
e9174d1e | 540 | use std::num::ParseIntError; |
1a4d82fc | 541 | |
e9174d1e SL |
542 | fn double_number(number_str: &str) -> Result<i32, ParseIntError> { |
543 | match number_str.parse::<i32>() { | |
544 | Ok(n) => Ok(2 * n), | |
545 | Err(err) => Err(err), | |
546 | } | |
1a4d82fc JJ |
547 | } |
548 | ||
e9174d1e SL |
549 | fn main() { |
550 | match double_number("10") { | |
551 | Ok(n) => assert_eq!(n, 20), | |
552 | Err(err) => println!("Error: {:?}", err), | |
553 | } | |
1a4d82fc | 554 | } |
e9174d1e SL |
555 | ``` |
556 | ||
557 | This is a little better, but now we've written a lot more code! The case | |
558 | analysis has once again bitten us. | |
1a4d82fc | 559 | |
e9174d1e SL |
560 | Combinators to the rescue! Just like `Option`, `Result` has lots of combinators |
561 | defined as methods. There is a large intersection of common combinators between | |
562 | `Result` and `Option`. In particular, `map` is part of that intersection: | |
563 | ||
564 | ```rust | |
565 | use std::num::ParseIntError; | |
566 | ||
567 | fn double_number(number_str: &str) -> Result<i32, ParseIntError> { | |
568 | number_str.parse::<i32>().map(|n| 2 * n) | |
569 | } | |
570 | ||
571 | fn main() { | |
572 | match double_number("10") { | |
573 | Ok(n) => assert_eq!(n, 20), | |
574 | Err(err) => println!("Error: {:?}", err), | |
1a4d82fc JJ |
575 | } |
576 | } | |
e9174d1e SL |
577 | ``` |
578 | ||
579 | The usual suspects are all there for `Result`, including | |
580 | [`unwrap_or`](../std/result/enum.Result.html#method.unwrap_or) and | |
581 | [`and_then`](../std/result/enum.Result.html#method.and_then). | |
582 | Additionally, since `Result` has a second type parameter, there are | |
583 | combinators that affect only the error type, such as | |
584 | [`map_err`](../std/result/enum.Result.html#method.map_err) (instead of | |
585 | `map`) and [`or_else`](../std/result/enum.Result.html#method.or_else) | |
586 | (instead of `and_then`). | |
587 | ||
588 | ### The `Result` type alias idiom | |
589 | ||
590 | In the standard library, you may frequently see types like | |
b039eaaf | 591 | `Result<i32>`. But wait, [we defined `Result`](#code-result-def) to |
e9174d1e SL |
592 | have two type parameters. How can we get away with only specifying |
593 | one? The key is to define a `Result` type alias that *fixes* one of | |
594 | the type parameters to a particular type. Usually the fixed type is | |
595 | the error type. For example, our previous example parsing integers | |
596 | could be rewritten like this: | |
597 | ||
598 | ```rust | |
599 | use std::num::ParseIntError; | |
600 | use std::result; | |
601 | ||
602 | type Result<T> = result::Result<T, ParseIntError>; | |
603 | ||
604 | fn double_number(number_str: &str) -> Result<i32> { | |
605 | unimplemented!(); | |
606 | } | |
607 | ``` | |
608 | ||
609 | Why would we do this? Well, if we have a lot of functions that could return | |
610 | `ParseIntError`, then it's much more convenient to define an alias that always | |
611 | uses `ParseIntError` so that we don't have to write it out all the time. | |
612 | ||
613 | The most prominent place this idiom is used in the standard library is | |
614 | with [`io::Result`](../std/io/type.Result.html). Typically, one writes | |
615 | `io::Result<T>`, which makes it clear that you're using the `io` | |
616 | module's type alias instead of the plain definition from | |
617 | `std::result`. (This idiom is also used for | |
618 | [`fmt::Result`](../std/fmt/type.Result.html).) | |
619 | ||
620 | ## A brief interlude: unwrapping isn't evil | |
621 | ||
622 | If you've been following along, you might have noticed that I've taken a pretty | |
623 | hard line against calling methods like `unwrap` that could `panic` and abort | |
624 | your program. *Generally speaking*, this is good advice. | |
625 | ||
626 | However, `unwrap` can still be used judiciously. What exactly justifies use of | |
627 | `unwrap` is somewhat of a grey area and reasonable people can disagree. I'll | |
628 | summarize some of my *opinions* on the matter. | |
629 | ||
630 | * **In examples and quick 'n' dirty code.** Sometimes you're writing examples | |
631 | or a quick program, and error handling simply isn't important. Beating the | |
632 | convenience of `unwrap` can be hard in such scenarios, so it is very | |
633 | appealing. | |
634 | * **When panicking indicates a bug in the program.** When the invariants of | |
635 | your code should prevent a certain case from happening (like, say, popping | |
636 | from an empty stack), then panicking can be permissible. This is because it | |
637 | exposes a bug in your program. This can be explicit, like from an `assert!` | |
638 | failing, or it could be because your index into an array was out of bounds. | |
639 | ||
640 | This is probably not an exhaustive list. Moreover, when using an | |
641 | `Option`, it is often better to use its | |
642 | [`expect`](../std/option/enum.Option.html#method.expect) | |
643 | method. `expect` does exactly the same thing as `unwrap`, except it | |
644 | prints a message you give to `expect`. This makes the resulting panic | |
645 | a bit nicer to deal with, since it will show your message instead of | |
646 | “called unwrap on a `None` value.” | |
647 | ||
648 | My advice boils down to this: use good judgment. There's a reason why the words | |
649 | “never do X” or “Y is considered harmful” don't appear in my writing. There are | |
650 | trade offs to all things, and it is up to you as the programmer to determine | |
651 | what is acceptable for your use cases. My goal is only to help you evaluate | |
652 | trade offs as accurately as possible. | |
653 | ||
654 | Now that we've covered the basics of error handling in Rust, and | |
655 | explained unwrapping, let's start exploring more of the standard | |
656 | library. | |
657 | ||
658 | # Working with multiple error types | |
659 | ||
660 | Thus far, we've looked at error handling where everything was either an | |
661 | `Option<T>` or a `Result<T, SomeError>`. But what happens when you have both an | |
662 | `Option` and a `Result`? Or what if you have a `Result<T, Error1>` and a | |
663 | `Result<T, Error2>`? Handling *composition of distinct error types* is the next | |
664 | challenge in front of us, and it will be the major theme throughout the rest of | |
9cc50fc6 | 665 | this section. |
e9174d1e SL |
666 | |
667 | ## Composing `Option` and `Result` | |
668 | ||
669 | So far, I've talked about combinators defined for `Option` and combinators | |
670 | defined for `Result`. We can use these combinators to compose results of | |
671 | different computations without doing explicit case analysis. | |
672 | ||
673 | Of course, in real code, things aren't always as clean. Sometimes you have a | |
674 | mix of `Option` and `Result` types. Must we resort to explicit case analysis, | |
675 | or can we continue using combinators? | |
676 | ||
9cc50fc6 | 677 | For now, let's revisit one of the first examples in this section: |
e9174d1e SL |
678 | |
679 | ```rust,should_panic | |
680 | use std::env; | |
1a4d82fc JJ |
681 | |
682 | fn main() { | |
e9174d1e SL |
683 | let mut argv = env::args(); |
684 | let arg: String = argv.nth(1).unwrap(); // error 1 | |
685 | let n: i32 = arg.parse().unwrap(); // error 2 | |
686 | println!("{}", 2 * n); | |
1a4d82fc JJ |
687 | } |
688 | ``` | |
689 | ||
e9174d1e SL |
690 | Given our new found knowledge of `Option`, `Result` and their various |
691 | combinators, we should try to rewrite this so that errors are handled properly | |
692 | and the program doesn't panic if there's an error. | |
693 | ||
694 | The tricky aspect here is that `argv.nth(1)` produces an `Option` while | |
695 | `arg.parse()` produces a `Result`. These aren't directly composable. When faced | |
696 | with both an `Option` and a `Result`, the solution is *usually* to convert the | |
697 | `Option` to a `Result`. In our case, the absence of a command line parameter | |
698 | (from `env::args()`) means the user didn't invoke the program correctly. We | |
9cc50fc6 | 699 | could use a `String` to describe the error. Let's try: |
e9174d1e | 700 | |
b039eaaf SL |
701 | <span id="code-error-double-string"></span> |
702 | ||
e9174d1e SL |
703 | ```rust |
704 | use std::env; | |
705 | ||
706 | fn double_arg(mut argv: env::Args) -> Result<i32, String> { | |
707 | argv.nth(1) | |
708 | .ok_or("Please give at least one argument".to_owned()) | |
709 | .and_then(|arg| arg.parse::<i32>().map_err(|err| err.to_string())) | |
92a42be0 | 710 | .map(|n| 2 * n) |
e9174d1e | 711 | } |
1a4d82fc | 712 | |
e9174d1e SL |
713 | fn main() { |
714 | match double_arg(env::args()) { | |
715 | Ok(n) => println!("{}", n), | |
716 | Err(err) => println!("Error: {}", err), | |
717 | } | |
718 | } | |
719 | ``` | |
1a4d82fc | 720 | |
e9174d1e SL |
721 | There are a couple new things in this example. The first is the use of the |
722 | [`Option::ok_or`](../std/option/enum.Option.html#method.ok_or) | |
723 | combinator. This is one way to convert an `Option` into a `Result`. The | |
724 | conversion requires you to specify what error to use if `Option` is `None`. | |
725 | Like the other combinators we've seen, its definition is very simple: | |
1a4d82fc | 726 | |
bd371182 | 727 | ```rust |
e9174d1e SL |
728 | fn ok_or<T, E>(option: Option<T>, err: E) -> Result<T, E> { |
729 | match option { | |
730 | Some(val) => Ok(val), | |
731 | None => Err(err), | |
732 | } | |
733 | } | |
734 | ``` | |
bd371182 | 735 | |
e9174d1e SL |
736 | The other new combinator used here is |
737 | [`Result::map_err`](../std/result/enum.Result.html#method.map_err). | |
9cc50fc6 | 738 | This is like `Result::map`, except it maps a function on to the *error* |
e9174d1e SL |
739 | portion of a `Result` value. If the `Result` is an `Ok(...)` value, then it is |
740 | returned unmodified. | |
741 | ||
742 | We use `map_err` here because it is necessary for the error types to remain | |
743 | the same (because of our use of `and_then`). Since we chose to convert the | |
744 | `Option<String>` (from `argv.nth(1)`) to a `Result<String, String>`, we must | |
745 | also convert the `ParseIntError` from `arg.parse()` to a `String`. | |
746 | ||
747 | ## The limits of combinators | |
748 | ||
749 | Doing IO and parsing input is a very common task, and it's one that I | |
750 | personally have done a lot of in Rust. Therefore, we will use (and continue to | |
751 | use) IO and various parsing routines to exemplify error handling. | |
752 | ||
753 | Let's start simple. We are tasked with opening a file, reading all of its | |
754 | contents and converting its contents to a number. Then we multiply it by `2` | |
755 | and print the output. | |
756 | ||
757 | Although I've tried to convince you not to use `unwrap`, it can be useful | |
758 | to first write your code using `unwrap`. It allows you to focus on your problem | |
759 | instead of the error handling, and it exposes the points where proper error | |
760 | handling need to occur. Let's start there so we can get a handle on the code, | |
761 | and then refactor it to use better error handling. | |
762 | ||
763 | ```rust,should_panic | |
764 | use std::fs::File; | |
765 | use std::io::Read; | |
766 | use std::path::Path; | |
767 | ||
768 | fn file_double<P: AsRef<Path>>(file_path: P) -> i32 { | |
769 | let mut file = File::open(file_path).unwrap(); // error 1 | |
770 | let mut contents = String::new(); | |
771 | file.read_to_string(&mut contents).unwrap(); // error 2 | |
772 | let n: i32 = contents.trim().parse().unwrap(); // error 3 | |
773 | 2 * n | |
774 | } | |
775 | ||
776 | fn main() { | |
777 | let doubled = file_double("foobar"); | |
778 | println!("{}", doubled); | |
779 | } | |
1a4d82fc JJ |
780 | ``` |
781 | ||
e9174d1e SL |
782 | (N.B. The `AsRef<Path>` is used because those are the |
783 | [same bounds used on | |
784 | `std::fs::File::open`](../std/fs/struct.File.html#method.open). | |
b039eaaf | 785 | This makes it ergonomic to use any kind of string as a file path.) |
e9174d1e SL |
786 | |
787 | There are three different errors that can occur here: | |
788 | ||
789 | 1. A problem opening the file. | |
790 | 2. A problem reading data from the file. | |
791 | 3. A problem parsing the data as a number. | |
792 | ||
793 | The first two problems are described via the | |
794 | [`std::io::Error`](../std/io/struct.Error.html) type. We know this | |
795 | because of the return types of | |
796 | [`std::fs::File::open`](../std/fs/struct.File.html#method.open) and | |
797 | [`std::io::Read::read_to_string`](../std/io/trait.Read.html#method.read_to_string). | |
798 | (Note that they both use the [`Result` type alias | |
799 | idiom](#the-result-type-alias-idiom) described previously. If you | |
800 | click on the `Result` type, you'll [see the type | |
801 | alias](../std/io/type.Result.html), and consequently, the underlying | |
802 | `io::Error` type.) The third problem is described by the | |
803 | [`std::num::ParseIntError`](../std/num/struct.ParseIntError.html) | |
804 | type. The `io::Error` type in particular is *pervasive* throughout the | |
805 | standard library. You will see it again and again. | |
806 | ||
807 | Let's start the process of refactoring the `file_double` function. To make this | |
808 | function composable with other components of the program, it should *not* panic | |
809 | if any of the above error conditions are met. Effectively, this means that the | |
810 | function should *return an error* if any of its operations fail. Our problem is | |
811 | that the return type of `file_double` is `i32`, which does not give us any | |
812 | useful way of reporting an error. Thus, we must start by changing the return | |
813 | type from `i32` to something else. | |
814 | ||
815 | The first thing we need to decide: should we use `Option` or `Result`? We | |
816 | certainly could use `Option` very easily. If any of the three errors occur, we | |
817 | could simply return `None`. This will work *and it is better than panicking*, | |
818 | but we can do a lot better. Instead, we should pass some detail about the error | |
819 | that occurred. Since we want to express the *possibility of error*, we should | |
820 | use `Result<i32, E>`. But what should `E` be? Since two *different* types of | |
821 | errors can occur, we need to convert them to a common type. One such type is | |
822 | `String`. Let's see how that impacts our code: | |
1a4d82fc | 823 | |
e9174d1e SL |
824 | ```rust |
825 | use std::fs::File; | |
826 | use std::io::Read; | |
827 | use std::path::Path; | |
828 | ||
829 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, String> { | |
830 | File::open(file_path) | |
831 | .map_err(|err| err.to_string()) | |
832 | .and_then(|mut file| { | |
833 | let mut contents = String::new(); | |
834 | file.read_to_string(&mut contents) | |
835 | .map_err(|err| err.to_string()) | |
836 | .map(|_| contents) | |
837 | }) | |
838 | .and_then(|contents| { | |
839 | contents.trim().parse::<i32>() | |
840 | .map_err(|err| err.to_string()) | |
841 | }) | |
842 | .map(|n| 2 * n) | |
843 | } | |
844 | ||
845 | fn main() { | |
846 | match file_double("foobar") { | |
847 | Ok(n) => println!("{}", n), | |
848 | Err(err) => println!("Error: {}", err), | |
849 | } | |
850 | } | |
851 | ``` | |
852 | ||
853 | This code looks a bit hairy. It can take quite a bit of practice before code | |
854 | like this becomes easy to write. The way we write it is by *following the | |
855 | types*. As soon as we changed the return type of `file_double` to | |
856 | `Result<i32, String>`, we had to start looking for the right combinators. In | |
857 | this case, we only used three different combinators: `and_then`, `map` and | |
858 | `map_err`. | |
859 | ||
860 | `and_then` is used to chain multiple computations where each computation could | |
861 | return an error. After opening the file, there are two more computations that | |
862 | could fail: reading from the file and parsing the contents as a number. | |
863 | Correspondingly, there are two calls to `and_then`. | |
864 | ||
865 | `map` is used to apply a function to the `Ok(...)` value of a `Result`. For | |
866 | example, the very last call to `map` multiplies the `Ok(...)` value (which is | |
867 | an `i32`) by `2`. If an error had occurred before that point, this operation | |
868 | would have been skipped because of how `map` is defined. | |
869 | ||
9cc50fc6 | 870 | `map_err` is the trick that makes all of this work. `map_err` is like |
e9174d1e SL |
871 | `map`, except it applies a function to the `Err(...)` value of a `Result`. In |
872 | this case, we want to convert all of our errors to one type: `String`. Since | |
873 | both `io::Error` and `num::ParseIntError` implement `ToString`, we can call the | |
874 | `to_string()` method to convert them. | |
875 | ||
876 | With all of that said, the code is still hairy. Mastering use of combinators is | |
877 | important, but they have their limits. Let's try a different approach: early | |
878 | returns. | |
879 | ||
880 | ## Early returns | |
881 | ||
882 | I'd like to take the code from the previous section and rewrite it using *early | |
883 | returns*. Early returns let you exit the function early. We can't return early | |
884 | in `file_double` from inside another closure, so we'll need to revert back to | |
885 | explicit case analysis. | |
1a4d82fc JJ |
886 | |
887 | ```rust | |
e9174d1e SL |
888 | use std::fs::File; |
889 | use std::io::Read; | |
890 | use std::path::Path; | |
891 | ||
892 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, String> { | |
893 | let mut file = match File::open(file_path) { | |
894 | Ok(file) => file, | |
895 | Err(err) => return Err(err.to_string()), | |
896 | }; | |
897 | let mut contents = String::new(); | |
898 | if let Err(err) = file.read_to_string(&mut contents) { | |
899 | return Err(err.to_string()); | |
900 | } | |
901 | let n: i32 = match contents.trim().parse() { | |
902 | Ok(n) => n, | |
903 | Err(err) => return Err(err.to_string()), | |
904 | }; | |
905 | Ok(2 * n) | |
906 | } | |
907 | ||
908 | fn main() { | |
909 | match file_double("foobar") { | |
910 | Ok(n) => println!("{}", n), | |
911 | Err(err) => println!("Error: {}", err), | |
912 | } | |
1a4d82fc JJ |
913 | } |
914 | ``` | |
915 | ||
9cc50fc6 | 916 | Reasonable people can disagree over whether this code is better than the code |
e9174d1e SL |
917 | that uses combinators, but if you aren't familiar with the combinator approach, |
918 | this code looks simpler to read to me. It uses explicit case analysis with | |
919 | `match` and `if let`. If an error occurs, it simply stops executing the | |
920 | function and returns the error (by converting it to a string). | |
921 | ||
922 | Isn't this a step backwards though? Previously, we said that the key to | |
923 | ergonomic error handling is reducing explicit case analysis, yet we've reverted | |
924 | back to explicit case analysis here. It turns out, there are *multiple* ways to | |
925 | reduce explicit case analysis. Combinators aren't the only way. | |
1a4d82fc | 926 | |
e9174d1e SL |
927 | ## The `try!` macro |
928 | ||
929 | A cornerstone of error handling in Rust is the `try!` macro. The `try!` macro | |
9cc50fc6 | 930 | abstracts case analysis like combinators, but unlike combinators, it also |
e9174d1e SL |
931 | abstracts *control flow*. Namely, it can abstract the *early return* pattern |
932 | seen above. | |
933 | ||
934 | Here is a simplified definition of a `try!` macro: | |
1a4d82fc | 935 | |
b039eaaf SL |
936 | <span id="code-try-def-simple"></span> |
937 | ||
1a4d82fc | 938 | ```rust |
e9174d1e SL |
939 | macro_rules! try { |
940 | ($e:expr) => (match $e { | |
941 | Ok(val) => val, | |
942 | Err(err) => return Err(err), | |
943 | }); | |
944 | } | |
945 | ``` | |
946 | ||
9e0c209e | 947 | (The [real definition](../std/macro.try.html) is a bit more |
e9174d1e | 948 | sophisticated. We will address that later.) |
1a4d82fc | 949 | |
e9174d1e SL |
950 | Using the `try!` macro makes it very easy to simplify our last example. Since |
951 | it does the case analysis and the early return for us, we get tighter code that | |
952 | is easier to read: | |
953 | ||
954 | ```rust | |
955 | use std::fs::File; | |
956 | use std::io::Read; | |
957 | use std::path::Path; | |
958 | ||
959 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, String> { | |
960 | let mut file = try!(File::open(file_path).map_err(|e| e.to_string())); | |
961 | let mut contents = String::new(); | |
962 | try!(file.read_to_string(&mut contents).map_err(|e| e.to_string())); | |
963 | let n = try!(contents.trim().parse::<i32>().map_err(|e| e.to_string())); | |
964 | Ok(2 * n) | |
965 | } | |
966 | ||
967 | fn main() { | |
968 | match file_double("foobar") { | |
969 | Ok(n) => println!("{}", n), | |
970 | Err(err) => println!("Error: {}", err), | |
971 | } | |
972 | } | |
973 | ``` | |
974 | ||
975 | The `map_err` calls are still necessary given | |
976 | [our definition of `try!`](#code-try-def-simple). | |
977 | This is because the error types still need to be converted to `String`. | |
978 | The good news is that we will soon learn how to remove those `map_err` calls! | |
979 | The bad news is that we will need to learn a bit more about a couple important | |
980 | traits in the standard library before we can remove the `map_err` calls. | |
981 | ||
982 | ## Defining your own error type | |
983 | ||
984 | Before we dive into some of the standard library error traits, I'd like to wrap | |
985 | up this section by removing the use of `String` as our error type in the | |
986 | previous examples. | |
987 | ||
988 | Using `String` as we did in our previous examples is convenient because it's | |
989 | easy to convert errors to strings, or even make up your own errors as strings | |
990 | on the spot. However, using `String` for your errors has some downsides. | |
991 | ||
992 | The first downside is that the error messages tend to clutter your | |
993 | code. It's possible to define the error messages elsewhere, but unless | |
994 | you're unusually disciplined, it is very tempting to embed the error | |
995 | message into your code. Indeed, we did exactly this in a [previous | |
996 | example](#code-error-double-string). | |
997 | ||
998 | The second and more important downside is that `String`s are *lossy*. That is, | |
999 | if all errors are converted to strings, then the errors we pass to the caller | |
1000 | become completely opaque. The only reasonable thing the caller can do with a | |
1001 | `String` error is show it to the user. Certainly, inspecting the string to | |
1002 | determine the type of error is not robust. (Admittedly, this downside is far | |
1003 | more important inside of a library as opposed to, say, an application.) | |
1004 | ||
1005 | For example, the `io::Error` type embeds an | |
1006 | [`io::ErrorKind`](../std/io/enum.ErrorKind.html), | |
1007 | which is *structured data* that represents what went wrong during an IO | |
1008 | operation. This is important because you might want to react differently | |
1009 | depending on the error. (e.g., A `BrokenPipe` error might mean quitting your | |
1010 | program gracefully while a `NotFound` error might mean exiting with an error | |
1011 | code and showing an error to the user.) With `io::ErrorKind`, the caller can | |
1012 | examine the type of an error with case analysis, which is strictly superior | |
1013 | to trying to tease out the details of an error inside of a `String`. | |
1014 | ||
1015 | Instead of using a `String` as an error type in our previous example of reading | |
1016 | an integer from a file, we can define our own error type that represents errors | |
1017 | with *structured data*. We endeavor to not drop information from underlying | |
1018 | errors in case the caller wants to inspect the details. | |
1019 | ||
1020 | The ideal way to represent *one of many possibilities* is to define our own | |
1021 | sum type using `enum`. In our case, an error is either an `io::Error` or a | |
1022 | `num::ParseIntError`, so a natural definition arises: | |
1023 | ||
1024 | ```rust | |
1025 | use std::io; | |
1026 | use std::num; | |
1027 | ||
1028 | // We derive `Debug` because all types should probably derive `Debug`. | |
1029 | // This gives us a reasonable human readable description of `CliError` values. | |
85aaf69f | 1030 | #[derive(Debug)] |
e9174d1e SL |
1031 | enum CliError { |
1032 | Io(io::Error), | |
1033 | Parse(num::ParseIntError), | |
1034 | } | |
1035 | ``` | |
1036 | ||
1037 | Tweaking our code is very easy. Instead of converting errors to strings, we | |
1038 | simply convert them to our `CliError` type using the corresponding value | |
1039 | constructor: | |
1a4d82fc | 1040 | |
e9174d1e SL |
1041 | ```rust |
1042 | # #[derive(Debug)] | |
1043 | # enum CliError { Io(::std::io::Error), Parse(::std::num::ParseIntError) } | |
1044 | use std::fs::File; | |
1045 | use std::io::Read; | |
1046 | use std::path::Path; | |
1047 | ||
1048 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, CliError> { | |
1049 | let mut file = try!(File::open(file_path).map_err(CliError::Io)); | |
1050 | let mut contents = String::new(); | |
1051 | try!(file.read_to_string(&mut contents).map_err(CliError::Io)); | |
1052 | let n: i32 = try!(contents.trim().parse().map_err(CliError::Parse)); | |
1053 | Ok(2 * n) | |
1054 | } | |
1055 | ||
1056 | fn main() { | |
1057 | match file_double("foobar") { | |
1058 | Ok(n) => println!("{}", n), | |
1059 | Err(err) => println!("Error: {:?}", err), | |
1a4d82fc | 1060 | } |
e9174d1e SL |
1061 | } |
1062 | ``` | |
1063 | ||
1064 | The only change here is switching `map_err(|e| e.to_string())` (which converts | |
1065 | errors to strings) to `map_err(CliError::Io)` or `map_err(CliError::Parse)`. | |
1066 | The *caller* gets to decide the level of detail to report to the user. In | |
1067 | effect, using a `String` as an error type removes choices from the caller while | |
1068 | using a custom `enum` error type like `CliError` gives the caller all of the | |
1069 | conveniences as before in addition to *structured data* describing the error. | |
1070 | ||
1071 | A rule of thumb is to define your own error type, but a `String` error type | |
1072 | will do in a pinch, particularly if you're writing an application. If you're | |
1073 | writing a library, defining your own error type should be strongly preferred so | |
1074 | that you don't remove choices from the caller unnecessarily. | |
1075 | ||
1076 | # Standard library traits used for error handling | |
1077 | ||
1078 | The standard library defines two integral traits for error handling: | |
1079 | [`std::error::Error`](../std/error/trait.Error.html) and | |
1080 | [`std::convert::From`](../std/convert/trait.From.html). While `Error` | |
1081 | is designed specifically for generically describing errors, the `From` | |
1082 | trait serves a more general role for converting values between two | |
1083 | distinct types. | |
1084 | ||
1085 | ## The `Error` trait | |
1086 | ||
1087 | The `Error` trait is [defined in the standard | |
1088 | library](../std/error/trait.Error.html): | |
1089 | ||
1090 | ```rust | |
1091 | use std::fmt::{Debug, Display}; | |
1092 | ||
1093 | trait Error: Debug + Display { | |
1094 | /// A short description of the error. | |
1095 | fn description(&self) -> &str; | |
1096 | ||
1097 | /// The lower level cause of this error, if any. | |
1098 | fn cause(&self) -> Option<&Error> { None } | |
1099 | } | |
1100 | ``` | |
1101 | ||
1102 | This trait is super generic because it is meant to be implemented for *all* | |
1103 | types that represent errors. This will prove useful for writing composable code | |
1104 | as we'll see later. Otherwise, the trait allows you to do at least the | |
1105 | following things: | |
1106 | ||
1107 | * Obtain a `Debug` representation of the error. | |
1108 | * Obtain a user-facing `Display` representation of the error. | |
1109 | * Obtain a short description of the error (via the `description` method). | |
1110 | * Inspect the causal chain of an error, if one exists (via the `cause` method). | |
1111 | ||
1112 | The first two are a result of `Error` requiring impls for both `Debug` and | |
1113 | `Display`. The latter two are from the two methods defined on `Error`. The | |
1114 | power of `Error` comes from the fact that all error types impl `Error`, which | |
1115 | means errors can be existentially quantified as a | |
1116 | [trait object](../book/trait-objects.html). | |
1117 | This manifests as either `Box<Error>` or `&Error`. Indeed, the `cause` method | |
1118 | returns an `&Error`, which is itself a trait object. We'll revisit the | |
1119 | `Error` trait's utility as a trait object later. | |
1120 | ||
1121 | For now, it suffices to show an example implementing the `Error` trait. Let's | |
1122 | use the error type we defined in the | |
1123 | [previous section](#defining-your-own-error-type): | |
1124 | ||
1125 | ```rust | |
1126 | use std::io; | |
1127 | use std::num; | |
1128 | ||
1129 | // We derive `Debug` because all types should probably derive `Debug`. | |
1130 | // This gives us a reasonable human readable description of `CliError` values. | |
1131 | #[derive(Debug)] | |
1132 | enum CliError { | |
1133 | Io(io::Error), | |
1134 | Parse(num::ParseIntError), | |
1135 | } | |
1136 | ``` | |
1137 | ||
1138 | This particular error type represents the possibility of two types of errors | |
1139 | occurring: an error dealing with I/O or an error converting a string to a | |
1140 | number. The error could represent as many error types as you want by adding new | |
1141 | variants to the `enum` definition. | |
1142 | ||
1143 | Implementing `Error` is pretty straight-forward. It's mostly going to be a lot | |
1144 | explicit case analysis. | |
1145 | ||
1146 | ```rust,ignore | |
1147 | use std::error; | |
1148 | use std::fmt; | |
1149 | ||
1150 | impl fmt::Display for CliError { | |
1151 | fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | |
1152 | match *self { | |
1153 | // Both underlying errors already impl `Display`, so we defer to | |
1154 | // their implementations. | |
1155 | CliError::Io(ref err) => write!(f, "IO error: {}", err), | |
1156 | CliError::Parse(ref err) => write!(f, "Parse error: {}", err), | |
1157 | } | |
1a4d82fc JJ |
1158 | } |
1159 | } | |
1160 | ||
e9174d1e SL |
1161 | impl error::Error for CliError { |
1162 | fn description(&self) -> &str { | |
1163 | // Both underlying errors already impl `Error`, so we defer to their | |
1164 | // implementations. | |
1165 | match *self { | |
1166 | CliError::Io(ref err) => err.description(), | |
92a42be0 | 1167 | CliError::Parse(ref err) => err.description(), |
e9174d1e | 1168 | } |
1a4d82fc | 1169 | } |
e9174d1e SL |
1170 | |
1171 | fn cause(&self) -> Option<&error::Error> { | |
1172 | match *self { | |
1173 | // N.B. Both of these implicitly cast `err` from their concrete | |
1174 | // types (either `&io::Error` or `&num::ParseIntError`) | |
1175 | // to a trait object `&Error`. This works because both error types | |
1176 | // implement `Error`. | |
1177 | CliError::Io(ref err) => Some(err), | |
1178 | CliError::Parse(ref err) => Some(err), | |
1179 | } | |
1a4d82fc JJ |
1180 | } |
1181 | } | |
1182 | ``` | |
1183 | ||
e9174d1e SL |
1184 | We note that this is a very typical implementation of `Error`: match on your |
1185 | different error types and satisfy the contracts defined for `description` and | |
1186 | `cause`. | |
1187 | ||
1188 | ## The `From` trait | |
1189 | ||
1190 | The `std::convert::From` trait is | |
1191 | [defined in the standard | |
1192 | library](../std/convert/trait.From.html): | |
1193 | ||
b039eaaf SL |
1194 | <span id="code-from-def"></span> |
1195 | ||
e9174d1e SL |
1196 | ```rust |
1197 | trait From<T> { | |
1198 | fn from(T) -> Self; | |
1199 | } | |
1200 | ``` | |
1201 | ||
1202 | Deliciously simple, yes? `From` is very useful because it gives us a generic | |
1203 | way to talk about conversion *from* a particular type `T` to some other type | |
1204 | (in this case, “some other type” is the subject of the impl, or `Self`). | |
1205 | The crux of `From` is the | |
1206 | [set of implementations provided by the standard | |
1207 | library](../std/convert/trait.From.html). | |
1a4d82fc | 1208 | |
e9174d1e | 1209 | Here are a few simple examples demonstrating how `From` works: |
d9579d0f | 1210 | |
e9174d1e SL |
1211 | ```rust |
1212 | let string: String = From::from("foo"); | |
1213 | let bytes: Vec<u8> = From::from("foo"); | |
1214 | let cow: ::std::borrow::Cow<str> = From::from("foo"); | |
1215 | ``` | |
1a4d82fc | 1216 | |
e9174d1e SL |
1217 | OK, so `From` is useful for converting between strings. But what about errors? |
1218 | It turns out, there is one critical impl: | |
1a4d82fc | 1219 | |
62682a34 | 1220 | ```rust,ignore |
e9174d1e | 1221 | impl<'a, E: Error + 'a> From<E> for Box<Error + 'a> |
1a4d82fc JJ |
1222 | ``` |
1223 | ||
e9174d1e SL |
1224 | This impl says that for *any* type that impls `Error`, we can convert it to a |
1225 | trait object `Box<Error>`. This may not seem terribly surprising, but it is | |
1226 | useful in a generic context. | |
1a4d82fc | 1227 | |
e9174d1e SL |
1228 | Remember the two errors we were dealing with previously? Specifically, |
1229 | `io::Error` and `num::ParseIntError`. Since both impl `Error`, they work with | |
1230 | `From`: | |
1231 | ||
1232 | ```rust | |
1233 | use std::error::Error; | |
1234 | use std::fs; | |
1235 | use std::io; | |
1236 | use std::num; | |
1237 | ||
1238 | // We have to jump through some hoops to actually get error values. | |
1239 | let io_err: io::Error = io::Error::last_os_error(); | |
1240 | let parse_err: num::ParseIntError = "not a number".parse::<i32>().unwrap_err(); | |
1241 | ||
1242 | // OK, here are the conversions. | |
1243 | let err1: Box<Error> = From::from(io_err); | |
1244 | let err2: Box<Error> = From::from(parse_err); | |
1a4d82fc JJ |
1245 | ``` |
1246 | ||
e9174d1e SL |
1247 | There is a really important pattern to recognize here. Both `err1` and `err2` |
1248 | have the *same type*. This is because they are existentially quantified types, | |
b039eaaf | 1249 | or trait objects. In particular, their underlying type is *erased* from the |
e9174d1e SL |
1250 | compiler's knowledge, so it truly sees `err1` and `err2` as exactly the same. |
1251 | Additionally, we constructed `err1` and `err2` using precisely the same | |
1252 | function call: `From::from`. This is because `From::from` is overloaded on both | |
1253 | its argument and its return type. | |
1a4d82fc | 1254 | |
e9174d1e SL |
1255 | This pattern is important because it solves a problem we had earlier: it gives |
1256 | us a way to reliably convert errors to the same type using the same function. | |
1a4d82fc | 1257 | |
e9174d1e | 1258 | Time to revisit an old friend; the `try!` macro. |
1a4d82fc | 1259 | |
e9174d1e | 1260 | ## The real `try!` macro |
1a4d82fc | 1261 | |
e9174d1e | 1262 | Previously, we presented this definition of `try!`: |
1a4d82fc | 1263 | |
e9174d1e SL |
1264 | ```rust |
1265 | macro_rules! try { | |
1266 | ($e:expr) => (match $e { | |
1267 | Ok(val) => val, | |
1268 | Err(err) => return Err(err), | |
1269 | }); | |
1270 | } | |
1a4d82fc JJ |
1271 | ``` |
1272 | ||
b039eaaf | 1273 | This is not its real definition. Its real definition is |
9e0c209e | 1274 | [in the standard library](../std/macro.try.html): |
1a4d82fc | 1275 | |
b039eaaf SL |
1276 | <span id="code-try-def"></span> |
1277 | ||
e9174d1e SL |
1278 | ```rust |
1279 | macro_rules! try { | |
1280 | ($e:expr) => (match $e { | |
1281 | Ok(val) => val, | |
1282 | Err(err) => return Err(::std::convert::From::from(err)), | |
1283 | }); | |
1284 | } | |
1285 | ``` | |
1a4d82fc | 1286 | |
e9174d1e SL |
1287 | There's one tiny but powerful change: the error value is passed through |
1288 | `From::from`. This makes the `try!` macro a lot more powerful because it gives | |
1289 | you automatic type conversion for free. | |
1290 | ||
1291 | Armed with our more powerful `try!` macro, let's take a look at code we wrote | |
1292 | previously to read a file and convert its contents to an integer: | |
1293 | ||
1294 | ```rust | |
1295 | use std::fs::File; | |
1296 | use std::io::Read; | |
1297 | use std::path::Path; | |
1298 | ||
1299 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, String> { | |
1300 | let mut file = try!(File::open(file_path).map_err(|e| e.to_string())); | |
1301 | let mut contents = String::new(); | |
1302 | try!(file.read_to_string(&mut contents).map_err(|e| e.to_string())); | |
1303 | let n = try!(contents.trim().parse::<i32>().map_err(|e| e.to_string())); | |
1304 | Ok(2 * n) | |
1305 | } | |
1a4d82fc | 1306 | ``` |
c34b1796 | 1307 | |
e9174d1e SL |
1308 | Earlier, we promised that we could get rid of the `map_err` calls. Indeed, all |
1309 | we have to do is pick a type that `From` works with. As we saw in the previous | |
b039eaaf | 1310 | section, `From` has an impl that lets it convert any error type into a |
e9174d1e | 1311 | `Box<Error>`: |
c34b1796 | 1312 | |
e9174d1e SL |
1313 | ```rust |
1314 | use std::error::Error; | |
1315 | use std::fs::File; | |
1316 | use std::io::Read; | |
1317 | use std::path::Path; | |
1318 | ||
1319 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, Box<Error>> { | |
1320 | let mut file = try!(File::open(file_path)); | |
1321 | let mut contents = String::new(); | |
1322 | try!(file.read_to_string(&mut contents)); | |
1323 | let n = try!(contents.trim().parse::<i32>()); | |
1324 | Ok(2 * n) | |
1325 | } | |
1326 | ``` | |
c34b1796 | 1327 | |
e9174d1e SL |
1328 | We are getting very close to ideal error handling. Our code has very little |
1329 | overhead as a result from error handling because the `try!` macro encapsulates | |
1330 | three things simultaneously: | |
c34b1796 | 1331 | |
e9174d1e SL |
1332 | 1. Case analysis. |
1333 | 2. Control flow. | |
1334 | 3. Error type conversion. | |
1335 | ||
1336 | When all three things are combined, we get code that is unencumbered by | |
1337 | combinators, calls to `unwrap` or case analysis. | |
1338 | ||
1339 | There's one little nit left: the `Box<Error>` type is *opaque*. If we | |
1340 | return a `Box<Error>` to the caller, the caller can't (easily) inspect | |
1341 | underlying error type. The situation is certainly better than `String` | |
1342 | because the caller can call methods like | |
1343 | [`description`](../std/error/trait.Error.html#tymethod.description) | |
1344 | and [`cause`](../std/error/trait.Error.html#method.cause), but the | |
1345 | limitation remains: `Box<Error>` is opaque. (N.B. This isn't entirely | |
1346 | true because Rust does have runtime reflection, which is useful in | |
1347 | some scenarios that are [beyond the scope of this | |
9cc50fc6 | 1348 | section](https://crates.io/crates/error).) |
e9174d1e SL |
1349 | |
1350 | It's time to revisit our custom `CliError` type and tie everything together. | |
1351 | ||
1352 | ## Composing custom error types | |
1353 | ||
1354 | In the last section, we looked at the real `try!` macro and how it does | |
1355 | automatic type conversion for us by calling `From::from` on the error value. | |
1356 | In particular, we converted errors to `Box<Error>`, which works, but the type | |
1357 | is opaque to callers. | |
1358 | ||
1359 | To fix this, we use the same remedy that we're already familiar with: a custom | |
1360 | error type. Once again, here is the code that reads the contents of a file and | |
1361 | converts it to an integer: | |
c34b1796 AL |
1362 | |
1363 | ```rust | |
1364 | use std::fs::File; | |
e9174d1e SL |
1365 | use std::io::{self, Read}; |
1366 | use std::num; | |
1367 | use std::path::Path; | |
c34b1796 | 1368 | |
e9174d1e SL |
1369 | // We derive `Debug` because all types should probably derive `Debug`. |
1370 | // This gives us a reasonable human readable description of `CliError` values. | |
1371 | #[derive(Debug)] | |
1372 | enum CliError { | |
1373 | Io(io::Error), | |
1374 | Parse(num::ParseIntError), | |
c34b1796 AL |
1375 | } |
1376 | ||
e9174d1e SL |
1377 | fn file_double_verbose<P: AsRef<Path>>(file_path: P) -> Result<i32, CliError> { |
1378 | let mut file = try!(File::open(file_path).map_err(CliError::Io)); | |
1379 | let mut contents = String::new(); | |
1380 | try!(file.read_to_string(&mut contents).map_err(CliError::Io)); | |
1381 | let n: i32 = try!(contents.trim().parse().map_err(CliError::Parse)); | |
1382 | Ok(2 * n) | |
1383 | } | |
1384 | ``` | |
c34b1796 | 1385 | |
e9174d1e SL |
1386 | Notice that we still have the calls to `map_err`. Why? Well, recall the |
1387 | definitions of [`try!`](#code-try-def) and [`From`](#code-from-def). The | |
1388 | problem is that there is no `From` impl that allows us to convert from error | |
1389 | types like `io::Error` and `num::ParseIntError` to our own custom `CliError`. | |
1390 | Of course, it is easy to fix this! Since we defined `CliError`, we can impl | |
1391 | `From` with it: | |
1392 | ||
1393 | ```rust | |
1394 | # #[derive(Debug)] | |
1395 | # enum CliError { Io(io::Error), Parse(num::ParseIntError) } | |
1396 | use std::io; | |
1397 | use std::num; | |
1398 | ||
1399 | impl From<io::Error> for CliError { | |
1400 | fn from(err: io::Error) -> CliError { | |
1401 | CliError::Io(err) | |
c34b1796 | 1402 | } |
e9174d1e | 1403 | } |
c34b1796 | 1404 | |
e9174d1e SL |
1405 | impl From<num::ParseIntError> for CliError { |
1406 | fn from(err: num::ParseIntError) -> CliError { | |
1407 | CliError::Parse(err) | |
1408 | } | |
c34b1796 AL |
1409 | } |
1410 | ``` | |
1411 | ||
e9174d1e SL |
1412 | All these impls are doing is teaching `From` how to create a `CliError` from |
1413 | other error types. In our case, construction is as simple as invoking the | |
1414 | corresponding value constructor. Indeed, it is *typically* this easy. | |
1415 | ||
1416 | We can finally rewrite `file_double`: | |
c34b1796 AL |
1417 | |
1418 | ```rust | |
e9174d1e SL |
1419 | # use std::io; |
1420 | # use std::num; | |
1421 | # enum CliError { Io(::std::io::Error), Parse(::std::num::ParseIntError) } | |
1422 | # impl From<io::Error> for CliError { | |
1423 | # fn from(err: io::Error) -> CliError { CliError::Io(err) } | |
1424 | # } | |
1425 | # impl From<num::ParseIntError> for CliError { | |
1426 | # fn from(err: num::ParseIntError) -> CliError { CliError::Parse(err) } | |
1427 | # } | |
1428 | ||
c34b1796 | 1429 | use std::fs::File; |
e9174d1e SL |
1430 | use std::io::Read; |
1431 | use std::path::Path; | |
1432 | ||
1433 | fn file_double<P: AsRef<Path>>(file_path: P) -> Result<i32, CliError> { | |
1434 | let mut file = try!(File::open(file_path)); | |
1435 | let mut contents = String::new(); | |
1436 | try!(file.read_to_string(&mut contents)); | |
1437 | let n: i32 = try!(contents.trim().parse()); | |
1438 | Ok(2 * n) | |
1439 | } | |
1440 | ``` | |
1441 | ||
1442 | The only thing we did here was remove the calls to `map_err`. They are no | |
1443 | longer needed because the `try!` macro invokes `From::from` on the error value. | |
1444 | This works because we've provided `From` impls for all the error types that | |
1445 | could appear. | |
1446 | ||
1447 | If we modified our `file_double` function to perform some other operation, say, | |
1448 | convert a string to a float, then we'd need to add a new variant to our error | |
1449 | type: | |
1450 | ||
1451 | ```rust | |
c34b1796 | 1452 | use std::io; |
e9174d1e SL |
1453 | use std::num; |
1454 | ||
1455 | enum CliError { | |
1456 | Io(io::Error), | |
1457 | ParseInt(num::ParseIntError), | |
1458 | ParseFloat(num::ParseFloatError), | |
1459 | } | |
1460 | ``` | |
1461 | ||
1462 | And add a new `From` impl: | |
1463 | ||
1464 | ```rust | |
1465 | # enum CliError { | |
1466 | # Io(::std::io::Error), | |
1467 | # ParseInt(num::ParseIntError), | |
1468 | # ParseFloat(num::ParseFloatError), | |
1469 | # } | |
1470 | ||
1471 | use std::num; | |
1472 | ||
1473 | impl From<num::ParseFloatError> for CliError { | |
1474 | fn from(err: num::ParseFloatError) -> CliError { | |
1475 | CliError::ParseFloat(err) | |
1476 | } | |
1477 | } | |
1478 | ``` | |
1479 | ||
1480 | And that's it! | |
1481 | ||
1482 | ## Advice for library writers | |
1483 | ||
1484 | If your library needs to report custom errors, then you should | |
1485 | probably define your own error type. It's up to you whether or not to | |
1486 | expose its representation (like | |
1487 | [`ErrorKind`](../std/io/enum.ErrorKind.html)) or keep it hidden (like | |
1488 | [`ParseIntError`](../std/num/struct.ParseIntError.html)). Regardless | |
1489 | of how you do it, it's usually good practice to at least provide some | |
9cc50fc6 | 1490 | information about the error beyond its `String` |
e9174d1e SL |
1491 | representation. But certainly, this will vary depending on use cases. |
1492 | ||
1493 | At a minimum, you should probably implement the | |
1494 | [`Error`](../std/error/trait.Error.html) | |
1495 | trait. This will give users of your library some minimum flexibility for | |
1496 | [composing errors](#the-real-try-macro). Implementing the `Error` trait also | |
1497 | means that users are guaranteed the ability to obtain a string representation | |
1498 | of an error (because it requires impls for both `fmt::Debug` and | |
1499 | `fmt::Display`). | |
1500 | ||
1501 | Beyond that, it can also be useful to provide implementations of `From` on your | |
1502 | error types. This allows you (the library author) and your users to | |
1503 | [compose more detailed errors](#composing-custom-error-types). For example, | |
1504 | [`csv::Error`](http://burntsushi.net/rustdoc/csv/enum.Error.html) | |
1505 | provides `From` impls for both `io::Error` and `byteorder::Error`. | |
1506 | ||
1507 | Finally, depending on your tastes, you may also want to define a | |
1508 | [`Result` type alias](#the-result-type-alias-idiom), particularly if your | |
1509 | library defines a single error type. This is used in the standard library | |
1510 | for [`io::Result`](../std/io/type.Result.html) | |
1511 | and [`fmt::Result`](../std/fmt/type.Result.html). | |
1512 | ||
1513 | # Case study: A program to read population data | |
1514 | ||
9cc50fc6 | 1515 | This section was long, and depending on your background, it might be |
e9174d1e SL |
1516 | rather dense. While there is plenty of example code to go along with |
1517 | the prose, most of it was specifically designed to be pedagogical. So, | |
1518 | we're going to do something new: a case study. | |
1519 | ||
1520 | For this, we're going to build up a command line program that lets you | |
1521 | query world population data. The objective is simple: you give it a location | |
1522 | and it will tell you the population. Despite the simplicity, there is a lot | |
1523 | that can go wrong! | |
1524 | ||
1525 | The data we'll be using comes from the [Data Science | |
1526 | Toolkit][11]. I've prepared some data from it for this exercise. You | |
1527 | can either grab the [world population data][12] (41MB gzip compressed, | |
9cc50fc6 | 1528 | 145MB uncompressed) or only the [US population data][13] (2.2MB gzip |
e9174d1e SL |
1529 | compressed, 7.2MB uncompressed). |
1530 | ||
1531 | Up until now, we've kept the code limited to Rust's standard library. For a real | |
1532 | task like this though, we'll want to at least use something to parse CSV data, | |
1533 | parse the program arguments and decode that stuff into Rust types automatically. For that, we'll use the | |
1534 | [`csv`](https://crates.io/crates/csv), | |
1535 | and [`rustc-serialize`](https://crates.io/crates/rustc-serialize) crates. | |
1536 | ||
1537 | ## Initial setup | |
1538 | ||
1539 | We're not going to spend a lot of time on setting up a project with | |
1540 | Cargo because it is already covered well in [the Cargo | |
7453a54e | 1541 | section](getting-started.html#hello-cargo) and [Cargo's documentation][14]. |
e9174d1e SL |
1542 | |
1543 | To get started from scratch, run `cargo new --bin city-pop` and make sure your | |
1544 | `Cargo.toml` looks something like this: | |
1545 | ||
1546 | ```text | |
1547 | [package] | |
1548 | name = "city-pop" | |
1549 | version = "0.1.0" | |
1550 | authors = ["Andrew Gallant <jamslam@gmail.com>"] | |
1551 | ||
1552 | [[bin]] | |
1553 | name = "city-pop" | |
1554 | ||
1555 | [dependencies] | |
1556 | csv = "0.*" | |
1557 | rustc-serialize = "0.*" | |
1558 | getopts = "0.*" | |
1559 | ``` | |
1560 | ||
1561 | You should already be able to run: | |
1562 | ||
1563 | ```text | |
1564 | cargo build --release | |
1565 | ./target/release/city-pop | |
1566 | # Outputs: Hello, world! | |
1567 | ``` | |
1568 | ||
1569 | ## Argument parsing | |
1570 | ||
b039eaaf | 1571 | Let's get argument parsing out of the way. We won't go into too much |
e9174d1e SL |
1572 | detail on Getopts, but there is [some good documentation][15] |
1573 | describing it. The short story is that Getopts generates an argument | |
1574 | parser and a help message from a vector of options (The fact that it | |
1575 | is a vector is hidden behind a struct and a set of methods). Once the | |
a7813a04 XL |
1576 | parsing is done, the parser returns a struct that records matches |
1577 | for defined options, and remaining "free" arguments. | |
1578 | From there, we can get information about the flags, for | |
b039eaaf | 1579 | instance, whether they were passed in, and what arguments they |
e9174d1e SL |
1580 | had. Here's our program with the appropriate `extern crate` |
1581 | statements, and the basic argument setup for Getopts: | |
1582 | ||
1583 | ```rust,ignore | |
1584 | extern crate getopts; | |
1585 | extern crate rustc_serialize; | |
1586 | ||
1587 | use getopts::Options; | |
1588 | use std::env; | |
1589 | ||
1590 | fn print_usage(program: &str, opts: Options) { | |
1591 | println!("{}", opts.usage(&format!("Usage: {} [options] <data-path> <city>", program))); | |
1592 | } | |
1593 | ||
1594 | fn main() { | |
1595 | let args: Vec<String> = env::args().collect(); | |
7453a54e | 1596 | let program = &args[0]; |
e9174d1e SL |
1597 | |
1598 | let mut opts = Options::new(); | |
1599 | opts.optflag("h", "help", "Show this usage message."); | |
b039eaaf | 1600 | |
e9174d1e SL |
1601 | let matches = match opts.parse(&args[1..]) { |
1602 | Ok(m) => { m } | |
9cc50fc6 | 1603 | Err(e) => { panic!(e.to_string()) } |
e9174d1e SL |
1604 | }; |
1605 | if matches.opt_present("h") { | |
1606 | print_usage(&program, opts); | |
9cc50fc6 | 1607 | return; |
e9174d1e | 1608 | } |
a7813a04 XL |
1609 | let data_path = &matches.free[0]; |
1610 | let city: &str = &matches.free[1]; | |
b039eaaf | 1611 | |
7453a54e | 1612 | // Do stuff with information |
e9174d1e SL |
1613 | } |
1614 | ``` | |
1615 | ||
1616 | First, we get a vector of the arguments passed into our program. We | |
1617 | then store the first one, knowing that it is our program's name. Once | |
1618 | that's done, we set up our argument flags, in this case a simplistic | |
1619 | help message flag. Once we have the argument flags set up, we use | |
1620 | `Options.parse` to parse the argument vector (starting from index one, | |
b039eaaf | 1621 | because index 0 is the program name). If this was successful, we |
e9174d1e SL |
1622 | assign matches to the parsed object, if not, we panic. Once past that, |
1623 | we test if the user passed in the help flag, and if so print the usage | |
1624 | message. The option help messages are constructed by Getopts, so all | |
1625 | we have to do to print the usage message is tell it what we want it to | |
1626 | print for the program name and template. If the user has not passed in | |
1627 | the help flag, we assign the proper variables to their corresponding | |
1628 | arguments. | |
1629 | ||
1630 | ## Writing the logic | |
1631 | ||
92a42be0 SL |
1632 | We all write code differently, but error handling is usually the last thing we |
1633 | want to think about. This isn't great for the overall design of a program, but | |
1634 | it can be useful for rapid prototyping. Because Rust forces us to be explicit | |
1635 | about error handling (by making us call `unwrap`), it is easy to see which | |
1636 | parts of our program can cause errors. | |
e9174d1e SL |
1637 | |
1638 | In this case study, the logic is really simple. All we need to do is parse the | |
1639 | CSV data given to us and print out a field in matching rows. Let's do it. (Make | |
1640 | sure to add `extern crate csv;` to the top of your file.) | |
1641 | ||
1642 | ```rust,ignore | |
9cc50fc6 | 1643 | use std::fs::File; |
9cc50fc6 | 1644 | |
e9174d1e SL |
1645 | // This struct represents the data in each row of the CSV file. |
1646 | // Type based decoding absolves us of a lot of the nitty gritty error | |
1647 | // handling, like parsing strings as integers or floats. | |
1648 | #[derive(Debug, RustcDecodable)] | |
1649 | struct Row { | |
1650 | country: String, | |
1651 | city: String, | |
1652 | accent_city: String, | |
1653 | region: String, | |
1654 | ||
1655 | // Not every row has data for the population, latitude or longitude! | |
1656 | // So we express them as `Option` types, which admits the possibility of | |
1657 | // absence. The CSV parser will fill in the correct value for us. | |
1658 | population: Option<u64>, | |
1659 | latitude: Option<f64>, | |
1660 | longitude: Option<f64>, | |
1661 | } | |
1662 | ||
1663 | fn print_usage(program: &str, opts: Options) { | |
1664 | println!("{}", opts.usage(&format!("Usage: {} [options] <data-path> <city>", program))); | |
1665 | } | |
1666 | ||
1667 | fn main() { | |
1668 | let args: Vec<String> = env::args().collect(); | |
7453a54e | 1669 | let program = &args[0]; |
e9174d1e SL |
1670 | |
1671 | let mut opts = Options::new(); | |
1672 | opts.optflag("h", "help", "Show this usage message."); | |
b039eaaf | 1673 | |
e9174d1e SL |
1674 | let matches = match opts.parse(&args[1..]) { |
1675 | Ok(m) => { m } | |
92a42be0 | 1676 | Err(e) => { panic!(e.to_string()) } |
e9174d1e | 1677 | }; |
b039eaaf | 1678 | |
e9174d1e SL |
1679 | if matches.opt_present("h") { |
1680 | print_usage(&program, opts); | |
7453a54e SL |
1681 | return; |
1682 | } | |
b039eaaf | 1683 | |
a7813a04 XL |
1684 | let data_path = &matches.free[0]; |
1685 | let city: &str = &matches.free[1]; | |
b039eaaf | 1686 | |
7453a54e SL |
1687 | let file = File::open(data_path).unwrap(); |
1688 | let mut rdr = csv::Reader::from_reader(file); | |
b039eaaf | 1689 | |
7453a54e SL |
1690 | for row in rdr.decode::<Row>() { |
1691 | let row = row.unwrap(); | |
b039eaaf | 1692 | |
7453a54e SL |
1693 | if row.city == city { |
1694 | println!("{}, {}: {:?}", | |
1695 | row.city, row.country, | |
1696 | row.population.expect("population count")); | |
1697 | } | |
1698 | } | |
e9174d1e SL |
1699 | } |
1700 | ``` | |
1701 | ||
1702 | Let's outline the errors. We can start with the obvious: the three places that | |
1703 | `unwrap` is called: | |
1704 | ||
9cc50fc6 | 1705 | 1. [`File::open`](../std/fs/struct.File.html#method.open) |
e9174d1e SL |
1706 | can return an |
1707 | [`io::Error`](../std/io/struct.Error.html). | |
1708 | 2. [`csv::Reader::decode`](http://burntsushi.net/rustdoc/csv/struct.Reader.html#method.decode) | |
1709 | decodes one record at a time, and | |
1710 | [decoding a | |
1711 | record](http://burntsushi.net/rustdoc/csv/struct.DecodedRecords.html) | |
1712 | (look at the `Item` associated type on the `Iterator` impl) | |
1713 | can produce a | |
1714 | [`csv::Error`](http://burntsushi.net/rustdoc/csv/enum.Error.html). | |
1715 | 3. If `row.population` is `None`, then calling `expect` will panic. | |
1716 | ||
1717 | Are there any others? What if we can't find a matching city? Tools like `grep` | |
1718 | will return an error code, so we probably should too. So we have logic errors | |
1719 | specific to our problem, IO errors and CSV parsing errors. We're going to | |
1720 | explore two different ways to approach handling these errors. | |
1721 | ||
1722 | I'd like to start with `Box<Error>`. Later, we'll see how defining our own | |
1723 | error type can be useful. | |
1724 | ||
1725 | ## Error handling with `Box<Error>` | |
1726 | ||
1727 | `Box<Error>` is nice because it *just works*. You don't need to define your own | |
1728 | error types and you don't need any `From` implementations. The downside is that | |
1729 | since `Box<Error>` is a trait object, it *erases the type*, which means the | |
1730 | compiler can no longer reason about its underlying type. | |
1731 | ||
1732 | [Previously](#the-limits-of-combinators) we started refactoring our code by | |
1733 | changing the type of our function from `T` to `Result<T, OurErrorType>`. In | |
9cc50fc6 | 1734 | this case, `OurErrorType` is only `Box<Error>`. But what's `T`? And can we add |
e9174d1e SL |
1735 | a return type to `main`? |
1736 | ||
1737 | The answer to the second question is no, we can't. That means we'll need to | |
1738 | write a new function. But what is `T`? The simplest thing we can do is to | |
1739 | return a list of matching `Row` values as a `Vec<Row>`. (Better code would | |
1740 | return an iterator, but that is left as an exercise to the reader.) | |
1741 | ||
1742 | Let's refactor our code into its own function, but keep the calls to `unwrap`. | |
1743 | Note that we opt to handle the possibility of a missing population count by | |
1744 | simply ignoring that row. | |
1745 | ||
1746 | ```rust,ignore | |
7453a54e SL |
1747 | use std::path::Path; |
1748 | ||
e9174d1e SL |
1749 | struct Row { |
1750 | // unchanged | |
1751 | } | |
1752 | ||
1753 | struct PopulationCount { | |
1754 | city: String, | |
1755 | country: String, | |
1756 | // This is no longer an `Option` because values of this type are only | |
1757 | // constructed if they have a population count. | |
1758 | count: u64, | |
1759 | } | |
1760 | ||
1761 | fn print_usage(program: &str, opts: Options) { | |
1762 | println!("{}", opts.usage(&format!("Usage: {} [options] <data-path> <city>", program))); | |
1763 | } | |
1764 | ||
1765 | fn search<P: AsRef<Path>>(file_path: P, city: &str) -> Vec<PopulationCount> { | |
1766 | let mut found = vec![]; | |
9cc50fc6 | 1767 | let file = File::open(file_path).unwrap(); |
e9174d1e SL |
1768 | let mut rdr = csv::Reader::from_reader(file); |
1769 | for row in rdr.decode::<Row>() { | |
1770 | let row = row.unwrap(); | |
1771 | match row.population { | |
1772 | None => { } // skip it | |
1773 | Some(count) => if row.city == city { | |
1774 | found.push(PopulationCount { | |
1775 | city: row.city, | |
1776 | country: row.country, | |
1777 | count: count, | |
1778 | }); | |
1779 | }, | |
1780 | } | |
1781 | } | |
1782 | found | |
1783 | } | |
1784 | ||
1785 | fn main() { | |
7453a54e SL |
1786 | let args: Vec<String> = env::args().collect(); |
1787 | let program = &args[0]; | |
e9174d1e | 1788 | |
7453a54e SL |
1789 | let mut opts = Options::new(); |
1790 | opts.optflag("h", "help", "Show this usage message."); | |
e9174d1e | 1791 | |
7453a54e SL |
1792 | let matches = match opts.parse(&args[1..]) { |
1793 | Ok(m) => { m } | |
1794 | Err(e) => { panic!(e.to_string()) } | |
1795 | }; | |
a7813a04 | 1796 | |
7453a54e SL |
1797 | if matches.opt_present("h") { |
1798 | print_usage(&program, opts); | |
1799 | return; | |
1800 | } | |
b039eaaf | 1801 | |
a7813a04 XL |
1802 | let data_path = &matches.free[0]; |
1803 | let city: &str = &matches.free[1]; | |
1804 | ||
7453a54e SL |
1805 | for pop in search(data_path, city) { |
1806 | println!("{}, {}: {:?}", pop.city, pop.country, pop.count); | |
1807 | } | |
e9174d1e SL |
1808 | } |
1809 | ||
1810 | ``` | |
1811 | ||
1812 | While we got rid of one use of `expect` (which is a nicer variant of `unwrap`), | |
1813 | we still should handle the absence of any search results. | |
1814 | ||
1815 | To convert this to proper error handling, we need to do the following: | |
1816 | ||
1817 | 1. Change the return type of `search` to be `Result<Vec<PopulationCount>, | |
1818 | Box<Error>>`. | |
1819 | 2. Use the [`try!` macro](#code-try-def) so that errors are returned to the | |
1820 | caller instead of panicking the program. | |
1821 | 3. Handle the error in `main`. | |
1822 | ||
1823 | Let's try it: | |
c34b1796 | 1824 | |
e9174d1e | 1825 | ```rust,ignore |
9cc50fc6 SL |
1826 | use std::error::Error; |
1827 | ||
1828 | // The rest of the code before this is unchanged | |
1829 | ||
e9174d1e SL |
1830 | fn search<P: AsRef<Path>> |
1831 | (file_path: P, city: &str) | |
3157f602 | 1832 | -> Result<Vec<PopulationCount>, Box<Error>> { |
e9174d1e | 1833 | let mut found = vec![]; |
9cc50fc6 | 1834 | let file = try!(File::open(file_path)); |
e9174d1e SL |
1835 | let mut rdr = csv::Reader::from_reader(file); |
1836 | for row in rdr.decode::<Row>() { | |
1837 | let row = try!(row); | |
1838 | match row.population { | |
1839 | None => { } // skip it | |
1840 | Some(count) => if row.city == city { | |
1841 | found.push(PopulationCount { | |
1842 | city: row.city, | |
1843 | country: row.country, | |
1844 | count: count, | |
1845 | }); | |
1846 | }, | |
1847 | } | |
1848 | } | |
1849 | if found.is_empty() { | |
1850 | Err(From::from("No matching cities with a population were found.")) | |
1851 | } else { | |
1852 | Ok(found) | |
1853 | } | |
c34b1796 | 1854 | } |
e9174d1e | 1855 | ``` |
c34b1796 | 1856 | |
e9174d1e SL |
1857 | Instead of `x.unwrap()`, we now have `try!(x)`. Since our function returns a |
1858 | `Result<T, E>`, the `try!` macro will return early from the function if an | |
1859 | error occurs. | |
c34b1796 | 1860 | |
3157f602 XL |
1861 | At the end of `search` we also convert a plain string to an error type |
1862 | by using the [corresponding `From` impls](../std/convert/trait.From.html): | |
c34b1796 | 1863 | |
e9174d1e SL |
1864 | ```rust,ignore |
1865 | // We are making use of this impl in the code above, since we call `From::from` | |
1866 | // on a `&'static str`. | |
3157f602 | 1867 | impl<'a> From<&'a str> for Box<Error> |
e9174d1e SL |
1868 | |
1869 | // But this is also useful when you need to allocate a new string for an | |
1870 | // error message, usually with `format!`. | |
3157f602 | 1871 | impl From<String> for Box<Error> |
e9174d1e SL |
1872 | ``` |
1873 | ||
92a42be0 SL |
1874 | Since `search` now returns a `Result<T, E>`, `main` should use case analysis |
1875 | when calling `search`: | |
1876 | ||
1877 | ```rust,ignore | |
1878 | ... | |
a7813a04 XL |
1879 | match search(data_path, city) { |
1880 | Ok(pops) => { | |
1881 | for pop in pops { | |
1882 | println!("{}, {}: {:?}", pop.city, pop.country, pop.count); | |
1883 | } | |
92a42be0 | 1884 | } |
a7813a04 | 1885 | Err(err) => println!("{}", err) |
92a42be0 | 1886 | } |
92a42be0 SL |
1887 | ... |
1888 | ``` | |
1889 | ||
e9174d1e SL |
1890 | Now that we've seen how to do proper error handling with `Box<Error>`, let's |
1891 | try a different approach with our own custom error type. But first, let's take | |
1892 | a quick break from error handling and add support for reading from `stdin`. | |
1893 | ||
1894 | ## Reading from stdin | |
1895 | ||
1896 | In our program, we accept a single file for input and do one pass over the | |
1897 | data. This means we probably should be able to accept input on stdin. But maybe | |
1898 | we like the current format too—so let's have both! | |
1899 | ||
b039eaaf | 1900 | Adding support for stdin is actually quite easy. There are only three things we |
e9174d1e SL |
1901 | have to do: |
1902 | ||
1903 | 1. Tweak the program arguments so that a single parameter—the | |
1904 | city—can be accepted while the population data is read from stdin. | |
1905 | 2. Modify the program so that an option `-f` can take the file, if it | |
1906 | is not passed into stdin. | |
1907 | 3. Modify the `search` function to take an *optional* file path. When `None`, | |
1908 | it should know to read from stdin. | |
1909 | ||
1910 | First, here's the new usage: | |
1911 | ||
1912 | ```rust,ignore | |
1913 | fn print_usage(program: &str, opts: Options) { | |
7453a54e | 1914 | println!("{}", opts.usage(&format!("Usage: {} [options] <city>", program))); |
e9174d1e SL |
1915 | } |
1916 | ``` | |
a7813a04 | 1917 | Of course we need to adapt the argument handling code: |
e9174d1e SL |
1918 | |
1919 | ```rust,ignore | |
1920 | ... | |
a7813a04 XL |
1921 | let mut opts = Options::new(); |
1922 | opts.optopt("f", "file", "Choose an input file, instead of using STDIN.", "NAME"); | |
1923 | opts.optflag("h", "help", "Show this usage message."); | |
1924 | ... | |
1925 | let data_path = matches.opt_str("f"); | |
1926 | ||
1927 | let city = if !matches.free.is_empty() { | |
1928 | &matches.free[0] | |
1929 | } else { | |
1930 | print_usage(&program, opts); | |
1931 | return; | |
1932 | }; | |
1933 | ||
1934 | match search(&data_path, city) { | |
1935 | Ok(pops) => { | |
1936 | for pop in pops { | |
1937 | println!("{}, {}: {:?}", pop.city, pop.country, pop.count); | |
1938 | } | |
9cc50fc6 | 1939 | } |
a7813a04 | 1940 | Err(err) => println!("{}", err) |
9cc50fc6 | 1941 | } |
e9174d1e | 1942 | ... |
c34b1796 AL |
1943 | ``` |
1944 | ||
a7813a04 XL |
1945 | We've made the user experience a bit nicer by showing the usage message, |
1946 | instead of a panic from an out-of-bounds index, when `city`, the | |
1947 | remaining free argument, is not present. | |
e9174d1e SL |
1948 | |
1949 | Modifying `search` is slightly trickier. The `csv` crate can build a | |
1950 | parser out of | |
1951 | [any type that implements `io::Read`](http://burntsushi.net/rustdoc/csv/struct.Reader.html#method.from_reader). | |
1952 | But how can we use the same code over both types? There's actually a | |
1953 | couple ways we could go about this. One way is to write `search` such | |
1954 | that it is generic on some type parameter `R` that satisfies | |
9cc50fc6 | 1955 | `io::Read`. Another way is to use trait objects: |
c34b1796 | 1956 | |
e9174d1e | 1957 | ```rust,ignore |
9cc50fc6 SL |
1958 | use std::io; |
1959 | ||
1960 | // The rest of the code before this is unchanged | |
1961 | ||
e9174d1e SL |
1962 | fn search<P: AsRef<Path>> |
1963 | (file_path: &Option<P>, city: &str) | |
3157f602 | 1964 | -> Result<Vec<PopulationCount>, Box<Error>> { |
e9174d1e SL |
1965 | let mut found = vec![]; |
1966 | let input: Box<io::Read> = match *file_path { | |
1967 | None => Box::new(io::stdin()), | |
9cc50fc6 | 1968 | Some(ref file_path) => Box::new(try!(File::open(file_path))), |
e9174d1e SL |
1969 | }; |
1970 | let mut rdr = csv::Reader::from_reader(input); | |
1971 | // The rest remains unchanged! | |
1972 | } | |
1973 | ``` | |
1974 | ||
1975 | ## Error handling with a custom type | |
1976 | ||
1977 | Previously, we learned how to | |
1978 | [compose errors using a custom error type](#composing-custom-error-types). | |
1979 | We did this by defining our error type as an `enum` and implementing `Error` | |
1980 | and `From`. | |
1981 | ||
1982 | Since we have three distinct errors (IO, CSV parsing and not found), let's | |
1983 | define an `enum` with three variants: | |
1984 | ||
1985 | ```rust,ignore | |
1986 | #[derive(Debug)] | |
1987 | enum CliError { | |
1988 | Io(io::Error), | |
1989 | Csv(csv::Error), | |
1990 | NotFound, | |
1991 | } | |
1992 | ``` | |
1993 | ||
1994 | And now for impls on `Display` and `Error`: | |
1995 | ||
1996 | ```rust,ignore | |
a7813a04 XL |
1997 | use std::fmt; |
1998 | ||
e9174d1e SL |
1999 | impl fmt::Display for CliError { |
2000 | fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | |
2001 | match *self { | |
2002 | CliError::Io(ref err) => err.fmt(f), | |
2003 | CliError::Csv(ref err) => err.fmt(f), | |
2004 | CliError::NotFound => write!(f, "No matching cities with a \ | |
2005 | population were found."), | |
2006 | } | |
2007 | } | |
2008 | } | |
2009 | ||
2010 | impl Error for CliError { | |
2011 | fn description(&self) -> &str { | |
2012 | match *self { | |
2013 | CliError::Io(ref err) => err.description(), | |
2014 | CliError::Csv(ref err) => err.description(), | |
2015 | CliError::NotFound => "not found", | |
2016 | } | |
2017 | } | |
54a0048b | 2018 | |
a7813a04 XL |
2019 | fn cause(&self) -> Option<&Error> { |
2020 | match *self { | |
54a0048b | 2021 | CliError::Io(ref err) => Some(err), |
a7813a04 XL |
2022 | CliError::Csv(ref err) => Some(err), |
2023 | // Our custom error doesn't have an underlying cause, | |
2024 | // but we could modify it so that it does. | |
2025 | CliError::NotFound => None, | |
54a0048b SL |
2026 | } |
2027 | } | |
e9174d1e SL |
2028 | } |
2029 | ``` | |
2030 | ||
2031 | Before we can use our `CliError` type in our `search` function, we need to | |
2032 | provide a couple `From` impls. How do we know which impls to provide? Well, | |
2033 | we'll need to convert from both `io::Error` and `csv::Error` to `CliError`. | |
2034 | Those are the only external errors, so we'll only need two `From` impls for | |
2035 | now: | |
2036 | ||
2037 | ```rust,ignore | |
2038 | impl From<io::Error> for CliError { | |
2039 | fn from(err: io::Error) -> CliError { | |
2040 | CliError::Io(err) | |
2041 | } | |
2042 | } | |
2043 | ||
2044 | impl From<csv::Error> for CliError { | |
2045 | fn from(err: csv::Error) -> CliError { | |
2046 | CliError::Csv(err) | |
2047 | } | |
2048 | } | |
2049 | ``` | |
2050 | ||
2051 | The `From` impls are important because of how | |
2052 | [`try!` is defined](#code-try-def). In particular, if an error occurs, | |
2053 | `From::from` is called on the error, which in this case, will convert it to our | |
2054 | own error type `CliError`. | |
2055 | ||
2056 | With the `From` impls done, we only need to make two small tweaks to our | |
2057 | `search` function: the return type and the “not found” error. Here it is in | |
2058 | full: | |
2059 | ||
2060 | ```rust,ignore | |
2061 | fn search<P: AsRef<Path>> | |
2062 | (file_path: &Option<P>, city: &str) | |
2063 | -> Result<Vec<PopulationCount>, CliError> { | |
2064 | let mut found = vec![]; | |
2065 | let input: Box<io::Read> = match *file_path { | |
2066 | None => Box::new(io::stdin()), | |
9cc50fc6 | 2067 | Some(ref file_path) => Box::new(try!(File::open(file_path))), |
e9174d1e SL |
2068 | }; |
2069 | let mut rdr = csv::Reader::from_reader(input); | |
2070 | for row in rdr.decode::<Row>() { | |
2071 | let row = try!(row); | |
2072 | match row.population { | |
2073 | None => { } // skip it | |
2074 | Some(count) => if row.city == city { | |
2075 | found.push(PopulationCount { | |
2076 | city: row.city, | |
2077 | country: row.country, | |
2078 | count: count, | |
2079 | }); | |
2080 | }, | |
2081 | } | |
2082 | } | |
2083 | if found.is_empty() { | |
2084 | Err(CliError::NotFound) | |
2085 | } else { | |
2086 | Ok(found) | |
2087 | } | |
2088 | } | |
2089 | ``` | |
2090 | ||
2091 | No other changes are necessary. | |
2092 | ||
2093 | ## Adding functionality | |
2094 | ||
2095 | Writing generic code is great, because generalizing stuff is cool, and | |
2096 | it can then be useful later. But sometimes, the juice isn't worth the | |
2097 | squeeze. Look at what we just did in the previous step: | |
2098 | ||
2099 | 1. Defined a new error type. | |
2100 | 2. Added impls for `Error`, `Display` and two for `From`. | |
2101 | ||
2102 | The big downside here is that our program didn't improve a whole lot. | |
2103 | There is quite a bit of overhead to representing errors with `enum`s, | |
2104 | especially in short programs like this. | |
2105 | ||
2106 | *One* useful aspect of using a custom error type like we've done here is that | |
2107 | the `main` function can now choose to handle errors differently. Previously, | |
2108 | with `Box<Error>`, it didn't have much of a choice: just print the message. | |
2109 | We're still doing that here, but what if we wanted to, say, add a `--quiet` | |
2110 | flag? The `--quiet` flag should silence any verbose output. | |
2111 | ||
2112 | Right now, if the program doesn't find a match, it will output a message saying | |
2113 | so. This can be a little clumsy, especially if you intend for the program to | |
2114 | be used in shell scripts. | |
2115 | ||
2116 | So let's start by adding the flags. Like before, we need to tweak the usage | |
b039eaaf | 2117 | string and add a flag to the Option variable. Once we've done that, Getopts does the rest: |
e9174d1e SL |
2118 | |
2119 | ```rust,ignore | |
2120 | ... | |
a7813a04 XL |
2121 | let mut opts = Options::new(); |
2122 | opts.optopt("f", "file", "Choose an input file, instead of using STDIN.", "NAME"); | |
2123 | opts.optflag("h", "help", "Show this usage message."); | |
2124 | opts.optflag("q", "quiet", "Silences errors and warnings."); | |
e9174d1e SL |
2125 | ... |
2126 | ``` | |
2127 | ||
9cc50fc6 | 2128 | Now we only need to implement our “quiet” functionality. This requires us to |
e9174d1e SL |
2129 | tweak the case analysis in `main`: |
2130 | ||
2131 | ```rust,ignore | |
a7813a04 XL |
2132 | use std::process; |
2133 | ... | |
2134 | match search(&data_path, city) { | |
2135 | Err(CliError::NotFound) if matches.opt_present("q") => process::exit(1), | |
2136 | Err(err) => panic!("{}", err), | |
2137 | Ok(pops) => for pop in pops { | |
2138 | println!("{}, {}: {:?}", pop.city, pop.country, pop.count); | |
2139 | } | |
e9174d1e | 2140 | } |
a7813a04 | 2141 | ... |
e9174d1e | 2142 | ``` |
c34b1796 | 2143 | |
e9174d1e SL |
2144 | Certainly, we don't want to be quiet if there was an IO error or if the data |
2145 | failed to parse. Therefore, we use case analysis to check if the error type is | |
2146 | `NotFound` *and* if `--quiet` has been enabled. If the search failed, we still | |
2147 | quit with an exit code (following `grep`'s convention). | |
2148 | ||
2149 | If we had stuck with `Box<Error>`, then it would be pretty tricky to implement | |
2150 | the `--quiet` functionality. | |
2151 | ||
2152 | This pretty much sums up our case study. From here, you should be ready to go | |
2153 | out into the world and write your own programs and libraries with proper error | |
2154 | handling. | |
2155 | ||
2156 | # The Short Story | |
2157 | ||
9cc50fc6 | 2158 | Since this section is long, it is useful to have a quick summary for error |
e9174d1e SL |
2159 | handling in Rust. These are some good “rules of thumb." They are emphatically |
2160 | *not* commandments. There are probably good reasons to break every one of these | |
2161 | heuristics! | |
2162 | ||
2163 | * If you're writing short example code that would be overburdened by error | |
9cc50fc6 | 2164 | handling, it's probably fine to use `unwrap` (whether that's |
e9174d1e SL |
2165 | [`Result::unwrap`](../std/result/enum.Result.html#method.unwrap), |
2166 | [`Option::unwrap`](../std/option/enum.Option.html#method.unwrap) | |
2167 | or preferably | |
2168 | [`Option::expect`](../std/option/enum.Option.html#method.expect)). | |
2169 | Consumers of your code should know to use proper error handling. (If they | |
2170 | don't, send them here!) | |
2171 | * If you're writing a quick 'n' dirty program, don't feel ashamed if you use | |
2172 | `unwrap`. Be warned: if it winds up in someone else's hands, don't be | |
2173 | surprised if they are agitated by poor error messages! | |
2174 | * If you're writing a quick 'n' dirty program and feel ashamed about panicking | |
3157f602 XL |
2175 | anyway, then use either a `String` or a `Box<Error>` for your |
2176 | error type. | |
e9174d1e SL |
2177 | * Otherwise, in a program, define your own error types with appropriate |
2178 | [`From`](../std/convert/trait.From.html) | |
2179 | and | |
2180 | [`Error`](../std/error/trait.Error.html) | |
9e0c209e | 2181 | impls to make the [`try!`](../std/macro.try.html) |
b039eaaf | 2182 | macro more ergonomic. |
e9174d1e SL |
2183 | * If you're writing a library and your code can produce errors, define your own |
2184 | error type and implement the | |
2185 | [`std::error::Error`](../std/error/trait.Error.html) | |
2186 | trait. Where appropriate, implement | |
2187 | [`From`](../std/convert/trait.From.html) to make both | |
2188 | your library code and the caller's code easier to write. (Because of Rust's | |
2189 | coherence rules, callers will not be able to impl `From` on your error type, | |
2190 | so your library should do it.) | |
2191 | * Learn the combinators defined on | |
2192 | [`Option`](../std/option/enum.Option.html) | |
2193 | and | |
2194 | [`Result`](../std/result/enum.Result.html). | |
2195 | Using them exclusively can be a bit tiring at times, but I've personally | |
2196 | found a healthy mix of `try!` and combinators to be quite appealing. | |
2197 | `and_then`, `map` and `unwrap_or` are my favorites. | |
2198 | ||
2199 | [1]: ../book/patterns.html | |
2200 | [2]: ../std/option/enum.Option.html#method.map | |
2201 | [3]: ../std/option/enum.Option.html#method.unwrap_or | |
2202 | [4]: ../std/option/enum.Option.html#method.unwrap_or_else | |
2203 | [5]: ../std/option/enum.Option.html | |
3157f602 | 2204 | [6]: ../std/result/index.html |
e9174d1e SL |
2205 | [7]: ../std/result/enum.Result.html#method.unwrap |
2206 | [8]: ../std/fmt/trait.Debug.html | |
2207 | [9]: ../std/primitive.str.html#method.parse | |
2208 | [10]: ../book/associated-types.html | |
2209 | [11]: https://github.com/petewarden/dstkdata | |
2210 | [12]: http://burntsushi.net/stuff/worldcitiespop.csv.gz | |
2211 | [13]: http://burntsushi.net/stuff/uscitiespop.csv.gz | |
2212 | [14]: http://doc.crates.io/guide.html | |
2213 | [15]: http://doc.rust-lang.org/getopts/getopts/index.html |