]> git.proxmox.com Git - rustc.git/blame - src/doc/book/nostarch/chapter12.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / book / nostarch / chapter12.md
CommitLineData
5099ac24
FG
1<!-- DO NOT EDIT THIS FILE.
2
3This file is periodically generated from the content in the `/src/`
4directory, so all fixes need to be made in `/src/`.
5-->
a2a8927a
XL
6
7[TOC]
8
9# An I/O Project: Building a Command Line Program
10
11This chapter is a recap of the many skills you’ve learned so far and an
12exploration of a few more standard library features. We’ll build a command line
13tool that interacts with file and command line input/output to practice some of
14the Rust concepts you now have under your belt.
15
16Rust’s speed, safety, single binary output, and cross-platform support make it
17an ideal language for creating command line tools, so for our project, we’ll
04454e1e
FG
18make our own version of the classic command line search tool `grep`
19(**g**lobally search a **r**egular **e**xpression and **p**rint). In the
20simplest use case, `grep` searches a specified file for a specified string. To
923072b8
FG
21do so, `grep` takes as its arguments a file path and a string. Then it reads
22the file, finds lines in that file that contain the string argument, and prints
04454e1e
FG
23those lines.
24
25Along the way, we’ll show how to make our command line tool use the terminal
26features that many other command line tools use. We’ll read the value of an
a2a8927a
XL
27environment variable to allow the user to configure the behavior of our tool.
28We’ll also print error messages to the standard error console stream (`stderr`)
29instead of standard output (`stdout`), so, for example, the user can redirect
30successful output to a file while still seeing error messages onscreen.
31
32One Rust community member, Andrew Gallant, has already created a fully
33featured, very fast version of `grep`, called `ripgrep`. By comparison, our
04454e1e
FG
34version will be fairly simple, but this chapter will give you some of the
35background knowledge you need to understand a real-world project such as
a2a8927a
XL
36`ripgrep`.
37
38Our `grep` project will combine a number of concepts you’ve learned so far:
39
40* Organizing code (using what you learned about modules in Chapter 7)
41* Using vectors and strings (collections, Chapter 8)
42* Handling errors (Chapter 9)
43* Using traits and lifetimes where appropriate (Chapter 10)
44* Writing tests (Chapter 11)
45
46We’ll also briefly introduce closures, iterators, and trait objects, which
47Chapters 13 and 17 will cover in detail.
48
49## Accepting Command Line Arguments
50
51Let’s create a new project with, as always, `cargo new`. We’ll call our project
52`minigrep` to distinguish it from the `grep` tool that you might already have
53on your system.
54
55```
56$ cargo new minigrep
57 Created binary (application) `minigrep` project
58$ cd minigrep
59```
60
61The first task is to make `minigrep` accept its two command line arguments: the
923072b8
FG
62file path and a string to search for. That is, we want to be able to run our
63program with `cargo run`, two hyphens to indicate the following arguments are
64for our program rather than for `cargo`, a string to search for, and a path to
65a file to search in, like so:
a2a8927a
XL
66
67```
923072b8 68$ cargo run -- searchstring example-filename.txt
a2a8927a
XL
69```
70
923072b8
FG
71<!---
72Depending on platform, the above might be written as
73
74```
75$ cargo run -- searchstring example-filename.txt
76```
77
78This is mentioned in the cargo run help:
79
80cargo-run
81Run a binary or example of the local package
82
83USAGE:
84 cargo run [OPTIONS] [--] [args]...
85
86I know it's optional, but I think it's a bit more failsafe to separate cargo
87and its arguments from your app and your app's arguments.
88
89/JT --->
90<!-- Good call, I've updated this here and throughout where relevant. /Carol -->
91
a2a8927a
XL
92Right now, the program generated by `cargo new` cannot process arguments we
93give it. Some existing libraries on *https://crates.io/* can help with writing
94a program that accepts command line arguments, but because you’re just learning
95this concept, let’s implement this capability ourselves.
96
97### Reading the Argument Values
98
99To enable `minigrep` to read the values of command line arguments we pass to
04454e1e
FG
100it, we’ll need the `std::env::args` function provided in Rust’s standard
101library. This function returns an iterator of the command line arguments passed
102to `minigrep`. We’ll cover iterators fully in Chapter 13. For now, you only
103need to know two details about iterators: iterators produce a series of values,
104and we can call the `collect` method on an iterator to turn it into a
105collection, such as a vector, that contains all the elements the iterator
106produces.
a2a8927a 107
04454e1e
FG
108The code in Listing 12-1 allows your `minigrep` program to read any command
109line arguments passed to it and then collect the values into a vector.
a2a8927a
XL
110
111Filename: src/main.rs
112
113```
114use std::env;
115
116fn main() {
117 let args: Vec<String> = env::args().collect();
923072b8 118 dbg!(args);
a2a8927a
XL
119}
120```
121
122Listing 12-1: Collecting the command line arguments into a vector and printing
123them
124
125First, we bring the `std::env` module into scope with a `use` statement so we
126can use its `args` function. Notice that the `std::env::args` function is
127nested in two levels of modules. As we discussed in Chapter 7, in cases where
923072b8
FG
128the desired function is nested in more than one module, we’ve chosen to bring
129the parent module into scope rather than the function. By doing so, we can
130easily use other functions from `std::env`. It’s also less ambiguous than
a2a8927a
XL
131adding `use std::env::args` and then calling the function with just `args`,
132because `args` might easily be mistaken for a function that’s defined in the
133current module.
134
923072b8
FG
135<!---
136
137"it’s conventional to bring the parent module into scope rather than the
138function"
139
140I'm not sure if we have a strong standard. The first thing that came to mind
141was "how does rustfmt handle it?" and it doesn't have any preferred format.
142Same for clippy.
143
144I'd say we could show them how to do it, but I wouldn't say anything about
145convention.
146
147/JT --->
148<!-- Fair, I changed "it's conventional" to "we've chosen". /Carol -->
149
a2a8927a
XL
150> ### The `args` Function and Invalid Unicode
151>
152> Note that `std::env::args` will panic if any argument contains invalid
153> Unicode. If your program needs to accept arguments containing invalid
154> Unicode, use `std::env::args_os` instead. That function returns an iterator
155> that produces `OsString` values instead of `String` values. We’ve chosen to
156> use `std::env::args` here for simplicity, because `OsString` values differ
157> per platform and are more complex to work with than `String` values.
158
159On the first line of `main`, we call `env::args`, and we immediately use
160`collect` to turn the iterator into a vector containing all the values produced
161by the iterator. We can use the `collect` function to create many kinds of
162collections, so we explicitly annotate the type of `args` to specify that we
163want a vector of strings. Although we very rarely need to annotate types in
164Rust, `collect` is one function you do often need to annotate because Rust
165isn’t able to infer the kind of collection you want.
166
923072b8
FG
167Finally, we print the vector using the debug macro. Let’s try running the code
168first with no arguments and then with two arguments:
a2a8927a
XL
169
170```
171$ cargo run
172--snip--
173["target/debug/minigrep"]
174```
175
176```
923072b8 177$ cargo run -- needle haystack
a2a8927a
XL
178--snip--
179["target/debug/minigrep", "needle", "haystack"]
180```
181
182Notice that the first value in the vector is `"target/debug/minigrep"`, which
183is the name of our binary. This matches the behavior of the arguments list in
184C, letting programs use the name by which they were invoked in their execution.
185It’s often convenient to have access to the program name in case you want to
186print it in messages or change behavior of the program based on what command
187line alias was used to invoke the program. But for the purposes of this
188chapter, we’ll ignore it and save only the two arguments we need.
189
190### Saving the Argument Values in Variables
191
04454e1e
FG
192The program is currently able to access the values specified as command line
193arguments. Now we need to save the values of the two arguments in variables so
194we can use the values throughout the rest of the program. We do that in Listing
19512-2.
a2a8927a
XL
196
197Filename: src/main.rs
198
199```
200use std::env;
201
202fn main() {
203 let args: Vec<String> = env::args().collect();
204
205 let query = &args[1];
923072b8 206 let file_path = &args[2];
a2a8927a
XL
207
208 println!("Searching for {}", query);
923072b8 209 println!("In file {}", file_path);
a2a8927a
XL
210}
211```
212
923072b8 213Listing 12-2: Creating variables to hold the query argument and file path
a2a8927a
XL
214argument
215
216As we saw when we printed the vector, the program’s name takes up the first
04454e1e
FG
217value in the vector at `args[0]`, so we’re starting arguments at index `1`. The
218first argument `minigrep` takes is the string we’re searching for, so we put a
a2a8927a 219reference to the first argument in the variable `query`. The second argument
923072b8
FG
220will be the file path, so we put a reference to the second argument in the
221variable `file_path`.
a2a8927a
XL
222
223We temporarily print the values of these variables to prove that the code is
224working as we intend. Let’s run this program again with the arguments `test`
225and `sample.txt`:
226
227```
923072b8 228$ cargo run -- test sample.txt
a2a8927a
XL
229 Compiling minigrep v0.1.0 (file:///projects/minigrep)
230 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
231 Running `target/debug/minigrep test sample.txt`
232Searching for test
233In file sample.txt
234```
235
236Great, the program is working! The values of the arguments we need are being
237saved into the right variables. Later we’ll add some error handling to deal
238with certain potential erroneous situations, such as when the user provides no
239arguments; for now, we’ll ignore that situation and work on adding file-reading
240capabilities instead.
241
242## Reading a File
243
923072b8 244Now we’ll add functionality to read the file specified in the `file_path`
04454e1e 245argument. First, we need a sample file to test it with: we’ll use a file with a
a2a8927a
XL
246small amount of text over multiple lines with some repeated words. Listing 12-3
247has an Emily Dickinson poem that will work well! Create a file called
248*poem.txt* at the root level of your project, and enter the poem “I’m Nobody!
249Who are you?”
250
251Filename: poem.txt
252
253```
254I'm nobody! Who are you?
255Are you nobody, too?
256Then there's a pair of us - don't tell!
257They'd banish us, you know.
258
259How dreary to be somebody!
260How public, like a frog
261To tell your name the livelong day
262To an admiring bog!
263```
264
265Listing 12-3: A poem by Emily Dickinson makes a good test case
266
267With the text in place, edit *src/main.rs* and add code to read the file, as
268shown in Listing 12-4.
269
270Filename: src/main.rs
271
272```
273use std::env;
274[1] use std::fs;
275
276fn main() {
277 // --snip--
923072b8 278 println!("In file {}", file_path);
a2a8927a 279
923072b8
FG
280 [2] let contents = fs::read_to_string(file_path)
281 .expect("Should have been able to read the file");
a2a8927a 282
923072b8 283 [3] println!("With text:\n{contents}");
a2a8927a
XL
284}
285```
286
287Listing 12-4: Reading the contents of the file specified by the second argument
288
04454e1e
FG
289First, we bring in a relevant part of the standard library with a `use`
290statement: we need `std::fs` to handle files [1].
a2a8927a 291
923072b8
FG
292In `main`, the new statement `fs::read_to_string` takes the `file_path`, opens
293that file, and returns a `std::io::Result<String>` of the file’s contents [2].
294
295<!---
296
297The above returns `std::io::Result<String>`. Calling it `Result<String>` is a
298bit ambiguous and may confuse the reader.
299
300/JT --->
301<!-- Totally right, I've fixed! /Carol -->
a2a8927a 302
04454e1e
FG
303After that, we again add a temporary `println!` statement that prints the value
304of `contents` after the file is read, so we can check that the program is
305working so far [3].
a2a8927a
XL
306
307Let’s run this code with any string as the first command line argument (because
308we haven’t implemented the searching part yet) and the *poem.txt* file as the
309second argument:
310
311```
923072b8 312$ cargo run -- the poem.txt
a2a8927a
XL
313 Compiling minigrep v0.1.0 (file:///projects/minigrep)
314 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
315 Running `target/debug/minigrep the poem.txt`
316Searching for the
317In file poem.txt
318With text:
319I'm nobody! Who are you?
320Are you nobody, too?
321Then there's a pair of us - don't tell!
322They'd banish us, you know.
323
324How dreary to be somebody!
325How public, like a frog
326To tell your name the livelong day
327To an admiring bog!
328```
329
330Great! The code read and then printed the contents of the file. But the code
04454e1e
FG
331has a few flaws. At the moment, the `main` function has multiple
332responsibilities: generally, functions are clearer and easier to maintain if
333each function is responsible for only one idea. The other problem is that we’re
334not handling errors as well as we could. The program is still small, so these
335flaws aren’t a big problem, but as the program grows, it will be harder to fix
336them cleanly. It’s good practice to begin refactoring early on when developing
337a program, because it’s much easier to refactor smaller amounts of code. We’ll
338do that next.
a2a8927a
XL
339
340## Refactoring to Improve Modularity and Error Handling
341
342To improve our program, we’ll fix four problems that have to do with the
04454e1e
FG
343program’s structure and how it’s handling potential errors. First, our `main`
344function now performs two tasks: it parses arguments and reads files. As our
345program grows, the number of separate tasks the `main` function handles will
346increase. As a function gains responsibilities, it becomes more difficult to
347reason about, harder to test, and harder to change without breaking one of its
348parts. It’s best to separate functionality so each function is responsible for
349one task.
a2a8927a 350
923072b8 351This issue also ties into the second problem: although `query` and `file_path`
a2a8927a
XL
352are configuration variables to our program, variables like `contents` are used
353to perform the program’s logic. The longer `main` becomes, the more variables
354we’ll need to bring into scope; the more variables we have in scope, the harder
355it will be to keep track of the purpose of each. It’s best to group the
356configuration variables into one structure to make their purpose clear.
357
358The third problem is that we’ve used `expect` to print an error message when
923072b8
FG
359reading the file fails, but the error message just prints `Should have been
360able to read the file`. Reading a file can fail in a number of ways: for
361example, the file could be missing, or we might not have permission to open it.
362Right now, regardless of the situation, we’d print the same error message for
04454e1e 363everything, which wouldn’t give the user any information!
a2a8927a
XL
364
365Fourth, we use `expect` repeatedly to handle different errors, and if the user
366runs our program without specifying enough arguments, they’ll get an `index out
367of bounds` error from Rust that doesn’t clearly explain the problem. It would
368be best if all the error-handling code were in one place so future maintainers
04454e1e 369had only one place to consult the code if the error-handling logic needed to
a2a8927a
XL
370change. Having all the error-handling code in one place will also ensure that
371we’re printing messages that will be meaningful to our end users.
372
373Let’s address these four problems by refactoring our project.
374
375### Separation of Concerns for Binary Projects
376
377The organizational problem of allocating responsibility for multiple tasks to
378the `main` function is common to many binary projects. As a result, the Rust
04454e1e
FG
379community has developed guidelines for splitting the separate concerns of a
380binary program when `main` starts getting large. This process has the following
381steps:
a2a8927a
XL
382
383* Split your program into a *main.rs* and a *lib.rs* and move your program’s
384 logic to *lib.rs*.
385* As long as your command line parsing logic is small, it can remain in
386 *main.rs*.
387* When the command line parsing logic starts getting complicated, extract it
388 from *main.rs* and move it to *lib.rs*.
389
390The responsibilities that remain in the `main` function after this process
391should be limited to the following:
392
393* Calling the command line parsing logic with the argument values
394* Setting up any other configuration
395* Calling a `run` function in *lib.rs*
396* Handling the error if `run` returns an error
397
398This pattern is about separating concerns: *main.rs* handles running the
399program, and *lib.rs* handles all the logic of the task at hand. Because you
400can’t test the `main` function directly, this structure lets you test all of
04454e1e
FG
401your program’s logic by moving it into functions in *lib.rs*. The code that
402remains in *main.rs* will be small enough to verify its correctness by reading
403it. Let’s rework our program by following this process.
a2a8927a
XL
404
405#### Extracting the Argument Parser
406
407We’ll extract the functionality for parsing arguments into a function that
408`main` will call to prepare for moving the command line parsing logic to
409*src/lib.rs*. Listing 12-5 shows the new start of `main` that calls a new
410function `parse_config`, which we’ll define in *src/main.rs* for the moment.
411
412Filename: src/main.rs
413
414```
415fn main() {
416 let args: Vec<String> = env::args().collect();
417
923072b8 418 let (query, file_path) = parse_config(&args);
a2a8927a
XL
419
420 // --snip--
421}
422
423fn parse_config(args: &[String]) -> (&str, &str) {
424 let query = &args[1];
923072b8 425 let file_path = &args[2];
a2a8927a 426
923072b8 427 (query, file_path)
a2a8927a
XL
428}
429```
430
431Listing 12-5: Extracting a `parse_config` function from `main`
432
433We’re still collecting the command line arguments into a vector, but instead of
434assigning the argument value at index 1 to the variable `query` and the
923072b8 435argument value at index 2 to the variable `file_path` within the `main`
a2a8927a
XL
436function, we pass the whole vector to the `parse_config` function. The
437`parse_config` function then holds the logic that determines which argument
438goes in which variable and passes the values back to `main`. We still create
923072b8 439the `query` and `file_path` variables in `main`, but `main` no longer has the
a2a8927a
XL
440responsibility of determining how the command line arguments and variables
441correspond.
442
443This rework may seem like overkill for our small program, but we’re refactoring
444in small, incremental steps. After making this change, run the program again to
445verify that the argument parsing still works. It’s good to check your progress
446often, to help identify the cause of problems when they occur.
447
448#### Grouping Configuration Values
449
450We can take another small step to improve the `parse_config` function further.
451At the moment, we’re returning a tuple, but then we immediately break that
452tuple into individual parts again. This is a sign that perhaps we don’t have
453the right abstraction yet.
454
455Another indicator that shows there’s room for improvement is the `config` part
456of `parse_config`, which implies that the two values we return are related and
457are both part of one configuration value. We’re not currently conveying this
458meaning in the structure of the data other than by grouping the two values into
04454e1e 459a tuple; we’ll instead put the two values into one struct and give each of the
a2a8927a
XL
460struct fields a meaningful name. Doing so will make it easier for future
461maintainers of this code to understand how the different values relate to each
462other and what their purpose is.
463
464Listing 12-6 shows the improvements to the `parse_config` function.
465
466Filename: src/main.rs
467
468```
469fn main() {
470 let args: Vec<String> = env::args().collect();
471
472 [1] let config = parse_config(&args);
473
474 println!("Searching for {}", config.query[2]);
923072b8 475 println!("In file {}", config.file_path[3]);
a2a8927a 476
923072b8
FG
477 let contents = fs::read_to_string(config.file_path[4])
478 .expect("Should have been able to read the file");
a2a8927a
XL
479
480 // --snip--
481}
482
483[5] struct Config {
484 query: String,
923072b8 485 file_path: String,
a2a8927a
XL
486}
487
488[6] fn parse_config(args: &[String]) -> Config {
489 [7] let query = args[1].clone();
923072b8 490 [8] let file_path = args[2].clone();
a2a8927a 491
923072b8 492 Config { query, file_path }
a2a8927a
XL
493}
494```
495
496Listing 12-6: Refactoring `parse_config` to return an instance of a `Config`
497struct
498
499We’ve added a struct named `Config` defined to have fields named `query` and
923072b8 500`file_path` [5]. The signature of `parse_config` now indicates that it returns a
a2a8927a
XL
501`Config` value [6]. In the body of `parse_config`, where we used to return
502string slices that reference `String` values in `args`, we now define `Config`
503to contain owned `String` values. The `args` variable in `main` is the owner of
504the argument values and is only letting the `parse_config` function borrow
505them, which means we’d violate Rust’s borrowing rules if `Config` tried to take
506ownership of the values in `args`.
507
04454e1e
FG
508There are a number of ways we could manage the `String` data; the easiest,
509though somewhat inefficient, route is to call the `clone` method on the values
510[7][8]. This will make a full copy of the data for the `Config` instance to
511own, which takes more time and memory than storing a reference to the string
512data. However, cloning the data also makes our code very straightforward
513because we don’t have to manage the lifetimes of the references; in this
514circumstance, giving up a little performance to gain simplicity is a worthwhile
515trade-off.
a2a8927a
XL
516
517> ### The Trade-Offs of Using `clone`
518>
519> There’s a tendency among many Rustaceans to avoid using `clone` to fix
520> ownership problems because of its runtime cost. In
521> Chapter 13, you’ll learn how to use more efficient
522> methods in this type of situation. But for now, it’s okay to copy a few
523> strings to continue making progress because you’ll make these copies only
923072b8 524> once and your file path and query string are very small. It’s better to have
a2a8927a
XL
525> a working program that’s a bit inefficient than to try to hyperoptimize code
526> on your first pass. As you become more experienced with Rust, it’ll be
527> easier to start with the most efficient solution, but for now, it’s
528> perfectly acceptable to call `clone`.
529
530We’ve updated `main` so it places the instance of `Config` returned by
531`parse_config` into a variable named `config` [1], and we updated the code that
923072b8 532previously used the separate `query` and `file_path` variables so it now uses
a2a8927a
XL
533the fields on the `Config` struct instead [2][3][4].
534
923072b8 535Now our code more clearly conveys that `query` and `file_path` are related and
a2a8927a
XL
536that their purpose is to configure how the program will work. Any code that
537uses these values knows to find them in the `config` instance in the fields
538named for their purpose.
539
540#### Creating a Constructor for `Config`
541
542So far, we’ve extracted the logic responsible for parsing the command line
543arguments from `main` and placed it in the `parse_config` function. Doing so
923072b8 544helped us to see that the `query` and `file_path` values were related and that
a2a8927a 545relationship should be conveyed in our code. We then added a `Config` struct to
923072b8 546name the related purpose of `query` and `file_path` and to be able to return the
a2a8927a
XL
547values’ names as struct field names from the `parse_config` function.
548
549So now that the purpose of the `parse_config` function is to create a `Config`
550instance, we can change `parse_config` from a plain function to a function
551named `new` that is associated with the `Config` struct. Making this change
552will make the code more idiomatic. We can create instances of types in the
553standard library, such as `String`, by calling `String::new`. Similarly, by
554changing `parse_config` into a `new` function associated with `Config`, we’ll
555be able to create instances of `Config` by calling `Config::new`. Listing 12-7
556shows the changes we need to make.
557
558Filename: src/main.rs
559
560```
561fn main() {
562 let args: Vec<String> = env::args().collect();
563
564 [1] let config = Config::new(&args);
565
566 // --snip--
567}
568
569// --snip--
570
571[2] impl Config {
572 [3] fn new(args: &[String]) -> Config {
573 let query = args[1].clone();
923072b8 574 let file_path = args[2].clone();
a2a8927a 575
923072b8 576 Config { query, file_path }
a2a8927a
XL
577 }
578}
579```
580
923072b8
FG
581<!---
582
583Not sure how nitty I'm being but worth a mention:
584
585Cloning in a constructor feels a bit awkward, as the clones will take
586additional memory which could exceed the system memory and cause a panic. We
587may want to promote a "constructors don't fail" way of thinking where possible.
588For that, we'd need to move to using two `String` params for the `new`
589function, which also feels a bit more Rust-y way of doing it.
590
591/JT --->
592<!-- We fix this in Chapter 13, and I haven't heard people saying "constructors
593don't fail" before... I can see how that would be important in some contexts,
594but I would hesitate to push that generally. /Carol -->
595
a2a8927a
XL
596Listing 12-7: Changing `parse_config` into `Config::new`
597
598We’ve updated `main` where we were calling `parse_config` to instead call
599`Config::new` [1]. We’ve changed the name of `parse_config` to `new` [3] and
600moved it within an `impl` block [2], which associates the `new` function with
601`Config`. Try compiling this code again to make sure it works.
602
603### Fixing the Error Handling
604
605Now we’ll work on fixing our error handling. Recall that attempting to access
606the values in the `args` vector at index 1 or index 2 will cause the program to
607panic if the vector contains fewer than three items. Try running the program
608without any arguments; it will look like this:
609
610```
611$ cargo run
612 Compiling minigrep v0.1.0 (file:///projects/minigrep)
613 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
614 Running `target/debug/minigrep`
615thread 'main' panicked at 'index out of bounds: the len is 1 but the index is 1', src/main.rs:27:21
616note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
617```
618
619The line `index out of bounds: the len is 1 but the index is 1` is an error
620message intended for programmers. It won’t help our end users understand what
04454e1e 621they should do instead. Let’s fix that now.
a2a8927a
XL
622
623#### Improving the Error Message
624
625In Listing 12-8, we add a check in the `new` function that will verify that the
626slice is long enough before accessing index 1 and 2. If the slice isn’t long
04454e1e 627enough, the program panics and displays a better error message.
a2a8927a
XL
628
629Filename: src/main.rs
630
631```
632// --snip--
633fn new(args: &[String]) -> Config {
634 if args.len() < 3 {
635 panic!("not enough arguments");
636 }
637 // --snip--
638```
639
640Listing 12-8: Adding a check for the number of arguments
641
04454e1e 642This code is similar to the `Guess::new` function we wrote in Listing 9-13,
a2a8927a
XL
643where we called `panic!` when the `value` argument was out of the range of
644valid values. Instead of checking for a range of values here, we’re checking
645that the length of `args` is at least 3 and the rest of the function can
646operate under the assumption that this condition has been met. If `args` has
647fewer than three items, this condition will be true, and we call the `panic!`
648macro to end the program immediately.
649
650With these extra few lines of code in `new`, let’s run the program without any
651arguments again to see what the error looks like now:
652
653```
654$ cargo run
655 Compiling minigrep v0.1.0 (file:///projects/minigrep)
656 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
657 Running `target/debug/minigrep`
658thread 'main' panicked at 'not enough arguments', src/main.rs:26:13
659note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
660```
661
662This output is better: we now have a reasonable error message. However, we also
663have extraneous information we don’t want to give to our users. Perhaps using
664the technique we used in Listing 9-13 isn’t the best to use here: a call to
665`panic!` is more appropriate for a programming problem than a usage problem, as
04454e1e 666discussed in Chapter 9. Instead, we’ll use the other technique you learned
a2a8927a
XL
667about in Chapter 9—returning a `Result` that indicates either success or an
668error.
669
923072b8 670#### Returning a `Result` Instead of Calling `panic!`
a2a8927a
XL
671
672We can instead return a `Result` value that will contain a `Config` instance in
923072b8
FG
673the successful case and will describe the problem in the error case. We’re also
674going to change the function name from `new` to `build` because many
675programmers expect `new` functions to never fail. When `Config::build` is
676communicating to `main`, we can use the `Result` type to signal there was a
677problem. Then we can change `main` to convert an `Err` variant into a more
678practical error for our users without the surrounding text about `thread
679'main'` and `RUST_BACKTRACE` that a call to `panic!` causes.
680
681Listing 12-9 shows the changes we need to make to the return value of the
682function we’re now calling `Config::build` and the body of the function needed
683to return a `Result`. Note that this won’t compile until we update `main` as
684well, which we’ll do in the next listing.
a2a8927a
XL
685
686Filename: src/main.rs
687
688```
689impl Config {
923072b8 690 fn build(args: &[String]) -> Result<Config, &'static str> {
a2a8927a
XL
691 if args.len() < 3 {
692 return Err("not enough arguments");
693 }
694
695 let query = args[1].clone();
923072b8 696 let file_path = args[2].clone();
a2a8927a 697
923072b8 698 Ok(Config { query, file_path })
a2a8927a
XL
699 }
700}
701```
702
923072b8
FG
703<!---
704
705Similar to above, I think having infallible constructors are a bit more Rust-y.
706For times where you need to construct and that construction can potentially
707fail, we should use a different name than `new` to key people in that they
708aren't getting a constructor but instead they're maybe getting the type they
709want (or maybe an error).
a2a8927a 710
923072b8
FG
711/JT --->
712<!-- Ok, you've convinced me to change the name from `new` to `build` even
713though I don't think the "constructors should be infallible" philospohpy is
714universal. /Carol -->
a2a8927a 715
923072b8 716Listing 12-9: Returning a `Result` from `Config::build`
a2a8927a 717
923072b8
FG
718Our `build` function returns a `Result` with a `Config` instance in the success
719case and a `&'static str` in the error case. Our error values will always be
720string literals that have the `'static` lifetime.
a2a8927a 721
923072b8
FG
722We’ve made two changes in the body of the function: instead of calling `panic!`
723when the user doesn’t pass enough arguments, we now return an `Err` value, and
724we’ve wrapped the `Config` return value in an `Ok`. These changes make the
725function conform to its new type signature.
726
727Returning an `Err` value from `Config::build` allows the `main` function to
728handle the `Result` value returned from the `build` function and exit the
729process more cleanly in the error case.
730
731#### Calling `Config::build` and Handling Errors
a2a8927a
XL
732
733To handle the error case and print a user-friendly message, we need to update
923072b8 734`main` to handle the `Result` being returned by `Config::build`, as shown in
a2a8927a 735Listing 12-10. We’ll also take the responsibility of exiting the command line
04454e1e
FG
736tool with a nonzero error code away from `panic!` and instead implement it by
737hand. A nonzero exit status is a convention to signal to the process that
738called our program that the program exited with an error state.
a2a8927a
XL
739
740Filename: src/main.rs
741
742```
743[1] use std::process;
744
745fn main() {
746 let args: Vec<String> = env::args().collect();
747
923072b8
FG
748 [2] let config = Config::build(&args).unwrap_or_else([3]|err[4]| {
749 [5] println!("Problem parsing arguments: {err}");
a2a8927a
XL
750 [6] process::exit(1);
751 });
752
753 // --snip--
754```
755
923072b8 756Listing 12-10: Exiting with an error code if building a `Config` fails
a2a8927a
XL
757
758In this listing, we’ve used a method we haven’t covered in detail yet:
759`unwrap_or_else`, which is defined on `Result<T, E>` by the standard library
760[2]. Using `unwrap_or_else` allows us to define some custom, non-`panic!` error
761handling. If the `Result` is an `Ok` value, this method’s behavior is similar
762to `unwrap`: it returns the inner value `Ok` is wrapping. However, if the value
763is an `Err` value, this method calls the code in the *closure*, which is an
764anonymous function we define and pass as an argument to `unwrap_or_else` [3].
765We’ll cover closures in more detail in Chapter 13. For now, you just need to
766know that `unwrap_or_else` will pass the inner value of the `Err`, which in
767this case is the static string `"not enough arguments"` that we added in
768Listing 12-9, to our closure in the argument `err` that appears between the
769vertical pipes [4]. The code in the closure can then use the `err` value when
770it runs.
771
772We’ve added a new `use` line to bring `process` from the standard library into
773scope [1]. The code in the closure that will be run in the error case is only
774two lines: we print the `err` value [5] and then call `process::exit` [6]. The
775`process::exit` function will stop the program immediately and return the
776number that was passed as the exit status code. This is similar to the
777`panic!`-based handling we used in Listing 12-8, but we no longer get all the
778extra output. Let’s try it:
779
780```
781$ cargo run
782 Compiling minigrep v0.1.0 (file:///projects/minigrep)
783 Finished dev [unoptimized + debuginfo] target(s) in 0.48s
784 Running `target/debug/minigrep`
785Problem parsing arguments: not enough arguments
786```
787
788Great! This output is much friendlier for our users.
789
790### Extracting Logic from `main`
791
792Now that we’ve finished refactoring the configuration parsing, let’s turn to
793the program’s logic. As we stated in “Separation of Concerns for Binary
794Projects”, we’ll extract a function named `run` that will hold all the logic
795currently in the `main` function that isn’t involved with setting up
796configuration or handling errors. When we’re done, `main` will be concise and
797easy to verify by inspection, and we’ll be able to write tests for all the
798other logic.
799
800Listing 12-11 shows the extracted `run` function. For now, we’re just making
801the small, incremental improvement of extracting the function. We’re still
802defining the function in *src/main.rs*.
803
804Filename: src/main.rs
805
806```
807fn main() {
808 // --snip--
809
810 println!("Searching for {}", config.query);
923072b8 811 println!("In file {}", config.file_path);
a2a8927a
XL
812
813 run(config);
814}
815
816fn run(config: Config) {
923072b8
FG
817 let contents = fs::read_to_string(config.file_path)
818 .expect("Should have been able to read the file");
a2a8927a 819
923072b8 820 println!("With text:\n{contents}");
a2a8927a
XL
821}
822
823// --snip--
824```
825
826Listing 12-11: Extracting a `run` function containing the rest of the program
827logic
828
829The `run` function now contains all the remaining logic from `main`, starting
830from reading the file. The `run` function takes the `Config` instance as an
831argument.
832
833#### Returning Errors from the `run` Function
834
835With the remaining program logic separated into the `run` function, we can
923072b8 836improve the error handling, as we did with `Config::build` in Listing 12-9.
a2a8927a
XL
837Instead of allowing the program to panic by calling `expect`, the `run`
838function will return a `Result<T, E>` when something goes wrong. This will let
04454e1e 839us further consolidate the logic around handling errors into `main` in a
a2a8927a
XL
840user-friendly way. Listing 12-12 shows the changes we need to make to the
841signature and body of `run`.
842
843Filename: src/main.rs
844
845```
846[1] use std::error::Error;
847
848// --snip--
849
850[2] fn run(config: Config) -> Result<(), Box<dyn Error>> {
923072b8 851 let contents = fs::read_to_string(config.file_path)?[3];
a2a8927a 852
923072b8 853 println!("With text:\n{contents}");
a2a8927a
XL
854
855 [4] Ok(())
856}
857```
858
859Listing 12-12: Changing the `run` function to return `Result`
860
861We’ve made three significant changes here. First, we changed the return type of
862the `run` function to `Result<(), Box<dyn Error>>` [2]. This function previously
863returned the unit type, `()`, and we keep that as the value returned in the
864`Ok` case.
865
866For the error type, we used the *trait object* `Box<dyn Error>` (and we’ve
867brought `std::error::Error` into scope with a `use` statement at the top [1]).
868We’ll cover trait objects in Chapter 17. For now, just know that `Box<dyn
869Error>` means the function will return a type that implements the `Error`
870trait, but we don’t have to specify what particular type the return value will
871be. This gives us flexibility to return error values that may be of different
872types in different error cases. The `dyn` keyword is short for “dynamic.”
873
874Second, we’ve removed the call to `expect` in favor of the `?` operator [3], as
875we talked about in Chapter 9. Rather than `panic!` on an error, `?` will return
876the error value from the current function for the caller to handle.
877
878Third, the `run` function now returns an `Ok` value in the success case [4].
879We’ve declared the `run` function’s success type as `()` in the signature,
880which means we need to wrap the unit type value in the `Ok` value. This
881`Ok(())` syntax might look a bit strange at first, but using `()` like this is
882the idiomatic way to indicate that we’re calling `run` for its side effects
883only; it doesn’t return a value we need.
884
885When you run this code, it will compile but will display a warning:
886
887```
888warning: unused `Result` that must be used
889 --> src/main.rs:19:5
890 |
89119 | run(config);
892 | ^^^^^^^^^^^^
893 |
894 = note: `#[warn(unused_must_use)]` on by default
895 = note: this `Result` may be an `Err` variant, which should be handled
896```
897
898Rust tells us that our code ignored the `Result` value and the `Result` value
899might indicate that an error occurred. But we’re not checking to see whether or
900not there was an error, and the compiler reminds us that we probably meant to
901have some error-handling code here! Let’s rectify that problem now.
902
903#### Handling Errors Returned from `run` in `main`
904
905We’ll check for errors and handle them using a technique similar to one we used
923072b8 906with `Config::build` in Listing 12-10, but with a slight difference:
a2a8927a
XL
907
908Filename: src/main.rs
909
910```
911fn main() {
912 // --snip--
913
914 println!("Searching for {}", config.query);
923072b8 915 println!("In file {}", config.file_path);
a2a8927a
XL
916
917 if let Err(e) = run(config) {
923072b8 918 println!("Application error: {e}");
a2a8927a
XL
919
920 process::exit(1);
921 }
922}
923```
924
925We use `if let` rather than `unwrap_or_else` to check whether `run` returns an
926`Err` value and call `process::exit(1)` if it does. The `run` function doesn’t
923072b8 927return a value that we want to `unwrap` in the same way that `Config::build`
a2a8927a
XL
928returns the `Config` instance. Because `run` returns `()` in the success case,
929we only care about detecting an error, so we don’t need `unwrap_or_else` to
04454e1e 930return the unwrapped value, which would only be `()`.
a2a8927a
XL
931
932The bodies of the `if let` and the `unwrap_or_else` functions are the same in
933both cases: we print the error and exit.
934
935### Splitting Code into a Library Crate
936
937Our `minigrep` project is looking good so far! Now we’ll split the
04454e1e
FG
938*src/main.rs* file and put some code into the *src/lib.rs* file. That way we
939can test the code and have a *src/main.rs* file with fewer responsibilities.
a2a8927a
XL
940
941Let’s move all the code that isn’t the `main` function from *src/main.rs* to
942*src/lib.rs*:
943
944* The `run` function definition
945* The relevant `use` statements
946* The definition of `Config`
923072b8 947* The `Config::build` function definition
a2a8927a
XL
948
949The contents of *src/lib.rs* should have the signatures shown in Listing 12-13
950(we’ve omitted the bodies of the functions for brevity). Note that this won’t
951compile until we modify *src/main.rs* in Listing 12-14.
952
953Filename: src/lib.rs
954
955```
956use std::error::Error;
957use std::fs;
958
959pub struct Config {
960 pub query: String,
923072b8 961 pub file_path: String,
a2a8927a
XL
962}
963
964impl Config {
923072b8 965 pub fn build(args: &[String]) -> Result<Config, &'static str> {
a2a8927a
XL
966 // --snip--
967 }
968}
969
970pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
971 // --snip--
972}
973```
974
975Listing 12-13: Moving `Config` and `run` into *src/lib.rs*
976
977We’ve made liberal use of the `pub` keyword: on `Config`, on its fields and its
923072b8
FG
978`build` method, and on the `run` function. We now have a library crate that has
979a public API we can test!
a2a8927a
XL
980
981Now we need to bring the code we moved to *src/lib.rs* into the scope of the
982binary crate in *src/main.rs*, as shown in Listing 12-14.
983
984Filename: src/main.rs
985
986```
987use std::env;
988use std::process;
989
990use minigrep::Config;
991
992fn main() {
993 // --snip--
994 if let Err(e) = minigrep::run(config) {
995 // --snip--
996 }
997}
998```
999
1000Listing 12-14: Using the `minigrep` library crate in *src/main.rs*
1001
1002We add a `use minigrep::Config` line to bring the `Config` type from the
1003library crate into the binary crate’s scope, and we prefix the `run` function
1004with our crate name. Now all the functionality should be connected and should
1005work. Run the program with `cargo run` and make sure everything works
1006correctly.
1007
1008Whew! That was a lot of work, but we’ve set ourselves up for success in the
1009future. Now it’s much easier to handle errors, and we’ve made the code more
1010modular. Almost all of our work will be done in *src/lib.rs* from here on out.
1011
1012Let’s take advantage of this newfound modularity by doing something that would
1013have been difficult with the old code but is easy with the new code: we’ll
1014write some tests!
1015
1016## Developing the Library’s Functionality with Test-Driven Development
1017
1018Now that we’ve extracted the logic into *src/lib.rs* and left the argument
1019collecting and error handling in *src/main.rs*, it’s much easier to write tests
1020for the core functionality of our code. We can call functions directly with
1021various arguments and check return values without having to call our binary
1022from the command line.
1023
04454e1e
FG
1024In this section, we’ll add the searching logic to the `minigrep` program
1025using the test-driven development (TDD) process with the following steps:
a2a8927a
XL
1026
10271. Write a test that fails and run it to make sure it fails for the reason you
1028 expect.
10292. Write or modify just enough code to make the new test pass.
10303. Refactor the code you just added or changed and make sure the tests
1031 continue to pass.
10324. Repeat from step 1!
1033
04454e1e
FG
1034Though it’s just one of many ways to write software, TDD can help drive code
1035design. Writing the test before you write the code that makes the test pass
1036helps to maintain high test coverage throughout the process.
a2a8927a
XL
1037
1038We’ll test drive the implementation of the functionality that will actually do
1039the searching for the query string in the file contents and produce a list of
1040lines that match the query. We’ll add this functionality in a function called
1041`search`.
1042
1043### Writing a Failing Test
1044
1045Because we don’t need them anymore, let’s remove the `println!` statements from
1046*src/lib.rs* and *src/main.rs* that we used to check the program’s behavior.
04454e1e
FG
1047Then, in *src/lib.rs*, add a `tests` module with a test function, as we did in
1048Chapter 11. The test function specifies the behavior we want the `search`
1049function to have: it will take a query and the text to search, and it will
1050return only the lines from the text that contain the query. Listing 12-15 shows
1051this test, which won’t compile yet.
a2a8927a
XL
1052
1053Filename: src/lib.rs
1054
1055```
1056#[cfg(test)]
1057mod tests {
1058 use super::*;
1059
1060 #[test]
1061 fn one_result() {
1062 let query = "duct";
1063 let contents = "\
1064Rust:
1065safe, fast, productive.
1066Pick three.";
1067
1068 assert_eq!(vec!["safe, fast, productive."], search(query, contents));
1069 }
1070}
1071```
1072
1073Listing 12-15: Creating a failing test for the `search` function we wish we had
1074
1075This test searches for the string `"duct"`. The text we’re searching is three
1076lines, only one of which contains `"duct"` (Note that the backslash after the
1077opening double quote tells Rust not to put a newline character at the beginning
1078of the contents of this string literal). We assert that the value returned from
1079the `search` function contains only the line we expect.
1080
04454e1e
FG
1081We aren’t yet able to run this test and watch it fail because the test doesn’t
1082even compile: the `search` function doesn’t exist yet! In accordance with TDD
1083principles, we’ll add just enough code to get the test to compile and run by
1084adding a definition of the `search` function that always returns an empty
1085vector, as shown in Listing 12-16. Then the test should compile and fail
1086because an empty vector doesn’t match a vector containing the line `"safe,
1087fast, productive."`
a2a8927a
XL
1088
1089Filename: src/lib.rs
1090
1091```
1092pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
1093 vec![]
1094}
1095```
1096
1097Listing 12-16: Defining just enough of the `search` function so our test will
1098compile
1099
04454e1e
FG
1100Notice that we need to define an explicit lifetime `'a` in the signature of
1101`search` and use that lifetime with the `contents` argument and the return
1102value. Recall in Chapter 10 that the lifetime parameters specify which argument
1103lifetime is connected to the lifetime of the return value. In this case, we
1104indicate that the returned vector should contain string slices that reference
1105slices of the argument `contents` (rather than the argument `query`).
a2a8927a
XL
1106
1107In other words, we tell Rust that the data returned by the `search` function
1108will live as long as the data passed into the `search` function in the
1109`contents` argument. This is important! The data referenced *by* a slice needs
1110to be valid for the reference to be valid; if the compiler assumes we’re making
1111string slices of `query` rather than `contents`, it will do its safety checking
1112incorrectly.
1113
1114If we forget the lifetime annotations and try to compile this function, we’ll
1115get this error:
1116
1117```
1118error[E0106]: missing lifetime specifier
1119 --> src/lib.rs:28:51
1120 |
112128 | pub fn search(query: &str, contents: &str) -> Vec<&str> {
1122 | ---- ---- ^ expected named lifetime parameter
1123 |
1124 = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `query` or `contents`
1125help: consider introducing a named lifetime parameter
1126 |
112728 | pub fn search<'a>(query: &'a str, contents: &'a str) -> Vec<&'a str> {
1128 | ++++ ++ ++ ++
1129```
1130
1131Rust can’t possibly know which of the two arguments we need, so we need to tell
04454e1e
FG
1132it explicitly. Because `contents` is the argument that contains all of our text
1133and we want to return the parts of that text that match, we know `contents` is
1134the argument that should be connected to the return value using the lifetime
1135syntax.
a2a8927a
XL
1136
1137Other programming languages don’t require you to connect arguments to return
04454e1e
FG
1138values in the signature, but this practice will get easier over time. You might
1139want to compare this example with the “Validating References with Lifetimes”
1140section in Chapter 10.
a2a8927a
XL
1141
1142Now let’s run the test:
1143
1144```
1145$ cargo test
1146 Compiling minigrep v0.1.0 (file:///projects/minigrep)
1147 Finished test [unoptimized + debuginfo] target(s) in 0.97s
1148 Running unittests (target/debug/deps/minigrep-9cd200e5fac0fc94)
1149
1150running 1 test
1151test tests::one_result ... FAILED
1152
1153failures:
1154
1155---- tests::one_result stdout ----
1156thread 'main' panicked at 'assertion failed: `(left == right)`
1157 left: `["safe, fast, productive."]`,
1158 right: `[]`', src/lib.rs:44:9
1159note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
1160
1161
1162failures:
1163 tests::one_result
1164
1165test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
1166
1167error: test failed, to rerun pass '--lib'
1168```
1169
1170Great, the test fails, exactly as we expected. Let’s get the test to pass!
1171
1172### Writing Code to Pass the Test
1173
1174Currently, our test is failing because we always return an empty vector. To fix
1175that and implement `search`, our program needs to follow these steps:
1176
1177* Iterate through each line of the contents.
1178* Check whether the line contains our query string.
1179* If it does, add it to the list of values we’re returning.
1180* If it doesn’t, do nothing.
1181* Return the list of results that match.
1182
1183Let’s work through each step, starting with iterating through lines.
1184
1185#### Iterating Through Lines with the `lines` Method
1186
1187Rust has a helpful method to handle line-by-line iteration of strings,
1188conveniently named `lines`, that works as shown in Listing 12-17. Note this
1189won’t compile yet.
1190
1191Filename: src/lib.rs
1192
1193```
1194pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
1195 for line in contents.lines() {
1196 // do something with line
1197 }
1198}
1199```
1200
1201Listing 12-17: Iterating through each line in `contents`
1202
1203The `lines` method returns an iterator. We’ll talk about iterators in depth in
1204Chapter 13, but recall that you saw this way of using an iterator in Listing
12053-5, where we used a `for` loop with an iterator to run some code on each item
1206in a collection.
1207
1208#### Searching Each Line for the Query
1209
1210Next, we’ll check whether the current line contains our query string.
1211Fortunately, strings have a helpful method named `contains` that does this for
1212us! Add a call to the `contains` method in the `search` function, as shown in
1213Listing 12-18. Note this still won’t compile yet.
1214
1215Filename: src/lib.rs
1216
1217```
1218pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
1219 for line in contents.lines() {
1220 if line.contains(query) {
1221 // do something with line
1222 }
1223 }
1224}
1225```
1226
1227Listing 12-18: Adding functionality to see whether the line contains the string
1228in `query`
1229
04454e1e
FG
1230At the moment, we’re building up functionality. To get it to compile, we need
1231to return a value from the body as we indicated we would in the function
1232signature.
1233
a2a8927a
XL
1234#### Storing Matching Lines
1235
04454e1e
FG
1236To finish this function, we need a way to store the matching lines that we want
1237to return. For that, we can make a mutable vector before the `for` loop and
1238call the `push` method to store a `line` in the vector. After the `for` loop,
1239we return the vector, as shown in Listing 12-19.
a2a8927a
XL
1240
1241Filename: src/lib.rs
1242
1243```
1244pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
1245 let mut results = Vec::new();
1246
1247 for line in contents.lines() {
1248 if line.contains(query) {
1249 results.push(line);
1250 }
1251 }
1252
1253 results
1254}
1255```
1256
1257Listing 12-19: Storing the lines that match so we can return them
1258
1259Now the `search` function should return only the lines that contain `query`,
1260and our test should pass. Let’s run the test:
1261
1262```
1263$ cargo test
1264--snip--
1265running 1 test
1266test tests::one_result ... ok
1267
1268test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
1269```
1270
1271Our test passed, so we know it works!
1272
1273At this point, we could consider opportunities for refactoring the
1274implementation of the search function while keeping the tests passing to
1275maintain the same functionality. The code in the search function isn’t too bad,
1276but it doesn’t take advantage of some useful features of iterators. We’ll
1277return to this example in Chapter 13, where we’ll explore iterators in detail,
1278and look at how to improve it.
1279
1280#### Using the `search` Function in the `run` Function
1281
1282Now that the `search` function is working and tested, we need to call `search`
1283from our `run` function. We need to pass the `config.query` value and the
1284`contents` that `run` reads from the file to the `search` function. Then `run`
1285will print each line returned from `search`:
1286
1287Filename: src/lib.rs
1288
1289```
1290pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
923072b8 1291 let contents = fs::read_to_string(config.file_path)?;
a2a8927a
XL
1292
1293 for line in search(&config.query, &contents) {
923072b8 1294 println!("{line}");
a2a8927a
XL
1295 }
1296
1297 Ok(())
1298}
1299```
1300
1301We’re still using a `for` loop to return each line from `search` and print it.
1302
1303Now the entire program should work! Let’s try it out, first with a word that
1304should return exactly one line from the Emily Dickinson poem, “frog”:
1305
1306```
923072b8 1307$ cargo run -- frog poem.txt
a2a8927a
XL
1308 Compiling minigrep v0.1.0 (file:///projects/minigrep)
1309 Finished dev [unoptimized + debuginfo] target(s) in 0.38s
1310 Running `target/debug/minigrep frog poem.txt`
1311How public, like a frog
1312```
1313
1314Cool! Now let’s try a word that will match multiple lines, like “body”:
1315
1316```
923072b8 1317$ cargo run -- body poem.txt
a2a8927a
XL
1318 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
1319 Running `target/debug/minigrep body poem.txt`
1320I'm nobody! Who are you?
1321Are you nobody, too?
1322How dreary to be somebody!
1323```
1324
1325And finally, let’s make sure that we don’t get any lines when we search for a
1326word that isn’t anywhere in the poem, such as “monomorphization”:
1327
1328```
923072b8 1329$ cargo run -- monomorphization poem.txt
a2a8927a
XL
1330 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
1331 Running `target/debug/minigrep monomorphization poem.txt`
1332```
1333
1334Excellent! We’ve built our own mini version of a classic tool and learned a lot
1335about how to structure applications. We’ve also learned a bit about file input
1336and output, lifetimes, testing, and command line parsing.
1337
1338To round out this project, we’ll briefly demonstrate how to work with
1339environment variables and how to print to standard error, both of which are
1340useful when you’re writing command line programs.
1341
1342## Working with Environment Variables
1343
1344We’ll improve `minigrep` by adding an extra feature: an option for
1345case-insensitive searching that the user can turn on via an environment
1346variable. We could make this feature a command line option and require that
04454e1e
FG
1347users enter it each time they want it to apply, but by instead making it an
1348environment variable, we allow our users to set the environment variable once
1349and have all their searches be case insensitive in that terminal session.
a2a8927a
XL
1350
1351### Writing a Failing Test for the Case-Insensitive `search` Function
1352
04454e1e
FG
1353We first add a new `search_case_insensitive` function that will be called when
1354the environment variable has a value. We’ll continue to follow the TDD process,
1355so the first step is again to write a failing test. We’ll add a new test for
1356the new `search_case_insensitive` function and rename our old test from
a2a8927a
XL
1357`one_result` to `case_sensitive` to clarify the differences between the two
1358tests, as shown in Listing 12-20.
1359
1360Filename: src/lib.rs
1361
1362```
1363#[cfg(test)]
1364mod tests {
1365 use super::*;
1366
1367 #[test]
1368 fn case_sensitive() {
1369 let query = "duct";
1370 let contents = "\
1371Rust:
1372safe, fast, productive.
1373Pick three.
1374Duct tape.";
1375
1376 assert_eq!(vec!["safe, fast, productive."], search(query, contents));
1377 }
1378
1379 #[test]
1380 fn case_insensitive() {
1381 let query = "rUsT";
1382 let contents = "\
1383Rust:
1384safe, fast, productive.
1385Pick three.
1386Trust me.";
1387
1388 assert_eq!(
1389 vec!["Rust:", "Trust me."],
1390 search_case_insensitive(query, contents)
1391 );
1392 }
1393}
1394```
1395
1396Listing 12-20: Adding a new failing test for the case-insensitive function
1397we’re about to add
1398
1399Note that we’ve edited the old test’s `contents` too. We’ve added a new line
1400with the text `"Duct tape."` using a capital D that shouldn’t match the query
1401`"duct"` when we’re searching in a case-sensitive manner. Changing the old test
1402in this way helps ensure that we don’t accidentally break the case-sensitive
1403search functionality that we’ve already implemented. This test should pass now
1404and should continue to pass as we work on the case-insensitive search.
1405
1406The new test for the case-*insensitive* search uses `"rUsT"` as its query. In
1407the `search_case_insensitive` function we’re about to add, the query `"rUsT"`
1408should match the line containing `"Rust:"` with a capital R and match the line
1409`"Trust me."` even though both have different casing from the query. This is
1410our failing test, and it will fail to compile because we haven’t yet defined
1411the `search_case_insensitive` function. Feel free to add a skeleton
1412implementation that always returns an empty vector, similar to the way we did
1413for the `search` function in Listing 12-16 to see the test compile and fail.
1414
1415### Implementing the `search_case_insensitive` Function
1416
1417The `search_case_insensitive` function, shown in Listing 12-21, will be almost
1418the same as the `search` function. The only difference is that we’ll lowercase
1419the `query` and each `line` so whatever the case of the input arguments,
1420they’ll be the same case when we check whether the line contains the query.
1421
1422Filename: src/lib.rs
1423
1424```
1425pub fn search_case_insensitive<'a>(
1426 query: &str,
1427 contents: &'a str,
1428) -> Vec<&'a str> {
1429 [1] let query = query.to_lowercase();
1430 let mut results = Vec::new();
1431
1432 for line in contents.lines() {
1433 if line.to_lowercase()[2].contains(&query[3]) {
1434 results.push(line);
1435 }
1436 }
1437
1438 results
1439}
1440```
1441
1442Listing 12-21: Defining the `search_case_insensitive` function to lowercase the
1443query and the line before comparing them
1444
1445First, we lowercase the `query` string and store it in a shadowed variable with
1446the same name [1]. Calling `to_lowercase` on the query is necessary so no
1447matter whether the user’s query is `"rust"`, `"RUST"`, `"Rust"`, or `"rUsT"`,
1448we’ll treat the query as if it were `"rust"` and be insensitive to the case.
1449While `to_lowercase` will handle basic Unicode, it won’t be 100% accurate. If
1450we were writing a real application, we’d want to do a bit more work here, but
1451this section is about environment variables, not Unicode, so we’ll leave it at
1452that here.
1453
1454Note that `query` is now a `String` rather than a string slice, because calling
1455`to_lowercase` creates new data rather than referencing existing data. Say the
1456query is `"rUsT"`, as an example: that string slice doesn’t contain a lowercase
1457`u` or `t` for us to use, so we have to allocate a new `String` containing
1458`"rust"`. When we pass `query` as an argument to the `contains` method now, we
1459need to add an ampersand [3] because the signature of `contains` is defined to
1460take a string slice.
1461
04454e1e
FG
1462Next, we add a call to `to_lowercase` on each `line` to lowercase all
1463characters [2]. Now that we’ve converted `line` and `query` to lowercase, we’ll
1464find matches no matter what the case of the query is.
a2a8927a
XL
1465
1466Let’s see if this implementation passes the tests:
1467
1468```
1469running 2 tests
1470test tests::case_insensitive ... ok
1471test tests::case_sensitive ... ok
1472
1473test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
1474```
1475
1476Great! They passed. Now, let’s call the new `search_case_insensitive` function
1477from the `run` function. First, we’ll add a configuration option to the
1478`Config` struct to switch between case-sensitive and case-insensitive search.
1479Adding this field will cause compiler errors because we aren’t initializing
1480this field anywhere yet:
1481
04454e1e
FG
1482<!-- JT: I decided to change the field name and the environment variable to be
1483called `ignore_case` to avoid some double-negative confusion detailed in this
1484issue: https://github.com/rust-lang/book/issues/1898 I'd love your thoughts
1485especially on the names and logic throughout this section! Thank you!!
1486/Carol -->
923072b8
FG
1487<!---
1488
1489I've left a few comments. I think the name reads okay.
1490
1491I think my recurring thought here (which you'll see in a few places), is if
1492we want to make a constructor, to do more work outside of the constructor so
1493that the constructor itself is infallible.
1494
1495Or, we could rename the function to something like `build_config` or something.
1496Then it feels a bit more like "well sure, it's possible it could fail to build
1497if I give it something wrong".
1498
1499Not sure if we have space here, but in a real-world version of your example we'd
1500probably use the builder pattern, so you could optionally grab the value
1501from the environment or not. When you're writing tests, things that find their
1502settings from the env tend to be a pain as you have to incorporate that into
1503the testing. But if you can use the builder pattern, you can use another way
1504of configuring the value that is more test-friendly.
1505
1506/JT --->
1507<!-- While I do love the builder pattern, that would be a bigger change than I
1508want to make right now to explain it thoroughly. /Carol -->
04454e1e 1509
a2a8927a
XL
1510Filename: src/lib.rs
1511
1512```
1513pub struct Config {
1514 pub query: String,
923072b8 1515 pub file_path: String,
04454e1e 1516 pub ignore_case: bool,
a2a8927a
XL
1517}
1518```
1519
04454e1e
FG
1520We added the `ignore_case` field that holds a Boolean. Next, we need the
1521`run` function to check the `ignore_case` field’s value and use that to
1522decide whether to call the `search` function or the `search_case_insensitive`
1523function, as shown in Listing 12-22. This still won’t compile yet.
a2a8927a
XL
1524
1525Filename: src/lib.rs
1526
1527```
1528pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
923072b8 1529 let contents = fs::read_to_string(config.file_path)?;
a2a8927a 1530
04454e1e 1531 let results = if config.ignore_case {
a2a8927a 1532 search_case_insensitive(&config.query, &contents)
04454e1e
FG
1533 } else {
1534 search(&config.query, &contents)
a2a8927a
XL
1535 };
1536
1537 for line in results {
923072b8 1538 println!("{line}");
a2a8927a
XL
1539 }
1540
1541 Ok(())
1542}
1543```
1544
1545Listing 12-22: Calling either `search` or `search_case_insensitive` based on
04454e1e 1546the value in `config.ignore_case`
a2a8927a
XL
1547
1548Finally, we need to check for the environment variable. The functions for
1549working with environment variables are in the `env` module in the standard
04454e1e
FG
1550library, so we bring that module into scope at the top of *src/lib.rs*. Then
1551we’ll use the `var` function from the `env` module to check to see if any value
1552has been set for an environment variable named `IGNORE_CASE`, as shown in
1553Listing 12-23.
a2a8927a
XL
1554
1555Filename: src/lib.rs
1556
1557```
1558use std::env;
1559// --snip--
1560
1561impl Config {
923072b8 1562 pub fn build(args: &[String]) -> Result<Config, &'static str> {
a2a8927a
XL
1563 if args.len() < 3 {
1564 return Err("not enough arguments");
1565 }
1566
1567 let query = args[1].clone();
923072b8 1568 let file_path = args[2].clone();
a2a8927a 1569
04454e1e 1570 let ignore_case = env::var("IGNORE_CASE").is_ok();
a2a8927a
XL
1571
1572 Ok(Config {
1573 query,
923072b8 1574 file_path,
04454e1e 1575 ignore_case,
a2a8927a
XL
1576 })
1577 }
1578}
1579```
1580
923072b8
FG
1581<!---
1582
1583Same comment on this one, too. We can largely avoid confusion here I think by
1584just not naming it `new`.
1585
1586Taking in the args as a slice of Strings also feels less Rust-y than having
1587names for the two parameters that will be part of the Config.
1588
1589/JT --->
1590<!-- I've changed the name from `new` to `build`, and we change the way this
1591gets arguments somewhat in chapter 13. The *real* Rusty way would be to use an
1592argument parser crate like clap... and I don't want to use external crates in
1593the book any more than the one usage of rand in chapter 2 :) /Carol -->
1594
04454e1e
FG
1595Listing 12-23: Checking for any value in an environment variable named
1596`IGNORE_CASE`
a2a8927a 1597
04454e1e
FG
1598Here, we create a new variable `ignore_case`. To set its value, we call the
1599`env::var` function and pass it the name of the `IGNORE_CASE` environment
a2a8927a
XL
1600variable. The `env::var` function returns a `Result` that will be the
1601successful `Ok` variant that contains the value of the environment variable if
04454e1e
FG
1602the environment variable is set to any value. It will return the `Err` variant
1603if the environment variable is not set.
a2a8927a 1604
04454e1e
FG
1605We’re using the `is_ok` method on the `Result` to check whether the environment
1606variable is set, which means the program should do a case-insensitive search.
1607If the `IGNORE_CASE` environment variable isn’t set to anything, `is_ok` will
1608return false and the program will perform a case-sensitive search. We don’t
a2a8927a 1609care about the *value* of the environment variable, just whether it’s set or
04454e1e 1610unset, so we’re checking `is_ok` rather than using `unwrap`, `expect`, or any
a2a8927a
XL
1611of the other methods we’ve seen on `Result`.
1612
04454e1e
FG
1613We pass the value in the `ignore_case` variable to the `Config` instance so the
1614`run` function can read that value and decide whether to call
1615`search_case_insensitive` or `search`, as we implemented in Listing 12-22.
a2a8927a
XL
1616
1617Let’s give it a try! First, we’ll run our program without the environment
1618variable set and with the query `to`, which should match any line that contains
1619the word “to” in all lowercase:
1620
1621```
923072b8 1622$ cargo run -- to poem.txt
a2a8927a
XL
1623 Compiling minigrep v0.1.0 (file:///projects/minigrep)
1624 Finished dev [unoptimized + debuginfo] target(s) in 0.0s
1625 Running `target/debug/minigrep to poem.txt`
1626Are you nobody, too?
1627How dreary to be somebody!
1628```
1629
04454e1e 1630Looks like that still works! Now, let’s run the program with `IGNORE_CASE`
a2a8927a
XL
1631set to `1` but with the same query `to`.
1632
04454e1e 1633```
923072b8 1634$ IGNORE_CASE=1 cargo run -- to poem.txt
04454e1e
FG
1635```
1636
a2a8927a
XL
1637If you’re using PowerShell, you will need to set the environment variable and
1638run the program as separate commands:
1639
1640```
923072b8 1641PS> $Env:IGNORE_CASE=1; cargo run -- to poem.txt
a2a8927a
XL
1642```
1643
04454e1e 1644This will make `IGNORE_CASE` persist for the remainder of your shell
a2a8927a
XL
1645session. It can be unset with the `Remove-Item` cmdlet:
1646
1647```
04454e1e 1648PS> Remove-Item Env:IGNORE_CASE
a2a8927a
XL
1649```
1650
1651We should get lines that contain “to” that might have uppercase letters:
1652
1653```
a2a8927a
XL
1654Are you nobody, too?
1655How dreary to be somebody!
1656To tell your name the livelong day
1657To an admiring bog!
1658```
1659
1660Excellent, we also got lines containing “To”! Our `minigrep` program can now do
1661case-insensitive searching controlled by an environment variable. Now you know
1662how to manage options set using either command line arguments or environment
1663variables.
1664
1665Some programs allow arguments *and* environment variables for the same
1666configuration. In those cases, the programs decide that one or the other takes
04454e1e
FG
1667precedence. For another exercise on your own, try controlling case sensitivity
1668through either a command line argument or an environment variable. Decide
1669whether the command line argument or the environment variable should take
1670precedence if the program is run with one set to case sensitive and one set to
1671ignore case.
a2a8927a
XL
1672
1673The `std::env` module contains many more useful features for dealing with
1674environment variables: check out its documentation to see what is available.
1675
1676## Writing Error Messages to Standard Error Instead of Standard Output
1677
1678At the moment, we’re writing all of our output to the terminal using the
1679`println!` macro. In most terminals, there are two kinds of output: *standard
1680output* (`stdout`) for general information and *standard error* (`stderr`) for
1681error messages. This distinction enables users to choose to direct the
1682successful output of a program to a file but still print error messages to the
1683screen.
1684
1685The `println!` macro is only capable of printing to standard output, so we
1686have to use something else to print to standard error.
1687
1688### Checking Where Errors Are Written
1689
1690First, let’s observe how the content printed by `minigrep` is currently being
1691written to standard output, including any error messages we want to write to
1692standard error instead. We’ll do that by redirecting the standard output stream
04454e1e
FG
1693to a file while intentionally causing an error. We won’t redirect the standard
1694error stream, so any content sent to standard error will continue to display on
1695the screen.
a2a8927a
XL
1696
1697Command line programs are expected to send error messages to the standard error
1698stream so we can still see error messages on the screen even if we redirect the
1699standard output stream to a file. Our program is not currently well-behaved:
1700we’re about to see that it saves the error message output to a file instead!
1701
923072b8 1702To demonstrate this behavior, we’ll run the program with `>` and the file_path,
04454e1e
FG
1703*output.txt*, that we want to redirect the standard output stream to. We won’t
1704pass any arguments, which should cause an error:
a2a8927a
XL
1705
1706```
1707$ cargo run > output.txt
1708```
1709
1710The `>` syntax tells the shell to write the contents of standard output to
1711*output.txt* instead of the screen. We didn’t see the error message we were
1712expecting printed to the screen, so that means it must have ended up in the
1713file. This is what *output.txt* contains:
1714
1715```
1716Problem parsing arguments: not enough arguments
1717```
1718
1719Yup, our error message is being printed to standard output. It’s much more
1720useful for error messages like this to be printed to standard error so only
1721data from a successful run ends up in the file. We’ll change that.
1722
1723### Printing Errors to Standard Error
1724
1725We’ll use the code in Listing 12-24 to change how error messages are printed.
1726Because of the refactoring we did earlier in this chapter, all the code that
1727prints error messages is in one function, `main`. The standard library provides
1728the `eprintln!` macro that prints to the standard error stream, so let’s change
1729the two places we were calling `println!` to print errors to use `eprintln!`
1730instead.
1731
1732Filename: src/main.rs
1733
1734```
1735fn main() {
1736 let args: Vec<String> = env::args().collect();
1737
923072b8
FG
1738 let config = Config::build(&args).unwrap_or_else(|err| {
1739 eprintln!("Problem parsing arguments: {err}");
a2a8927a
XL
1740 process::exit(1);
1741 });
1742
1743 if let Err(e) = minigrep::run(config) {
923072b8 1744 eprintln!("Application error: {e}");
a2a8927a
XL
1745
1746 process::exit(1);
1747 }
1748}
1749```
1750
1751Listing 12-24: Writing error messages to standard error instead of standard
1752output using `eprintln!`
1753
04454e1e
FG
1754Let’s now run the program again in the same way, without any arguments and
1755redirecting standard output with `>`:
a2a8927a
XL
1756
1757```
1758$ cargo run > output.txt
1759Problem parsing arguments: not enough arguments
1760```
1761
1762Now we see the error onscreen and *output.txt* contains nothing, which is the
1763behavior we expect of command line programs.
1764
1765Let’s run the program again with arguments that don’t cause an error but still
1766redirect standard output to a file, like so:
1767
1768```
923072b8 1769$ cargo run -- to poem.txt > output.txt
a2a8927a
XL
1770```
1771
1772We won’t see any output to the terminal, and *output.txt* will contain our
1773results:
1774
1775Filename: output.txt
1776
1777```
1778Are you nobody, too?
1779How dreary to be somebody!
1780```
1781
1782This demonstrates that we’re now using standard output for successful output
1783and standard error for error output as appropriate.
1784
1785## Summary
1786
1787This chapter recapped some of the major concepts you’ve learned so far and
1788covered how to perform common I/O operations in Rust. By using command line
1789arguments, files, environment variables, and the `eprintln!` macro for printing
04454e1e
FG
1790errors, you’re now prepared to write command line applications. Combined with
1791the concepts in previous chapters, your code will be well organized, store data
a2a8927a
XL
1792effectively in the appropriate data structures, handle errors nicely, and be
1793well tested.
1794
1795Next, we’ll explore some Rust features that were influenced by functional
1796languages: closures and iterators.