src/doc/book/second-edition/src/ch16-01-threads.md

   1 ## Using Threads to Run Code Simultaneously
   2
   3 In most current operating systems, an executed program’s code is run in a
   4 *process*, and the operating system manages multiple processes at once. Within
   5 your program, you can also have independent parts that run simultaneously. The
   6 features that run these independent parts are called *threads*.
   7
   8 Splitting the computation in your program into multiple threads can improve
   9 performance because the program does multiple tasks at the same time, but it
  10 also adds complexity. Because threads can run simultaneously, there’s no
  11 inherent guarantee about the order in which parts of your code on different
  12 threads will run. This can lead to problems, such as:
  13
  14 * Race conditions, where threads are accessing data or resources in an
  15   inconsistent order
  16 * Deadlocks, where two threads are waiting for each other to finish using a
  17   resource the other thread has, preventing both threads from continuing
  18 * Bugs that happen only in certain situations and are hard to reproduce and fix
  19   reliably
  20
  21 Rust attempts to mitigate the negative effects of using threads, but
  22 programming in a multithreaded context still takes careful thought and requires
  23 a code structure that is different from that in programs running in a single
  24 thread.
  25
  26 Programming languages implement threads in a few different ways. Many operating
  27 systems provide an API for creating new threads. This model where a language
  28 calls the operating system APIs to create threads is sometimes called *1:1*,
  29 meaning one operating system thread per one language thread.
  30
  31 Many programming languages provide their own special implementation of threads.
  32 Programming language-provided threads are known as *green* threads, and
  33 languages that use these green threads will execute them in the context of a
  34 different number of operating system threads. For this reason, the
  35 green-threaded model is called the *M:N* model: there are `M` green threads per
  36 `N` operating system threads, where `M` and `N` are not necessarily the same
  37 number.
  38
  39 Each model has its own advantages and trade-offs, and the trade-off most
  40 important to Rust is runtime support. *Runtime* is a confusing term and can
  41 have different meanings in different contexts.
  42
  43 In this context, by *runtime* we mean code that is included by the language in
  44 every binary. This code can be large or small depending on the language, but
  45 every non-assembly language will have some amount of runtime code. For that
  46 reason, colloquially when people say a language has “no runtime,” they often
  47 mean “small runtime.” Smaller runtimes have fewer features but have the
  48 advantage of resulting in smaller binaries, which make it easier to combine the
  49 language with other languages in more contexts. Although many languages are
  50 okay with increasing the runtime size in exchange for more features, Rust needs
  51 to have nearly no runtime and cannot compromise on being able to call into C to
  52 maintain performance.
  53
  54 The green-threading M:N model requires a larger language runtime to manage
  55 threads. As such, the Rust standard library only provides an implementation of
  56 1:1 threading. Because Rust is such a low-level language, there are crates that
  57 implement M:N threading if you would rather trade overhead for aspects such as
  58 more control over which threads run when and lower costs of context switching,
  59 for example.
  60
  61 Now that we’ve defined threads in Rust, let’s explore how to use the
  62 thread-related API provided by the standard library.
  63
  64 ### Creating a New Thread with `spawn`
  65
  66 To create a new thread, we call the `thread::spawn` function and pass it a
  67 closure (we talked about closures in Chapter 13) containing the code we want to
  68 run in the new thread. The example in Listing 16-1 prints some text from a main
  69 thread and other text from a new thread:
  70
  71 <span class="filename">Filename: src/main.rs</span>
  72
  73 ```rust
  74 use std::thread;
  75 use std::time::Duration;
  76
  77 fn main() {
  78     thread::spawn(|| {
  79         for i in 1..10 {
  80             println!("hi number {} from the spawned thread!", i);
  81             thread::sleep(Duration::from_millis(1));
  82         }
  83     });
  84
  85     for i in 1..5 {
  86         println!("hi number {} from the main thread!", i);
  87         thread::sleep(Duration::from_millis(1));
  88     }
  89 }
  90 ```
  91
  92 <span class="caption">Listing 16-1: Creating a new thread to print one thing
  93 while the main thread prints something else</span>
  94
  95 Note that with this function, the new thread will be stopped when the main
  96 thread ends, whether or not it has finished running. The output from this
  97 program might be a little different every time, but it will look similar to the
  98 following:
  99
 100 ```text
 101 hi number 1 from the main thread!
 102 hi number 1 from the spawned thread!
 103 hi number 2 from the main thread!
 104 hi number 2 from the spawned thread!
 105 hi number 3 from the main thread!
 106 hi number 3 from the spawned thread!
 107 hi number 4 from the main thread!
 108 hi number 4 from the spawned thread!
 109 hi number 5 from the spawned thread!
 110 ```
 111
 112 The calls to `thread::sleep` force a thread to stop its execution for a short
 113 duration, allowing a different thread to run. The threads will probably take
 114 turns, but that isn’t guaranteed: it depends on how your operating system
 115 schedules the threads. In this run, the main thread printed first, even though
 116 the print statement from the spawned thread appears first in the code. And even
 117 though we told the spawned thread to print until `i` is 9, it only got to 5
 118 before the main thread shut down.
 119
 120 If you run this code and only see output from the main thread, or don’t see any
 121 overlap, try increasing the numbers in the ranges to create more opportunities
 122 for the operating system to switch between the threads.
 123
 124 ### Waiting for All Threads to Finish Using `join` Handles
 125
 126 The code in Listing 16-1 not only stops the spawned thread prematurely most of
 127 the time due to the main thread ending, but also can't guarantee that the
 128 spawned thread will get to run at all. The reason is that there is no guarantee
 129 on the order in which threads run!
 130
 131 We can fix the problem of the spawned thread not getting to run, or not getting
 132 to run completely, by saving the return value of `thread::spawn` in a variable.
 133 The return type of `thread::spawn` is `JoinHandle`. A `JoinHandle` is an owned
 134 value that, when we call the `join` method on it, will wait for its thread to
 135 finish. Listing 16-2 shows how to use the `JoinHandle` of the thread we created
 136 in Listing 16-1 and call `join` to make sure the spawned thread finishes before
 137 `main` exits:
 138
 139 <span class="filename">Filename: src/main.rs</span>
 140
 141 ```rust
 142 use std::thread;
 143 use std::time::Duration;
 144
 145 fn main() {
 146     let handle = thread::spawn(|| {
 147         for i in 1..10 {
 148             println!("hi number {} from the spawned thread!", i);
 149             thread::sleep(Duration::from_millis(1));
 150         }
 151     });
 152
 153     for i in 1..5 {
 154         println!("hi number {} from the main thread!", i);
 155         thread::sleep(Duration::from_millis(1));
 156     }
 157
 158     handle.join().unwrap();
 159 }
 160 ```
 161
 162 <span class="caption">Listing 16-2: Saving a `JoinHandle` from `thread::spawn`
 163 to guarantee the thread is run to completion</span>
 164
 165 Calling `join` on the handle blocks the thread currently running until the
 166 thread represented by the handle terminates. *Blocking* a thread means that
 167 thread is prevented from performing work or exiting. Because we’ve put the call
 168 to `join` after the main thread’s `for` loop, running Listing 16-2 should
 169 produce output similar to this:
 170
 171 ```text
 172 hi number 1 from the main thread!
 173 hi number 2 from the main thread!
 174 hi number 1 from the spawned thread!
 175 hi number 3 from the main thread!
 176 hi number 2 from the spawned thread!
 177 hi number 4 from the main thread!
 178 hi number 3 from the spawned thread!
 179 hi number 4 from the spawned thread!
 180 hi number 5 from the spawned thread!
 181 hi number 6 from the spawned thread!
 182 hi number 7 from the spawned thread!
 183 hi number 8 from the spawned thread!
 184 hi number 9 from the spawned thread!
 185 ```
 186
 187 The two threads continue alternating, but the main thread waits because of the
 188 call to `handle.join()` and does not end until the spawned thread is finished.
 189
 190 But let’s see what happens when we instead move `handle.join()` before the
 191 `for` loop in `main`, like this:
 192
 193 <span class="filename">Filename: src/main.rs</span>
 194
 195 ```rust
 196 use std::thread;
 197 use std::time::Duration;
 198
 199 fn main() {
 200     let handle = thread::spawn(|| {
 201         for i in 1..10 {
 202             println!("hi number {} from the spawned thread!", i);
 203             thread::sleep(Duration::from_millis(1));
 204         }
 205     });
 206
 207     handle.join().unwrap();
 208
 209     for i in 1..5 {
 210         println!("hi number {} from the main thread!", i);
 211         thread::sleep(Duration::from_millis(1));
 212     }
 213 }
 214 ```
 215
 216 The main thread will wait for the spawned thread to finish and then run its
 217 `for` loop, so the output won’t be interleaved anymore, as shown here:
 218
 219 ```text
 220 hi number 1 from the spawned thread!
 221 hi number 2 from the spawned thread!
 222 hi number 3 from the spawned thread!
 223 hi number 4 from the spawned thread!
 224 hi number 5 from the spawned thread!
 225 hi number 6 from the spawned thread!
 226 hi number 7 from the spawned thread!
 227 hi number 8 from the spawned thread!
 228 hi number 9 from the spawned thread!
 229 hi number 1 from the main thread!
 230 hi number 2 from the main thread!
 231 hi number 3 from the main thread!
 232 hi number 4 from the main thread!
 233 ```
 234
 235 Small details, such as where `join` is called, can affect whether or not your
 236 threads run at the same time.
 237
 238 ### Using `move` Closures with Threads
 239
 240 The `move` closure is often used alongside `thread::spawn` because it allows
 241 you to use data from one thread in another thread.
 242
 243 In Chapter 13, we mentioned we can use the `move` keyword before the parameter
 244 list of a closure to force the closure to take ownership of the values it uses
 245 in the environment. This technique is especially useful when creating new
 246 threads in order to transfer ownership of values from one thread to another.
 247
 248 Notice in Listing 16-1 that the closure we pass to `thread::spawn` takes no
 249 arguments: we’re not using any data from the main thread in the spawned
 250 thread’s code. To use data from the main thread in the spawned thread, the
 251 spawned thread’s closure must capture the values it needs. Listing 16-3 shows
 252 an attempt to create a vector in the main thread and use it in the spawned
 253 thread. However, this won’t yet work, as you’ll see in a moment.
 254
 255 <span class="filename">Filename: src/main.rs</span>
 256
 257 ```rust,ignore
 258 use std::thread;
 259
 260 fn main() {
 261     let v = vec![1, 2, 3];
 262
 263     let handle = thread::spawn(|| {
 264         println!("Here's a vector: {:?}", v);
 265     });
 266
 267     handle.join().unwrap();
 268 }
 269 ```
 270
 271 <span class="caption">Listing 16-3: Attempting to use a vector created by the
 272 main thread in another thread</span>
 273
 274 The closure uses `v`, so it will capture `v` and make it part of the closure’s
 275 environment. Because `thread::spawn` runs this closure in a new thread, we
 276 should be able to access `v` inside that new thread. But when we compile this
 277 example, we get the following error:
 278
 279 ```text
 280 error[E0373]: closure may outlive the current function, but it borrows `v`,
 281 which is owned by the current function
 282  --> src/main.rs:6:32
 283   |
 284 6 |     let handle = thread::spawn(|| {
 285   |                                ^^ may outlive borrowed value `v`
 286 7 |         println!("Here's a vector: {:?}", v);
 287   |                                           - `v` is borrowed here
 288   |
 289 help: to force the closure to take ownership of `v` (and any other referenced
 290 variables), use the `move` keyword
 291   |
 292 6 |     let handle = thread::spawn(move || {
 293   |                                ^^^^^^^
 294 ```
 295
 296 Rust *infers* how to capture `v`, and because `println!` only needs a reference
 297 to `v`, the closure tries to borrow `v`. However, there’s a problem: Rust can’t
 298 tell how long the spawned thread will run, so it doesn’t know if the reference
 299 to `v` will always be valid.
 300
 301 Listing 16-4 provides a scenario that’s more likely to have a reference to `v`
 302 that won’t be valid:
 303
 304 <span class="filename">Filename: src/main.rs</span>
 305
 306 ```rust,ignore
 307 use std::thread;
 308
 309 fn main() {
 310     let v = vec![1, 2, 3];
 311
 312     let handle = thread::spawn(|| {
 313         println!("Here's a vector: {:?}", v);
 314     });
 315
 316     drop(v); // oh no!
 317
 318     handle.join().unwrap();
 319 }
 320 ```
 321
 322 <span class="caption">Listing 16-4: A thread with a closure that attempts to
 323 capture a reference to `v` from a main thread that drops `v`</span>
 324
 325 If we were allowed to run this code, there’s a possibility the spawned thread
 326 would be immediately put in the background without running at all. The spawned
 327 thread has a reference to `v` inside, but the main thread immediately drops
 328 `v`, using the `drop` function we discussed in Chapter 15. Then, when the
 329 spawned thread starts to execute, `v` is no longer valid, so a reference to it
 330 is also invalid. Oh no!
 331
 332 To fix the compiler error in Listing 16-3, we can use the error message’s
 333 advice:
 334
 335 ```text
 336 help: to force the closure to take ownership of `v` (and any other referenced
 337 variables), use the `move` keyword
 338   |
 339 6 |     let handle = thread::spawn(move || {
 340   |                                ^^^^^^^
 341 ```
 342
 343 By adding the `move` keyword before the closure, we force the closure to take
 344 ownership of the values it’s using rather than allowing Rust to infer that it
 345 should borrow the values. The modification to Listing 16-3 shown in Listing
 346 16-5 will compile and run as we intend:
 347
 348 <span class="filename">Filename: src/main.rs</span>
 349
 350 ```rust
 351 use std::thread;
 352
 353 fn main() {
 354     let v = vec![1, 2, 3];
 355
 356     let handle = thread::spawn(move || {
 357         println!("Here's a vector: {:?}", v);
 358     });
 359
 360     handle.join().unwrap();
 361 }
 362 ```
 363
 364 <span class="caption">Listing 16-5: Using the `move` keyword to force a closure
 365 to take ownership of the values it uses</span>
 366
 367 What would happen to the code in Listing 16-4 where the main thread called
 368 `drop` if we use a `move` closure? Would `move` fix that case? Unfortunately,
 369 no; we would get a different error because what Listing 16-4 is trying to do
 370 isn’t allowed for a different reason. If we added `move` to the closure, we
 371 would move `v` into the closure’s environment, and we could no longer call
 372 `drop` on it in the main thread. We would get this compiler error instead:
 373
 374 ```text
 375 error[E0382]: use of moved value: `v`
 376   --> src/main.rs:10:10
 377    |
 378 6  |     let handle = thread::spawn(move || {
 379    |                                ------- value moved (into closure) here
 380 ...
 381 10 |     drop(v); // oh no!
 382    |          ^ value used here after move
 383    |
 384    = note: move occurs because `v` has type `std::vec::Vec<i32>`, which does
 385    not implement the `Copy` trait
 386 ```
 387
 388 Rust’s ownership rules have saved us again! We got an error from the code in
 389 Listing 16-3 because Rust was being conservative and only borrowing `v` for the
 390 thread, which meant the main thread could theoretically invalidate the spawned
 391 thread’s reference. By telling Rust to move ownership of `v` to the spawned
 392 thread, we’re guaranteeing Rust that the main thread won’t use `v` anymore. If
 393 we change Listing 16-4 in the same way, we’re then violating the ownership
 394 rules when we try to use `v` in the main thread. The `move` keyword overrides
 395 Rust’s conservative default of borrowing; it doesn’t let us violate the
 396 ownership rules.
 397
 398 With a basic understanding of threads and the thread API, let’s look at what we
 399 can *do* with threads.