src/doc/book/second-edition/src/ch16-01-threads.md

   1 ## Using Threads to Run Code Simultaneously
   2
   3 In most current operating systems, an executed program’s code is run in a
   4 *process*, and the operating system manages multiple processes at once. Within
   5 your program, you can also have independent parts that run simultaneously. The
   6 feature that runs these independent parts is called *threads*.
   7
   8 Splitting the computation in your program into multiple threads can improve
   9 performance because the program does multiple tasks at the same time, but it
  10 also adds complexity. Because threads can run simultaneously, there’s no
  11 inherent guarantee about the order in which parts of your code on different
  12 threads will run. This can lead to problems, such as:
  13
  14 * Race conditions, where threads are accessing data or resources in an
  15   inconsistent order
  16 * Deadlocks, where two threads are waiting for each other to finish using a
  17   resource the other thread has, preventing both threads from continuing
  18 * Bugs that only happen in certain situations and are hard to reproduce and fix
  19   reliably
  20
  21 Rust attempts to mitigate the negative effects of using threads. Programming in
  22 a multithreaded context still takes careful thought and requires a code
  23 structure that is different from programs that run in a single thread.
  24
  25 Programming languages implement threads in a few different ways. Many operating
  26 systems provide an API for creating new threads. This model where a language
  27 calls the operating system APIs to create threads is sometimes called *1:1*,
  28 one operating system thread per one language thread.
  29
  30 Many programming languages provide their own special implementation of threads.
  31 Programming language-provided threads are known as *green* threads, and
  32 languages that use these green threads will execute them in the context of a
  33 different number of operating system threads. For this reason, the green
  34 threaded model is called the *M:N* model: `M` green threads per `N` operating
  35 system threads, where `M` and `N` are not necessarily the same number.
  36
  37 Each model has its own advantages and trade-offs, and the trade-off most
  38 important to Rust is runtime support. Runtime is a confusing term and can have
  39 different meanings in different contexts.
  40
  41 In this context, by *runtime* we mean code that is included by the language in
  42 every binary. This code can be large or small depending on the language, but
  43 every non-assembly language will have some amount of runtime code. For that
  44 reason, colloquially when people say a language has “no runtime,” they often
  45 mean “small runtime.” Smaller runtimes have fewer features but have the
  46 advantage of resulting in smaller binaries, which make it easier to combine the
  47 language with other languages in more contexts. Although many languages are
  48 okay with increasing the runtime size in exchange for more features, Rust needs
  49 to have nearly no runtime and cannot compromise on being able to call into C to
  50 maintain performance.
  51
  52 The green threading M:N model requires a larger language runtime to manage
  53 threads. As such, the Rust standard library only provides an implementation of
  54 1:1 threading. Because Rust is such a low-level language, there are crates that
  55 implement M:N threading if you would rather trade overhead for aspects such as
  56 more control over which threads run when and lower costs of context switching,
  57 for example.
  58
  59 Now that we’ve defined threads in Rust, let’s explore how to use the
  60 thread-related API provided by the standard library.
  61
  62 ### Creating a New Thread with `spawn`
  63
  64 To create a new thread, we call the `thread::spawn` function and pass it a
  65 closure (we talked about closures in Chapter 13) containing the code we want to
  66 run in the new thread. The example in Listing 16-1 prints some text from a main
  67 thread and other text from a new thread:
  68
  69 <span class="filename">Filename: src/main.rs</span>
  70
  71 ```rust
  72 use std::thread;
  73 use std::time::Duration;
  74
  75 fn main() {
  76     thread::spawn(|| {
  77         for i in 1..10 {
  78             println!("hi number {} from the spawned thread!", i);
  79             thread::sleep(Duration::from_millis(1));
  80         }
  81     });
  82
  83     for i in 1..5 {
  84         println!("hi number {} from the main thread!", i);
  85         thread::sleep(Duration::from_millis(1));
  86     }
  87 }
  88 ```
  89
  90 <span class="caption">Listing 16-1: Creating a new thread to print one thing
  91 while the main thread prints something else</span>
  92
  93 Note that with this function, the new thread will be stopped when the main
  94 thread ends, whether or not it has finished running. The output from this
  95 program might be a little different every time, but it will look similar to the
  96 following:
  97
  98 ```text
  99 hi number 1 from the main thread!
 100 hi number 1 from the spawned thread!
 101 hi number 2 from the main thread!
 102 hi number 2 from the spawned thread!
 103 hi number 3 from the main thread!
 104 hi number 3 from the spawned thread!
 105 hi number 4 from the main thread!
 106 hi number 4 from the spawned thread!
 107 hi number 5 from the spawned thread!
 108 ```
 109
 110 The calls to `thread::sleep` force a thread to stop its execution for a short
 111 duration, which allows a different thread to run. The threads will probably
 112 take turns, but that isn’t guaranteed: it depends on how your operating system
 113 schedules the threads. In this run, the main thread printed first, even though
 114 the print statement from the spawned thread appears first in the code. And even
 115 though we told the spawned thread to print until `i` is 9, it only got to 5
 116 before the main thread shut down.
 117
 118 If you run this code and only see output from the main thread, or don’t see any
 119 overlap, try increasing the numbers in the ranges to create more opportunities
 120 for the operating system to switch between the threads.
 121
 122 ### Waiting for All Threads to Finish Using `join` Handles
 123
 124 The code in Listing 16-1 not only stops the spawned thread prematurely most of
 125 the time due to the main thread ending, but there is no guarantee that the
 126 spawned thread will get to run at all. The reason is that there is no guarantee
 127 on the order in which threads run!
 128
 129 We can fix the problem of the spawned thread not getting to run, or not getting
 130 to run completely, by saving the return value of `thread::spawn` in a variable.
 131 The return type of `thread::spawn` is `JoinHandle`. A `JoinHandle` is an owned
 132 value that, when we call the `join` method on it, will wait for its thread to
 133 finish. Listing 16-2 shows how to use the `JoinHandle` of the thread we created
 134 in Listing 16-1 and call `join` to make sure the spawned thread finishes before
 135 `main` exits:
 136
 137 <span class="filename">Filename: src/main.rs</span>
 138
 139 ```rust
 140 use std::thread;
 141 use std::time::Duration;
 142
 143 fn main() {
 144     let handle = thread::spawn(|| {
 145         for i in 1..10 {
 146             println!("hi number {} from the spawned thread!", i);
 147             thread::sleep(Duration::from_millis(1));
 148         }
 149     });
 150
 151     for i in 1..5 {
 152         println!("hi number {} from the main thread!", i);
 153         thread::sleep(Duration::from_millis(1));
 154     }
 155
 156     handle.join().unwrap();
 157 }
 158 ```
 159
 160 <span class="caption">Listing 16-2: Saving a `JoinHandle` from `thread::spawn`
 161 to guarantee the thread is run to completion</span>
 162
 163 Calling `join` on the handle blocks the thread currently running until the
 164 thread represented by the handle terminates. *Blocking* a thread means that
 165 thread is prevented from performing work or exiting. Because we’ve put the call
 166 to `join` after the main thread’s `for` loop, running Listing 16-2 should
 167 produce output similar to this:
 168
 169 ```text
 170 hi number 1 from the main thread!
 171 hi number 2 from the main thread!
 172 hi number 1 from the spawned thread!
 173 hi number 3 from the main thread!
 174 hi number 2 from the spawned thread!
 175 hi number 4 from the main thread!
 176 hi number 3 from the spawned thread!
 177 hi number 4 from the spawned thread!
 178 hi number 5 from the spawned thread!
 179 hi number 6 from the spawned thread!
 180 hi number 7 from the spawned thread!
 181 hi number 8 from the spawned thread!
 182 hi number 9 from the spawned thread!
 183 ```
 184
 185 The two threads continue alternating, but the main thread waits because of the
 186 call to `handle.join()` and does not end until the spawned thread is finished.
 187
 188 But let’s see what happens when we instead move `handle.join()` before the
 189 `for` loop in `main`, like this:
 190
 191 <span class="filename">Filename: src/main.rs</span>
 192
 193 ```rust
 194 use std::thread;
 195 use std::time::Duration;
 196
 197 fn main() {
 198     let handle = thread::spawn(|| {
 199         for i in 1..10 {
 200             println!("hi number {} from the spawned thread!", i);
 201             thread::sleep(Duration::from_millis(1));
 202         }
 203     });
 204
 205     handle.join().unwrap();
 206
 207     for i in 1..5 {
 208         println!("hi number {} from the main thread!", i);
 209         thread::sleep(Duration::from_millis(1));
 210     }
 211 }
 212 ```
 213
 214 The main thread will wait for the spawned thread to finish and then run its
 215 `for` loop, so the output won’t be interleaved anymore, as shown here:
 216
 217 ```text
 218 hi number 1 from the spawned thread!
 219 hi number 2 from the spawned thread!
 220 hi number 3 from the spawned thread!
 221 hi number 4 from the spawned thread!
 222 hi number 5 from the spawned thread!
 223 hi number 6 from the spawned thread!
 224 hi number 7 from the spawned thread!
 225 hi number 8 from the spawned thread!
 226 hi number 9 from the spawned thread!
 227 hi number 1 from the main thread!
 228 hi number 2 from the main thread!
 229 hi number 3 from the main thread!
 230 hi number 4 from the main thread!
 231 ```
 232
 233 Thinking about such a small detail as where to call `join` can affect whether
 234 or not your threads run at the same time.
 235
 236 ### Using `move` Closures with Threads
 237
 238 The `move` closure, which we mentioned briefly in Chapter 13, is often used
 239 alongside `thread::spawn` because it allows us to use data from one thread in
 240 another thread.
 241
 242 In Chapter 13, we said that “If we want to force the closure to take ownership
 243 of the values it uses in the environment, we can use the `move` keyword before
 244 the parameter list. This technique is mostly useful when passing a closure to a
 245 new thread to move the data so it’s owned by the new thread.”
 246
 247 Now that we’re creating new threads, we’ll talk about capturing values in
 248 closures.
 249
 250 Notice in Listing 16-1 that the closure we pass to `thread::spawn` takes no
 251 arguments: we’re not using any data from the main thread in the spawned
 252 thread’s code. To do so, the spawned thread’s closure must capture the values
 253 it needs. Listing 16-3 shows an attempt to create a vector in the main thread
 254 and use it in the spawned thread. However, this won’t yet work, as you’ll see
 255 in a moment:
 256
 257 <span class="filename">Filename: src/main.rs</span>
 258
 259 ```rust,ignore
 260 use std::thread;
 261
 262 fn main() {
 263     let v = vec![1, 2, 3];
 264
 265     let handle = thread::spawn(|| {
 266         println!("Here's a vector: {:?}", v);
 267     });
 268
 269     handle.join().unwrap();
 270 }
 271 ```
 272
 273 <span class="caption">Listing 16-3: Attempting to use a vector created by the
 274 main thread in another thread</span>
 275
 276 The closure uses `v`, so it will capture `v` and make it part of the closure’s
 277 environment. Because `thread::spawn` runs this closure in a new thread, we
 278 should be able to access `v` inside that new thread. But when we compile this
 279 example, we get the following error:
 280
 281 ```text
 282 error[E0373]: closure may outlive the current function, but it borrows `v`,
 283 which is owned by the current function
 284  --> src/main.rs:6:32
 285   |
 286 6 |     let handle = thread::spawn(|| {
 287   |                                ^^ may outlive borrowed value `v`
 288 7 |         println!("Here's a vector: {:?}", v);
 289   |                                           - `v` is borrowed here
 290   |
 291 help: to force the closure to take ownership of `v` (and any other referenced
 292 variables), use the `move` keyword
 293   |
 294 6 |     let handle = thread::spawn(move || {
 295   |                                ^^^^^^^
 296 ```
 297
 298 Rust *infers* how to capture `v`, and because `println!` only needs a reference
 299 to `v`, the closure tries to borrow `v`. However, there’s a problem: Rust can’t
 300 tell how long the spawned thread will run, so it doesn’t know if the reference
 301 to `v` will always be valid.
 302
 303 Listing 16-4 provides a scenario that’s more likely to have a reference to `v`
 304 that won’t be valid:
 305
 306 <span class="filename">Filename: src/main.rs</span>
 307
 308 ```rust,ignore
 309 use std::thread;
 310
 311 fn main() {
 312     let v = vec![1, 2, 3];
 313
 314     let handle = thread::spawn(|| {
 315         println!("Here's a vector: {:?}", v);
 316     });
 317
 318     drop(v); // oh no!
 319
 320     handle.join().unwrap();
 321 }
 322 ```
 323
 324 <span class="caption">Listing 16-4: A thread with a closure that attempts to
 325 capture a reference to `v` from a main thread that drops `v`</span>
 326
 327 If we were allowed to run this code, there’s a possibility the spawned thread
 328 will be immediately put in the background without running at all. The spawned
 329 thread has a reference to `v` inside, but the main thread immediately drops
 330 `v`, using the `drop` function we discussed in Chapter 15. Then, when the
 331 spawned thread starts to execute, `v` is no longer valid, so a reference to it
 332 is also invalid. Oh no!
 333
 334 To fix the compiler error in Listing 16-3, we can use the error message’s
 335 advice:
 336
 337 ```text
 338 help: to force the closure to take ownership of `v` (and any other referenced
 339 variables), use the `move` keyword
 340   |
 341 6 |     let handle = thread::spawn(move || {
 342   |                                ^^^^^^^
 343 ```
 344
 345 By adding the `move` keyword before the closure, we force the closure to take
 346 ownership of the values it’s using rather than allowing Rust to infer that it
 347 should borrow the values. The modification to Listing 16-3 shown in Listing
 348 16-5 will compile and run as we intend:
 349
 350 <span class="filename">Filename: src/main.rs</span>
 351
 352 ```rust
 353 use std::thread;
 354
 355 fn main() {
 356     let v = vec![1, 2, 3];
 357
 358     let handle = thread::spawn(move || {
 359         println!("Here's a vector: {:?}", v);
 360     });
 361
 362     handle.join().unwrap();
 363 }
 364 ```
 365
 366 <span class="caption">Listing 16-5: Using the `move` keyword to force a closure
 367 to take ownership of the values it uses</span>
 368
 369 What would happen to the code in Listing 16-4 where the main thread called
 370 `drop` if we use a `move` closure? Would `move` fix that case? Unfortunately,
 371 no; we would get a different error because what Listing 16-4 is trying to do
 372 isn’t allowed for a different reason. If we add `move` to the closure, we would
 373 move `v` into the closure’s environment, and we could no longer call `drop` on
 374 it in the main thread. We would get this compiler error instead:
 375
 376 ```text
 377 error[E0382]: use of moved value: `v`
 378   --> src/main.rs:10:10
 379    |
 380 6  |     let handle = thread::spawn(move || {
 381    |                                ------- value moved (into closure) here
 382 ...
 383 10 |     drop(v); // oh no!
 384    |          ^ value used here after move
 385    |
 386    = note: move occurs because `v` has type `std::vec::Vec<i32>`, which does
 387    not implement the `Copy` trait
 388 ```
 389
 390 Rust’s ownership rules have saved us again! We got an error from the code in
 391 Listing 16-3 because Rust was being conservative and only borrowing `v` for the
 392 thread, which meant the main thread could theoretically invalidate the spawned
 393 thread’s reference. By telling Rust to move ownership of `v` to the spawned
 394 thread, we’re guaranteeing Rust that the main thread won’t use `v` anymore. If
 395 we change Listing 16-4 in the same way, we’re then violating the ownership
 396 rules when we try to use `v` in the main thread. The `move` keyword overrides
 397 Rust’s conservative default of borrowing; it doesn’t let us violate the
 398 ownership rules.
 399
 400 With a basic understanding of threads and the thread API, let’s look at what we
 401 can *do* with threads.