]>
Commit | Line | Data |
---|---|---|
cc61c64b XL |
1 | ## Using Threads to Run Code Simultaneously |
2 | ||
ea8adc8c XL |
3 | In most operating systems today, an executed program’s code is run in a |
4 | *process*, and the operating system manages multiple process at once. Within | |
5 | your program, you can also have independent parts that run simultaneously. The | |
6 | feature that runs these independent parts is called *threads*. | |
7 | ||
8 | <!-- I've tried to simplify the text above, can you check that I haven't | |
9 | changed meaning? --> | |
10 | <!-- Made some small tweaks, overall seems fine /Carol --> | |
11 | ||
12 | Splitting the computation in your program up into multiple threads can improve | |
13 | performance, since the program will be doing multiple things at the same time, | |
14 | but it also adds complexity. Because threads may run simultaneously, there’s no | |
15 | inherent guarantee about the order in which parts of your code on different | |
16 | threads will run. This can lead to problems such as: | |
17 | ||
18 | - Race conditions, where threads are accessing data or resources in an | |
19 | inconsistent order | |
20 | - Deadlocks, where two threads are waiting for each other to finish using a | |
21 | resource the other thread has, which prevents both threads from continuing | |
22 | - Bugs that only happen in certain situations and are hard to reproduce and | |
23 | fix reliably | |
24 | ||
25 | <!-- How do threads prevent each other from continuing? Or is that something | |
26 | we'll cover later?--> | |
27 | <!-- We don't really get into that later, so I've expanded a bit here /Carol --> | |
28 | ||
29 | Rust attempts to mitigate negative effects of using threads. Programming in a | |
30 | multithreaded context still takes careful thought and requires a code structure | |
31 | that’s different from programs that run in a single thread. | |
32 | ||
33 | Programming languages implement threads in a few different ways. Many operating | |
34 | systems provide an API for creating new threads. This model where a language | |
35 | calls the operating system APIs to create threads is sometimes called *1:1*, | |
36 | one OS thread per one language thread. | |
37 | ||
38 | Many programming languages provide their own special implementation of threads. | |
39 | Programming language-provided threads are known as *green* threads, and | |
40 | languages that use these green threads will execute them in the context of a | |
41 | different number of operating system threads. For this reason, the green | |
42 | threaded model is called the *M:N* model, `M` green threads per `N` OS threads, | |
43 | where `M` and `N` are not necessarily the same number. | |
44 | ||
45 | Each model has its own advantages and tradeoffs, and the tradeoff most | |
46 | important to Rust is runtime support. *Runtime* is a confusing term and can | |
47 | have different meanings in different contexts. | |
48 | ||
49 | <!-- Below - you mean this is the cause of runtime? Or "runtime" literally | |
50 | means the code included by Rust in every binary? --> | |
51 | <!-- Runtime literally means the code included by Rust in every binary. | |
52 | Wikipedia calls this "runtime system": | |
53 | https://en.wikipedia.org/wiki/Runtime_system but most people colloquially just | |
54 | say "the runtime". I've tried to clarify. /Carol --> | |
55 | ||
56 | In this context, by runtime we mean code that’s included by the language in | |
57 | every binary. This code can be large or small depending on the language, but | |
58 | every non-assembly language will have some amount of runtime code. For that | |
59 | reason, colloquially when people say a language has “no runtime” they often | |
60 | mean “small runtime.” Smaller runtimes have fewer features but have the | |
61 | advantage of resulting in smaller binaries, which make it easier to combine the | |
62 | language with other languages in more contexts. While many languages are okay | |
63 | with increasing the runtime size in exchange for more features, Rust needs to | |
64 | have nearly no runtime, and cannot compromise on being able to call into C in | |
65 | order to maintain performance. | |
66 | ||
67 | The green threading M:N model requires a larger language runtime to manage | |
68 | threads. As such, the Rust standard library only provides an implementation of | |
69 | 1:1 threading. Because Rust is such a low-level language, there are crates that | |
70 | implement M:N threading if you would rather trade overhead for aspects such as | |
71 | more control over which threads run when and lower costs of context switching, | |
72 | for example. | |
73 | ||
74 | Now that we’ve defined threads in Rust, let’s explore how to use the | |
75 | thread-related API provided by the standard library. | |
cc61c64b XL |
76 | |
77 | ### Creating a New Thread with `spawn` | |
78 | ||
ea8adc8c XL |
79 | To create a new thread, we call the `thread::spawn` function, and pass it a |
80 | closure (we talked about closures in Chapter 13) containing the code we want to | |
81 | run in the new thread. The example in Listing 16-1 prints some text from a main | |
82 | thread and other text from a new thread: | |
cc61c64b XL |
83 | |
84 | <span class="filename">Filename: src/main.rs</span> | |
85 | ||
86 | ```rust | |
87 | use std::thread; | |
88 | ||
89 | fn main() { | |
90 | thread::spawn(|| { | |
91 | for i in 1..10 { | |
92 | println!("hi number {} from the spawned thread!", i); | |
93 | } | |
94 | }); | |
95 | ||
96 | for i in 1..5 { | |
97 | println!("hi number {} from the main thread!", i); | |
98 | } | |
99 | } | |
100 | ``` | |
101 | ||
102 | <span class="caption">Listing 16-1: Creating a new thread to print one thing | |
ea8adc8c | 103 | while the main thread prints something else</span> |
cc61c64b | 104 | |
ea8adc8c XL |
105 | Note that with this function, the new thread will be stopped when the main |
106 | thread ends, whether it has finished running or not. The output from this | |
107 | program might be a little different every time, but it will look similar to | |
108 | this: | |
cc61c64b XL |
109 | |
110 | ```text | |
111 | hi number 1 from the main thread! | |
112 | hi number 1 from the spawned thread! | |
113 | hi number 2 from the main thread! | |
114 | hi number 2 from the spawned thread! | |
115 | hi number 3 from the main thread! | |
116 | hi number 3 from the spawned thread! | |
117 | hi number 4 from the main thread! | |
118 | hi number 4 from the spawned thread! | |
119 | hi number 5 from the spawned thread! | |
120 | ``` | |
121 | ||
ea8adc8c XL |
122 | <!-- This seems interesting, how come the threads often take turns, but not |
123 | always? --> | |
124 | <!-- I've added a bit of clarification /Carol --> | |
125 | ||
126 | The threads will probably take turns, but that’s not guaranteed: it depends on | |
127 | how your operating system schedules the threads. In this run, the main thread | |
128 | printed first, even though the print statement from the spawned thread appears | |
129 | first in the code. And even though we told the spawned thread to print until | |
130 | `i` is 9, it only got to 5 before the main thread shut down. | |
131 | ||
132 | If you run this code and only see one thread, or don’t see any overlap, try | |
cc61c64b XL |
133 | increasing the numbers in the ranges to create more opportunities for a thread |
134 | to take a break and give the other thread a turn. | |
135 | ||
136 | #### Waiting for All Threads to Finish Using `join` Handles | |
137 | ||
ea8adc8c XL |
138 | The code in Listing 16-1 not only stops the spawned thread prematurely most of |
139 | the time, because the main thread ends before the spawned thread is done, | |
140 | there’s actually no guarantee that the spawned thread will get to run at all, | |
141 | because there’s no guarantee on the order in which threads run! | |
142 | ||
143 | <!-- Above -- why is this the case, because there are no guarantees over which | |
144 | order the threads run in? --> | |
145 | <!-- Yep! /Carol --> | |
146 | ||
147 | We can fix this by saving the return value of `thread::spawn` in a variable. | |
148 | The return type of `thread::spawn` is `JoinHandle`. A `JoinHandle` is an owned | |
149 | value that, when we call the `join` method on it, will wait for its thread to | |
150 | finish. Listing 16-2 shows how to use the `JoinHandle` of the thread we created | |
151 | in Listing 16-1 and call `join` in order to make sure the spawned thread | |
152 | finishes before the `main` exits: | |
153 | ||
154 | <!-- Saving the return value where? I think this explanation of join handle | |
155 | needs expanding, this feels cut short --> | |
156 | <!-- In a variable. I've expanded a bit, but I'm not sure what information | |
157 | seems missing, so I'm not sure if this is sufficient /Carol --> | |
cc61c64b XL |
158 | |
159 | <span class="filename">Filename: src/main.rs</span> | |
160 | ||
161 | ```rust | |
162 | use std::thread; | |
163 | ||
164 | fn main() { | |
165 | let handle = thread::spawn(|| { | |
166 | for i in 1..10 { | |
167 | println!("hi number {} from the spawned thread!", i); | |
168 | } | |
169 | }); | |
170 | ||
171 | for i in 1..5 { | |
172 | println!("hi number {} from the main thread!", i); | |
173 | } | |
174 | ||
175 | handle.join(); | |
176 | } | |
177 | ``` | |
178 | ||
179 | <span class="caption">Listing 16-2: Saving a `JoinHandle` from `thread::spawn` | |
180 | to guarantee the thread is run to completion</span> | |
181 | ||
ea8adc8c XL |
182 | Calling `join` on the handle blocks the thread currently running until the |
183 | thread represented by the handle terminates. *Blocking* a thread means that | |
184 | thread is prevented from performing work or exiting. Because we’ve put the call | |
185 | to `join` after the main thread’s `for` loop, running this example should | |
186 | produce output that looks something like this: | |
187 | ||
188 | <!-- Liz: I've added a definition of "block" in the context of threads here, | |
189 | which is the first time we used the term-- it seemed to cause some confusion | |
190 | later on. /Carol --> | |
cc61c64b XL |
191 | |
192 | ```text | |
193 | hi number 1 from the main thread! | |
194 | hi number 2 from the main thread! | |
195 | hi number 1 from the spawned thread! | |
196 | hi number 3 from the main thread! | |
197 | hi number 2 from the spawned thread! | |
198 | hi number 4 from the main thread! | |
199 | hi number 3 from the spawned thread! | |
200 | hi number 4 from the spawned thread! | |
201 | hi number 5 from the spawned thread! | |
202 | hi number 6 from the spawned thread! | |
203 | hi number 7 from the spawned thread! | |
204 | hi number 8 from the spawned thread! | |
205 | hi number 9 from the spawned thread! | |
206 | ``` | |
207 | ||
208 | The two threads are still alternating, but the main thread waits because of the | |
209 | call to `handle.join()` and does not end until the spawned thread is finished. | |
210 | ||
211 | If we instead move `handle.join()` before the `for` loop in main, like this: | |
212 | ||
213 | <span class="filename">Filename: src/main.rs</span> | |
214 | ||
215 | ```rust | |
216 | use std::thread; | |
217 | ||
218 | fn main() { | |
219 | let handle = thread::spawn(|| { | |
220 | for i in 1..10 { | |
221 | println!("hi number {} from the spawned thread!", i); | |
222 | } | |
223 | }); | |
224 | ||
225 | handle.join(); | |
226 | ||
227 | for i in 1..5 { | |
228 | println!("hi number {} from the main thread!", i); | |
229 | } | |
230 | } | |
231 | ``` | |
232 | ||
ea8adc8c XL |
233 | The main thread will wait for the spawned thread to finish and then run its |
234 | `for` loop, so the output won’t be interleaved anymore: | |
cc61c64b XL |
235 | |
236 | ```text | |
237 | hi number 1 from the spawned thread! | |
238 | hi number 2 from the spawned thread! | |
239 | hi number 3 from the spawned thread! | |
240 | hi number 4 from the spawned thread! | |
241 | hi number 5 from the spawned thread! | |
242 | hi number 6 from the spawned thread! | |
243 | hi number 7 from the spawned thread! | |
244 | hi number 8 from the spawned thread! | |
245 | hi number 9 from the spawned thread! | |
246 | hi number 1 from the main thread! | |
247 | hi number 2 from the main thread! | |
248 | hi number 3 from the main thread! | |
249 | hi number 4 from the main thread! | |
250 | ``` | |
251 | ||
252 | Thinking about a small thing such as where to call `join` can affect whether | |
253 | your threads are actually running at the same time or not. | |
254 | ||
255 | ### Using `move` Closures with Threads | |
256 | ||
ea8adc8c XL |
257 | The `move` closure, which we didn’t cover in Chapter 13, is often used |
258 | alongside `thread::spawn`, as it allows us to use data from one thread in | |
259 | another thread. | |
cc61c64b | 260 | |
ea8adc8c XL |
261 | In Chapter 13, we said that “Creating closures that capture values from their |
262 | environment is mostly used in the context of starting new threads.” | |
263 | ||
264 | <!-- PROD: DE to check this quote, see if it has changed --> | |
cc61c64b | 265 | |
3b2f2976 | 266 | Now we’re creating new threads, so let’s talk about capturing values in |
cc61c64b XL |
267 | closures! |
268 | ||
ea8adc8c | 269 | Notice in Listing 16-1 that the closure we pass to `thread::spawn` takes no |
3b2f2976 | 270 | arguments: we’re not using any data from the main thread in the spawned |
ea8adc8c XL |
271 | thread’s code. In order to do so, the spawned thread’s closure must capture the |
272 | values it needs. Listing 16-3 shows an attempt to create a vector in the main | |
273 | thread and use it in the spawned thread. However, this won’t yet work, as | |
274 | you’ll see in a moment: | |
cc61c64b XL |
275 | |
276 | <span class="filename">Filename: src/main.rs</span> | |
277 | ||
278 | ```rust,ignore | |
279 | use std::thread; | |
280 | ||
281 | fn main() { | |
282 | let v = vec![1, 2, 3]; | |
283 | ||
284 | let handle = thread::spawn(|| { | |
285 | println!("Here's a vector: {:?}", v); | |
286 | }); | |
287 | ||
288 | handle.join(); | |
289 | } | |
290 | ``` | |
291 | ||
292 | <span class="caption">Listing 16-3: Attempting to use a vector created by the | |
ea8adc8c | 293 | main thread in another thread</span> |
cc61c64b | 294 | |
ea8adc8c XL |
295 | The closure uses `v`, so will capture `v` and make it part of the closure’s |
296 | environment. Because `thread::spawn` runs this closure in a new thread, we | |
297 | should be able to access `v` inside that new thread. | |
cc61c64b | 298 | |
3b2f2976 | 299 | When we compile this example, however, we’ll get the following error: |
cc61c64b XL |
300 | |
301 | ```text | |
302 | error[E0373]: closure may outlive the current function, but it borrows `v`, | |
303 | which is owned by the current function | |
304 | --> | |
305 | | | |
306 | 6 | let handle = thread::spawn(|| { | |
307 | | ^^ may outlive borrowed value `v` | |
308 | 7 | println!("Here's a vector: {:?}", v); | |
309 | | - `v` is borrowed here | |
310 | | | |
311 | help: to force the closure to take ownership of `v` (and any other referenced | |
312 | variables), use the `move` keyword, as shown: | |
313 | | let handle = thread::spawn(move || { | |
314 | ``` | |
315 | ||
ea8adc8c XL |
316 | Rust *infers* how to capture `v`, and since `println!` only needs a reference |
317 | to `v`, the closure tries to borrow `v`. There’s a problem, though: Rust can’t | |
318 | tell how long the spawned thread will run, so doesn’t know if the reference to | |
319 | `v` will always be valid. | |
cc61c64b | 320 | |
ea8adc8c XL |
321 | Let’s look at a scenario that’s more likely to have a reference to `v` that |
322 | won’t be valid, shown Listing 16-4: | |
cc61c64b XL |
323 | |
324 | <span class="filename">Filename: src/main.rs</span> | |
325 | ||
326 | ```rust,ignore | |
327 | use std::thread; | |
328 | ||
329 | fn main() { | |
330 | let v = vec![1, 2, 3]; | |
331 | ||
332 | let handle = thread::spawn(|| { | |
333 | println!("Here's a vector: {:?}", v); | |
334 | }); | |
335 | ||
336 | drop(v); // oh no! | |
337 | ||
338 | handle.join(); | |
339 | } | |
340 | ``` | |
341 | ||
342 | <span class="caption">Listing 16-4: A thread with a closure that attempts to | |
343 | capture a reference to `v` from a main thread that drops `v`</span> | |
344 | ||
ea8adc8c XL |
345 | If we run this code, there’s a possibility the spawned thread will be |
346 | immediately put in the background without getting a chance to run at all. The | |
347 | spawned thread has a reference to `v` inside, but the main thread immediately | |
348 | drops `v`, using the `drop` function we discussed in Chapter 15. Then, when the | |
349 | spawned thread starts to execute, `v` is no longer valid, so a reference to it | |
350 | is also invalid. Oh no! | |
cc61c64b | 351 | |
ea8adc8c XL |
352 | To fix the problem in Listing 16-3, we can listen to the advice of the error |
353 | message: | |
cc61c64b XL |
354 | |
355 | ```text | |
356 | help: to force the closure to take ownership of `v` (and any other referenced | |
357 | variables), use the `move` keyword, as shown: | |
358 | | let handle = thread::spawn(move || { | |
359 | ``` | |
360 | ||
361 | By adding the `move` keyword before the closure, we force the closure to take | |
ea8adc8c XL |
362 | ownership of the values it’s using, rather than allowing Rust to infer that it |
363 | should borrow. The modification to Listing 16-3 shown in Listing 16-5 will | |
364 | compile and run as we intend: | |
cc61c64b XL |
365 | |
366 | <span class="filename">Filename: src/main.rs</span> | |
367 | ||
368 | ```rust | |
369 | use std::thread; | |
370 | ||
371 | fn main() { | |
372 | let v = vec![1, 2, 3]; | |
373 | ||
374 | let handle = thread::spawn(move || { | |
375 | println!("Here's a vector: {:?}", v); | |
376 | }); | |
377 | ||
378 | handle.join(); | |
379 | } | |
380 | ``` | |
381 | ||
382 | <span class="caption">Listing 16-5: Using the `move` keyword to force a closure | |
383 | to take ownership of the values it uses</span> | |
384 | ||
ea8adc8c XL |
385 | <!-- Can you be more specific about the question we're asking about 16-4?--> |
386 | <!-- Done /Carol --> | |
387 | ||
388 | What would happen to the code in Listing 16-4 where the main thread called | |
389 | `drop` if we use a `move` closure? Would `move` fix that case? Nope, we get a | |
390 | different error, because what Listing 16-4 is trying to do isn’t allowed for a | |
391 | different reason! If we add `move` to the closure, we’d move `v` into the | |
392 | closure’s environment, and we could no longer call `drop` on it in the main | |
393 | thread. We would get this compiler error instead: | |
cc61c64b XL |
394 | |
395 | ```text | |
396 | error[E0382]: use of moved value: `v` | |
397 | --> | |
398 | | | |
399 | 6 | let handle = thread::spawn(move || { | |
400 | | ------- value moved (into closure) here | |
401 | ... | |
402 | 10 | drop(v); // oh no! | |
403 | | ^ value used here after move | |
404 | | | |
405 | = note: move occurs because `v` has type `std::vec::Vec<i32>`, which does | |
406 | not implement the `Copy` trait | |
407 | ``` | |
408 | ||
ea8adc8c XL |
409 | Rust’s ownership rules have saved us again! We got an error from the code in |
410 | Listing 16-3 because Rust was being conservative and only borrowing `v` for the | |
411 | thread, which meant the main thread could theoretically invalidate the spawned | |
412 | thread’s reference. By telling Rust to move ownership of `v` to the spawned | |
413 | thread, we’re guaranteeing to Rust that the main thread won’t use `v` anymore. | |
414 | If we change Listing 16-4 in the same way, we’re then violating the ownership | |
415 | rules when we try to use `v` in the main thread. The `move` keyword overrides | |
416 | Rust’s conservative default of borrowing; it doesn’t let us violate the | |
417 | ownership rules. | |
418 | ||
419 | <!-- Uh oh, I'm lost again, I thought we were trying to fix 16-4 with move, but | |
420 | we don't want it to work, is that right? Can you talk about this a little?--> | |
421 | <!-- I've tried to clarify a bit in the paragraph before this error and a bit | |
422 | after the error /Carol --> | |
cc61c64b | 423 | |
3b2f2976 | 424 | Now that we have a basic understanding of threads and the thread API, let’s |
cc61c64b | 425 | talk about what we can actually *do* with threads. |