]>
Commit | Line | Data |
---|---|---|
85aaf69f SL |
1 | % Concurrency |
2 | ||
3 | Concurrency and parallelism are incredibly important topics in computer | |
4 | science, and are also a hot topic in industry today. Computers are gaining more | |
5 | and more cores, yet many programmers aren't prepared to fully utilize them. | |
6 | ||
7 | Rust's memory safety features also apply to its concurrency story too. Even | |
8 | concurrent Rust programs must be memory safe, having no data races. Rust's type | |
d9579d0f | 9 | system is up to the task, and gives you powerful ways to reason about |
85aaf69f SL |
10 | concurrent code at compile time. |
11 | ||
12 | Before we talk about the concurrency features that come with Rust, it's important | |
c1a9b12d SL |
13 | to understand something: Rust is low-level enough that the vast majority of |
14 | this is provided by the standard library, not by the language. This means that | |
15 | if you don't like some aspect of the way Rust handles concurrency, you can | |
16 | implement an alternative way of doing things. | |
17 | [mio](https://github.com/carllerche/mio) is a real-world example of this | |
18 | principle in action. | |
85aaf69f SL |
19 | |
20 | ## Background: `Send` and `Sync` | |
21 | ||
22 | Concurrency is difficult to reason about. In Rust, we have a strong, static | |
23 | type system to help us reason about our code. As such, Rust gives us two traits | |
24 | to help us make sense of code that can possibly be concurrent. | |
25 | ||
26 | ### `Send` | |
27 | ||
28 | The first trait we're going to talk about is | |
b039eaaf SL |
29 | [`Send`](../std/marker/trait.Send.html). When a type `T` implements `Send`, it |
30 | indicates that something of this type is able to have ownership transferred | |
85aaf69f SL |
31 | safely between threads. |
32 | ||
33 | This is important to enforce certain restrictions. For example, if we have a | |
34 | channel connecting two threads, we would want to be able to send some data | |
35 | down the channel and to the other thread. Therefore, we'd ensure that `Send` was | |
36 | implemented for that type. | |
37 | ||
b039eaaf | 38 | In the opposite way, if we were wrapping a library with [FFI][ffi] that isn't |
85aaf69f SL |
39 | threadsafe, we wouldn't want to implement `Send`, and so the compiler will help |
40 | us enforce that it can't leave the current thread. | |
41 | ||
b039eaaf SL |
42 | [ffi]: ffi.html |
43 | ||
85aaf69f SL |
44 | ### `Sync` |
45 | ||
c34b1796 | 46 | The second of these traits is called [`Sync`](../std/marker/trait.Sync.html). |
b039eaaf | 47 | When a type `T` implements `Sync`, it indicates that something |
85aaf69f | 48 | of this type has no possibility of introducing memory unsafety when used from |
b039eaaf SL |
49 | multiple threads concurrently through shared references. This implies that |
50 | types which don't have [interior mutability](mutability.html) are inherently | |
51 | `Sync`, which includes simple primitive types (like `u8`) and aggregate types | |
52 | containing them. | |
53 | ||
54 | For sharing references across threads, Rust provides a wrapper type called | |
55 | `Arc<T>`. `Arc<T>` implements `Send` and `Sync` if and only if `T` implements | |
56 | both `Send` and `Sync`. For example, an object of type `Arc<RefCell<U>>` cannot | |
57 | be transferred across threads because | |
58 | [`RefCell`](choosing-your-guarantees.html#refcellt) does not implement | |
59 | `Sync`, consequently `Arc<RefCell<U>>` would not implement `Send`. | |
85aaf69f SL |
60 | |
61 | These two traits allow you to use the type system to make strong guarantees | |
62 | about the properties of your code under concurrency. Before we demonstrate | |
63 | why, we need to learn how to create a concurrent Rust program in the first | |
64 | place! | |
65 | ||
66 | ## Threads | |
67 | ||
9346a6ac | 68 | Rust's standard library provides a library for threads, which allow you to |
85aaf69f SL |
69 | run Rust code in parallel. Here's a basic example of using `std::thread`: |
70 | ||
62682a34 | 71 | ```rust |
85aaf69f SL |
72 | use std::thread; |
73 | ||
74 | fn main() { | |
9346a6ac | 75 | thread::spawn(|| { |
85aaf69f SL |
76 | println!("Hello from a thread!"); |
77 | }); | |
78 | } | |
79 | ``` | |
80 | ||
b039eaaf | 81 | The `thread::spawn()` method accepts a [closure](closures.html), which is executed in a |
9346a6ac AL |
82 | new thread. It returns a handle to the thread, that can be used to |
83 | wait for the child thread to finish and extract its result: | |
85aaf69f | 84 | |
62682a34 | 85 | ```rust |
85aaf69f SL |
86 | use std::thread; |
87 | ||
88 | fn main() { | |
9346a6ac AL |
89 | let handle = thread::spawn(|| { |
90 | "Hello from a thread!" | |
85aaf69f SL |
91 | }); |
92 | ||
9346a6ac | 93 | println!("{}", handle.join().unwrap()); |
85aaf69f SL |
94 | } |
95 | ``` | |
96 | ||
54a0048b SL |
97 | As closures can capture variables from their environment, we can also try to |
98 | bring some data into the other thread: | |
99 | ||
100 | ```rust,ignore | |
101 | use std::thread; | |
102 | ||
103 | fn main() { | |
104 | let x = 1; | |
105 | thread::spawn(|| { | |
106 | println!("x is {}", x); | |
107 | }); | |
108 | } | |
109 | ``` | |
110 | ||
111 | However, this gives us an error: | |
112 | ||
113 | ```text | |
114 | 5:19: 7:6 error: closure may outlive the current function, but it | |
115 | borrows `x`, which is owned by the current function | |
116 | ... | |
117 | 5:19: 7:6 help: to force the closure to take ownership of `x` (and any other referenced variables), | |
118 | use the `move` keyword, as shown: | |
119 | thread::spawn(move || { | |
120 | println!("x is {}", x); | |
121 | }); | |
122 | ``` | |
123 | ||
124 | This is because by default closures capture variables by reference, and thus the | |
125 | closure only captures a _reference to `x`_. This is a problem, because the | |
126 | thread may outlive the scope of `x`, leading to a dangling pointer. | |
127 | ||
128 | To fix this, we use a `move` closure as mentioned in the error message. `move` | |
129 | closures are explained in depth [here](closures.html#move-closures); basically | |
130 | they move variables from their environment into themselves. | |
131 | ||
132 | ```rust | |
133 | use std::thread; | |
134 | ||
135 | fn main() { | |
136 | let x = 1; | |
137 | thread::spawn(move || { | |
138 | println!("x is {}", x); | |
139 | }); | |
140 | } | |
141 | ``` | |
142 | ||
85aaf69f SL |
143 | Many languages have the ability to execute threads, but it's wildly unsafe. |
144 | There are entire books about how to prevent errors that occur from shared | |
145 | mutable state. Rust helps out with its type system here as well, by preventing | |
146 | data races at compile time. Let's talk about how you actually share things | |
147 | between threads. | |
148 | ||
149 | ## Safe Shared Mutable State | |
150 | ||
151 | Due to Rust's type system, we have a concept that sounds like a lie: "safe | |
152 | shared mutable state." Many programmers agree that shared mutable state is | |
153 | very, very bad. | |
154 | ||
155 | Someone once said this: | |
156 | ||
157 | > Shared mutable state is the root of all evil. Most languages attempt to deal | |
158 | > with this problem through the 'mutable' part, but Rust deals with it by | |
159 | > solving the 'shared' part. | |
160 | ||
161 | The same [ownership system](ownership.html) that helps prevent using pointers | |
162 | incorrectly also helps rule out data races, one of the worst kinds of | |
163 | concurrency bugs. | |
164 | ||
54a0048b | 165 | As an example, here is a Rust program that could have a data race in many |
85aaf69f SL |
166 | languages. It will not compile: |
167 | ||
168 | ```ignore | |
169 | use std::thread; | |
92a42be0 | 170 | use std::time::Duration; |
85aaf69f SL |
171 | |
172 | fn main() { | |
c1a9b12d | 173 | let mut data = vec![1, 2, 3]; |
85aaf69f | 174 | |
bd371182 | 175 | for i in 0..3 { |
85aaf69f SL |
176 | thread::spawn(move || { |
177 | data[i] += 1; | |
178 | }); | |
179 | } | |
180 | ||
92a42be0 | 181 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
182 | } |
183 | ``` | |
184 | ||
185 | This gives us an error: | |
186 | ||
187 | ```text | |
9346a6ac | 188 | 8:17 error: capture of moved value: `data` |
85aaf69f SL |
189 | data[i] += 1; |
190 | ^~~~ | |
191 | ``` | |
192 | ||
e9174d1e | 193 | Rust knows this wouldn't be safe! If we had a reference to `data` in each |
54a0048b SL |
194 | thread, and the thread takes ownership of the reference, we'd have three owners! |
195 | `data` gets moved out of `main` in the first call to `spawn()`, so subsequent | |
196 | calls in the loop cannot use this variable. | |
e9174d1e | 197 | |
54a0048b SL |
198 | Note that this specific example will not cause a data race since different array |
199 | indices are being accessed. But this can't be determined at compile time, and in | |
200 | a similar situation where `i` is a constant or is random, you would have a data | |
201 | race. | |
e9174d1e | 202 | |
54a0048b SL |
203 | So, we need some type that lets us have more than one owning reference to a |
204 | value. Usually, we'd use `Rc<T>` for this, which is a reference counted type | |
205 | that provides shared ownership. It has some runtime bookkeeping that keeps track | |
206 | of the number of references to it, hence the "reference count" part of its name. | |
e9174d1e | 207 | |
54a0048b SL |
208 | Calling `clone()` on an `Rc<T>` will return a new owned reference and bump the |
209 | internal reference count. We create one of these for each thread: | |
210 | ||
211 | ||
212 | ```ignore | |
213 | use std::thread; | |
214 | use std::time::Duration; | |
215 | use std::rc::Rc; | |
216 | ||
217 | fn main() { | |
218 | let mut data = Rc::new(vec![1, 2, 3]); | |
219 | ||
220 | for i in 0..3 { | |
221 | // create a new owned reference | |
222 | let data_ref = data.clone(); | |
223 | ||
224 | // use it in a thread | |
225 | thread::spawn(move || { | |
226 | data_ref[i] += 1; | |
227 | }); | |
228 | } | |
229 | ||
230 | thread::sleep(Duration::from_millis(50)); | |
231 | } | |
232 | ``` | |
233 | ||
234 | This won't work, however, and will give us the error: | |
235 | ||
236 | ```text | |
237 | 13:9: 13:22 error: the trait bound `alloc::rc::Rc<collections::vec::Vec<i32>> : core::marker::Send` | |
238 | is not satisfied | |
239 | ... | |
240 | 13:9: 13:22 note: `alloc::rc::Rc<collections::vec::Vec<i32>>` | |
241 | cannot be sent between threads safely | |
242 | ``` | |
243 | ||
244 | As the error message mentions, `Rc` cannot be sent between threads safely. This | |
245 | is because the internal reference count is not maintained in a thread safe | |
246 | matter and can have a data race. | |
247 | ||
248 | To solve this, we'll use `Arc<T>`, Rust's standard atomic reference count type. | |
e9174d1e SL |
249 | |
250 | The Atomic part means `Arc<T>` can safely be accessed from multiple threads. | |
251 | To do this the compiler guarantees that mutations of the internal count use | |
252 | indivisible operations which can't have data races. | |
85aaf69f | 253 | |
54a0048b SL |
254 | In essence, `Arc<T>` is a type that lets us share ownership of data _across |
255 | threads_. | |
256 | ||
85aaf69f SL |
257 | |
258 | ```ignore | |
259 | use std::thread; | |
e9174d1e | 260 | use std::sync::Arc; |
92a42be0 | 261 | use std::time::Duration; |
85aaf69f SL |
262 | |
263 | fn main() { | |
e9174d1e | 264 | let mut data = Arc::new(vec![1, 2, 3]); |
85aaf69f | 265 | |
bd371182 | 266 | for i in 0..3 { |
e9174d1e | 267 | let data = data.clone(); |
85aaf69f SL |
268 | thread::spawn(move || { |
269 | data[i] += 1; | |
270 | }); | |
271 | } | |
272 | ||
92a42be0 | 273 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
274 | } |
275 | ``` | |
276 | ||
54a0048b | 277 | Similarly to last time, we use `clone()` to create a new owned handle. |
e9174d1e SL |
278 | This handle is then moved into the new thread. |
279 | ||
280 | And... still gives us an error. | |
85aaf69f SL |
281 | |
282 | ```text | |
e9174d1e SL |
283 | <anon>:11:24 error: cannot borrow immutable borrowed content as mutable |
284 | <anon>:11 data[i] += 1; | |
285 | ^~~~ | |
85aaf69f SL |
286 | ``` |
287 | ||
54a0048b SL |
288 | `Arc<T>` by default has immutable contents. It allows the _sharing_ of data |
289 | between threads, but shared mutable data is unsafe and when threads are | |
290 | involved can cause data races! | |
291 | ||
85aaf69f | 292 | |
54a0048b SL |
293 | Usually when we wish to make something in an immutable position mutable, we use |
294 | `Cell<T>` or `RefCell<T>` which allow safe mutation via runtime checks or | |
295 | otherwise (see also: [Choosing Your Guarantees](choosing-your-guarantees.html)). | |
296 | However, similar to `Rc`, these are not thread safe. If we try using these, we | |
297 | will get an error about these types not being `Sync`, and the code will fail to | |
298 | compile. | |
299 | ||
300 | It looks like we need some type that allows us to safely mutate a shared value | |
301 | across threads, for example a type that can ensure only one thread at a time is | |
302 | able to mutate the value inside it at any one time. | |
85aaf69f | 303 | |
e9174d1e | 304 | For that, we can use the `Mutex<T>` type! |
85aaf69f | 305 | |
e9174d1e | 306 | Here's the working version: |
85aaf69f | 307 | |
62682a34 | 308 | ```rust |
85aaf69f SL |
309 | use std::sync::{Arc, Mutex}; |
310 | use std::thread; | |
92a42be0 | 311 | use std::time::Duration; |
85aaf69f SL |
312 | |
313 | fn main() { | |
c1a9b12d | 314 | let data = Arc::new(Mutex::new(vec![1, 2, 3])); |
85aaf69f | 315 | |
bd371182 | 316 | for i in 0..3 { |
85aaf69f SL |
317 | let data = data.clone(); |
318 | thread::spawn(move || { | |
319 | let mut data = data.lock().unwrap(); | |
320 | data[i] += 1; | |
321 | }); | |
322 | } | |
323 | ||
92a42be0 | 324 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
325 | } |
326 | ``` | |
327 | ||
b039eaaf SL |
328 | Note that the value of `i` is bound (copied) to the closure and not shared |
329 | among the threads. | |
e9174d1e | 330 | |
54a0048b SL |
331 | We're "locking" the mutex here. A mutex (short for "mutual exclusion"), as |
332 | mentioned, only allows one thread at a time to access a value. When we wish to | |
333 | access the value, we use `lock()` on it. This will "lock" the mutex, and no | |
334 | other thread will be able to lock it (and hence, do anything with the value) | |
335 | until we're done with it. If a thread attempts to lock a mutex which is already | |
336 | locked, it will wait until the other thread releases the lock. | |
337 | ||
338 | The lock "release" here is implicit; when the result of the lock (in this case, | |
339 | `data`) goes out of scope, the lock is automatically released. | |
340 | ||
341 | Note that [`lock`](../std/sync/struct.Mutex.html#method.lock) method of | |
b039eaaf | 342 | [`Mutex`](../std/sync/struct.Mutex.html) has this signature: |
e9174d1e SL |
343 | |
344 | ```ignore | |
345 | fn lock(&self) -> LockResult<MutexGuard<T>> | |
346 | ``` | |
347 | ||
b039eaaf SL |
348 | and because `Send` is not implemented for `MutexGuard<T>`, the guard cannot |
349 | cross thread boundaries, ensuring thread-locality of lock acquire and release. | |
e9174d1e SL |
350 | |
351 | Let's examine the body of the thread more closely: | |
85aaf69f | 352 | |
9346a6ac | 353 | ```rust |
85aaf69f SL |
354 | # use std::sync::{Arc, Mutex}; |
355 | # use std::thread; | |
92a42be0 | 356 | # use std::time::Duration; |
85aaf69f | 357 | # fn main() { |
c1a9b12d | 358 | # let data = Arc::new(Mutex::new(vec![1, 2, 3])); |
bd371182 | 359 | # for i in 0..3 { |
85aaf69f SL |
360 | # let data = data.clone(); |
361 | thread::spawn(move || { | |
362 | let mut data = data.lock().unwrap(); | |
363 | data[i] += 1; | |
364 | }); | |
365 | # } | |
92a42be0 | 366 | # thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
367 | # } |
368 | ``` | |
369 | ||
370 | First, we call `lock()`, which acquires the mutex's lock. Because this may fail, | |
7453a54e | 371 | it returns a `Result<T, E>`, and because this is just an example, we `unwrap()` |
85aaf69f SL |
372 | it to get a reference to the data. Real code would have more robust error handling |
373 | here. We're then free to mutate it, since we have the lock. | |
374 | ||
c34b1796 AL |
375 | Lastly, while the threads are running, we wait on a short timer. But |
376 | this is not ideal: we may have picked a reasonable amount of time to | |
377 | wait but it's more likely we'll either be waiting longer than | |
378 | necessary or not long enough, depending on just how much time the | |
379 | threads actually take to finish computing when the program runs. | |
85aaf69f | 380 | |
c34b1796 AL |
381 | A more precise alternative to the timer would be to use one of the |
382 | mechanisms provided by the Rust standard library for synchronizing | |
383 | threads with each other. Let's talk about one of them: channels. | |
85aaf69f SL |
384 | |
385 | ## Channels | |
386 | ||
387 | Here's a version of our code that uses channels for synchronization, rather | |
388 | than waiting for a specific time: | |
389 | ||
62682a34 | 390 | ```rust |
85aaf69f SL |
391 | use std::sync::{Arc, Mutex}; |
392 | use std::thread; | |
393 | use std::sync::mpsc; | |
394 | ||
395 | fn main() { | |
c1a9b12d | 396 | let data = Arc::new(Mutex::new(0)); |
85aaf69f | 397 | |
7453a54e SL |
398 | // `tx` is the "transmitter" or "sender" |
399 | // `rx` is the "receiver" | |
85aaf69f SL |
400 | let (tx, rx) = mpsc::channel(); |
401 | ||
402 | for _ in 0..10 { | |
403 | let (data, tx) = (data.clone(), tx.clone()); | |
404 | ||
405 | thread::spawn(move || { | |
406 | let mut data = data.lock().unwrap(); | |
407 | *data += 1; | |
408 | ||
92a42be0 | 409 | tx.send(()).unwrap(); |
85aaf69f SL |
410 | }); |
411 | } | |
412 | ||
413 | for _ in 0..10 { | |
92a42be0 | 414 | rx.recv().unwrap(); |
85aaf69f SL |
415 | } |
416 | } | |
417 | ``` | |
418 | ||
9cc50fc6 | 419 | We use the `mpsc::channel()` method to construct a new channel. We `send` |
85aaf69f SL |
420 | a simple `()` down the channel, and then wait for ten of them to come back. |
421 | ||
9cc50fc6 | 422 | While this channel is sending a generic signal, we can send any data that |
85aaf69f SL |
423 | is `Send` over the channel! |
424 | ||
62682a34 | 425 | ```rust |
85aaf69f SL |
426 | use std::thread; |
427 | use std::sync::mpsc; | |
428 | ||
429 | fn main() { | |
430 | let (tx, rx) = mpsc::channel(); | |
431 | ||
b039eaaf | 432 | for i in 0..10 { |
85aaf69f SL |
433 | let tx = tx.clone(); |
434 | ||
435 | thread::spawn(move || { | |
b039eaaf | 436 | let answer = i * i; |
85aaf69f | 437 | |
92a42be0 | 438 | tx.send(answer).unwrap(); |
85aaf69f SL |
439 | }); |
440 | } | |
441 | ||
b039eaaf SL |
442 | for _ in 0..10 { |
443 | println!("{}", rx.recv().unwrap()); | |
444 | } | |
85aaf69f SL |
445 | } |
446 | ``` | |
447 | ||
b039eaaf SL |
448 | Here we create 10 threads, asking each to calculate the square of a number (`i` |
449 | at the time of `spawn()`), and then `send()` back the answer over the channel. | |
85aaf69f SL |
450 | |
451 | ||
452 | ## Panics | |
453 | ||
454 | A `panic!` will crash the currently executing thread. You can use Rust's | |
455 | threads as a simple isolation mechanism: | |
456 | ||
62682a34 | 457 | ```rust |
85aaf69f SL |
458 | use std::thread; |
459 | ||
e9174d1e | 460 | let handle = thread::spawn(move || { |
85aaf69f | 461 | panic!("oops!"); |
e9174d1e SL |
462 | }); |
463 | ||
464 | let result = handle.join(); | |
85aaf69f SL |
465 | |
466 | assert!(result.is_err()); | |
467 | ``` | |
468 | ||
e9174d1e | 469 | `Thread.join()` gives us a `Result` back, which allows us to check if the thread |
85aaf69f | 470 | has panicked or not. |