]>
Commit | Line | Data |
---|---|---|
85aaf69f SL |
1 | % Concurrency |
2 | ||
3 | Concurrency and parallelism are incredibly important topics in computer | |
4 | science, and are also a hot topic in industry today. Computers are gaining more | |
5 | and more cores, yet many programmers aren't prepared to fully utilize them. | |
6 | ||
c30ab7b3 | 7 | Rust's memory safety features also apply to its concurrency story. Even |
85aaf69f | 8 | concurrent Rust programs must be memory safe, having no data races. Rust's type |
d9579d0f | 9 | system is up to the task, and gives you powerful ways to reason about |
85aaf69f SL |
10 | concurrent code at compile time. |
11 | ||
12 | Before we talk about the concurrency features that come with Rust, it's important | |
c1a9b12d SL |
13 | to understand something: Rust is low-level enough that the vast majority of |
14 | this is provided by the standard library, not by the language. This means that | |
15 | if you don't like some aspect of the way Rust handles concurrency, you can | |
16 | implement an alternative way of doing things. | |
17 | [mio](https://github.com/carllerche/mio) is a real-world example of this | |
18 | principle in action. | |
85aaf69f SL |
19 | |
20 | ## Background: `Send` and `Sync` | |
21 | ||
22 | Concurrency is difficult to reason about. In Rust, we have a strong, static | |
23 | type system to help us reason about our code. As such, Rust gives us two traits | |
24 | to help us make sense of code that can possibly be concurrent. | |
25 | ||
26 | ### `Send` | |
27 | ||
28 | The first trait we're going to talk about is | |
b039eaaf SL |
29 | [`Send`](../std/marker/trait.Send.html). When a type `T` implements `Send`, it |
30 | indicates that something of this type is able to have ownership transferred | |
85aaf69f SL |
31 | safely between threads. |
32 | ||
33 | This is important to enforce certain restrictions. For example, if we have a | |
34 | channel connecting two threads, we would want to be able to send some data | |
35 | down the channel and to the other thread. Therefore, we'd ensure that `Send` was | |
36 | implemented for that type. | |
37 | ||
b039eaaf | 38 | In the opposite way, if we were wrapping a library with [FFI][ffi] that isn't |
85aaf69f SL |
39 | threadsafe, we wouldn't want to implement `Send`, and so the compiler will help |
40 | us enforce that it can't leave the current thread. | |
41 | ||
b039eaaf SL |
42 | [ffi]: ffi.html |
43 | ||
85aaf69f SL |
44 | ### `Sync` |
45 | ||
c34b1796 | 46 | The second of these traits is called [`Sync`](../std/marker/trait.Sync.html). |
b039eaaf | 47 | When a type `T` implements `Sync`, it indicates that something |
85aaf69f | 48 | of this type has no possibility of introducing memory unsafety when used from |
b039eaaf SL |
49 | multiple threads concurrently through shared references. This implies that |
50 | types which don't have [interior mutability](mutability.html) are inherently | |
51 | `Sync`, which includes simple primitive types (like `u8`) and aggregate types | |
52 | containing them. | |
53 | ||
54 | For sharing references across threads, Rust provides a wrapper type called | |
55 | `Arc<T>`. `Arc<T>` implements `Send` and `Sync` if and only if `T` implements | |
56 | both `Send` and `Sync`. For example, an object of type `Arc<RefCell<U>>` cannot | |
57 | be transferred across threads because | |
58 | [`RefCell`](choosing-your-guarantees.html#refcellt) does not implement | |
59 | `Sync`, consequently `Arc<RefCell<U>>` would not implement `Send`. | |
85aaf69f SL |
60 | |
61 | These two traits allow you to use the type system to make strong guarantees | |
62 | about the properties of your code under concurrency. Before we demonstrate | |
63 | why, we need to learn how to create a concurrent Rust program in the first | |
64 | place! | |
65 | ||
66 | ## Threads | |
67 | ||
9346a6ac | 68 | Rust's standard library provides a library for threads, which allow you to |
85aaf69f SL |
69 | run Rust code in parallel. Here's a basic example of using `std::thread`: |
70 | ||
62682a34 | 71 | ```rust |
85aaf69f SL |
72 | use std::thread; |
73 | ||
74 | fn main() { | |
9346a6ac | 75 | thread::spawn(|| { |
85aaf69f SL |
76 | println!("Hello from a thread!"); |
77 | }); | |
78 | } | |
79 | ``` | |
80 | ||
b039eaaf | 81 | The `thread::spawn()` method accepts a [closure](closures.html), which is executed in a |
9346a6ac AL |
82 | new thread. It returns a handle to the thread, that can be used to |
83 | wait for the child thread to finish and extract its result: | |
85aaf69f | 84 | |
62682a34 | 85 | ```rust |
85aaf69f SL |
86 | use std::thread; |
87 | ||
88 | fn main() { | |
9346a6ac AL |
89 | let handle = thread::spawn(|| { |
90 | "Hello from a thread!" | |
85aaf69f SL |
91 | }); |
92 | ||
9346a6ac | 93 | println!("{}", handle.join().unwrap()); |
85aaf69f SL |
94 | } |
95 | ``` | |
96 | ||
54a0048b SL |
97 | As closures can capture variables from their environment, we can also try to |
98 | bring some data into the other thread: | |
99 | ||
100 | ```rust,ignore | |
101 | use std::thread; | |
102 | ||
103 | fn main() { | |
104 | let x = 1; | |
105 | thread::spawn(|| { | |
106 | println!("x is {}", x); | |
107 | }); | |
108 | } | |
109 | ``` | |
110 | ||
111 | However, this gives us an error: | |
112 | ||
113 | ```text | |
114 | 5:19: 7:6 error: closure may outlive the current function, but it | |
115 | borrows `x`, which is owned by the current function | |
116 | ... | |
117 | 5:19: 7:6 help: to force the closure to take ownership of `x` (and any other referenced variables), | |
118 | use the `move` keyword, as shown: | |
119 | thread::spawn(move || { | |
120 | println!("x is {}", x); | |
121 | }); | |
122 | ``` | |
123 | ||
124 | This is because by default closures capture variables by reference, and thus the | |
125 | closure only captures a _reference to `x`_. This is a problem, because the | |
126 | thread may outlive the scope of `x`, leading to a dangling pointer. | |
127 | ||
128 | To fix this, we use a `move` closure as mentioned in the error message. `move` | |
129 | closures are explained in depth [here](closures.html#move-closures); basically | |
130 | they move variables from their environment into themselves. | |
131 | ||
132 | ```rust | |
133 | use std::thread; | |
134 | ||
135 | fn main() { | |
136 | let x = 1; | |
137 | thread::spawn(move || { | |
138 | println!("x is {}", x); | |
139 | }); | |
140 | } | |
141 | ``` | |
142 | ||
85aaf69f SL |
143 | Many languages have the ability to execute threads, but it's wildly unsafe. |
144 | There are entire books about how to prevent errors that occur from shared | |
145 | mutable state. Rust helps out with its type system here as well, by preventing | |
146 | data races at compile time. Let's talk about how you actually share things | |
147 | between threads. | |
148 | ||
149 | ## Safe Shared Mutable State | |
150 | ||
151 | Due to Rust's type system, we have a concept that sounds like a lie: "safe | |
152 | shared mutable state." Many programmers agree that shared mutable state is | |
153 | very, very bad. | |
154 | ||
155 | Someone once said this: | |
156 | ||
157 | > Shared mutable state is the root of all evil. Most languages attempt to deal | |
158 | > with this problem through the 'mutable' part, but Rust deals with it by | |
159 | > solving the 'shared' part. | |
160 | ||
161 | The same [ownership system](ownership.html) that helps prevent using pointers | |
162 | incorrectly also helps rule out data races, one of the worst kinds of | |
163 | concurrency bugs. | |
164 | ||
a7813a04 | 165 | As an example, here is a Rust program that would have a data race in many |
85aaf69f SL |
166 | languages. It will not compile: |
167 | ||
a7813a04 | 168 | ```rust,ignore |
85aaf69f | 169 | use std::thread; |
92a42be0 | 170 | use std::time::Duration; |
85aaf69f SL |
171 | |
172 | fn main() { | |
c1a9b12d | 173 | let mut data = vec![1, 2, 3]; |
85aaf69f | 174 | |
bd371182 | 175 | for i in 0..3 { |
85aaf69f | 176 | thread::spawn(move || { |
a7813a04 | 177 | data[0] += i; |
85aaf69f SL |
178 | }); |
179 | } | |
180 | ||
92a42be0 | 181 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
182 | } |
183 | ``` | |
184 | ||
185 | This gives us an error: | |
186 | ||
187 | ```text | |
9346a6ac | 188 | 8:17 error: capture of moved value: `data` |
a7813a04 | 189 | data[0] += i; |
85aaf69f SL |
190 | ^~~~ |
191 | ``` | |
192 | ||
e9174d1e | 193 | Rust knows this wouldn't be safe! If we had a reference to `data` in each |
54a0048b SL |
194 | thread, and the thread takes ownership of the reference, we'd have three owners! |
195 | `data` gets moved out of `main` in the first call to `spawn()`, so subsequent | |
196 | calls in the loop cannot use this variable. | |
e9174d1e | 197 | |
54a0048b SL |
198 | So, we need some type that lets us have more than one owning reference to a |
199 | value. Usually, we'd use `Rc<T>` for this, which is a reference counted type | |
200 | that provides shared ownership. It has some runtime bookkeeping that keeps track | |
201 | of the number of references to it, hence the "reference count" part of its name. | |
e9174d1e | 202 | |
54a0048b SL |
203 | Calling `clone()` on an `Rc<T>` will return a new owned reference and bump the |
204 | internal reference count. We create one of these for each thread: | |
205 | ||
206 | ||
a7813a04 | 207 | ```rust,ignore |
54a0048b SL |
208 | use std::thread; |
209 | use std::time::Duration; | |
210 | use std::rc::Rc; | |
211 | ||
212 | fn main() { | |
213 | let mut data = Rc::new(vec![1, 2, 3]); | |
214 | ||
215 | for i in 0..3 { | |
216 | // create a new owned reference | |
217 | let data_ref = data.clone(); | |
218 | ||
219 | // use it in a thread | |
220 | thread::spawn(move || { | |
a7813a04 | 221 | data_ref[0] += i; |
54a0048b SL |
222 | }); |
223 | } | |
224 | ||
225 | thread::sleep(Duration::from_millis(50)); | |
226 | } | |
227 | ``` | |
228 | ||
229 | This won't work, however, and will give us the error: | |
230 | ||
231 | ```text | |
232 | 13:9: 13:22 error: the trait bound `alloc::rc::Rc<collections::vec::Vec<i32>> : core::marker::Send` | |
233 | is not satisfied | |
234 | ... | |
235 | 13:9: 13:22 note: `alloc::rc::Rc<collections::vec::Vec<i32>>` | |
236 | cannot be sent between threads safely | |
237 | ``` | |
238 | ||
239 | As the error message mentions, `Rc` cannot be sent between threads safely. This | |
240 | is because the internal reference count is not maintained in a thread safe | |
241 | matter and can have a data race. | |
242 | ||
243 | To solve this, we'll use `Arc<T>`, Rust's standard atomic reference count type. | |
e9174d1e SL |
244 | |
245 | The Atomic part means `Arc<T>` can safely be accessed from multiple threads. | |
246 | To do this the compiler guarantees that mutations of the internal count use | |
247 | indivisible operations which can't have data races. | |
85aaf69f | 248 | |
54a0048b SL |
249 | In essence, `Arc<T>` is a type that lets us share ownership of data _across |
250 | threads_. | |
251 | ||
85aaf69f | 252 | |
a7813a04 | 253 | ```rust,ignore |
85aaf69f | 254 | use std::thread; |
e9174d1e | 255 | use std::sync::Arc; |
92a42be0 | 256 | use std::time::Duration; |
85aaf69f SL |
257 | |
258 | fn main() { | |
e9174d1e | 259 | let mut data = Arc::new(vec![1, 2, 3]); |
85aaf69f | 260 | |
bd371182 | 261 | for i in 0..3 { |
e9174d1e | 262 | let data = data.clone(); |
85aaf69f | 263 | thread::spawn(move || { |
a7813a04 | 264 | data[0] += i; |
85aaf69f SL |
265 | }); |
266 | } | |
267 | ||
92a42be0 | 268 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
269 | } |
270 | ``` | |
271 | ||
54a0048b | 272 | Similarly to last time, we use `clone()` to create a new owned handle. |
e9174d1e SL |
273 | This handle is then moved into the new thread. |
274 | ||
275 | And... still gives us an error. | |
85aaf69f SL |
276 | |
277 | ```text | |
e9174d1e | 278 | <anon>:11:24 error: cannot borrow immutable borrowed content as mutable |
a7813a04 | 279 | <anon>:11 data[0] += i; |
e9174d1e | 280 | ^~~~ |
85aaf69f SL |
281 | ``` |
282 | ||
54a0048b | 283 | `Arc<T>` by default has immutable contents. It allows the _sharing_ of data |
c30ab7b3 SL |
284 | between threads, but shared mutable data is unsafe—and when threads are |
285 | involved—can cause data races! | |
54a0048b | 286 | |
85aaf69f | 287 | |
54a0048b SL |
288 | Usually when we wish to make something in an immutable position mutable, we use |
289 | `Cell<T>` or `RefCell<T>` which allow safe mutation via runtime checks or | |
290 | otherwise (see also: [Choosing Your Guarantees](choosing-your-guarantees.html)). | |
291 | However, similar to `Rc`, these are not thread safe. If we try using these, we | |
292 | will get an error about these types not being `Sync`, and the code will fail to | |
293 | compile. | |
294 | ||
295 | It looks like we need some type that allows us to safely mutate a shared value | |
296 | across threads, for example a type that can ensure only one thread at a time is | |
297 | able to mutate the value inside it at any one time. | |
85aaf69f | 298 | |
e9174d1e | 299 | For that, we can use the `Mutex<T>` type! |
85aaf69f | 300 | |
e9174d1e | 301 | Here's the working version: |
85aaf69f | 302 | |
62682a34 | 303 | ```rust |
85aaf69f SL |
304 | use std::sync::{Arc, Mutex}; |
305 | use std::thread; | |
92a42be0 | 306 | use std::time::Duration; |
85aaf69f SL |
307 | |
308 | fn main() { | |
c1a9b12d | 309 | let data = Arc::new(Mutex::new(vec![1, 2, 3])); |
85aaf69f | 310 | |
bd371182 | 311 | for i in 0..3 { |
85aaf69f SL |
312 | let data = data.clone(); |
313 | thread::spawn(move || { | |
314 | let mut data = data.lock().unwrap(); | |
a7813a04 | 315 | data[0] += i; |
85aaf69f SL |
316 | }); |
317 | } | |
318 | ||
92a42be0 | 319 | thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
320 | } |
321 | ``` | |
322 | ||
b039eaaf SL |
323 | Note that the value of `i` is bound (copied) to the closure and not shared |
324 | among the threads. | |
e9174d1e | 325 | |
54a0048b SL |
326 | We're "locking" the mutex here. A mutex (short for "mutual exclusion"), as |
327 | mentioned, only allows one thread at a time to access a value. When we wish to | |
328 | access the value, we use `lock()` on it. This will "lock" the mutex, and no | |
329 | other thread will be able to lock it (and hence, do anything with the value) | |
330 | until we're done with it. If a thread attempts to lock a mutex which is already | |
331 | locked, it will wait until the other thread releases the lock. | |
332 | ||
333 | The lock "release" here is implicit; when the result of the lock (in this case, | |
334 | `data`) goes out of scope, the lock is automatically released. | |
335 | ||
336 | Note that [`lock`](../std/sync/struct.Mutex.html#method.lock) method of | |
b039eaaf | 337 | [`Mutex`](../std/sync/struct.Mutex.html) has this signature: |
e9174d1e | 338 | |
a7813a04 | 339 | ```rust,ignore |
e9174d1e SL |
340 | fn lock(&self) -> LockResult<MutexGuard<T>> |
341 | ``` | |
342 | ||
b039eaaf SL |
343 | and because `Send` is not implemented for `MutexGuard<T>`, the guard cannot |
344 | cross thread boundaries, ensuring thread-locality of lock acquire and release. | |
e9174d1e SL |
345 | |
346 | Let's examine the body of the thread more closely: | |
85aaf69f | 347 | |
9346a6ac | 348 | ```rust |
85aaf69f SL |
349 | # use std::sync::{Arc, Mutex}; |
350 | # use std::thread; | |
92a42be0 | 351 | # use std::time::Duration; |
85aaf69f | 352 | # fn main() { |
c1a9b12d | 353 | # let data = Arc::new(Mutex::new(vec![1, 2, 3])); |
bd371182 | 354 | # for i in 0..3 { |
85aaf69f SL |
355 | # let data = data.clone(); |
356 | thread::spawn(move || { | |
357 | let mut data = data.lock().unwrap(); | |
a7813a04 | 358 | data[0] += i; |
85aaf69f SL |
359 | }); |
360 | # } | |
92a42be0 | 361 | # thread::sleep(Duration::from_millis(50)); |
85aaf69f SL |
362 | # } |
363 | ``` | |
364 | ||
365 | First, we call `lock()`, which acquires the mutex's lock. Because this may fail, | |
7453a54e | 366 | it returns a `Result<T, E>`, and because this is just an example, we `unwrap()` |
85aaf69f SL |
367 | it to get a reference to the data. Real code would have more robust error handling |
368 | here. We're then free to mutate it, since we have the lock. | |
369 | ||
c34b1796 AL |
370 | Lastly, while the threads are running, we wait on a short timer. But |
371 | this is not ideal: we may have picked a reasonable amount of time to | |
372 | wait but it's more likely we'll either be waiting longer than | |
373 | necessary or not long enough, depending on just how much time the | |
374 | threads actually take to finish computing when the program runs. | |
85aaf69f | 375 | |
c34b1796 AL |
376 | A more precise alternative to the timer would be to use one of the |
377 | mechanisms provided by the Rust standard library for synchronizing | |
378 | threads with each other. Let's talk about one of them: channels. | |
85aaf69f SL |
379 | |
380 | ## Channels | |
381 | ||
382 | Here's a version of our code that uses channels for synchronization, rather | |
383 | than waiting for a specific time: | |
384 | ||
62682a34 | 385 | ```rust |
85aaf69f SL |
386 | use std::sync::{Arc, Mutex}; |
387 | use std::thread; | |
388 | use std::sync::mpsc; | |
389 | ||
390 | fn main() { | |
c1a9b12d | 391 | let data = Arc::new(Mutex::new(0)); |
85aaf69f | 392 | |
7453a54e SL |
393 | // `tx` is the "transmitter" or "sender" |
394 | // `rx` is the "receiver" | |
85aaf69f SL |
395 | let (tx, rx) = mpsc::channel(); |
396 | ||
397 | for _ in 0..10 { | |
398 | let (data, tx) = (data.clone(), tx.clone()); | |
399 | ||
400 | thread::spawn(move || { | |
401 | let mut data = data.lock().unwrap(); | |
402 | *data += 1; | |
403 | ||
92a42be0 | 404 | tx.send(()).unwrap(); |
85aaf69f SL |
405 | }); |
406 | } | |
407 | ||
408 | for _ in 0..10 { | |
92a42be0 | 409 | rx.recv().unwrap(); |
85aaf69f SL |
410 | } |
411 | } | |
412 | ``` | |
413 | ||
9cc50fc6 | 414 | We use the `mpsc::channel()` method to construct a new channel. We `send` |
85aaf69f SL |
415 | a simple `()` down the channel, and then wait for ten of them to come back. |
416 | ||
9cc50fc6 | 417 | While this channel is sending a generic signal, we can send any data that |
85aaf69f SL |
418 | is `Send` over the channel! |
419 | ||
62682a34 | 420 | ```rust |
85aaf69f SL |
421 | use std::thread; |
422 | use std::sync::mpsc; | |
423 | ||
424 | fn main() { | |
425 | let (tx, rx) = mpsc::channel(); | |
426 | ||
b039eaaf | 427 | for i in 0..10 { |
85aaf69f SL |
428 | let tx = tx.clone(); |
429 | ||
430 | thread::spawn(move || { | |
b039eaaf | 431 | let answer = i * i; |
85aaf69f | 432 | |
92a42be0 | 433 | tx.send(answer).unwrap(); |
85aaf69f SL |
434 | }); |
435 | } | |
436 | ||
b039eaaf SL |
437 | for _ in 0..10 { |
438 | println!("{}", rx.recv().unwrap()); | |
439 | } | |
85aaf69f SL |
440 | } |
441 | ``` | |
442 | ||
b039eaaf SL |
443 | Here we create 10 threads, asking each to calculate the square of a number (`i` |
444 | at the time of `spawn()`), and then `send()` back the answer over the channel. | |
85aaf69f SL |
445 | |
446 | ||
447 | ## Panics | |
448 | ||
449 | A `panic!` will crash the currently executing thread. You can use Rust's | |
450 | threads as a simple isolation mechanism: | |
451 | ||
62682a34 | 452 | ```rust |
85aaf69f SL |
453 | use std::thread; |
454 | ||
e9174d1e | 455 | let handle = thread::spawn(move || { |
85aaf69f | 456 | panic!("oops!"); |
e9174d1e SL |
457 | }); |
458 | ||
459 | let result = handle.join(); | |
85aaf69f SL |
460 | |
461 | assert!(result.is_err()); | |
462 | ``` | |
463 | ||
e9174d1e | 464 | `Thread.join()` gives us a `Result` back, which allows us to check if the thread |
85aaf69f | 465 | has panicked or not. |