]> git.proxmox.com Git - rustc.git/blob - src/doc/book/concurrency.md
New upstream version 1.15.0+dfsg1
[rustc.git] / src / doc / book / concurrency.md
1 % Concurrency
2
3 Concurrency and parallelism are incredibly important topics in computer
4 science, and are also a hot topic in industry today. Computers are gaining more
5 and more cores, yet many programmers aren't prepared to fully utilize them.
6
7 Rust's memory safety features also apply to its concurrency story. Even
8 concurrent Rust programs must be memory safe, having no data races. Rust's type
9 system is up to the task, and gives you powerful ways to reason about
10 concurrent code at compile time.
11
12 Before we talk about the concurrency features that come with Rust, it's important
13 to understand something: Rust is low-level enough that the vast majority of
14 this is provided by the standard library, not by the language. This means that
15 if you don't like some aspect of the way Rust handles concurrency, you can
16 implement an alternative way of doing things.
17 [mio](https://github.com/carllerche/mio) is a real-world example of this
18 principle in action.
19
20 ## Background: `Send` and `Sync`
21
22 Concurrency is difficult to reason about. In Rust, we have a strong, static
23 type system to help us reason about our code. As such, Rust gives us two traits
24 to help us make sense of code that can possibly be concurrent.
25
26 ### `Send`
27
28 The first trait we're going to talk about is
29 [`Send`](../std/marker/trait.Send.html). When a type `T` implements `Send`, it
30 indicates that something of this type is able to have ownership transferred
31 safely between threads.
32
33 This is important to enforce certain restrictions. For example, if we have a
34 channel connecting two threads, we would want to be able to send some data
35 down the channel and to the other thread. Therefore, we'd ensure that `Send` was
36 implemented for that type.
37
38 In the opposite way, if we were wrapping a library with [FFI][ffi] that isn't
39 threadsafe, we wouldn't want to implement `Send`, and so the compiler will help
40 us enforce that it can't leave the current thread.
41
42 [ffi]: ffi.html
43
44 ### `Sync`
45
46 The second of these traits is called [`Sync`](../std/marker/trait.Sync.html).
47 When a type `T` implements `Sync`, it indicates that something
48 of this type has no possibility of introducing memory unsafety when used from
49 multiple threads concurrently through shared references. This implies that
50 types which don't have [interior mutability](mutability.html) are inherently
51 `Sync`, which includes simple primitive types (like `u8`) and aggregate types
52 containing them.
53
54 For sharing references across threads, Rust provides a wrapper type called
55 `Arc<T>`. `Arc<T>` implements `Send` and `Sync` if and only if `T` implements
56 both `Send` and `Sync`. For example, an object of type `Arc<RefCell<U>>` cannot
57 be transferred across threads because
58 [`RefCell`](choosing-your-guarantees.html#refcellt) does not implement
59 `Sync`, consequently `Arc<RefCell<U>>` would not implement `Send`.
60
61 These two traits allow you to use the type system to make strong guarantees
62 about the properties of your code under concurrency. Before we demonstrate
63 why, we need to learn how to create a concurrent Rust program in the first
64 place!
65
66 ## Threads
67
68 Rust's standard library provides a library for threads, which allow you to
69 run Rust code in parallel. Here's a basic example of using `std::thread`:
70
71 ```rust
72 use std::thread;
73
74 fn main() {
75 thread::spawn(|| {
76 println!("Hello from a thread!");
77 });
78 }
79 ```
80
81 The `thread::spawn()` method accepts a [closure](closures.html), which is executed in a
82 new thread. It returns a handle to the thread, that can be used to
83 wait for the child thread to finish and extract its result:
84
85 ```rust
86 use std::thread;
87
88 fn main() {
89 let handle = thread::spawn(|| {
90 "Hello from a thread!"
91 });
92
93 println!("{}", handle.join().unwrap());
94 }
95 ```
96
97 As closures can capture variables from their environment, we can also try to
98 bring some data into the other thread:
99
100 ```rust,ignore
101 use std::thread;
102
103 fn main() {
104 let x = 1;
105 thread::spawn(|| {
106 println!("x is {}", x);
107 });
108 }
109 ```
110
111 However, this gives us an error:
112
113 ```text
114 5:19: 7:6 error: closure may outlive the current function, but it
115 borrows `x`, which is owned by the current function
116 ...
117 5:19: 7:6 help: to force the closure to take ownership of `x` (and any other referenced variables),
118 use the `move` keyword, as shown:
119 thread::spawn(move || {
120 println!("x is {}", x);
121 });
122 ```
123
124 This is because by default closures capture variables by reference, and thus the
125 closure only captures a _reference to `x`_. This is a problem, because the
126 thread may outlive the scope of `x`, leading to a dangling pointer.
127
128 To fix this, we use a `move` closure as mentioned in the error message. `move`
129 closures are explained in depth [here](closures.html#move-closures); basically
130 they move variables from their environment into themselves.
131
132 ```rust
133 use std::thread;
134
135 fn main() {
136 let x = 1;
137 thread::spawn(move || {
138 println!("x is {}", x);
139 });
140 }
141 ```
142
143 Many languages have the ability to execute threads, but it's wildly unsafe.
144 There are entire books about how to prevent errors that occur from shared
145 mutable state. Rust helps out with its type system here as well, by preventing
146 data races at compile time. Let's talk about how you actually share things
147 between threads.
148
149 ## Safe Shared Mutable State
150
151 Due to Rust's type system, we have a concept that sounds like a lie: "safe
152 shared mutable state." Many programmers agree that shared mutable state is
153 very, very bad.
154
155 Someone once said this:
156
157 > Shared mutable state is the root of all evil. Most languages attempt to deal
158 > with this problem through the 'mutable' part, but Rust deals with it by
159 > solving the 'shared' part.
160
161 The same [ownership system](ownership.html) that helps prevent using pointers
162 incorrectly also helps rule out data races, one of the worst kinds of
163 concurrency bugs.
164
165 As an example, here is a Rust program that would have a data race in many
166 languages. It will not compile:
167
168 ```rust,ignore
169 use std::thread;
170 use std::time::Duration;
171
172 fn main() {
173 let mut data = vec![1, 2, 3];
174
175 for i in 0..3 {
176 thread::spawn(move || {
177 data[0] += i;
178 });
179 }
180
181 thread::sleep(Duration::from_millis(50));
182 }
183 ```
184
185 This gives us an error:
186
187 ```text
188 8:17 error: capture of moved value: `data`
189 data[0] += i;
190 ^~~~
191 ```
192
193 Rust knows this wouldn't be safe! If we had a reference to `data` in each
194 thread, and the thread takes ownership of the reference, we'd have three owners!
195 `data` gets moved out of `main` in the first call to `spawn()`, so subsequent
196 calls in the loop cannot use this variable.
197
198 So, we need some type that lets us have more than one owning reference to a
199 value. Usually, we'd use `Rc<T>` for this, which is a reference counted type
200 that provides shared ownership. It has some runtime bookkeeping that keeps track
201 of the number of references to it, hence the "reference count" part of its name.
202
203 Calling `clone()` on an `Rc<T>` will return a new owned reference and bump the
204 internal reference count. We create one of these for each thread:
205
206
207 ```rust,ignore
208 use std::thread;
209 use std::time::Duration;
210 use std::rc::Rc;
211
212 fn main() {
213 let mut data = Rc::new(vec![1, 2, 3]);
214
215 for i in 0..3 {
216 // Create a new owned reference:
217 let data_ref = data.clone();
218
219 // Use it in a thread:
220 thread::spawn(move || {
221 data_ref[0] += i;
222 });
223 }
224
225 thread::sleep(Duration::from_millis(50));
226 }
227 ```
228
229 This won't work, however, and will give us the error:
230
231 ```text
232 13:9: 13:22 error: the trait bound `alloc::rc::Rc<collections::vec::Vec<i32>> : core::marker::Send`
233 is not satisfied
234 ...
235 13:9: 13:22 note: `alloc::rc::Rc<collections::vec::Vec<i32>>`
236 cannot be sent between threads safely
237 ```
238
239 As the error message mentions, `Rc` cannot be sent between threads safely. This
240 is because the internal reference count is not maintained in a thread safe
241 matter and can have a data race.
242
243 To solve this, we'll use `Arc<T>`, Rust's standard atomic reference count type.
244
245 The Atomic part means `Arc<T>` can safely be accessed from multiple threads.
246 To do this the compiler guarantees that mutations of the internal count use
247 indivisible operations which can't have data races.
248
249 In essence, `Arc<T>` is a type that lets us share ownership of data _across
250 threads_.
251
252
253 ```rust,ignore
254 use std::thread;
255 use std::sync::Arc;
256 use std::time::Duration;
257
258 fn main() {
259 let mut data = Arc::new(vec![1, 2, 3]);
260
261 for i in 0..3 {
262 let data = data.clone();
263 thread::spawn(move || {
264 data[0] += i;
265 });
266 }
267
268 thread::sleep(Duration::from_millis(50));
269 }
270 ```
271
272 Similarly to last time, we use `clone()` to create a new owned handle.
273 This handle is then moved into the new thread.
274
275 And... still gives us an error.
276
277 ```text
278 <anon>:11:24 error: cannot borrow immutable borrowed content as mutable
279 <anon>:11 data[0] += i;
280 ^~~~
281 ```
282
283 `Arc<T>` by default has immutable contents. It allows the _sharing_ of data
284 between threads, but shared mutable data is unsafe—and when threads are
285 involved—can cause data races!
286
287
288 Usually when we wish to make something in an immutable position mutable, we use
289 `Cell<T>` or `RefCell<T>` which allow safe mutation via runtime checks or
290 otherwise (see also: [Choosing Your Guarantees](choosing-your-guarantees.html)).
291 However, similar to `Rc`, these are not thread safe. If we try using these, we
292 will get an error about these types not being `Sync`, and the code will fail to
293 compile.
294
295 It looks like we need some type that allows us to safely mutate a shared value
296 across threads, for example a type that can ensure only one thread at a time is
297 able to mutate the value inside it at any one time.
298
299 For that, we can use the `Mutex<T>` type!
300
301 Here's the working version:
302
303 ```rust
304 use std::sync::{Arc, Mutex};
305 use std::thread;
306 use std::time::Duration;
307
308 fn main() {
309 let data = Arc::new(Mutex::new(vec![1, 2, 3]));
310
311 for i in 0..3 {
312 let data = data.clone();
313 thread::spawn(move || {
314 let mut data = data.lock().unwrap();
315 data[0] += i;
316 });
317 }
318
319 thread::sleep(Duration::from_millis(50));
320 }
321 ```
322
323 Note that the value of `i` is bound (copied) to the closure and not shared
324 among the threads.
325
326 We're "locking" the mutex here. A mutex (short for "mutual exclusion"), as
327 mentioned, only allows one thread at a time to access a value. When we wish to
328 access the value, we use `lock()` on it. This will "lock" the mutex, and no
329 other thread will be able to lock it (and hence, do anything with the value)
330 until we're done with it. If a thread attempts to lock a mutex which is already
331 locked, it will wait until the other thread releases the lock.
332
333 The lock "release" here is implicit; when the result of the lock (in this case,
334 `data`) goes out of scope, the lock is automatically released.
335
336 Note that [`lock`](../std/sync/struct.Mutex.html#method.lock) method of
337 [`Mutex`](../std/sync/struct.Mutex.html) has this signature:
338
339 ```rust,ignore
340 fn lock(&self) -> LockResult<MutexGuard<T>>
341 ```
342
343 and because `Send` is not implemented for `MutexGuard<T>`, the guard cannot
344 cross thread boundaries, ensuring thread-locality of lock acquire and release.
345
346 Let's examine the body of the thread more closely:
347
348 ```rust
349 # use std::sync::{Arc, Mutex};
350 # use std::thread;
351 # use std::time::Duration;
352 # fn main() {
353 # let data = Arc::new(Mutex::new(vec![1, 2, 3]));
354 # for i in 0..3 {
355 # let data = data.clone();
356 thread::spawn(move || {
357 let mut data = data.lock().unwrap();
358 data[0] += i;
359 });
360 # }
361 # thread::sleep(Duration::from_millis(50));
362 # }
363 ```
364
365 First, we call `lock()`, which acquires the mutex's lock. Because this may fail,
366 it returns a `Result<T, E>`, and because this is just an example, we `unwrap()`
367 it to get a reference to the data. Real code would have more robust error handling
368 here. We're then free to mutate it, since we have the lock.
369
370 Lastly, while the threads are running, we wait on a short timer. But
371 this is not ideal: we may have picked a reasonable amount of time to
372 wait but it's more likely we'll either be waiting longer than
373 necessary or not long enough, depending on just how much time the
374 threads actually take to finish computing when the program runs.
375
376 A more precise alternative to the timer would be to use one of the
377 mechanisms provided by the Rust standard library for synchronizing
378 threads with each other. Let's talk about one of them: channels.
379
380 ## Channels
381
382 Here's a version of our code that uses channels for synchronization, rather
383 than waiting for a specific time:
384
385 ```rust
386 use std::sync::{Arc, Mutex};
387 use std::thread;
388 use std::sync::mpsc;
389
390 fn main() {
391 let data = Arc::new(Mutex::new(0));
392
393 // `tx` is the "transmitter" or "sender".
394 // `rx` is the "receiver".
395 let (tx, rx) = mpsc::channel();
396
397 for _ in 0..10 {
398 let (data, tx) = (data.clone(), tx.clone());
399
400 thread::spawn(move || {
401 let mut data = data.lock().unwrap();
402 *data += 1;
403
404 tx.send(()).unwrap();
405 });
406 }
407
408 for _ in 0..10 {
409 rx.recv().unwrap();
410 }
411 }
412 ```
413
414 We use the `mpsc::channel()` method to construct a new channel. We `send`
415 a simple `()` down the channel, and then wait for ten of them to come back.
416
417 While this channel is sending a generic signal, we can send any data that
418 is `Send` over the channel!
419
420 ```rust
421 use std::thread;
422 use std::sync::mpsc;
423
424 fn main() {
425 let (tx, rx) = mpsc::channel();
426
427 for i in 0..10 {
428 let tx = tx.clone();
429
430 thread::spawn(move || {
431 let answer = i * i;
432
433 tx.send(answer).unwrap();
434 });
435 }
436
437 for _ in 0..10 {
438 println!("{}", rx.recv().unwrap());
439 }
440 }
441 ```
442
443 Here we create 10 threads, asking each to calculate the square of a number (`i`
444 at the time of `spawn()`), and then `send()` back the answer over the channel.
445
446
447 ## Panics
448
449 A `panic!` will crash the currently executing thread. You can use Rust's
450 threads as a simple isolation mechanism:
451
452 ```rust
453 use std::thread;
454
455 let handle = thread::spawn(move || {
456 panic!("oops!");
457 });
458
459 let result = handle.join();
460
461 assert!(result.is_err());
462 ```
463
464 `Thread.join()` gives us a `Result` back, which allows us to check if the thread
465 has panicked or not.