]> git.proxmox.com Git - rustc.git/blob - src/doc/book/src/ch15-06-reference-cycles.md
New upstream version 1.35.0+dfsg1
[rustc.git] / src / doc / book / src / ch15-06-reference-cycles.md
1 ## Reference Cycles Can Leak Memory
2
3 Rust’s memory safety guarantees make it difficult, but not impossible, to
4 accidentally create memory that is never cleaned up (known as a *memory leak*).
5 Preventing memory leaks entirely is not one of Rust’s guarantees in the same
6 way that disallowing data races at compile time is, meaning memory leaks are
7 memory safe in Rust. We can see that Rust allows memory leaks by using `Rc<T>`
8 and `RefCell<T>`: it’s possible to create references where items refer to each
9 other in a cycle. This creates memory leaks because the reference count of each
10 item in the cycle will never reach 0, and the values will never be dropped.
11
12 ### Creating a Reference Cycle
13
14 Let’s look at how a reference cycle might happen and how to prevent it,
15 starting with the definition of the `List` enum and a `tail` method in Listing
16 15-25:
17
18 <span class="filename">Filename: src/main.rs</span>
19
20 <!-- Hidden fn main is here to disable the automatic wrapping in fn main that
21 doc tests do; the `use List` fails if this listing is put within a main -->
22
23 ```rust
24 # fn main() {}
25 use std::rc::Rc;
26 use std::cell::RefCell;
27 use crate::List::{Cons, Nil};
28
29 #[derive(Debug)]
30 enum List {
31 Cons(i32, RefCell<Rc<List>>),
32 Nil,
33 }
34
35 impl List {
36 fn tail(&self) -> Option<&RefCell<Rc<List>>> {
37 match self {
38 Cons(_, item) => Some(item),
39 Nil => None,
40 }
41 }
42 }
43 ```
44
45 <span class="caption">Listing 15-25: A cons list definition that holds a
46 `RefCell<T>` so we can modify what a `Cons` variant is referring to</span>
47
48 We’re using another variation of the `List` definition from Listing 15-5. The
49 second element in the `Cons` variant is now `RefCell<Rc<List>>`, meaning that
50 instead of having the ability to modify the `i32` value as we did in Listing
51 15-24, we want to modify which `List` value a `Cons` variant is pointing to.
52 We’re also adding a `tail` method to make it convenient for us to access the
53 second item if we have a `Cons` variant.
54
55 In Listing 15-26, we’re adding a `main` function that uses the definitions in
56 Listing 15-25. This code creates a list in `a` and a list in `b` that points to
57 the list in `a`. Then it modifies the list in `a` to point to `b`, creating a
58 reference cycle. There are `println!` statements along the way to show what the
59 reference counts are at various points in this process.
60
61 <span class="filename">Filename: src/main.rs</span>
62
63 ```rust
64 # use crate::List::{Cons, Nil};
65 # use std::rc::Rc;
66 # use std::cell::RefCell;
67 # #[derive(Debug)]
68 # enum List {
69 # Cons(i32, RefCell<Rc<List>>),
70 # Nil,
71 # }
72 #
73 # impl List {
74 # fn tail(&self) -> Option<&RefCell<Rc<List>>> {
75 # match self {
76 # Cons(_, item) => Some(item),
77 # Nil => None,
78 # }
79 # }
80 # }
81 #
82 fn main() {
83 let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));
84
85 println!("a initial rc count = {}", Rc::strong_count(&a));
86 println!("a next item = {:?}", a.tail());
87
88 let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));
89
90 println!("a rc count after b creation = {}", Rc::strong_count(&a));
91 println!("b initial rc count = {}", Rc::strong_count(&b));
92 println!("b next item = {:?}", b.tail());
93
94 if let Some(link) = a.tail() {
95 *link.borrow_mut() = Rc::clone(&b);
96 }
97
98 println!("b rc count after changing a = {}", Rc::strong_count(&b));
99 println!("a rc count after changing a = {}", Rc::strong_count(&a));
100
101 // Uncomment the next line to see that we have a cycle;
102 // it will overflow the stack
103 // println!("a next item = {:?}", a.tail());
104 }
105 ```
106
107 <span class="caption">Listing 15-26: Creating a reference cycle of two `List`
108 values pointing to each other</span>
109
110 We create an `Rc<List>` instance holding a `List` value in the variable `a`
111 with an initial list of `5, Nil`. We then create an `Rc<List>` instance
112 holding another `List` value in the variable `b` that contains the value 10 and
113 points to the list in `a`.
114
115 We modify `a` so it points to `b` instead of `Nil`, creating a cycle. We
116 do that by using the `tail` method to get a reference to the
117 `RefCell<Rc<List>>` in `a`, which we put in the variable `link`. Then we use
118 the `borrow_mut` method on the `RefCell<Rc<List>>` to change the value inside
119 from an `Rc<List>` that holds a `Nil` value to the `Rc<List>` in `b`.
120
121 When we run this code, keeping the last `println!` commented out for the
122 moment, we’ll get this output:
123
124 ```text
125 a initial rc count = 1
126 a next item = Some(RefCell { value: Nil })
127 a rc count after b creation = 2
128 b initial rc count = 1
129 b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) })
130 b rc count after changing a = 2
131 a rc count after changing a = 2
132 ```
133
134 The reference count of the `Rc<List>` instances in both `a` and `b` are 2
135 after we change the list in `a` to point to `b`. At the end of `main`, Rust
136 will try to drop `b` first, which will decrease the count of the `Rc<List>`
137 instance in `b` by 1.
138
139 However, because `a` is still referencing the `Rc<List>` that was in `b`, that
140 `Rc<List>` has a count of 1 rather than 0, so the memory the `Rc<List>` has on
141 the heap won’t be dropped. The memory will just sit there with a count of 1,
142 forever. To visualize this reference cycle, we’ve created a diagram in Figure
143 15-4.
144
145 <img alt="Reference cycle of lists" src="img/trpl15-04.svg" class="center" />
146
147 <span class="caption">Figure 15-4: A reference cycle of lists `a` and `b`
148 pointing to each other</span>
149
150 If you uncomment the last `println!` and run the program, Rust will try to
151 print this cycle with `a` pointing to `b` pointing to `a` and so forth until it
152 overflows the stack.
153
154 In this case, right after we create the reference cycle, the program ends. The
155 consequences of this cycle aren’t very dire. However, if a more complex program
156 allocated lots of memory in a cycle and held onto it for a long time, the
157 program would use more memory than it needed and might overwhelm the system,
158 causing it to run out of available memory.
159
160 Creating reference cycles is not easily done, but it’s not impossible either.
161 If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested
162 combinations of types with interior mutability and reference counting, you must
163 ensure that you don’t create cycles; you can’t rely on Rust to catch them.
164 Creating a reference cycle would be a logic bug in your program that you should
165 use automated tests, code reviews, and other software development practices to
166 minimize.
167
168 Another solution for avoiding reference cycles is reorganizing your data
169 structures so that some references express ownership and some references don’t.
170 As a result, you can have cycles made up of some ownership relationships and
171 some non-ownership relationships, and only the ownership relationships affect
172 whether or not a value can be dropped. In Listing 15-25, we always want `Cons`
173 variants to own their list, so reorganizing the data structure isn’t possible.
174 Let’s look at an example using graphs made up of parent nodes and child nodes
175 to see when non-ownership relationships are an appropriate way to prevent
176 reference cycles.
177
178 ### Preventing Reference Cycles: Turning an `Rc<T>` into a `Weak<T>`
179
180 So far, we’ve demonstrated that calling `Rc::clone` increases the
181 `strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only cleaned
182 up if its `strong_count` is 0. You can also create a *weak reference* to the
183 value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a
184 reference to the `Rc<T>`. When you call `Rc::downgrade`, you get a smart
185 pointer of type `Weak<T>`. Instead of increasing the `strong_count` in the
186 `Rc<T>` instance by 1, calling `Rc::downgrade` increases the `weak_count` by 1.
187 The `Rc<T>` type uses `weak_count` to keep track of how many `Weak<T>`
188 references exist, similar to `strong_count`. The difference is the `weak_count`
189 doesn’t need to be 0 for the `Rc<T>` instance to be cleaned up.
190
191 Strong references are how you can share ownership of an `Rc<T>` instance. Weak
192 references don’t express an ownership relationship. They won’t cause a
193 reference cycle because any cycle involving some weak references will be broken
194 once the strong reference count of values involved is 0.
195
196 Because the value that `Weak<T>` references might have been dropped, to do
197 anything with the value that a `Weak<T>` is pointing to, you must make sure the
198 value still exists. Do this by calling the `upgrade` method on a `Weak<T>`
199 instance, which will return an `Option<Rc<T>>`. You’ll get a result of `Some`
200 if the `Rc<T>` value has not been dropped yet and a result of `None` if the
201 `Rc<T>` value has been dropped. Because `upgrade` returns an `Option<T>`, Rust
202 will ensure that the `Some` case and the `None` case are handled, and there
203 won’t be an invalid pointer.
204
205 As an example, rather than using a list whose items know only about the next
206 item, we’ll create a tree whose items know about their children items *and*
207 their parent items.
208
209 #### Creating a Tree Data Structure: a `Node` with Child Nodes
210
211 To start, we’ll build a tree with nodes that know about their child nodes.
212 We’ll create a struct named `Node` that holds its own `i32` value as well as
213 references to its children `Node` values:
214
215 <span class="filename">Filename: src/main.rs</span>
216
217 ```rust
218 use std::rc::Rc;
219 use std::cell::RefCell;
220
221 #[derive(Debug)]
222 struct Node {
223 value: i32,
224 children: RefCell<Vec<Rc<Node>>>,
225 }
226 ```
227
228 We want a `Node` to own its children, and we want to share that ownership with
229 variables so we can access each `Node` in the tree directly. To do this, we
230 define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to
231 modify which nodes are children of another node, so we have a `RefCell<T>` in
232 `children` around the `Vec<Rc<Node>>`.
233
234 Next, we’ll use our struct definition and create one `Node` instance named
235 `leaf` with the value 3 and no children, and another instance named `branch`
236 with the value 5 and `leaf` as one of its children, as shown in Listing 15-27:
237
238 <span class="filename">Filename: src/main.rs</span>
239
240 ```rust
241 # use std::rc::Rc;
242 # use std::cell::RefCell;
243 #
244 # #[derive(Debug)]
245 # struct Node {
246 # value: i32,
247 # children: RefCell<Vec<Rc<Node>>>,
248 # }
249 #
250 fn main() {
251 let leaf = Rc::new(Node {
252 value: 3,
253 children: RefCell::new(vec![]),
254 });
255
256 let branch = Rc::new(Node {
257 value: 5,
258 children: RefCell::new(vec![Rc::clone(&leaf)]),
259 });
260 }
261 ```
262
263 <span class="caption">Listing 15-27: Creating a `leaf` node with no children
264 and a `branch` node with `leaf` as one of its children</span>
265
266 We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the
267 `Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from
268 `branch` to `leaf` through `branch.children`, but there’s no way to get from
269 `leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and
270 doesn’t know they’re related. We want `leaf` to know that `branch` is its
271 parent. We’ll do that next.
272
273 #### Adding a Reference from a Child to Its Parent
274
275 To make the child node aware of its parent, we need to add a `parent` field to
276 our `Node` struct definition. The trouble is in deciding what the type of
277 `parent` should be. We know it can’t contain an `Rc<T>`, because that would
278 create a reference cycle with `leaf.parent` pointing to `branch` and
279 `branch.children` pointing to `leaf`, which would cause their `strong_count`
280 values to never be 0.
281
282 Thinking about the relationships another way, a parent node should own its
283 children: if a parent node is dropped, its child nodes should be dropped as
284 well. However, a child should not own its parent: if we drop a child node, the
285 parent should still exist. This is a case for weak references!
286
287 So instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`,
288 specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks
289 like this:
290
291 <span class="filename">Filename: src/main.rs</span>
292
293 ```rust
294 use std::rc::{Rc, Weak};
295 use std::cell::RefCell;
296
297 #[derive(Debug)]
298 struct Node {
299 value: i32,
300 parent: RefCell<Weak<Node>>,
301 children: RefCell<Vec<Rc<Node>>>,
302 }
303 ```
304
305 A node will be able to refer to its parent node but doesn’t own its parent.
306 In Listing 15-28, we update `main` to use this new definition so the `leaf`
307 node will have a way to refer to its parent, `branch`:
308
309 <span class="filename">Filename: src/main.rs</span>
310
311 ```rust
312 # use std::rc::{Rc, Weak};
313 # use std::cell::RefCell;
314 #
315 # #[derive(Debug)]
316 # struct Node {
317 # value: i32,
318 # parent: RefCell<Weak<Node>>,
319 # children: RefCell<Vec<Rc<Node>>>,
320 # }
321 #
322 fn main() {
323 let leaf = Rc::new(Node {
324 value: 3,
325 parent: RefCell::new(Weak::new()),
326 children: RefCell::new(vec![]),
327 });
328
329 println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
330
331 let branch = Rc::new(Node {
332 value: 5,
333 parent: RefCell::new(Weak::new()),
334 children: RefCell::new(vec![Rc::clone(&leaf)]),
335 });
336
337 *leaf.parent.borrow_mut() = Rc::downgrade(&branch);
338
339 println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
340 }
341 ```
342
343 <span class="caption">Listing 15-28: A `leaf` node with a weak reference to its
344 parent node `branch`</span>
345
346 Creating the `leaf` node looks similar to how creating the `leaf` node looked
347 in Listing 15-27 with the exception of the `parent` field: `leaf` starts out
348 without a parent, so we create a new, empty `Weak<Node>` reference instance.
349
350 At this point, when we try to get a reference to the parent of `leaf` by using
351 the `upgrade` method, we get a `None` value. We see this in the output from the
352 first `println!` statement:
353
354 ```text
355 leaf parent = None
356 ```
357
358 When we create the `branch` node, it will also have a new `Weak<Node>`
359 reference in the `parent` field, because `branch` doesn’t have a parent node.
360 We still have `leaf` as one of the children of `branch`. Once we have the
361 `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>`
362 reference to its parent. We use the `borrow_mut` method on the
363 `RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the
364 `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from
365 the `Rc<Node>` in `branch.`
366
367 When we print the parent of `leaf` again, this time we’ll get a `Some` variant
368 holding `branch`: now `leaf` can access its parent! When we print `leaf`, we
369 also avoid the cycle that eventually ended in a stack overflow like we had in
370 Listing 15-26; the `Weak<Node>` references are printed as `(Weak)`:
371
372 ```text
373 leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) },
374 children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) },
375 children: RefCell { value: [] } }] } })
376 ```
377
378 The lack of infinite output indicates that this code didn’t create a reference
379 cycle. We can also tell this by looking at the values we get from calling
380 `Rc::strong_count` and `Rc::weak_count`.
381
382 #### Visualizing Changes to `strong_count` and `weak_count`
383
384 Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>`
385 instances change by creating a new inner scope and moving the creation of
386 `branch` into that scope. By doing so, we can see what happens when `branch` is
387 created and then dropped when it goes out of scope. The modifications are shown
388 in Listing 15-29:
389
390 <span class="filename">Filename: src/main.rs</span>
391
392 ```rust
393 # use std::rc::{Rc, Weak};
394 # use std::cell::RefCell;
395 #
396 # #[derive(Debug)]
397 # struct Node {
398 # value: i32,
399 # parent: RefCell<Weak<Node>>,
400 # children: RefCell<Vec<Rc<Node>>>,
401 # }
402 #
403 fn main() {
404 let leaf = Rc::new(Node {
405 value: 3,
406 parent: RefCell::new(Weak::new()),
407 children: RefCell::new(vec![]),
408 });
409
410 println!(
411 "leaf strong = {}, weak = {}",
412 Rc::strong_count(&leaf),
413 Rc::weak_count(&leaf),
414 );
415
416 {
417 let branch = Rc::new(Node {
418 value: 5,
419 parent: RefCell::new(Weak::new()),
420 children: RefCell::new(vec![Rc::clone(&leaf)]),
421 });
422
423 *leaf.parent.borrow_mut() = Rc::downgrade(&branch);
424
425 println!(
426 "branch strong = {}, weak = {}",
427 Rc::strong_count(&branch),
428 Rc::weak_count(&branch),
429 );
430
431 println!(
432 "leaf strong = {}, weak = {}",
433 Rc::strong_count(&leaf),
434 Rc::weak_count(&leaf),
435 );
436 }
437
438 println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
439 println!(
440 "leaf strong = {}, weak = {}",
441 Rc::strong_count(&leaf),
442 Rc::weak_count(&leaf),
443 );
444 }
445 ```
446
447 <span class="caption">Listing 15-29: Creating `branch` in an inner scope and
448 examining strong and weak reference counts</span>
449
450 After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak
451 count of 0. In the inner scope, we create `branch` and associate it with
452 `leaf`, at which point when we print the counts, the `Rc<Node>` in `branch`
453 will have a strong count of 1 and a weak count of 1 (for `leaf.parent` pointing
454 to `branch` with a `Weak<Node>`). When we print the counts in `leaf`, we’ll see
455 it will have a strong count of 2, because `branch` now has a clone of the
456 `Rc<Node>` of `leaf` stored in `branch.children`, but will still have a weak
457 count of 0.
458
459 When the inner scope ends, `branch` goes out of scope and the strong count of
460 the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1
461 from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we
462 don’t get any memory leaks!
463
464 If we try to access the parent of `leaf` after the end of the scope, we’ll get
465 `None` again. At the end of the program, the `Rc<Node>` in `leaf` has a strong
466 count of 1 and a weak count of 0, because the variable `leaf` is now the only
467 reference to the `Rc<Node>` again.
468
469 All of the logic that manages the counts and value dropping is built into
470 `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By
471 specifying that the relationship from a child to its parent should be a
472 `Weak<T>` reference in the definition of `Node`, you’re able to have parent
473 nodes point to child nodes and vice versa without creating a reference cycle
474 and memory leaks.
475
476 ## Summary
477
478 This chapter covered how to use smart pointers to make different guarantees and
479 trade-offs from those Rust makes by default with regular references. The
480 `Box<T>` type has a known size and points to data allocated on the heap. The
481 `Rc<T>` type keeps track of the number of references to data on the heap so
482 that data can have multiple owners. The `RefCell<T>` type with its interior
483 mutability gives us a type that we can use when we need an immutable type but
484 need to change an inner value of that type; it also enforces the borrowing
485 rules at runtime instead of at compile time.
486
487 Also discussed were the `Deref` and `Drop` traits, which enable a lot of the
488 functionality of smart pointers. We explored reference cycles that can cause
489 memory leaks and how to prevent them using `Weak<T>`.
490
491 If this chapter has piqued your interest and you want to implement your own
492 smart pointers, check out [“The Rustonomicon”][nomicon] for more useful
493 information.
494
495 [nomicon]: https://doc.rust-lang.org/stable/nomicon/
496
497 Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new
498 smart pointers.