]> git.proxmox.com Git - rustc.git/blame - src/doc/book/nostarch/chapter15.md
New upstream version 1.62.1+dfsg1
[rustc.git] / src / doc / book / nostarch / chapter15.md
CommitLineData
5099ac24
FG
1<!-- DO NOT EDIT THIS FILE.
2
3This file is periodically generated from the content in the `/src/`
4directory, so all fixes need to be made in `/src/`.
5-->
6
7[TOC]
8
9# Smart Pointers
10
11A *pointer* is a general concept for a variable that contains an address in
12memory. This address refers to, or “points at,” some other data. The most
13common kind of pointer in Rust is a reference, which you learned about in
14Chapter 4. References are indicated by the `&` symbol and borrow the value they
15point to. They don’t have any special capabilities other than referring to
04454e1e 16data, and have no overhead.
5099ac24 17
04454e1e
FG
18*Smart pointers*, on the other hand, are data structures that act like a
19pointer but also have additional metadata and capabilities. The concept of
5099ac24 20smart pointers isn’t unique to Rust: smart pointers originated in C++ and exist
04454e1e
FG
21in other languages as well. Rust has a variety of smart pointers defined in the
22standard library that provide functionality beyond that provided by references.
23To explore the general concept, we'll look at a couple of different examples of
24smart pointers, including a *reference counting* smart pointer type. This
25pointer enables you to allow data to have multiple owners by keeping track of
26the number of owners and, when no owners remain, cleaning up the data.
27
28Rust, with its concept of ownership and borrowing, has an additional difference
29between references and smart pointers: while references only borrow data, in
30many cases, smart pointers *own* the data they point to.
31
32Though we didn't call them as much at the time, we’ve already encountered a few
33smart pointers in this book, including `String` and `Vec<T>` in Chapter 8. Both
34these types count as smart pointers because they own some memory and allow you
35to manipulate it. They also have metadata and extra capabilities or guarantees.
36`String`, for example, stores its capacity as metadata and has the extra
37ability to ensure its data will always be valid UTF-8.
38
39Smart pointers are usually implemented using structs. Unlike an ordinary
40struct, smart pointers implement the `Deref` and `Drop` traits. The `Deref`
41trait allows an instance of the smart pointer struct to behave like a reference
42so you can write your code to work with either references or smart pointers.
43The `Drop` trait allows you to customize the code that's run when an instance
44of the smart pointer goes out of scope. In this chapter, we’ll discuss both
45traits and demonstrate why they’re important to smart pointers.
5099ac24
FG
46
47Given that the smart pointer pattern is a general design pattern used
48frequently in Rust, this chapter won’t cover every existing smart pointer. Many
49libraries have their own smart pointers, and you can even write your own. We’ll
50cover the most common smart pointers in the standard library:
51
52* `Box<T>` for allocating values on the heap
53* `Rc<T>`, a reference counting type that enables multiple ownership
54* `Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`, a type that enforces
55 the borrowing rules at runtime instead of compile time
56
57In addition, we’ll cover the *interior mutability* pattern where an immutable
58type exposes an API for mutating an interior value. We’ll also discuss
59*reference cycles*: how they can leak memory and how to prevent them.
60
61Let’s dive in!
62
63## Using `Box<T>` to Point to Data on the Heap
64
65The most straightforward smart pointer is a *box*, whose type is written
66`Box<T>`. Boxes allow you to store data on the heap rather than the stack. What
67remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
68review the difference between the stack and the heap.
69
70Boxes don’t have performance overhead, other than storing their data on the
71heap instead of on the stack. But they don’t have many extra capabilities
72either. You’ll use them most often in these situations:
73
74* When you have a type whose size can’t be known at compile time and you want
75 to use a value of that type in a context that requires an exact size
76* When you have a large amount of data and you want to transfer ownership but
77 ensure the data won’t be copied when you do so
78* When you want to own a value and you care only that it’s a type that
79 implements a particular trait rather than being of a specific type
80
81We’ll demonstrate the first situation in the “Enabling Recursive Types with
82Boxes” section. In the second case, transferring ownership of a large amount of
83data can take a long time because the data is copied around on the stack. To
84improve performance in this situation, we can store the large amount of data on
85the heap in a box. Then, only the small amount of pointer data is copied around
86on the stack, while the data it references stays in one place on the heap. The
87third case is known as a *trait object*, and Chapter 17 devotes an entire
88section, “Using Trait Objects That Allow for Values of Different Types,” just
89to that topic. So what you learn here you’ll apply again in Chapter 17!
90
91### Using a `Box<T>` to Store Data on the Heap
92
04454e1e
FG
93Before we discuss the heap storage use case for `Box<T>`, we’ll cover the
94syntax and how to interact with values stored within a `Box<T>`.
5099ac24
FG
95
96Listing 15-1 shows how to use a box to store an `i32` value on the heap:
97
98Filename: src/main.rs
99
100```
101fn main() {
102 let b = Box::new(5);
103 println!("b = {}", b);
104}
105```
106
107Listing 15-1: Storing an `i32` value on the heap using a box
108
109We define the variable `b` to have the value of a `Box` that points to the
110value `5`, which is allocated on the heap. This program will print `b = 5`; in
111this case, we can access the data in the box similar to how we would if this
112data were on the stack. Just like any owned value, when a box goes out of
113scope, as `b` does at the end of `main`, it will be deallocated. The
04454e1e
FG
114deallocation happens both for the box (stored on the stack) and the data it
115points to (stored on the heap).
5099ac24
FG
116
117Putting a single value on the heap isn’t very useful, so you won’t use boxes by
118themselves in this way very often. Having values like a single `i32` on the
119stack, where they’re stored by default, is more appropriate in the majority of
120situations. Let’s look at a case where boxes allow us to define types that we
121wouldn’t be allowed to if we didn’t have boxes.
122
123### Enabling Recursive Types with Boxes
124
04454e1e
FG
125A value of *recursive type* can have another value of the same type as part of
126itself. Recursive types pose an issue because at compile time Rust needs to
127know how much space a type takes up. However, the nesting of values of
128recursive types could theoretically continue infinitely, so Rust can’t know how
129much space the value needs. Because boxes have a known size, we can enable
130recursive types by inserting a box in the recursive type definition.
5099ac24 131
04454e1e
FG
132As an example of a recursive type, let’s explore the *cons list*. This is a data
133type commonly found in functional programming languages. The cons list type
5099ac24
FG
134we’ll define is straightforward except for the recursion; therefore, the
135concepts in the example we’ll work with will be useful any time you get into
136more complex situations involving recursive types.
137
138#### More Information About the Cons List
139
140A *cons list* is a data structure that comes from the Lisp programming language
04454e1e
FG
141and its dialects and is made up of nested pairs. Its name comes from the `cons`
142function (short for “construct function”) in Lisp that constructs a new pair
143from its two arguments. By calling `cons` on a pair consisting of a value and
144another pair, we can construct cons lists made up of recursive pairs.
5099ac24 145
04454e1e
FG
146<!-- can you add a direct definition of the cons list -- we
147have a definition for the cons function, but not a direct one for the list. I
148assume the list is the result of the function, so is a recursive list of nested
149pairs? /LC -->
150<!-- There was a definition here, "These pairs containing pairs form a list",
151but I guess it wasn't clear enough. Is this better? /Carol -->
152
153For example, here's a pseudocode representation of a cons list containing the
154list 1, 2, 3 with each pair in parentheses:
155
156```
157(1, (2, (3, Nil)))
158```
159
160<!-- maybe here also show us a cons list -- not necessarily how to make one,
161but the actual list that's the result /LC -->
162<!-- Done! /Carol -->
5099ac24
FG
163
164Each item in a cons list contains two elements: the value of the current item
165and the next item. The last item in the list contains only a value called `Nil`
166without a next item. A cons list is produced by recursively calling the `cons`
167function. The canonical name to denote the base case of the recursion is `Nil`.
168Note that this is not the same as the “null” or “nil” concept in Chapter 6,
169which is an invalid or absent value.
170
04454e1e
FG
171The cons list isn’t a commonly used data structure in Rust. Most of the time
172when you have a list of items in Rust, `Vec<T>` is a better choice to use.
173Other, more complex recursive data types *are* useful in various situations,
174but by starting with the cons list in this chapter, we can explore how boxes
175let us define a recursive data type without much distraction.
5099ac24
FG
176
177Listing 15-2 contains an enum definition for a cons list. Note that this code
178won’t compile yet because the `List` type doesn’t have a known size, which
179we’ll demonstrate.
180
181Filename: src/main.rs
182
183```
184enum List {
185 Cons(i32, List),
186 Nil,
187}
188```
189
190Listing 15-2: The first attempt at defining an enum to represent a cons list
191data structure of `i32` values
192
193> Note: We’re implementing a cons list that holds only `i32` values for the
194> purposes of this example. We could have implemented it using generics, as we
195> discussed in Chapter 10, to define a cons list type that could store values of
196> any type.
197
198Using the `List` type to store the list `1, 2, 3` would look like the code in
199Listing 15-3:
200
201Filename: src/main.rs
202
203```
204use crate::List::{Cons, Nil};
205
206fn main() {
207 let list = Cons(1, Cons(2, Cons(3, Nil)));
208}
209```
210
211Listing 15-3: Using the `List` enum to store the list `1, 2, 3`
212
213The first `Cons` value holds `1` and another `List` value. This `List` value is
214another `Cons` value that holds `2` and another `List` value. This `List` value
215is one more `Cons` value that holds `3` and a `List` value, which is finally
216`Nil`, the non-recursive variant that signals the end of the list.
217
04454e1e
FG
218<!-- I really like the use of the cons list as the example here /LC -->
219
5099ac24
FG
220If we try to compile the code in Listing 15-3, we get the error shown in
221Listing 15-4:
222
223```
224error[E0072]: recursive type `List` has infinite size
225 --> src/main.rs:1:1
226 |
2271 | enum List {
228 | ^^^^^^^^^ recursive type has infinite size
2292 | Cons(i32, List),
230 | ---- recursive without indirection
231 |
232help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` representable
233```
234
235Listing 15-4: The error we get when attempting to define a recursive enum
236
237The error shows this type “has infinite size.” The reason is that we’ve defined
238`List` with a variant that is recursive: it holds another value of itself
239directly. As a result, Rust can’t figure out how much space it needs to store a
04454e1e
FG
240`List` value. Let’s break down why we get this error. First, we'll look at how
241Rust decides how much space it needs to store a value of a non-recursive type.
5099ac24
FG
242
243#### Computing the Size of a Non-Recursive Type
244
245Recall the `Message` enum we defined in Listing 6-2 when we discussed enum
246definitions in Chapter 6:
247
248```
249enum Message {
250 Quit,
251 Move { x: i32, y: i32 },
252 Write(String),
253 ChangeColor(i32, i32, i32),
254}
255```
256
257To determine how much space to allocate for a `Message` value, Rust goes
258through each of the variants to see which variant needs the most space. Rust
259sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough
260space to store two `i32` values, and so forth. Because only one variant will be
261used, the most space a `Message` value will need is the space it would take to
262store the largest of its variants.
263
264Contrast this with what happens when Rust tries to determine how much space a
265recursive type like the `List` enum in Listing 15-2 needs. The compiler starts
266by looking at the `Cons` variant, which holds a value of type `i32` and a value
267of type `List`. Therefore, `Cons` needs an amount of space equal to the size of
268an `i32` plus the size of a `List`. To figure out how much memory the `List`
269type needs, the compiler looks at the variants, starting with the `Cons`
270variant. The `Cons` variant holds a value of type `i32` and a value of type
271`List`, and this process continues infinitely, as shown in Figure 15-1.
272
273<img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" />
274
275Figure 15-1: An infinite `List` consisting of infinite `Cons` variants
276
277#### Using `Box<T>` to Get a Recursive Type with a Known Size
278
04454e1e
FG
279Because Rust can’t figure out how much space to allocate for recursively
280defined types, the compiler gives an error with this helpful suggestion:
5099ac24
FG
281
282```
283help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` representable
284```
285
286In this suggestion, “indirection” means that instead of storing a value
04454e1e 287directly, we should change the data structure to store the value indirectly by
5099ac24
FG
288storing a pointer to the value instead.
289
290Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>`
291needs: a pointer’s size doesn’t change based on the amount of data it’s
292pointing to. This means we can put a `Box<T>` inside the `Cons` variant instead
293of another `List` value directly. The `Box<T>` will point to the next `List`
294value that will be on the heap rather than inside the `Cons` variant.
04454e1e
FG
295Conceptually, we still have a list, created with lists holding other lists, but
296this implementation is now more like placing the items next to one another
5099ac24
FG
297rather than inside one another.
298
299We can change the definition of the `List` enum in Listing 15-2 and the usage
300of the `List` in Listing 15-3 to the code in Listing 15-5, which will compile:
301
302Filename: src/main.rs
303
304```
305enum List {
306 Cons(i32, Box<List>),
307 Nil,
308}
309
310use crate::List::{Cons, Nil};
311
312fn main() {
313 let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
314}
315```
316
317Listing 15-5: Definition of `List` that uses `Box<T>` in order to have a known
318size
319
04454e1e 320The `Cons` variant needs the size of an `i32` plus the space to store the
5099ac24
FG
321box’s pointer data. The `Nil` variant stores no values, so it needs less space
322than the `Cons` variant. We now know that any `List` value will take up the
323size of an `i32` plus the size of a box’s pointer data. By using a box, we’ve
324broken the infinite, recursive chain, so the compiler can figure out the size
325it needs to store a `List` value. Figure 15-2 shows what the `Cons` variant
326looks like now.
327
328<img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" />
329
330Figure 15-2: A `List` that is not infinitely sized because `Cons` holds a `Box`
331
332Boxes provide only the indirection and heap allocation; they don’t have any
333other special capabilities, like those we’ll see with the other smart pointer
04454e1e 334types. They also don’t have the performance overhead that these special
5099ac24
FG
335capabilities incur, so they can be useful in cases like the cons list where the
336indirection is the only feature we need. We’ll look at more use cases for boxes
337in Chapter 17, too.
338
339The `Box<T>` type is a smart pointer because it implements the `Deref` trait,
340which allows `Box<T>` values to be treated like references. When a `Box<T>`
341value goes out of scope, the heap data that the box is pointing to is cleaned
04454e1e
FG
342up as well because of the `Drop` trait implementation. These two traits will be
343even more important to the functionality provided by the other smart pointer
344types we’ll discuss in the rest of this chapter. Let’s explore these two traits
345in more detail.
5099ac24
FG
346
347## Treating Smart Pointers Like Regular References with the `Deref` Trait
348
349Implementing the `Deref` trait allows you to customize the behavior of the
04454e1e 350*dereference operator* `*` (not to be confused with the multiplication or glob
5099ac24
FG
351operator). By implementing `Deref` in such a way that a smart pointer can be
352treated like a regular reference, you can write code that operates on
353references and use that code with smart pointers too.
354
355Let’s first look at how the dereference operator works with regular references.
356Then we’ll try to define a custom type that behaves like `Box<T>`, and see why
357the dereference operator doesn’t work like a reference on our newly defined
358type. We’ll explore how implementing the `Deref` trait makes it possible for
359smart pointers to work in ways similar to references. Then we’ll look at
360Rust’s *deref coercion* feature and how it lets us work with either references
361or smart pointers.
362
363> Note: there’s one big difference between the `MyBox<T>` type we’re about to
364> build and the real `Box<T>`: our version will not store its data on the heap.
365> We are focusing this example on `Deref`, so where the data is actually stored
366> is less important than the pointer-like behavior.
367
04454e1e 368### Following the Pointer to the Value
5099ac24
FG
369
370A regular reference is a type of pointer, and one way to think of a pointer is
371as an arrow to a value stored somewhere else. In Listing 15-6, we create a
372reference to an `i32` value and then use the dereference operator to follow the
04454e1e
FG
373reference to the value:
374<!-- is the data different to the value? I wasn't sure if the two things above
375were supposed to be doing different things? /LC -->
376<!-- I'm not sure what you mean by "two things above", but generally "value"
377and "data" are interchangeable. I switched "data" to "value" in this paragraph,
378does that fix the issue? /Carol -->
5099ac24
FG
379
380Filename: src/main.rs
381
382```
383fn main() {
384 [1] let x = 5;
385 [2] let y = &x;
386
387 [3] assert_eq!(5, x);
388 [4] assert_eq!(5, *y);
389}
390```
391
392Listing 15-6: Using the dereference operator to follow a reference to an `i32`
393value
394
04454e1e 395The variable `x` holds an `i32` value `5` [1]. We set `y` equal to a reference
5099ac24
FG
396to `x` [2]. We can assert that `x` is equal to `5` [3]. However, if we want to
397make an assertion about the value in `y`, we have to use `*y` to follow the
04454e1e
FG
398reference to the value it’s pointing to (hence *dereference*) so the compiler
399can compare the actual value [4]. Once we dereference `y`, we have access to
400the integer value `y` is pointing to that we can compare with `5`.
5099ac24
FG
401
402If we tried to write `assert_eq!(5, y);` instead, we would get this compilation
403error:
404
405```
406error[E0277]: can't compare `{integer}` with `&{integer}`
407 --> src/main.rs:6:5
408 |
4096 | assert_eq!(5, y);
410 | ^^^^^^^^^^^^^^^^ no implementation for `{integer} == &{integer}`
411 |
412 = help: the trait `PartialEq<&{integer}>` is not implemented for `{integer}`
413```
414
415Comparing a number and a reference to a number isn’t allowed because they’re
416different types. We must use the dereference operator to follow the reference
417to the value it’s pointing to.
418
419### Using `Box<T>` Like a Reference
420
421We can rewrite the code in Listing 15-6 to use a `Box<T>` instead of a
04454e1e
FG
422reference; the dereference operator used on the `Box<T>` in Listing 15-7
423functions in the same way as the dereference operator used on the reference in
424Listing 15-6:
5099ac24
FG
425
426Filename: src/main.rs
427
428```
429fn main() {
430 let x = 5;
431 [1] let y = Box::new(x);
432
433 assert_eq!(5, x);
434 [2] assert_eq!(5, *y);
435}
436```
437
438Listing 15-7: Using the dereference operator on a `Box<i32>`
439
440The main difference between Listing 15-7 and Listing 15-6 is that here we set
441`y` to be an instance of a box pointing to a copied value of `x` rather than a
442reference pointing to the value of `x` [1]. In the last assertion [2], we can
443use the dereference operator to follow the box’s pointer in the same way that
444we did when `y` was a reference. Next, we’ll explore what is special about
445`Box<T>` that enables us to use the dereference operator by defining our own
446box type.
04454e1e
FG
447<!-- is there a benefit to using this method -- a box rather than a reference
448-- or is this just pure example? they seemed to be doing pretty much the same
449thing, I wasn't clear why you'd choose one over the other /LC -->
450<!-- This example is illustrating that the box and the reference behave the
451same in regards to the dereference operator. Differences and why to use `Box`
452are discussed before and after this section. I tried to clarify the purpose of
453this listing a bit in the paragraph just before the listing, does that help?
454/Carol -->
5099ac24
FG
455
456### Defining Our Own Smart Pointer
457
458Let’s build a smart pointer similar to the `Box<T>` type provided by the
459standard library to experience how smart pointers behave differently from
460references by default. Then we’ll look at how to add the ability to use the
461dereference operator.
462
463The `Box<T>` type is ultimately defined as a tuple struct with one element, so
464Listing 15-8 defines a `MyBox<T>` type in the same way. We’ll also define a
465`new` function to match the `new` function defined on `Box<T>`.
466
467Filename: src/main.rs
468
469```
470[1] struct MyBox<T>(T);
471
472impl<T> MyBox<T> {
473 [2] fn new(x: T) -> MyBox<T> {
474 [3] MyBox(x)
475 }
476}
477```
478
479Listing 15-8: Defining a `MyBox<T>` type
480
481We define a struct named `MyBox` and declare a generic parameter `T` [1],
482because we want our type to hold values of any type. The `MyBox` type is a
483tuple struct with one element of type `T`. The `MyBox::new` function takes one
484parameter of type `T` [2] and returns a `MyBox` instance that holds the value
485passed in [3].
486
487Let’s try adding the `main` function in Listing 15-7 to Listing 15-8 and
488changing it to use the `MyBox<T>` type we’ve defined instead of `Box<T>`. The
489code in Listing 15-9 won’t compile because Rust doesn’t know how to dereference
490`MyBox`.
491
492Filename: src/main.rs
493
494```
495fn main() {
496 let x = 5;
497 let y = MyBox::new(x);
498
499 assert_eq!(5, x);
500 assert_eq!(5, *y);
501}
502```
503
504Listing 15-9: Attempting to use `MyBox<T>` in the same way we used references
505and `Box<T>`
506
507Here’s the resulting compilation error:
508
509```
510error[E0614]: type `MyBox<{integer}>` cannot be dereferenced
511 --> src/main.rs:14:19
512 |
51314 | assert_eq!(5, *y);
514 | ^^
515```
516
517Our `MyBox<T>` type can’t be dereferenced because we haven’t implemented that
518ability on our type. To enable dereferencing with the `*` operator, we
519implement the `Deref` trait.
520
521### Treating a Type Like a Reference by Implementing the `Deref` Trait
522
523As discussed in the “Implementing a Trait on a Type” section of Chapter 10, to
524implement a trait, we need to provide implementations for the trait’s required
525methods. The `Deref` trait, provided by the standard library, requires us to
526implement one method named `deref` that borrows `self` and returns a reference
527to the inner data. Listing 15-10 contains an implementation of `Deref` to add
528to the definition of `MyBox`:
529
530Filename: src/main.rs
531
532```
533use std::ops::Deref;
534
535impl<T> Deref for MyBox<T> {
536 [1] type Target = T;
537
538 fn deref(&self) -> &Self::Target {
539 [2] &self.0
540 }
541}
542```
543
544Listing 15-10: Implementing `Deref` on `MyBox<T>`
545
546The `type Target = T;` syntax [1] defines an associated type for the `Deref`
547trait to use. Associated types are a slightly different way of declaring a
548generic parameter, but you don’t need to worry about them for now; we’ll cover
549them in more detail in Chapter 19.
550
551We fill in the body of the `deref` method with `&self.0` so `deref` returns a
04454e1e 552reference to the value we want to access with the `*` operator [2]; recall from
5099ac24
FG
553the “Using Tuple Structs without Named Fields to Create Different Types”
554section of Chapter 5 that `.0` accesses the first value in a tuple struct. The
555`main` function in Listing 15-9 that calls `*` on the `MyBox<T>` value now
556compiles, and the assertions pass!
557
558Without the `Deref` trait, the compiler can only dereference `&` references.
559The `deref` method gives the compiler the ability to take a value of any type
560that implements `Deref` and call the `deref` method to get a `&` reference that
561it knows how to dereference.
562
563When we entered `*y` in Listing 15-9, behind the scenes Rust actually ran this
564code:
565
566```
567*(y.deref())
568```
569
570Rust substitutes the `*` operator with a call to the `deref` method and then a
571plain dereference so we don’t have to think about whether or not we need to
572call the `deref` method. This Rust feature lets us write code that functions
573identically whether we have a regular reference or a type that implements
574`Deref`.
575
04454e1e
FG
576The reason the `deref` method returns a reference to a value, and that the
577plain dereference outside the parentheses in `*(y.deref())` is still necessary,
578is to do with the ownership system. If the `deref` method returned the value
579directly instead of a reference to the value, the value would be moved out of
580`self`. We don’t want to take ownership of the inner value inside `MyBox<T>` in
581this case or in most cases where we use the dereference operator.
5099ac24
FG
582
583Note that the `*` operator is replaced with a call to the `deref` method and
584then a call to the `*` operator just once, each time we use a `*` in our code.
585Because the substitution of the `*` operator does not recurse infinitely, we
586end up with data of type `i32`, which matches the `5` in `assert_eq!` in
587Listing 15-9.
588
589### Implicit Deref Coercions with Functions and Methods
590
04454e1e
FG
591*Deref coercion* converts a reference to a type that implements the `Deref`
592trait into a reference to another type. For example, deref coercion can convert
593`&String` to `&str` because `String` implements the `Deref` trait such that it
594returns `&str`. Deref conversion is a convenience Rust performs on arguments to
595functions and methods, and works only on types that implement the `Deref`
596trait. It happens automatically when we pass a reference to a particular type’s
597value as an argument to a function or method that doesn’t match the parameter
598type in the function or method definition. A sequence of calls to the `deref`
599method converts the type we provided into the type the parameter needs.
5099ac24
FG
600
601Deref coercion was added to Rust so that programmers writing function and
602method calls don’t need to add as many explicit references and dereferences
603with `&` and `*`. The deref coercion feature also lets us write more code that
604can work for either references or smart pointers.
605
606To see deref coercion in action, let’s use the `MyBox<T>` type we defined in
607Listing 15-8 as well as the implementation of `Deref` that we added in Listing
60815-10. Listing 15-11 shows the definition of a function that has a string slice
609parameter:
610
611Filename: src/main.rs
612
613```
614fn hello(name: &str) {
615 println!("Hello, {}!", name);
616}
617```
618
619Listing 15-11: A `hello` function that has the parameter `name` of type `&str`
620
621We can call the `hello` function with a string slice as an argument, such as
622`hello("Rust");` for example. Deref coercion makes it possible to call `hello`
623with a reference to a value of type `MyBox<String>`, as shown in Listing 15-12:
624
625Filename: src/main.rs
626
627```
628fn main() {
629 let m = MyBox::new(String::from("Rust"));
630 hello(&m);
631}
632```
633
634Listing 15-12: Calling `hello` with a reference to a `MyBox<String>` value,
635which works because of deref coercion
636
637Here we’re calling the `hello` function with the argument `&m`, which is a
638reference to a `MyBox<String>` value. Because we implemented the `Deref` trait
639on `MyBox<T>` in Listing 15-10, Rust can turn `&MyBox<String>` into `&String`
640by calling `deref`. The standard library provides an implementation of `Deref`
641on `String` that returns a string slice, and this is in the API documentation
642for `Deref`. Rust calls `deref` again to turn the `&String` into `&str`, which
643matches the `hello` function’s definition.
644
645If Rust didn’t implement deref coercion, we would have to write the code in
646Listing 15-13 instead of the code in Listing 15-12 to call `hello` with a value
647of type `&MyBox<String>`.
648
649Filename: src/main.rs
650
651```
652fn main() {
653 let m = MyBox::new(String::from("Rust"));
654 hello(&(*m)[..]);
655}
656```
657
658Listing 15-13: The code we would have to write if Rust didn’t have deref
659coercion
660
661The `(*m)` dereferences the `MyBox<String>` into a `String`. Then the `&` and
662`[..]` take a string slice of the `String` that is equal to the whole string to
04454e1e 663match the signature of `hello`. This code without deref coercions is harder to
5099ac24
FG
664read, write, and understand with all of these symbols involved. Deref coercion
665allows Rust to handle these conversions for us automatically.
666
667When the `Deref` trait is defined for the types involved, Rust will analyze the
668types and use `Deref::deref` as many times as necessary to get a reference to
669match the parameter’s type. The number of times that `Deref::deref` needs to be
670inserted is resolved at compile time, so there is no runtime penalty for taking
671advantage of deref coercion!
672
673### How Deref Coercion Interacts with Mutability
674
675Similar to how you use the `Deref` trait to override the `*` operator on
676immutable references, you can use the `DerefMut` trait to override the `*`
677operator on mutable references.
678
679Rust does deref coercion when it finds types and trait implementations in three
680cases:
681
682* From `&T` to `&U` when `T: Deref<Target=U>`
683* From `&mut T` to `&mut U` when `T: DerefMut<Target=U>`
684* From `&mut T` to `&U` when `T: Deref<Target=U>`
685
04454e1e
FG
686The first two cases are the same as each other except that the second
687implements mutability. The first case states that if you have a `&T`, and `T`
688implements `Deref` to some type `U`, you can get a `&U` transparently. The
689second case states that the same deref coercion happens for mutable references.
690
691<!-- the first two cases are the same as what? each other? I think so,
692so have edited as much, but wasn't sure /LC -->
693<!-- Yep! /Carol -->
5099ac24
FG
694
695The third case is trickier: Rust will also coerce a mutable reference to an
696immutable one. But the reverse is *not* possible: immutable references will
697never coerce to mutable references. Because of the borrowing rules, if you have
698a mutable reference, that mutable reference must be the only reference to that
699data (otherwise, the program wouldn’t compile). Converting one mutable
700reference to one immutable reference will never break the borrowing rules.
701Converting an immutable reference to a mutable reference would require that the
702initial immutable reference is the only immutable reference to that data, but
703the borrowing rules don’t guarantee that. Therefore, Rust can’t make the
704assumption that converting an immutable reference to a mutable reference is
705possible.
706
707## Running Code on Cleanup with the `Drop` Trait
708
709The second trait important to the smart pointer pattern is `Drop`, which lets
710you customize what happens when a value is about to go out of scope. You can
04454e1e
FG
711provide an implementation for the `Drop` trait on any type, and that code
712can be used to release resources like files or network connections.
713<!-- so we're saying Drop can be used in other contexts, this is just a
714convenient place to introduce it? /LC -->
715<!-- Yep! /Carol -->
5099ac24
FG
716We’re introducing `Drop` in the context of smart pointers because the
717functionality of the `Drop` trait is almost always used when implementing a
718smart pointer. For example, when a `Box<T>` is dropped it will deallocate the
719space on the heap that the box points to.
720
04454e1e
FG
721In some languages, for some types, the programmer must call code to free memory
722or resources every time they finish using an instance those types. Examples
723include file handles, sockets, or locks. If they forget, the system might
724become overloaded and crash. In Rust, you can specify that a particular bit of
725code be run whenever a value goes out of scope, and the compiler will insert
726this code automatically. As a result, you don’t need to be careful about
727placing cleanup code everywhere in a program that an instance of a particular
728type is finished with—you still won’t leak resources!
5099ac24 729
04454e1e 730You specify the code to run when a value goes out of scope by implementing the
5099ac24
FG
731`Drop` trait. The `Drop` trait requires you to implement one method named
732`drop` that takes a mutable reference to `self`. To see when Rust calls `drop`,
733let’s implement `drop` with `println!` statements for now.
734
735Listing 15-14 shows a `CustomSmartPointer` struct whose only custom
736functionality is that it will print `Dropping CustomSmartPointer!` when the
04454e1e 737instance goes out of scope, to show when Rust runs the `drop` function.
5099ac24
FG
738
739Filename: src/main.rs
740
741```
742struct CustomSmartPointer {
743 data: String,
744}
745
746[1] impl Drop for CustomSmartPointer {
747 fn drop(&mut self) {
748 [2] println!("Dropping CustomSmartPointer with data `{}`!", self.data);
749 }
750}
751
752fn main() {
753 [3] let c = CustomSmartPointer {
754 data: String::from("my stuff"),
755 };
756 [4] let d = CustomSmartPointer {
757 data: String::from("other stuff"),
758 };
759 [5] println!("CustomSmartPointers created.");
760[6] }
761```
762
763Listing 15-14: A `CustomSmartPointer` struct that implements the `Drop` trait
764where we would put our cleanup code
765
766The `Drop` trait is included in the prelude, so we don’t need to bring it into
767scope. We implement the `Drop` trait on `CustomSmartPointer` [1] and provide an
768implementation for the `drop` method that calls `println!` [2]. The body of the
769`drop` function is where you would place any logic that you wanted to run when
770an instance of your type goes out of scope. We’re printing some text here to
04454e1e 771demonstrate visually when Rust will call `drop`.
5099ac24
FG
772
773In `main`, we create two instances of `CustomSmartPointer` [3][4] and then
774print `CustomSmartPointers created` [5]. At the end of `main` [6], our
775instances of `CustomSmartPointer` will go out of scope, and Rust will call the
776code we put in the `drop` method [2], printing our final message. Note that we
777didn’t need to call the `drop` method explicitly.
778
779When we run this program, we’ll see the following output:
780
781```
782CustomSmartPointers created.
783Dropping CustomSmartPointer with data `other stuff`!
784Dropping CustomSmartPointer with data `my stuff`!
785```
786
787Rust automatically called `drop` for us when our instances went out of scope,
788calling the code we specified. Variables are dropped in the reverse order of
04454e1e
FG
789their creation, so `d` was dropped before `c`. This example's purpose is to
790give you a visual guide to how the `drop` method works; usually you would
791specify the cleanup code that your type needs to run rather than a print
792message.
5099ac24
FG
793
794### Dropping a Value Early with `std::mem::drop`
795
796Unfortunately, it’s not straightforward to disable the automatic `drop`
797functionality. Disabling `drop` isn’t usually necessary; the whole point of the
798`Drop` trait is that it’s taken care of automatically. Occasionally, however,
799you might want to clean up a value early. One example is when using smart
800pointers that manage locks: you might want to force the `drop` method that
801releases the lock so that other code in the same scope can acquire the lock.
802Rust doesn’t let you call the `Drop` trait’s `drop` method manually; instead
803you have to call the `std::mem::drop` function provided by the standard library
804if you want to force a value to be dropped before the end of its scope.
805
806If we try to call the `Drop` trait’s `drop` method manually by modifying the
807`main` function from Listing 15-14, as shown in Listing 15-15, we’ll get a
808compiler error:
809
810Filename: src/main.rs
811
812```
813fn main() {
814 let c = CustomSmartPointer {
815 data: String::from("some data"),
816 };
817 println!("CustomSmartPointer created.");
818 c.drop();
819 println!("CustomSmartPointer dropped before the end of main.");
820}
821```
822
823Listing 15-15: Attempting to call the `drop` method from the `Drop` trait
824manually to clean up early
825
826When we try to compile this code, we’ll get this error:
827
828```
829error[E0040]: explicit use of destructor method
830 --> src/main.rs:16:7
831 |
83216 | c.drop();
833 | --^^^^--
834 | | |
835 | | explicit destructor calls not allowed
836```
837
838This error message states that we’re not allowed to explicitly call `drop`. The
839error message uses the term *destructor*, which is the general programming term
840for a function that cleans up an instance. A *destructor* is analogous to a
841*constructor*, which creates an instance. The `drop` function in Rust is one
842particular destructor.
843
844Rust doesn’t let us call `drop` explicitly because Rust would still
04454e1e 845automatically call `drop` on the value at the end of `main`. This would cause a
5099ac24
FG
846*double free* error because Rust would be trying to clean up the same value
847twice.
848
849We can’t disable the automatic insertion of `drop` when a value goes out of
850scope, and we can’t call the `drop` method explicitly. So, if we need to force
04454e1e 851a value to be cleaned up early, we use the `std::mem::drop` function.
5099ac24
FG
852
853The `std::mem::drop` function is different from the `drop` method in the `Drop`
04454e1e
FG
854trait. We call it by passing as an argument the value we want to force drop.
855The function is in the prelude, so we can modify `main` in Listing 15-15 to
856call the `drop` function, as shown in Listing 15-16:
5099ac24
FG
857
858Filename: src/main.rs
859
860```
861fn main() {
862 let c = CustomSmartPointer {
863 data: String::from("some data"),
864 };
865 println!("CustomSmartPointer created.");
866 drop(c);
867 println!("CustomSmartPointer dropped before the end of main.");
868}
869```
870
871Listing 15-16: Calling `std::mem::drop` to explicitly drop a value before it
872goes out of scope
873
874Running this code will print the following:
875
876```
877CustomSmartPointer created.
878Dropping CustomSmartPointer with data `some data`!
879CustomSmartPointer dropped before the end of main.
880```
881
882The text ```Dropping CustomSmartPointer with data `some data`!``` is printed
883between the `CustomSmartPointer created.` and `CustomSmartPointer dropped
884before the end of main.` text, showing that the `drop` method code is called to
885drop `c` at that point.
886
887You can use code specified in a `Drop` trait implementation in many ways to
888make cleanup convenient and safe: for instance, you could use it to create your
889own memory allocator! With the `Drop` trait and Rust’s ownership system, you
890don’t have to remember to clean up because Rust does it automatically.
891
892You also don’t have to worry about problems resulting from accidentally
893cleaning up values still in use: the ownership system that makes sure
894references are always valid also ensures that `drop` gets called only once when
895the value is no longer being used.
896
897Now that we’ve examined `Box<T>` and some of the characteristics of smart
898pointers, let’s look at a few other smart pointers defined in the standard
899library.
900
901## `Rc<T>`, the Reference Counted Smart Pointer
902
903In the majority of cases, ownership is clear: you know exactly which variable
904owns a given value. However, there are cases when a single value might have
905multiple owners. For example, in graph data structures, multiple edges might
906point to the same node, and that node is conceptually owned by all of the edges
907that point to it. A node shouldn’t be cleaned up unless it doesn’t have any
04454e1e 908edges pointing to it and so has no owners.
5099ac24 909
04454e1e
FG
910<!-- Is this right, we always have to enable it explicitly? /LC -->
911<!-- Yep! /Carol -->
912You have to enable multiple ownership explicitly by using the Rust type
913`Rc<T>`, which is an abbreviation for *reference counting*. The `Rc<T>` type
914keeps track of the number of references to a value to determine whether or not
915the value is still in use. If there are zero references to a value, the value
916can be cleaned up without any references becoming invalid.
5099ac24
FG
917
918Imagine `Rc<T>` as a TV in a family room. When one person enters to watch TV,
919they turn it on. Others can come into the room and watch the TV. When the last
920person leaves the room, they turn off the TV because it’s no longer being used.
921If someone turns off the TV while others are still watching it, there would be
922uproar from the remaining TV watchers!
04454e1e 923<!-- ha, perfect analogy /LC -->
5099ac24
FG
924
925We use the `Rc<T>` type when we want to allocate some data on the heap for
926multiple parts of our program to read and we can’t determine at compile time
927which part will finish using the data last. If we knew which part would finish
928last, we could just make that part the data’s owner, and the normal ownership
929rules enforced at compile time would take effect.
930
931Note that `Rc<T>` is only for use in single-threaded scenarios. When we discuss
932concurrency in Chapter 16, we’ll cover how to do reference counting in
933multithreaded programs.
934
935### Using `Rc<T>` to Share Data
936
937Let’s return to our cons list example in Listing 15-5. Recall that we defined
938it using `Box<T>`. This time, we’ll create two lists that both share ownership
939of a third list. Conceptually, this looks similar to Figure 15-3:
940
941<img alt="Two lists that share ownership of a third list" src="img/trpl15-03.svg" class="center" />
942
943Figure 15-3: Two lists, `b` and `c`, sharing ownership of a third list, `a`
944
945We’ll create list `a` that contains 5 and then 10. Then we’ll make two more
946lists: `b` that starts with 3 and `c` that starts with 4. Both `b` and `c`
947lists will then continue on to the first `a` list containing 5 and 10. In other
948words, both lists will share the first list containing 5 and 10.
949
950Trying to implement this scenario using our definition of `List` with `Box<T>`
951won’t work, as shown in Listing 15-17:
952
953Filename: src/main.rs
954
955```
956enum List {
957 Cons(i32, Box<List>),
958 Nil,
959}
960
961use crate::List::{Cons, Nil};
962
963fn main() {
964 let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
965[1] let b = Cons(3, Box::new(a));
966[2] let c = Cons(4, Box::new(a));
967}
968```
969
970Listing 15-17: Demonstrating we’re not allowed to have two lists using `Box<T>`
971that try to share ownership of a third list
972
973When we compile this code, we get this error:
974
975```
976error[E0382]: use of moved value: `a`
977 --> src/main.rs:11:30
978 |
9799 | let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
980 | - move occurs because `a` has type `List`, which does not implement the `Copy` trait
98110 | let b = Cons(3, Box::new(a));
982 | - value moved here
98311 | let c = Cons(4, Box::new(a));
984 | ^ value used here after move
985```
986
987The `Cons` variants own the data they hold, so when we create the `b` list [1],
988`a` is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when
989creating `c` [2], we’re not allowed to because `a` has been moved.
990
991We could change the definition of `Cons` to hold references instead, but then
992we would have to specify lifetime parameters. By specifying lifetime
993parameters, we would be specifying that every element in the list will live at
994least as long as the entire list. This is the case for the elements and lists
995in Listing 15-17, but not in every scenario.
996
997Instead, we’ll change our definition of `List` to use `Rc<T>` in place of
998`Box<T>`, as shown in Listing 15-18. Each `Cons` variant will now hold a value
999and an `Rc<T>` pointing to a `List`. When we create `b`, instead of taking
1000ownership of `a`, we’ll clone the `Rc<List>` that `a` is holding, thereby
1001increasing the number of references from one to two and letting `a` and `b`
1002share ownership of the data in that `Rc<List>`. We’ll also clone `a` when
1003creating `c`, increasing the number of references from two to three. Every time
1004we call `Rc::clone`, the reference count to the data within the `Rc<List>` will
1005increase, and the data won’t be cleaned up unless there are zero references to
1006it.
1007
1008Filename: src/main.rs
1009
1010```
1011enum List {
1012 Cons(i32, Rc<List>),
1013 Nil,
1014}
1015
1016use crate::List::{Cons, Nil};
1017[1] use std::rc::Rc;
1018
1019fn main() {
1020[2] let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
1021[3] let b = Cons(3, Rc::clone(&a));
1022[4] let c = Cons(4, Rc::clone(&a));
1023}
1024```
1025
1026Listing 15-18: A definition of `List` that uses `Rc<T>`
1027
1028We need to add a `use` statement to bring `Rc<T>` into scope [1] because it’s
1029not in the prelude. In `main`, we create the list holding 5 and 10 and store it
1030in a new `Rc<List>` in `a` [2]. Then when we create `b` [3] and `c` [4], we
1031call the `Rc::clone` function and pass a reference to the `Rc<List>` in `a` as
1032an argument.
1033
1034We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust’s
1035convention is to use `Rc::clone` in this case. The implementation of
1036`Rc::clone` doesn’t make a deep copy of all the data like most types’
1037implementations of `clone` do. The call to `Rc::clone` only increments the
1038reference count, which doesn’t take much time. Deep copies of data can take a
1039lot of time. By using `Rc::clone` for reference counting, we can visually
1040distinguish between the deep-copy kinds of clones and the kinds of clones that
1041increase the reference count. When looking for performance problems in the
1042code, we only need to consider the deep-copy clones and can disregard calls to
1043`Rc::clone`.
1044
1045### Cloning an `Rc<T>` Increases the Reference Count
1046
1047Let’s change our working example in Listing 15-18 so we can see the reference
1048counts changing as we create and drop references to the `Rc<List>` in `a`.
1049
1050In Listing 15-19, we’ll change `main` so it has an inner scope around list `c`;
1051then we can see how the reference count changes when `c` goes out of scope.
1052
1053Filename: src/main.rs
1054
1055```
1056fn main() {
1057 let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
1058 println!("count after creating a = {}", Rc::strong_count(&a));
1059 let b = Cons(3, Rc::clone(&a));
1060 println!("count after creating b = {}", Rc::strong_count(&a));
1061 {
1062 let c = Cons(4, Rc::clone(&a));
1063 println!("count after creating c = {}", Rc::strong_count(&a));
1064 }
1065 println!("count after c goes out of scope = {}", Rc::strong_count(&a));
1066}
1067```
1068
1069Listing 15-19: Printing the reference count
1070
1071At each point in the program where the reference count changes, we print the
04454e1e
FG
1072reference count, which we get by calling the `Rc::strong_count` function. This
1073function is named `strong_count` rather than `count` because the `Rc<T>` type
1074also has a `weak_count`; we’ll see what `weak_count` is used for in the
5099ac24
FG
1075“Preventing Reference Cycles: Turning an `Rc<T>` into a `Weak<T>`” section.
1076
1077This code prints the following:
1078
1079```
1080count after creating a = 1
1081count after creating b = 2
1082count after creating c = 3
1083count after c goes out of scope = 2
1084```
1085
1086We can see that the `Rc<List>` in `a` has an initial reference count of 1; then
1087each time we call `clone`, the count goes up by 1. When `c` goes out of scope,
1088the count goes down by 1. We don’t have to call a function to decrease the
1089reference count like we have to call `Rc::clone` to increase the reference
1090count: the implementation of the `Drop` trait decreases the reference count
1091automatically when an `Rc<T>` value goes out of scope.
1092
1093What we can’t see in this example is that when `b` and then `a` go out of scope
1094at the end of `main`, the count is then 0, and the `Rc<List>` is cleaned up
04454e1e
FG
1095completely. Using `Rc<T>` allows a single value to have multiple owners, and
1096the count ensures that the value remains valid as long as any of the owners
1097still exist.
5099ac24
FG
1098
1099Via immutable references, `Rc<T>` allows you to share data between multiple
1100parts of your program for reading only. If `Rc<T>` allowed you to have multiple
1101mutable references too, you might violate one of the borrowing rules discussed
1102in Chapter 4: multiple mutable borrows to the same place can cause data races
1103and inconsistencies. But being able to mutate data is very useful! In the next
1104section, we’ll discuss the interior mutability pattern and the `RefCell<T>`
1105type that you can use in conjunction with an `Rc<T>` to work with this
1106immutability restriction.
1107
1108## `RefCell<T>` and the Interior Mutability Pattern
1109
1110*Interior mutability* is a design pattern in Rust that allows you to mutate
1111data even when there are immutable references to that data; normally, this
1112action is disallowed by the borrowing rules. To mutate data, the pattern uses
1113`unsafe` code inside a data structure to bend Rust’s usual rules that govern
04454e1e
FG
1114mutation and borrowing. We haven’t yet covered unsafe code that indicates we're
1115checking the rules manually instead of the compiler checking them for us; we
1116will discuss unsafe code more in Chapter 19.
1117<!-- maybe quickly give a quick idea of what unsafe code means, at a really
1118high level, to let them know how it helps in this situation /LC -->
1119<!-- Done! /Carol -->
1120We can use types that use the interior mutability pattern only when we can
5099ac24
FG
1121ensure that the borrowing rules will be followed at runtime, even though the
1122compiler can’t guarantee that. The `unsafe` code involved is then wrapped in a
1123safe API, and the outer type is still immutable.
1124
1125Let’s explore this concept by looking at the `RefCell<T>` type that follows the
1126interior mutability pattern.
1127
1128### Enforcing Borrowing Rules at Runtime with `RefCell<T>`
1129
1130Unlike `Rc<T>`, the `RefCell<T>` type represents single ownership over the data
1131it holds. So, what makes `RefCell<T>` different from a type like `Box<T>`?
1132Recall the borrowing rules you learned in Chapter 4:
1133
04454e1e
FG
1134* At any given time, you can have *either* (but not both) one mutable reference
1135 or any number of immutable references.
5099ac24
FG
1136* References must always be valid.
1137
1138With references and `Box<T>`, the borrowing rules’ invariants are enforced at
1139compile time. With `RefCell<T>`, these invariants are enforced *at runtime*.
1140With references, if you break these rules, you’ll get a compiler error. With
1141`RefCell<T>`, if you break these rules, your program will panic and exit.
1142
1143The advantages of checking the borrowing rules at compile time are that errors
1144will be caught sooner in the development process, and there is no impact on
1145runtime performance because all the analysis is completed beforehand. For those
1146reasons, checking the borrowing rules at compile time is the best choice in the
1147majority of cases, which is why this is Rust’s default.
1148
1149The advantage of checking the borrowing rules at runtime instead is that
04454e1e
FG
1150certain memory-safe scenarios are then allowed, where they would’ve been
1151disallowed by the compile-time checks. Static analysis, like the Rust compiler,
1152is inherently conservative. Some properties of code are impossible to detect by
1153analyzing the code: the most famous example is the Halting Problem, which is
1154beyond the scope of this book but is an interesting topic to research.
5099ac24
FG
1155
1156Because some analysis is impossible, if the Rust compiler can’t be sure the
1157code complies with the ownership rules, it might reject a correct program; in
1158this way, it’s conservative. If Rust accepted an incorrect program, users
1159wouldn’t be able to trust in the guarantees Rust makes. However, if Rust
1160rejects a correct program, the programmer will be inconvenienced, but nothing
1161catastrophic can occur. The `RefCell<T>` type is useful when you’re sure your
1162code follows the borrowing rules but the compiler is unable to understand and
1163guarantee that.
1164
1165Similar to `Rc<T>`, `RefCell<T>` is only for use in single-threaded scenarios
1166and will give you a compile-time error if you try using it in a multithreaded
1167context. We’ll talk about how to get the functionality of `RefCell<T>` in a
1168multithreaded program in Chapter 16.
1169
1170Here is a recap of the reasons to choose `Box<T>`, `Rc<T>`, or `RefCell<T>`:
1171
1172* `Rc<T>` enables multiple owners of the same data; `Box<T>` and `RefCell<T>`
1173 have single owners.
1174* `Box<T>` allows immutable or mutable borrows checked at compile time; `Rc<T>`
1175 allows only immutable borrows checked at compile time; `RefCell<T>` allows
1176 immutable or mutable borrows checked at runtime.
1177* Because `RefCell<T>` allows mutable borrows checked at runtime, you can
1178 mutate the value inside the `RefCell<T>` even when the `RefCell<T>` is
1179 immutable.
1180
1181Mutating the value inside an immutable value is the *interior mutability*
1182pattern. Let’s look at a situation in which interior mutability is useful and
1183examine how it’s possible.
1184
1185### Interior Mutability: A Mutable Borrow to an Immutable Value
1186
1187A consequence of the borrowing rules is that when you have an immutable value,
1188you can’t borrow it mutably. For example, this code won’t compile:
1189
1190```
1191fn main() {
1192 let x = 5;
1193 let y = &mut x;
1194}
1195```
1196
1197If you tried to compile this code, you’d get the following error:
1198
1199```
1200error[E0596]: cannot borrow `x` as mutable, as it is not declared as mutable
1201 --> src/main.rs:3:13
1202 |
12032 | let x = 5;
1204 | - help: consider changing this to be mutable: `mut x`
12053 | let y = &mut x;
1206 | ^^^^^^ cannot borrow as mutable
1207```
1208
1209However, there are situations in which it would be useful for a value to mutate
1210itself in its methods but appear immutable to other code. Code outside the
1211value’s methods would not be able to mutate the value. Using `RefCell<T>` is
04454e1e 1212one way to get the ability to have interior mutability, but `RefCell<T>`
5099ac24
FG
1213doesn’t get around the borrowing rules completely: the borrow checker in the
1214compiler allows this interior mutability, and the borrowing rules are checked
1215at runtime instead. If you violate the rules, you’ll get a `panic!` instead of
1216a compiler error.
1217
1218Let’s work through a practical example where we can use `RefCell<T>` to mutate
1219an immutable value and see why that is useful.
1220
1221#### A Use Case for Interior Mutability: Mock Objects
1222
04454e1e
FG
1223Sometimes during testing a programmer will use a type in place of another type,
1224in order to observe particular behavior and assert it's implemented correctly.
1225<!-- why do we use other types -- to test different types with the code? Or you
1226mean they might use special types that offer particular functionality, like
1227mock objects do? /LC -->
1228<!-- I'm not sure I understand your question completely, but I've tried to
1229clarify by completing the sentence you left and adding a bit after this spot--
1230is it clearer now? /Carol -->
1231This placeholder type is called a *test double*. Think of it in the sense of a
1232"stunt double" in filmmaking, where a person steps in and substitutes for an
1233actor to do a particular tricky scene. Test doubles stand in for other types
1234when we're running tests. *Mock objects* are specific types of test doubles
5099ac24
FG
1235that record what happens during a test so you can assert that the correct
1236actions took place.
1237
1238Rust doesn’t have objects in the same sense as other languages have objects,
1239and Rust doesn’t have mock object functionality built into the standard library
1240as some other languages do. However, you can definitely create a struct that
1241will serve the same purposes as a mock object.
1242
1243Here’s the scenario we’ll test: we’ll create a library that tracks a value
1244against a maximum value and sends messages based on how close to the maximum
1245value the current value is. This library could be used to keep track of a
1246user’s quota for the number of API calls they’re allowed to make, for example.
1247
1248Our library will only provide the functionality of tracking how close to the
1249maximum a value is and what the messages should be at what times. Applications
1250that use our library will be expected to provide the mechanism for sending the
1251messages: the application could put a message in the application, send an
1252email, send a text message, or something else. The library doesn’t need to know
1253that detail. All it needs is something that implements a trait we’ll provide
1254called `Messenger`. Listing 15-20 shows the library code:
1255
1256Filename: src/lib.rs
1257
1258```
1259pub trait Messenger {
1260[1] fn send(&self, msg: &str);
1261}
1262
1263pub struct LimitTracker<'a, T: Messenger> {
1264 messenger: &'a T,
1265 value: usize,
1266 max: usize,
1267}
1268
1269impl<'a, T> LimitTracker<'a, T>
1270where
1271 T: Messenger,
1272{
04454e1e 1273 pub fn new(messenger: &'a T, max: usize) -> LimitTracker<'a, T> {
5099ac24
FG
1274 LimitTracker {
1275 messenger,
1276 value: 0,
1277 max,
1278 }
1279 }
1280
1281[2] pub fn set_value(&mut self, value: usize) {
1282 self.value = value;
1283
1284 let percentage_of_max = self.value as f64 / self.max as f64;
1285
1286 if percentage_of_max >= 1.0 {
1287 self.messenger.send("Error: You are over your quota!");
1288 } else if percentage_of_max >= 0.9 {
1289 self.messenger
1290 .send("Urgent warning: You've used up over 90% of your quota!");
1291 } else if percentage_of_max >= 0.75 {
1292 self.messenger
1293 .send("Warning: You've used up over 75% of your quota!");
1294 }
1295 }
1296}
1297```
1298
1299Listing 15-20: A library to keep track of how close a value is to a maximum
1300value and warn when the value is at certain levels
1301
1302One important part of this code is that the `Messenger` trait has one method
1303called `send` that takes an immutable reference to `self` and the text of the
1304message [1]. This trait is the interface our mock object needs to implement so
1305that the mock can be used in the same way a real object is. The other important
1306part is that we want to test the behavior of the `set_value` method on the
1307`LimitTracker` [2]. We can change what we pass in for the `value` parameter,
1308but `set_value` doesn’t return anything for us to make assertions on. We want
1309to be able to say that if we create a `LimitTracker` with something that
1310implements the `Messenger` trait and a particular value for `max`, when we pass
1311different numbers for `value`, the messenger is told to send the appropriate
1312messages.
1313
1314We need a mock object that, instead of sending an email or text message when we
1315call `send`, will only keep track of the messages it’s told to send. We can
1316create a new instance of the mock object, create a `LimitTracker` that uses the
1317mock object, call the `set_value` method on `LimitTracker`, and then check that
1318the mock object has the messages we expect. Listing 15-21 shows an attempt to
1319implement a mock object to do just that, but the borrow checker won’t allow it:
1320
1321Filename: src/lib.rs
1322
1323```
1324#[cfg(test)]
1325mod tests {
1326 use super::*;
1327
1328 [1] struct MockMessenger {
1329 [2] sent_messages: Vec<String>,
1330 }
1331
1332 impl MockMessenger {
1333 [3] fn new() -> MockMessenger {
1334 MockMessenger {
1335 sent_messages: vec![],
1336 }
1337 }
1338 }
1339
1340 [4] impl Messenger for MockMessenger {
1341 fn send(&self, message: &str) {
1342 [5] self.sent_messages.push(String::from(message));
1343 }
1344 }
1345
1346 #[test]
1347 [6] fn it_sends_an_over_75_percent_warning_message() {
1348 let mock_messenger = MockMessenger::new();
1349 let mut limit_tracker = LimitTracker::new(&mock_messenger, 100);
1350
1351 limit_tracker.set_value(80);
1352
1353 assert_eq!(mock_messenger.sent_messages.len(), 1);
1354 }
1355}
1356```
1357
1358Listing 15-21: An attempt to implement a `MockMessenger` that isn’t allowed by
1359the borrow checker
1360
1361This test code defines a `MockMessenger` struct [1] that has a `sent_messages`
1362field with a `Vec` of `String` values [2] to keep track of the messages it’s
1363told to send. We also define an associated function `new` [3] to make it
1364convenient to create new `MockMessenger` values that start with an empty list
1365of messages. We then implement the `Messenger` trait for `MockMessenger` [4] so
1366we can give a `MockMessenger` to a `LimitTracker`. In the definition of the
1367`send` method [5], we take the message passed in as a parameter and store it in
1368the `MockMessenger` list of `sent_messages`.
1369
1370In the test, we’re testing what happens when the `LimitTracker` is told to set
1371`value` to something that is more than 75 percent of the `max` value [6].
1372First, we create a new `MockMessenger`, which will start with an empty list of
1373messages. Then we create a new `LimitTracker` and give it a reference to the
1374new `MockMessenger` and a `max` value of 100. We call the `set_value` method on
1375the `LimitTracker` with a value of 80, which is more than 75 percent of 100.
1376Then we assert that the list of messages that the `MockMessenger` is keeping
1377track of should now have one message in it.
1378
1379However, there’s one problem with this test, as shown here:
1380
1381```
1382error[E0596]: cannot borrow `self.sent_messages` as mutable, as it is behind a `&` reference
1383 --> src/lib.rs:58:13
1384 |
13852 | fn send(&self, msg: &str);
1386 | ----- help: consider changing that to be a mutable reference: `&mut self`
1387...
138858 | self.sent_messages.push(String::from(message));
1389 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `self` is a `&` reference, so the data it refers to cannot be borrowed as mutable
1390```
1391
1392We can’t modify the `MockMessenger` to keep track of the messages, because the
1393`send` method takes an immutable reference to `self`. We also can’t take the
1394suggestion from the error text to use `&mut self` instead, because then the
1395signature of `send` wouldn’t match the signature in the `Messenger` trait
1396definition (feel free to try and see what error message you get).
1397
1398This is a situation in which interior mutability can help! We’ll store the
1399`sent_messages` within a `RefCell<T>`, and then the `send` method will be
1400able to modify `sent_messages` to store the messages we’ve seen. Listing 15-22
1401shows what that looks like:
1402
1403Filename: src/lib.rs
1404
1405```
1406#[cfg(test)]
1407mod tests {
1408 use super::*;
1409 use std::cell::RefCell;
1410
1411 struct MockMessenger {
1412 [1] sent_messages: RefCell<Vec<String>>,
1413 }
1414
1415 impl MockMessenger {
1416 fn new() -> MockMessenger {
1417 MockMessenger {
1418 sent_messages: RefCell::new(vec![]) [2],
1419 }
1420 }
1421 }
1422
1423 impl Messenger for MockMessenger {
1424 fn send(&self, message: &str) {
1425 [3] self.sent_messages.borrow_mut().push(String::from(message));
1426 }
1427 }
1428
1429 #[test]
1430 fn it_sends_an_over_75_percent_warning_message() {
1431 // --snip--
1432
1433 [4] assert_eq!(mock_messenger.sent_messages.borrow().len(), 1);
1434 }
1435}
1436```
1437
1438Listing 15-22: Using `RefCell<T>` to mutate an inner value while the outer
1439value is considered immutable
1440
1441The `sent_messages` field is now of type `RefCell<Vec<String>>` [1] instead of
1442`Vec<String>`. In the `new` function, we create a new `RefCell<Vec<String>>`
1443instance around the empty vector [2].
1444
1445For the implementation of the `send` method, the first parameter is still an
1446immutable borrow of `self`, which matches the trait definition. We call
1447`borrow_mut` on the `RefCell<Vec<String>>` in `self.sent_messages` [3] to get a
1448mutable reference to the value inside the `RefCell<Vec<String>>`, which is
1449the vector. Then we can call `push` on the mutable reference to the vector to
1450keep track of the messages sent during the test.
1451
1452The last change we have to make is in the assertion: to see how many items are
1453in the inner vector, we call `borrow` on the `RefCell<Vec<String>>` to get an
1454immutable reference to the vector [4].
1455
1456Now that you’ve seen how to use `RefCell<T>`, let’s dig into how it works!
1457
1458#### Keeping Track of Borrows at Runtime with `RefCell<T>`
1459
1460When creating immutable and mutable references, we use the `&` and `&mut`
1461syntax, respectively. With `RefCell<T>`, we use the `borrow` and `borrow_mut`
1462methods, which are part of the safe API that belongs to `RefCell<T>`. The
1463`borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut`
1464returns the smart pointer type `RefMut<T>`. Both types implement `Deref`, so we
1465can treat them like regular references.
1466
1467The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart
1468pointers are currently active. Every time we call `borrow`, the `RefCell<T>`
1469increases its count of how many immutable borrows are active. When a `Ref<T>`
1470value goes out of scope, the count of immutable borrows goes down by one. Just
1471like the compile-time borrowing rules, `RefCell<T>` lets us have many immutable
1472borrows or one mutable borrow at any point in time.
1473
1474If we try to violate these rules, rather than getting a compiler error as we
1475would with references, the implementation of `RefCell<T>` will panic at
1476runtime. Listing 15-23 shows a modification of the implementation of `send` in
1477Listing 15-22. We’re deliberately trying to create two mutable borrows active
1478for the same scope to illustrate that `RefCell<T>` prevents us from doing this
1479at runtime.
1480
1481Filename: src/lib.rs
1482
1483```
1484 impl Messenger for MockMessenger {
1485 fn send(&self, message: &str) {
1486 let mut one_borrow = self.sent_messages.borrow_mut();
1487 let mut two_borrow = self.sent_messages.borrow_mut();
1488
1489 one_borrow.push(String::from(message));
1490 two_borrow.push(String::from(message));
1491 }
1492 }
1493```
1494
1495Listing 15-23: Creating two mutable references in the same scope to see that
1496`RefCell<T>` will panic
1497
1498We create a variable `one_borrow` for the `RefMut<T>` smart pointer returned
1499from `borrow_mut`. Then we create another mutable borrow in the same way in the
1500variable `two_borrow`. This makes two mutable references in the same scope,
1501which isn’t allowed. When we run the tests for our library, the code in Listing
150215-23 will compile without any errors, but the test will fail:
1503
1504```
1505---- tests::it_sends_an_over_75_percent_warning_message stdout ----
1506thread 'main' panicked at 'already borrowed: BorrowMutError', src/lib.rs:60:53
1507note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
1508```
1509
1510Notice that the code panicked with the message `already borrowed:
1511BorrowMutError`. This is how `RefCell<T>` handles violations of the borrowing
1512rules at runtime.
1513
04454e1e
FG
1514Choosing to catch borrowing errors at runtime rather than compile time, as
1515we've done here, means you'd potentially be finding mistakes in your code later
1516in the development process: possibly not until your code was deployed to
1517production. Also, your code would incur a small runtime performance penalty as
1518a result of keeping track of the borrows at runtime rather than compile time.
1519However, using `RefCell<T>` makes it possible to write a mock object that can
1520modify itself to keep track of the messages it has seen while you’re using it
1521in a context where only immutable values are allowed. You can use `RefCell<T>`
1522despite its trade-offs to get more functionality than regular references
1523provide.
5099ac24
FG
1524
1525### Having Multiple Owners of Mutable Data by Combining `Rc<T>` and `RefCell<T>`
1526
1527A common way to use `RefCell<T>` is in combination with `Rc<T>`. Recall that
1528`Rc<T>` lets you have multiple owners of some data, but it only gives immutable
1529access to that data. If you have an `Rc<T>` that holds a `RefCell<T>`, you can
1530get a value that can have multiple owners *and* that you can mutate!
1531
1532For example, recall the cons list example in Listing 15-18 where we used
1533`Rc<T>` to allow multiple lists to share ownership of another list. Because
1534`Rc<T>` holds only immutable values, we can’t change any of the values in the
1535list once we’ve created them. Let’s add in `RefCell<T>` to gain the ability to
1536change the values in the lists. Listing 15-24 shows that by using a
1537`RefCell<T>` in the `Cons` definition, we can modify the value stored in all
1538the lists:
1539
1540Filename: src/main.rs
1541
1542```
1543#[derive(Debug)]
1544enum List {
1545 Cons(Rc<RefCell<i32>>, Rc<List>),
1546 Nil,
1547}
1548
1549use crate::List::{Cons, Nil};
1550use std::cell::RefCell;
1551use std::rc::Rc;
1552
1553fn main() {
1554 [1] let value = Rc::new(RefCell::new(5));
1555
1556 [2] let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));
1557
1558 let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
1559 let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));
1560
1561 [3] *value.borrow_mut() += 10;
1562
1563 println!("a after = {:?}", a);
1564 println!("b after = {:?}", b);
1565 println!("c after = {:?}", c);
1566}
1567```
1568
1569Listing 15-24: Using `Rc<RefCell<i32>>` to create a `List` that we can mutate
1570
1571We create a value that is an instance of `Rc<RefCell<i32>>` and store it in a
1572variable named `value` [1] so we can access it directly later. Then we create a
1573`List` in `a` with a `Cons` variant that holds `value` [2]. We need to clone
1574`value` so both `a` and `value` have ownership of the inner `5` value rather
1575than transferring ownership from `value` to `a` or having `a` borrow from
1576`value`.
1577
1578We wrap the list `a` in an `Rc<T>` so when we create lists `b` and `c`, they
1579can both refer to `a`, which is what we did in Listing 15-18.
1580
04454e1e
FG
1581After we’ve created the lists in `a`, `b`, and `c`, we want to add 10 to the
1582value in `value` [3]. We do this by calling `borrow_mut` on `value`, which uses
1583the automatic dereferencing feature we discussed in Chapter 5 (see the section
5099ac24
FG
1584“Where’s the `->` Operator?”) to dereference the `Rc<T>` to the inner
1585`RefCell<T>` value. The `borrow_mut` method returns a `RefMut<T>` smart
1586pointer, and we use the dereference operator on it and change the inner value.
1587
1588When we print `a`, `b`, and `c`, we can see that they all have the modified
1589value of 15 rather than 5:
1590
1591```
1592a after = Cons(RefCell { value: 15 }, Nil)
1593b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil))
1594c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil))
1595```
1596
1597This technique is pretty neat! By using `RefCell<T>`, we have an outwardly
1598immutable `List` value. But we can use the methods on `RefCell<T>` that provide
1599access to its interior mutability so we can modify our data when we need to.
1600The runtime checks of the borrowing rules protect us from data races, and it’s
1601sometimes worth trading a bit of speed for this flexibility in our data
1602structures.
1603
1604The standard library has other types that provide interior mutability, such as
1605`Cell<T>`, which is similar except that instead of giving references to the
1606inner value, the value is copied in and out of the `Cell<T>`. There’s also
1607`Mutex<T>`, which offers interior mutability that’s safe to use across threads;
1608we’ll discuss its use in Chapter 16. Check out the standard library docs for
1609more details on the differences between these types.
1610
1611## Reference Cycles Can Leak Memory
1612
1613Rust’s memory safety guarantees make it difficult, but not impossible, to
1614accidentally create memory that is never cleaned up (known as a *memory leak*).
1615Preventing memory leaks entirely is not one of Rust’s guarantees, meaning
1616memory leaks are memory safe in Rust. We can see that Rust allows memory leaks
1617by using `Rc<T>` and `RefCell<T>`: it’s possible to create references where
1618items refer to each other in a cycle. This creates memory leaks because the
1619reference count of each item in the cycle will never reach 0, and the values
1620will never be dropped.
1621
1622### Creating a Reference Cycle
1623
1624Let’s look at how a reference cycle might happen and how to prevent it,
1625starting with the definition of the `List` enum and a `tail` method in Listing
162615-25:
1627
1628Filename: src/main.rs
1629
1630```
1631use crate::List::{Cons, Nil};
1632use std::cell::RefCell;
1633use std::rc::Rc;
1634
1635#[derive(Debug)]
1636enum List {
1637 [1] Cons(i32, RefCell<Rc<List>>),
1638 Nil,
1639}
1640
1641impl List {
1642 [2] fn tail(&self) -> Option<&RefCell<Rc<List>>> {
1643 match self {
1644 Cons(_, item) => Some(item),
1645 Nil => None,
1646 }
1647 }
1648}
1649```
1650
1651Listing 15-25: A cons list definition that holds a `RefCell<T>` so we can
1652modify what a `Cons` variant is referring to
1653
1654We’re using another variation of the `List` definition from Listing 15-5. The
1655second element in the `Cons` variant is now `RefCell<Rc<List>>` [1], meaning
1656that instead of having the ability to modify the `i32` value as we did in
04454e1e 1657Listing 15-24, we want to modify the `List` value a `Cons` variant is
5099ac24
FG
1658pointing to. We’re also adding a `tail` method [2] to make it convenient for us
1659to access the second item if we have a `Cons` variant.
1660
1661In Listing 15-26, we’re adding a `main` function that uses the definitions in
1662Listing 15-25. This code creates a list in `a` and a list in `b` that points to
1663the list in `a`. Then it modifies the list in `a` to point to `b`, creating a
1664reference cycle. There are `println!` statements along the way to show what the
1665reference counts are at various points in this process.
1666
1667Filename: src/main.rs
1668
1669```
1670fn main() {
1671 [1] let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));
1672
1673 println!("a initial rc count = {}", Rc::strong_count(&a));
1674 println!("a next item = {:?}", a.tail());
1675
1676 [2] let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));
1677
1678 println!("a rc count after b creation = {}", Rc::strong_count(&a));
1679 println!("b initial rc count = {}", Rc::strong_count(&b));
1680 println!("b next item = {:?}", b.tail());
1681
1682 [3] if let Some(link) = a.tail() {
1683 [4] *link.borrow_mut() = Rc::clone(&b);
1684 }
1685
1686 println!("b rc count after changing a = {}", Rc::strong_count(&b));
1687 println!("a rc count after changing a = {}", Rc::strong_count(&a));
1688
1689 // Uncomment the next line to see that we have a cycle;
1690 // it will overflow the stack
1691 // println!("a next item = {:?}", a.tail());
1692}
1693```
1694
1695Listing 15-26: Creating a reference cycle of two `List` values pointing to each
1696other
1697
1698We create an `Rc<List>` instance holding a `List` value in the variable `a`
1699with an initial list of `5, Nil` [1]. We then create an `Rc<List>` instance
1700holding another `List` value in the variable `b` that contains the value 10 and
1701points to the list in `a` [2].
1702
1703We modify `a` so it points to `b` instead of `Nil`, creating a cycle. We do
1704that by using the `tail` method to get a reference to the `RefCell<Rc<List>>`
1705in `a`, which we put in the variable `link` [3]. Then we use the `borrow_mut`
1706method on the `RefCell<Rc<List>>` to change the value inside from an `Rc<List>`
1707that holds a `Nil` value to the `Rc<List>` in `b` [4].
1708
1709When we run this code, keeping the last `println!` commented out for the
1710moment, we’ll get this output:
1711
1712```
1713a initial rc count = 1
1714a next item = Some(RefCell { value: Nil })
1715a rc count after b creation = 2
1716b initial rc count = 1
1717b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) })
1718b rc count after changing a = 2
1719a rc count after changing a = 2
1720```
1721
1722The reference count of the `Rc<List>` instances in both `a` and `b` are 2 after
1723we change the list in `a` to point to `b`. At the end of `main`, Rust drops the
1724variable `b`, which decreases the reference count of the `Rc<List>` instance
1725from 2 to 1. The memory that `Rc<List>` has on the heap won’t be dropped at
1726this point, because its reference count is 1, not 0. Then Rust drops `a`, which
1727decreases the reference count of the `a` `Rc<List>` instance from 2 to 1 as
1728well. This instance’s memory can’t be dropped either, because the other
1729`Rc<List>` instance still refers to it. The memory allocated to the list will
1730remain uncollected forever. To visualize this reference cycle, we’ve created a
1731diagram in Figure 15-4.
1732
1733<img alt="Reference cycle of lists" src="img/trpl15-04.svg" class="center" />
1734
1735Figure 15-4: A reference cycle of lists `a` and `b` pointing to each other
1736
1737If you uncomment the last `println!` and run the program, Rust will try to
1738print this cycle with `a` pointing to `b` pointing to `a` and so forth until it
1739overflows the stack.
1740
04454e1e
FG
1741<!-- "in this case" meaning with the last print statement commented out? /LC -->
1742<!-- Technically with both the println commented out and not. I've tried to
1743clarify the purpose of this paragraph earlier /Carol -->
1744Compared to a real-world program, the consequences creating a reference cycle
1745in this example aren’t very dire: right after we create the reference cycle,
1746the program ends. However, if a more complex program allocated lots of memory
1747in a cycle and held onto it for a long time, the program would use more memory
1748than it needed and might overwhelm the system, causing it to run out of
1749available memory.
5099ac24
FG
1750
1751Creating reference cycles is not easily done, but it’s not impossible either.
1752If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested
1753combinations of types with interior mutability and reference counting, you must
1754ensure that you don’t create cycles; you can’t rely on Rust to catch them.
1755Creating a reference cycle would be a logic bug in your program that you should
1756use automated tests, code reviews, and other software development practices to
1757minimize.
1758
1759Another solution for avoiding reference cycles is reorganizing your data
1760structures so that some references express ownership and some references don’t.
1761As a result, you can have cycles made up of some ownership relationships and
1762some non-ownership relationships, and only the ownership relationships affect
1763whether or not a value can be dropped. In Listing 15-25, we always want `Cons`
1764variants to own their list, so reorganizing the data structure isn’t possible.
1765Let’s look at an example using graphs made up of parent nodes and child nodes
1766to see when non-ownership relationships are an appropriate way to prevent
1767reference cycles.
1768
1769### Preventing Reference Cycles: Turning an `Rc<T>` into a `Weak<T>`
1770
1771So far, we’ve demonstrated that calling `Rc::clone` increases the
1772`strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only cleaned
1773up if its `strong_count` is 0. You can also create a *weak reference* to the
1774value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a
04454e1e
FG
1775reference to the `Rc<T>`. Strong references are how you can share ownership of
1776an `Rc<T>` instance. Weak references don’t express an ownership relationship,
1777and their count doesn't affect when an `Rc<T>` instance is cleaned up. They
1778won’t cause a reference cycle because any cycle involving some weak references
1779will be broken once the strong reference count of values involved is 0.
1780
1781<!-- can you explain what a weak reference does/is for? By the end of this
1782paragraph I wasn't entirely sure. It would be good to have a high level
1783succinct definition /LC -->
1784<!-- I've moved the explanation that was in the next paragraph to this spot and
1785modified it a bit, does this work? /Carol -->
1786
1787When you call `Rc::downgrade`, you get a smart pointer of type `Weak<T>`.
1788Instead of increasing the `strong_count` in the `Rc<T>` instance by 1, calling
1789`Rc::downgrade` increases the `weak_count` by 1. The `Rc<T>` type uses
1790`weak_count` to keep track of how many `Weak<T>` references exist, similar to
1791`strong_count`. The difference is the `weak_count` doesn’t need to be 0 for the
1792`Rc<T>` instance to be cleaned up.
5099ac24
FG
1793
1794Because the value that `Weak<T>` references might have been dropped, to do
1795anything with the value that a `Weak<T>` is pointing to, you must make sure the
1796value still exists. Do this by calling the `upgrade` method on a `Weak<T>`
1797instance, which will return an `Option<Rc<T>>`. You’ll get a result of `Some`
1798if the `Rc<T>` value has not been dropped yet and a result of `None` if the
1799`Rc<T>` value has been dropped. Because `upgrade` returns an `Option<Rc<T>>`,
1800Rust will ensure that the `Some` case and the `None` case are handled, and
1801there won’t be an invalid pointer.
1802
1803As an example, rather than using a list whose items know only about the next
1804item, we’ll create a tree whose items know about their children items *and*
1805their parent items.
1806
1807#### Creating a Tree Data Structure: a `Node` with Child Nodes
1808
1809To start, we’ll build a tree with nodes that know about their child nodes.
1810We’ll create a struct named `Node` that holds its own `i32` value as well as
1811references to its children `Node` values:
1812
1813Filename: src/main.rs
1814
1815```
1816use std::cell::RefCell;
1817use std::rc::Rc;
1818
1819#[derive(Debug)]
1820struct Node {
1821 value: i32,
1822 children: RefCell<Vec<Rc<Node>>>,
1823}
1824```
1825
1826We want a `Node` to own its children, and we want to share that ownership with
1827variables so we can access each `Node` in the tree directly. To do this, we
1828define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to
1829modify which nodes are children of another node, so we have a `RefCell<T>` in
1830`children` around the `Vec<Rc<Node>>`.
1831
1832Next, we’ll use our struct definition and create one `Node` instance named
1833`leaf` with the value 3 and no children, and another instance named `branch`
1834with the value 5 and `leaf` as one of its children, as shown in Listing 15-27:
1835
1836Filename: src/main.rs
1837
1838```
1839fn main() {
1840 let leaf = Rc::new(Node {
1841 value: 3,
1842 children: RefCell::new(vec![]),
1843 });
1844
1845 let branch = Rc::new(Node {
1846 value: 5,
1847 children: RefCell::new(vec![Rc::clone(&leaf)]),
1848 });
1849}
1850```
1851
1852Listing 15-27: Creating a `leaf` node with no children and a `branch` node with
1853`leaf` as one of its children
1854
1855We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the
1856`Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from
1857`branch` to `leaf` through `branch.children`, but there’s no way to get from
1858`leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and
1859doesn’t know they’re related. We want `leaf` to know that `branch` is its
1860parent. We’ll do that next.
1861
1862#### Adding a Reference from a Child to Its Parent
1863
1864To make the child node aware of its parent, we need to add a `parent` field to
1865our `Node` struct definition. The trouble is in deciding what the type of
1866`parent` should be. We know it can’t contain an `Rc<T>`, because that would
1867create a reference cycle with `leaf.parent` pointing to `branch` and
1868`branch.children` pointing to `leaf`, which would cause their `strong_count`
1869values to never be 0.
1870
1871Thinking about the relationships another way, a parent node should own its
1872children: if a parent node is dropped, its child nodes should be dropped as
1873well. However, a child should not own its parent: if we drop a child node, the
1874parent should still exist. This is a case for weak references!
1875
1876So instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`,
1877specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks
1878like this:
1879
1880Filename: src/main.rs
1881
1882```
1883use std::cell::RefCell;
1884use std::rc::{Rc, Weak};
1885
1886#[derive(Debug)]
1887struct Node {
1888 value: i32,
1889 parent: RefCell<Weak<Node>>,
1890 children: RefCell<Vec<Rc<Node>>>,
1891}
1892```
1893
1894A node will be able to refer to its parent node but doesn’t own its parent.
1895In Listing 15-28, we update `main` to use this new definition so the `leaf`
1896node will have a way to refer to its parent, `branch`:
1897
1898Filename: src/main.rs
1899
1900```
1901fn main() {
1902 let leaf = Rc::new(Node {
1903 value: 3,
1904 [1] parent: RefCell::new(Weak::new()),
1905 children: RefCell::new(vec![]),
1906 });
1907
1908 [2] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
1909
1910 let branch = Rc::new(Node {
1911 value: 5,
1912 [3] parent: RefCell::new(Weak::new()),
1913 children: RefCell::new(vec![Rc::clone(&leaf)]),
1914 });
1915
1916 [4] *leaf.parent.borrow_mut() = Rc::downgrade(&branch);
1917
1918 [5] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
1919}
1920```
1921
1922Listing 15-28: A `leaf` node with a weak reference to its parent node `branch`
1923
04454e1e
FG
1924Creating the `leaf` node looks similar to Listing 15-27 with the exception of
1925the `parent` field: `leaf` starts out without a parent, so we create a new,
1926empty `Weak<Node>` reference instance [1].
5099ac24
FG
1927
1928At this point, when we try to get a reference to the parent of `leaf` by using
1929the `upgrade` method, we get a `None` value. We see this in the output from the
1930first `println!` statement [2]:
1931
1932```
1933leaf parent = None
1934```
1935
1936When we create the `branch` node, it will also have a new `Weak<Node>`
1937reference in the `parent` field [3], because `branch` doesn’t have a parent
1938node. We still have `leaf` as one of the children of `branch`. Once we have the
1939`Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>`
1940reference to its parent [4]. We use the `borrow_mut` method on the
1941`RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the
1942`Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from
1943the `Rc<Node>` in `branch.`
1944
1945When we print the parent of `leaf` again [5], this time we’ll get a `Some`
1946variant holding `branch`: now `leaf` can access its parent! When we print
1947`leaf`, we also avoid the cycle that eventually ended in a stack overflow like
1948we had in Listing 15-26; the `Weak<Node>` references are printed as `(Weak)`:
1949
1950```
1951leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) },
1952children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) },
1953children: RefCell { value: [] } }] } })
1954```
1955
1956The lack of infinite output indicates that this code didn’t create a reference
1957cycle. We can also tell this by looking at the values we get from calling
1958`Rc::strong_count` and `Rc::weak_count`.
1959
1960#### Visualizing Changes to `strong_count` and `weak_count`
1961
1962Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>`
1963instances change by creating a new inner scope and moving the creation of
1964`branch` into that scope. By doing so, we can see what happens when `branch` is
1965created and then dropped when it goes out of scope. The modifications are shown
1966in Listing 15-29:
1967
1968Filename: src/main.rs
1969
1970```
1971fn main() {
1972 let leaf = Rc::new(Node {
1973 value: 3,
1974 parent: RefCell::new(Weak::new()),
1975 children: RefCell::new(vec![]),
1976 });
1977
1978 [1] println!(
1979 "leaf strong = {}, weak = {}",
1980 Rc::strong_count(&leaf),
1981 Rc::weak_count(&leaf),
1982 );
1983
1984 [2] {
1985 let branch = Rc::new(Node {
1986 value: 5,
1987 parent: RefCell::new(Weak::new()),
1988 children: RefCell::new(vec![Rc::clone(&leaf)]),
1989 });
1990
1991 *leaf.parent.borrow_mut() = Rc::downgrade(&branch);
1992
1993 [3] println!(
1994 "branch strong = {}, weak = {}",
1995 Rc::strong_count(&branch),
1996 Rc::weak_count(&branch),
1997 );
1998
1999 [4] println!(
2000 "leaf strong = {}, weak = {}",
2001 Rc::strong_count(&leaf),
2002 Rc::weak_count(&leaf),
2003 );
2004 [5] }
2005
2006 [6] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
2007 [7] println!(
2008 "leaf strong = {}, weak = {}",
2009 Rc::strong_count(&leaf),
2010 Rc::weak_count(&leaf),
2011 );
2012}
2013```
2014
2015Listing 15-29: Creating `branch` in an inner scope and examining strong and
2016weak reference counts
2017
2018After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak
2019count of 0 [1]. In the inner scope [2], we create `branch` and associate it
2020with `leaf`, at which point when we print the counts [3], the `Rc<Node>` in
2021`branch` will have a strong count of 1 and a weak count of 1 (for `leaf.parent`
2022pointing to `branch` with a `Weak<Node>`). When we print the counts in `leaf`
2023[4], we’ll see it will have a strong count of 2, because `branch` now has a
2024clone of the `Rc<Node>` of `leaf` stored in `branch.children`, but will still
2025have a weak count of 0.
2026
2027When the inner scope ends [5], `branch` goes out of scope and the strong count
2028of the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1
2029from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we
2030don’t get any memory leaks!
2031
2032If we try to access the parent of `leaf` after the end of the scope, we’ll get
2033`None` again [6]. At the end of the program [7], the `Rc<Node>` in `leaf` has a
2034strong count of 1 and a weak count of 0, because the variable `leaf` is now the
2035only reference to the `Rc<Node>` again.
2036
2037All of the logic that manages the counts and value dropping is built into
2038`Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By
2039specifying that the relationship from a child to its parent should be a
2040`Weak<T>` reference in the definition of `Node`, you’re able to have parent
2041nodes point to child nodes and vice versa without creating a reference cycle
2042and memory leaks.
2043
2044## Summary
2045
2046This chapter covered how to use smart pointers to make different guarantees and
2047trade-offs from those Rust makes by default with regular references. The
2048`Box<T>` type has a known size and points to data allocated on the heap. The
2049`Rc<T>` type keeps track of the number of references to data on the heap so
2050that data can have multiple owners. The `RefCell<T>` type with its interior
2051mutability gives us a type that we can use when we need an immutable type but
2052need to change an inner value of that type; it also enforces the borrowing
2053rules at runtime instead of at compile time.
2054
2055Also discussed were the `Deref` and `Drop` traits, which enable a lot of the
2056functionality of smart pointers. We explored reference cycles that can cause
2057memory leaks and how to prevent them using `Weak<T>`.
2058
2059If this chapter has piqued your interest and you want to implement your own
2060smart pointers, check out “The Rustonomicon” at
2061*https://doc.rust-lang.org/stable/nomicon/* for more useful information.
2062
2063Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new
2064smart pointers.