]>
Commit | Line | Data |
---|---|---|
cc61c64b XL |
1 | |
2 | [TOC] | |
3 | ||
4 | # Smart Pointers | |
5 | ||
ea8adc8c | 6 | A *pointer* is a general concept for a variable that contains an address in |
2c00a5a8 XL |
7 | memory. This address refers to, or “points at,” some other data. The most |
8 | common kind of pointer in Rust is a reference, which you learned about in | |
9 | Chapter 4. References are indicated by the `&` symbol and borrow the value they | |
10 | point to. They don’t have any special capabilities other than referring to | |
11 | data. Also, they don’t have any overhead and are the kind of pointer we use | |
12 | most often. | |
ea8adc8c XL |
13 | |
14 | *Smart pointers*, on the other hand, are data structures that act like a | |
2c00a5a8 XL |
15 | pointer but also have additional metadata and capabilities. The concept of |
16 | smart pointers isn’t unique to Rust: smart pointers originated in C++ and exist | |
17 | in other languages as well. In Rust, the different smart pointers defined in | |
18 | the standard library provide extra functionality beyond that provided by | |
19 | references. One example that we’ll explore in this chapter is the *reference | |
20 | counting* smart pointer type. This pointer enables you to have multiple owners | |
21 | of data by keeping track of the number of owners and, when no owners remain, | |
22 | taking care of cleaning up the data. | |
ea8adc8c XL |
23 | |
24 | In Rust, where we have the concept of ownership and borrowing, an additional | |
2c00a5a8 XL |
25 | difference between references and smart pointers is that references are |
26 | pointers that only borrow data; in contrast, in many cases, smart pointers | |
27 | *own* the data they point to. | |
28 | ||
29 | We’ve already encountered a few smart pointers in this book, such as `String` | |
30 | and `Vec<T>` in Chapter 8, although we didn’t call them smart pointers at the | |
31 | time. Both these types count as smart pointers because they own some memory and | |
32 | allow you to manipulate it. They also have metadata (such as their capacity) | |
33 | and extra capabilities or guarantees (such as with `String` ensuring its data | |
34 | will always be valid UTF-8). | |
35 | ||
36 | Smart pointers are usually implemented using structs. The characteristic that | |
37 | distinguishes a smart pointer from an ordinary struct is that smart pointers | |
ea8adc8c | 38 | implement the `Deref` and `Drop` traits. The `Deref` trait allows an instance |
2c00a5a8 XL |
39 | of the smart pointer struct to behave like a reference so we can write code |
40 | that works with either references or smart pointers. The `Drop` trait allows us | |
41 | to customize the code that is run when an instance of the smart pointer goes | |
42 | out of scope. In this chapter, we’ll discuss both traits and demonstrate why | |
43 | they’re important to smart pointers. | |
cc61c64b XL |
44 | |
45 | Given that the smart pointer pattern is a general design pattern used | |
2c00a5a8 XL |
46 | frequently in Rust, this chapter won’t cover every existing smart pointer. Many |
47 | libraries have their own smart pointers, and you can even write your own. We’ll | |
48 | cover the most common smart pointers in the standard library: | |
ea8adc8c XL |
49 | |
50 | * `Box<T>` for allocating values on the heap | |
51 | * `Rc<T>`, a reference counted type that enables multiple ownership | |
52 | * `Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`, a type that enforces | |
53 | the borrowing rules at runtime instead of compile time | |
54 | ||
2c00a5a8 | 55 | In addition, we’ll cover the *interior mutability* pattern where an immutable |
ea8adc8c | 56 | type exposes an API for mutating an interior value. We’ll also discuss |
2c00a5a8 | 57 | *reference cycles*: how they can leak memory and how to prevent them. |
cc61c64b | 58 | |
3b2f2976 | 59 | Let’s dive in! |
cc61c64b XL |
60 | |
61 | ## `Box<T>` Points to Data on the Heap and Has a Known Size | |
62 | ||
63 | The most straightforward smart pointer is a *box*, whose type is written | |
ea8adc8c | 64 | `Box<T>`. Boxes allow you to store data on the heap rather than the stack. What |
2c00a5a8 XL |
65 | remains on the stack is the pointer to the heap data. Refer to Chapter 4 to |
66 | review the difference between the stack and the heap. | |
ea8adc8c | 67 | |
2c00a5a8 XL |
68 | Boxes don’t have performance overhead, other than storing their data on the |
69 | heap instead of on the stack. But they don’t have many extra capabilities | |
70 | either. You’ll use them most often in these situations: | |
ea8adc8c | 71 | |
2c00a5a8 | 72 | * When you have a type whose size can’t be known at compile time, and you want |
ea8adc8c | 73 | to use a value of that type in a context that needs to know an exact size |
2c00a5a8 | 74 | * When you have a large amount of data and you want to transfer ownership but |
ea8adc8c | 75 | ensure the data won’t be copied when you do so |
2c00a5a8 XL |
76 | * When you want to own a value and only care that it’s a type that implements a |
77 | particular trait rather than knowing the concrete type | |
ea8adc8c | 78 | |
2c00a5a8 XL |
79 | We’ll demonstrate the first situation in this section. But before we do so, |
80 | we’ll elaborate on the other two situations a bit more: in the second case, | |
ea8adc8c | 81 | transferring ownership of a large amount of data can take a long time because |
2c00a5a8 | 82 | the data is copied around on the stack. To improve performance in this |
ea8adc8c XL |
83 | situation, we can store the large amount of data on the heap in a box. Then, |
84 | only the small amount of pointer data is copied around on the stack, and the | |
85 | data stays in one place on the heap. The third case is known as a *trait | |
2c00a5a8 XL |
86 | object*, and Chapter 17 devotes an entire section just to that topic. So what |
87 | you learn here you’ll apply again in Chapter 17! | |
ea8adc8c XL |
88 | |
89 | ### Using a `Box<T>` to Store Data on the Heap | |
90 | ||
2c00a5a8 XL |
91 | Before we discuss this use case for `Box<T>`, we’ll cover the syntax and how to |
92 | interact with values stored within a `Box<T>`. | |
ea8adc8c | 93 | |
2c00a5a8 | 94 | Listing 15-1 shows how to use a box to store an `i32` value on the heap: |
cc61c64b XL |
95 | |
96 | Filename: src/main.rs | |
97 | ||
98 | ``` | |
99 | fn main() { | |
100 | let b = Box::new(5); | |
101 | println!("b = {}", b); | |
102 | } | |
103 | ``` | |
104 | ||
105 | Listing 15-1: Storing an `i32` value on the heap using a box | |
106 | ||
ea8adc8c XL |
107 | We define the variable `b` to have the value of a `Box` that points to the |
108 | value `5`, which is allocated on the heap. This program will print `b = 5`; in | |
109 | this case, we can access the data in the box in a similar way as we would if | |
2c00a5a8 XL |
110 | this data was on the stack. Just like any owned value, when a box goes out of |
111 | scope like `b` does at the end of `main`, it will be deallocated. The | |
112 | deallocation happens for the box (stored on the stack) and the data it points | |
113 | to (stored on the heap). | |
cc61c64b | 114 | |
3b2f2976 | 115 | Putting a single value on the heap isn’t very useful, so you won’t use boxes by |
2c00a5a8 XL |
116 | themselves in this way very often. Having values like a single `i32` on the |
117 | stack, where they’re stored by default, is more appropriate in the majority of | |
118 | situations. Let’s look at a case where boxes allow us to define types that we | |
119 | wouldn’t be allowed to if we didn’t have boxes. | |
ea8adc8c XL |
120 | |
121 | ### Boxes Enable Recursive Types | |
122 | ||
2c00a5a8 XL |
123 | At compile time, Rust needs to know how much space a type takes up. One type |
124 | whose size can’t be known at compile time is a *recursive type*, where a value | |
125 | can have as part of itself another value of the same type. Because this nesting | |
126 | of values could theoretically continue infinitely, Rust doesn’t know how much | |
127 | space a value of a recursive type needs. However, boxes have a known size, so | |
128 | by inserting a box in a recursive type definition, we can have recursive types. | |
129 | ||
130 | Let’s explore the *cons list*, which is a data type common in functional | |
131 | programming languages, as an example of a recursive type. The cons list type | |
132 | we’ll define is straightforward except for the recursion; therefore, the | |
133 | concepts in the example we’ll work with will be useful any time you get into | |
134 | more complex situations involving recursive types. | |
135 | ||
136 | #### More Information About the Cons List | |
137 | ||
138 | A *cons list* is a data structure that comes from the Lisp programming language | |
139 | and its dialects. In Lisp, the `cons` function (short for “construct function”) | |
140 | constructs a new pair from its two arguments, which usually are a single value | |
141 | and another pair. These pairs containing pairs form a list. | |
142 | ||
143 | The cons function concept has made its way into more general functional | |
144 | programming jargon: “to cons x onto y” informally means to construct a new | |
145 | container instance by putting the element x at the start of this new container, | |
146 | followed by the container y. | |
147 | ||
148 | Each item in a cons list contains two elements: the value of the current item | |
149 | and the next item. The last item in the list contains only a value called `Nil` | |
150 | without a next item. A cons list is produced by recursively calling the `cons` | |
151 | function. The canonical name to denote the base case of the recursion is `Nil`. | |
152 | Note that this is not the same as the “null” or “nil” concept in Chapter 6, | |
153 | which is an invalid or absent value. | |
154 | ||
155 | Although functional programming languages use cons lists frequently, it isn’t a | |
156 | commonly used data structure in Rust. Most of the time when you have a list of | |
157 | items in Rust, `Vec<T>` is a better choice to use. Other, more complex | |
158 | recursive data types *are* useful in various situations, but by starting with | |
159 | the cons list, we can explore how boxes let us define a recursive data type | |
160 | without much distraction. | |
161 | ||
162 | Listing 15-2 contains an enum definition for a cons list. Note that this code | |
163 | won’t compile yet because the `List` type doesn’t have a known size, which | |
ea8adc8c XL |
164 | we’ll demonstrate: |
165 | ||
cc61c64b XL |
166 | Filename: src/main.rs |
167 | ||
168 | ``` | |
169 | enum List { | |
170 | Cons(i32, List), | |
171 | Nil, | |
172 | } | |
173 | ``` | |
174 | ||
2c00a5a8 | 175 | Listing 15-2: The first attempt at defining an enum to represent a cons list |
cc61c64b XL |
176 | data structure of `i32` values |
177 | ||
2c00a5a8 XL |
178 | > Note: We’re implementing a cons list that only holds `i32` values for the |
179 | > purposes of this example. We could have implemented it using generics, as we | |
180 | > discussed in Chapter 10, to define a cons list type that could store values of | |
181 | > any type. | |
ea8adc8c | 182 | |
2c00a5a8 XL |
183 | Using the `List` type to store the list `1, 2, 3` would look like the code in |
184 | Listing 15-3: | |
ea8adc8c XL |
185 | |
186 | Filename: src/main.rs | |
cc61c64b XL |
187 | |
188 | ``` | |
189 | use List::{Cons, Nil}; | |
190 | ||
191 | fn main() { | |
192 | let list = Cons(1, Cons(2, Cons(3, Nil))); | |
193 | } | |
194 | ``` | |
195 | ||
ea8adc8c XL |
196 | Listing 15-3: Using the `List` enum to store the list `1, 2, 3` |
197 | ||
2c00a5a8 XL |
198 | The first `Cons` value holds `1` and another `List` value. This `List` value is |
199 | another `Cons` value that holds `2` and another `List` value. This `List` value | |
cc61c64b XL |
200 | is one more `Cons` value that holds `3` and a `List` value, which is finally |
201 | `Nil`, the non-recursive variant that signals the end of the list. | |
202 | ||
2c00a5a8 XL |
203 | If we try to compile the code in Listing 15-3, we get the error shown in |
204 | Listing 15-4: | |
cc61c64b XL |
205 | |
206 | ``` | |
207 | error[E0072]: recursive type `List` has infinite size | |
2c00a5a8 | 208 | --> src/main.rs:1:1 |
cc61c64b | 209 | | |
ea8adc8c XL |
210 | 1 | enum List { |
211 | | ^^^^^^^^^ recursive type has infinite size | |
212 | 2 | Cons(i32, List), | |
2c00a5a8 | 213 | | ----- recursive without indirection |
cc61c64b XL |
214 | | |
215 | = help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to | |
216 | make `List` representable | |
217 | ``` | |
218 | ||
ea8adc8c XL |
219 | Listing 15-4: The error we get when attempting to define a recursive enum |
220 | ||
2c00a5a8 XL |
221 | The error shows this type “has infinite size.” The reason is that we’ve defined |
222 | `List` with a variant that is recursive: it holds another value of itself | |
223 | directly. As a result, Rust can’t figure out how much space it needs to store a | |
224 | `List` value. Let’s break down why we get this error a bit: first, let’s look | |
225 | at how Rust decides how much space it needs to store a value of a non-recursive | |
ea8adc8c XL |
226 | type. |
227 | ||
2c00a5a8 | 228 | #### Computing the Size of a Non-Recursive Type |
cc61c64b | 229 | |
cc61c64b XL |
230 | Recall the `Message` enum we defined in Listing 6-2 when we discussed enum |
231 | definitions in Chapter 6: | |
232 | ||
233 | ``` | |
234 | enum Message { | |
235 | Quit, | |
236 | Move { x: i32, y: i32 }, | |
237 | Write(String), | |
238 | ChangeColor(i32, i32, i32), | |
239 | } | |
240 | ``` | |
241 | ||
ea8adc8c XL |
242 | To determine how much space to allocate for a `Message` value, Rust goes |
243 | through each of the variants to see which variant needs the most space. Rust | |
244 | sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough | |
2c00a5a8 XL |
245 | space to store two `i32` values, and so forth. Because only one variant will be |
246 | used, the most space a `Message` value will need is the space it would take to | |
247 | store the largest of its variants. | |
cc61c64b | 248 | |
ea8adc8c XL |
249 | Contrast this to what happens when Rust tries to determine how much space a |
250 | recursive type like the `List` enum in Listing 15-2 needs. The compiler starts | |
251 | by looking at the `Cons` variant, which holds a value of type `i32` and a value | |
252 | of type `List`. Therefore, `Cons` needs an amount of space equal to the size of | |
253 | an `i32` plus the size of a `List`. To figure out how much memory the `List` | |
254 | type needs, the compiler looks at the variants, starting with the `Cons` | |
cc61c64b | 255 | variant. The `Cons` variant holds a value of type `i32` and a value of type |
2c00a5a8 | 256 | `List`, and this process continues infinitely, as shown in Figure 15-1: |
cc61c64b XL |
257 | |
258 | <img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" /> | |
259 | ||
2c00a5a8 | 260 | Figure 15-1: An infinite `List` consisting of infinite `Cons` variants |
ea8adc8c | 261 | |
2c00a5a8 | 262 | #### Using `Box<T>` to Get a Recursive Type with a Known Size |
cc61c64b | 263 | |
3b2f2976 | 264 | Rust can’t figure out how much space to allocate for recursively defined types, |
2c00a5a8 XL |
265 | so the compiler gives the error in Listing 15-4. But the error does include |
266 | this helpful suggestion: | |
cc61c64b XL |
267 | |
268 | ``` | |
2c00a5a8 XL |
269 | = help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to |
270 | make `List` representable | |
cc61c64b XL |
271 | ``` |
272 | ||
ea8adc8c | 273 | In this suggestion, “indirection” means that instead of storing a value |
2c00a5a8 XL |
274 | directly, we’ll change the data structure to store the value indirectly by |
275 | storing a pointer to the value instead. | |
ea8adc8c XL |
276 | |
277 | Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>` | |
278 | needs: a pointer’s size doesn’t change based on the amount of data it’s | |
2c00a5a8 XL |
279 | pointing to. This means we can put a `Box<T>` inside the `Cons` variant instead |
280 | of another `List` value directly. The `Box<T>` will point to the next `List` | |
281 | value that will be on the heap rather than inside the `Cons` variant. | |
282 | Conceptually, we still have a list, created with lists “holding” other lists, | |
283 | but this implementation is now more like the items being next to one another | |
284 | rather than inside one another. | |
ea8adc8c | 285 | |
2c00a5a8 XL |
286 | We can change the definition of the `List` enum in Listing 15-2 and the usage |
287 | of the `List` in Listing 15-3 to the code in Listing 15-5, which will compile: | |
cc61c64b XL |
288 | |
289 | Filename: src/main.rs | |
290 | ||
291 | ``` | |
292 | enum List { | |
293 | Cons(i32, Box<List>), | |
294 | Nil, | |
295 | } | |
296 | ||
297 | use List::{Cons, Nil}; | |
298 | ||
299 | fn main() { | |
300 | let list = Cons(1, | |
301 | Box::new(Cons(2, | |
302 | Box::new(Cons(3, | |
303 | Box::new(Nil)))))); | |
304 | } | |
305 | ``` | |
306 | ||
2c00a5a8 | 307 | Listing 15-5: Definition of `List` that uses `Box<T>` in order to have a known |
ea8adc8c | 308 | size |
cc61c64b | 309 | |
ea8adc8c XL |
310 | The `Cons` variant will need the size of an `i32` plus the space to store the |
311 | box’s pointer data. The `Nil` variant stores no values, so it needs less space | |
312 | than the `Cons` variant. We now know that any `List` value will take up the | |
313 | size of an `i32` plus the size of a box’s pointer data. By using a box, we’ve | |
2c00a5a8 XL |
314 | broken the infinite, recursive chain, so the compiler can figure out the size |
315 | it needs to store a `List` value. Figure 15-2 shows what the `Cons` variant | |
316 | looks like now: | |
cc61c64b XL |
317 | |
318 | <img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" /> | |
319 | ||
2c00a5a8 | 320 | Figure 15-2: A `List` that is not infinitely sized because `Cons` holds a `Box` |
ea8adc8c XL |
321 | |
322 | Boxes only provide the indirection and heap allocation; they don’t have any | |
2c00a5a8 | 323 | other special capabilities, like those we’ll see with the other smart pointer |
ea8adc8c | 324 | types. They also don’t have any performance overhead that these special |
2c00a5a8 | 325 | capabilities incur, so they can be useful in cases like the cons list where the |
ea8adc8c XL |
326 | indirection is the only feature we need. We’ll look at more use cases for boxes |
327 | in Chapter 17, too. | |
328 | ||
329 | The `Box<T>` type is a smart pointer because it implements the `Deref` trait, | |
330 | which allows `Box<T>` values to be treated like references. When a `Box<T>` | |
331 | value goes out of scope, the heap data that the box is pointing to is cleaned | |
2c00a5a8 XL |
332 | up as well because of the `Drop` trait implementation. Let’s explore these two |
333 | traits in more detail. These two traits will be even more important to the | |
334 | functionality provided by the other smart pointer types we’ll discuss in the | |
335 | rest of this chapter. | |
336 | ||
337 | ## Treating Smart Pointers Like Regular References with the `Deref` Trait | |
338 | ||
339 | Implementing the `Deref` trait allows us to customize the behavior of the | |
340 | *dereference operator*, `*` (as opposed to the multiplication or glob | |
341 | operator). By implementing `Deref` in such a way that a smart pointer can be | |
342 | treated like a regular reference, we can write code that operates on references | |
343 | and use that code with smart pointers too. | |
344 | ||
345 | Let’s first look at how `*` works with regular references, and then try to | |
346 | define our own type like `Box<T>` and see why `*` doesn’t work like a reference | |
347 | on our newly defined type. We’ll explore how implementing the `Deref` trait | |
348 | makes it possible for smart pointers to work in a similar way as references. | |
349 | Then we’ll look at Rust’s *deref coercion* feature and how it lets us work with | |
350 | either references or smart pointers. | |
ea8adc8c XL |
351 | |
352 | ### Following the Pointer to the Value with `*` | |
353 | ||
ea8adc8c | 354 | A regular reference is a type of pointer, and one way to think of a pointer is |
2c00a5a8 XL |
355 | as an arrow to a value stored somewhere else. In Listing 15-6, we create a |
356 | reference to an `i32` value and then use the dereference operator to follow the | |
357 | reference to the data: | |
ea8adc8c | 358 | |
2c00a5a8 | 359 | Filename: src/main.rs |
cc61c64b | 360 | |
ea8adc8c XL |
361 | ``` |
362 | fn main() { | |
363 | let x = 5; | |
364 | let y = &x; | |
cc61c64b | 365 | |
ea8adc8c XL |
366 | assert_eq!(5, x); |
367 | assert_eq!(5, *y); | |
368 | } | |
369 | ``` | |
cc61c64b | 370 | |
2c00a5a8 | 371 | Listing 15-6: Using the dereference operator to follow a reference to an `i32` |
ea8adc8c | 372 | value |
cc61c64b | 373 | |
ea8adc8c XL |
374 | The variable `x` holds an `i32` value, `5`. We set `y` equal to a reference to |
375 | `x`. We can assert that `x` is equal to `5`. However, if we want to make an | |
376 | assertion about the value in `y`, we have to use `*y` to follow the reference | |
2c00a5a8 XL |
377 | to the value it’s pointing to (hence *dereference*). Once we dereference `y`, |
378 | we have access to the integer value `y` is pointing to that we can compare with | |
379 | `5`. | |
cc61c64b | 380 | |
2c00a5a8 | 381 | If we tried to write `assert_eq!(5, y);` instead, we would get this compilation |
ea8adc8c | 382 | error: |
cc61c64b XL |
383 | |
384 | ``` | |
ea8adc8c XL |
385 | error[E0277]: the trait bound `{integer}: std::cmp::PartialEq<&{integer}>` is |
386 | not satisfied | |
2c00a5a8 | 387 | --> src/main.rs:6:5 |
ea8adc8c | 388 | | |
2c00a5a8 XL |
389 | 6 | assert_eq!(5, y); |
390 | | ^^^^^^^^^^^^^^^^^ can't compare `{integer}` with `&{integer}` | |
ea8adc8c XL |
391 | | |
392 | = help: the trait `std::cmp::PartialEq<&{integer}>` is not implemented for | |
393 | `{integer}` | |
394 | ``` | |
cc61c64b | 395 | |
2c00a5a8 XL |
396 | Comparing a number and a reference to a number isn’t allowed because they’re |
397 | different types. We must use `*` to follow the reference to the value it’s | |
ea8adc8c XL |
398 | pointing to. |
399 | ||
400 | ### Using `Box<T>` Like a Reference | |
401 | ||
2c00a5a8 XL |
402 | We can rewrite the code in Listing 15-6 to use a `Box<T>` instead of a |
403 | reference, and the dereference operator will work the same way as shown in | |
404 | Listing 15-7: | |
ea8adc8c | 405 | |
2c00a5a8 | 406 | Filename: src/main.rs |
cc61c64b | 407 | |
cc61c64b | 408 | ``` |
ea8adc8c XL |
409 | fn main() { |
410 | let x = 5; | |
411 | let y = Box::new(x); | |
cc61c64b | 412 | |
ea8adc8c XL |
413 | assert_eq!(5, x); |
414 | assert_eq!(5, *y); | |
415 | } | |
416 | ``` | |
cc61c64b | 417 | |
2c00a5a8 | 418 | Listing 15-7: Using the dereference operator on a `Box<i32>` |
cc61c64b | 419 | |
2c00a5a8 XL |
420 | The only difference between Listing 15-7 and Listing 15-6 is that here we set |
421 | `y` to be an instance of a box pointing to the value in `x` rather than a | |
422 | reference pointing to the value of `x`. In the last assertion, we can use the | |
423 | dereference operator to follow the box’s pointer in the same way that we did | |
424 | when `y` was a reference. Next, we’ll explore what is special about `Box<T>` | |
425 | that enables us to use the dereference operator by defining our own box type. | |
ea8adc8c XL |
426 | |
427 | ### Defining Our Own Smart Pointer | |
428 | ||
2c00a5a8 XL |
429 | Let’s build a smart pointer similar to the `Box<T>` type provided by the |
430 | standard library to experience how smart pointers behave differently to | |
431 | references by default. Then we’ll look at how to add the ability to use the | |
432 | dereference operator. | |
ea8adc8c | 433 | |
2c00a5a8 XL |
434 | The `Box<T>` type is ultimately defined as a tuple struct with one element, so |
435 | Listing 15-8 defines a `MyBox<T>` type in the same way. We’ll also define a | |
436 | `new` function to match the `new` function defined on `Box<T>`: | |
cc61c64b XL |
437 | |
438 | Filename: src/main.rs | |
439 | ||
440 | ``` | |
ea8adc8c | 441 | struct MyBox<T>(T); |
cc61c64b | 442 | |
ea8adc8c XL |
443 | impl<T> MyBox<T> { |
444 | fn new(x: T) -> MyBox<T> { | |
445 | MyBox(x) | |
446 | } | |
cc61c64b | 447 | } |
ea8adc8c | 448 | ``` |
cc61c64b | 449 | |
2c00a5a8 | 450 | Listing 15-8: Defining a `MyBox<T>` type |
cc61c64b | 451 | |
2c00a5a8 XL |
452 | We define a struct named `MyBox` and declare a generic parameter `T`, because |
453 | we want our type to hold values of any type. The `MyBox` type is a tuple struct | |
ea8adc8c XL |
454 | with one element of type `T`. The `MyBox::new` function takes one parameter of |
455 | type `T` and returns a `MyBox` instance that holds the value passed in. | |
456 | ||
2c00a5a8 XL |
457 | Let’s try adding the `main` function in Listing 15-7 to Listing 15-8 and |
458 | changing it to use the `MyBox<T>` type we’ve defined instead of `Box<T>`. The | |
459 | code in Listing 15-9 won’t compile because Rust doesn’t know how to dereference | |
460 | `MyBox`: | |
ea8adc8c XL |
461 | |
462 | Filename: src/main.rs | |
cc61c64b | 463 | |
ea8adc8c | 464 | ``` |
cc61c64b | 465 | fn main() { |
ea8adc8c XL |
466 | let x = 5; |
467 | let y = MyBox::new(x); | |
468 | ||
469 | assert_eq!(5, x); | |
470 | assert_eq!(5, *y); | |
471 | } | |
472 | ``` | |
473 | ||
2c00a5a8 XL |
474 | Listing 15-9: Attempting to use `MyBox<T>` in the same way we used references |
475 | and `Box<T>` | |
ea8adc8c | 476 | |
2c00a5a8 | 477 | Here’s the resulting compilation error: |
ea8adc8c XL |
478 | |
479 | ``` | |
2c00a5a8 | 480 | error[E0614]: type `MyBox<{integer}>` cannot be dereferenced |
ea8adc8c XL |
481 | --> src/main.rs:14:19 |
482 | | | |
483 | 14 | assert_eq!(5, *y); | |
484 | | ^^ | |
485 | ``` | |
486 | ||
487 | Our `MyBox<T>` type can’t be dereferenced because we haven’t implemented that | |
2c00a5a8 | 488 | ability on our type. To enable dereferencing with the `*` operator, we |
ea8adc8c XL |
489 | implement the `Deref` trait. |
490 | ||
2c00a5a8 | 491 | ### Treating a Type Like a Reference by Implementing the `Deref` Trait |
ea8adc8c | 492 | |
2c00a5a8 XL |
493 | As discussed in Chapter 10, to implement a trait, we need to provide |
494 | implementations for the trait’s required methods. The `Deref` trait, provided | |
495 | by the standard library, requires us to implement one method named `deref` that | |
496 | borrows `self` and returns a reference to the inner data. Listing 15-10 | |
497 | contains an implementation of `Deref` to add to the definition of `MyBox`: | |
ea8adc8c XL |
498 | |
499 | Filename: src/main.rs | |
500 | ||
501 | ``` | |
502 | use std::ops::Deref; | |
503 | ||
ea8adc8c XL |
504 | impl<T> Deref for MyBox<T> { |
505 | type Target = T; | |
506 | ||
507 | fn deref(&self) -> &T { | |
508 | &self.0 | |
509 | } | |
cc61c64b XL |
510 | } |
511 | ``` | |
512 | ||
2c00a5a8 | 513 | Listing 15-10: Implementing `Deref` on `MyBox<T>` |
cc61c64b | 514 | |
2c00a5a8 XL |
515 | The `type Target = T;` syntax defines an associated type for the `Deref` trait |
516 | to use. Associated types are a slightly different way of declaring a generic | |
517 | parameter, but you don’t need to worry about them for now; we’ll cover them in | |
518 | more detail in Chapter 19. | |
cc61c64b | 519 | |
2c00a5a8 XL |
520 | We fill in the body of the `deref` method with `&self.0` so `deref` returns a |
521 | reference to the value we want to access with the `*` operator. The `main` | |
522 | function in Listing 15-9 that calls `*` on the `MyBox<T>` value now compiles | |
523 | and the assertions pass! | |
ea8adc8c XL |
524 | |
525 | Without the `Deref` trait, the compiler can only dereference `&` references. | |
2c00a5a8 XL |
526 | The `deref` method gives the compiler the ability to take a value of any type |
527 | that implements `Deref` and call the `deref` method to get a `&` reference that | |
528 | it knows how to dereference. | |
ea8adc8c | 529 | |
2c00a5a8 XL |
530 | When we entered `*y` in Listing 15-9, behind the scenes Rust actually ran this |
531 | code: | |
cc61c64b XL |
532 | |
533 | ``` | |
ea8adc8c | 534 | *(y.deref()) |
cc61c64b XL |
535 | ``` |
536 | ||
ea8adc8c | 537 | Rust substitutes the `*` operator with a call to the `deref` method and then a |
2c00a5a8 XL |
538 | plain dereference so as programmers we don’t have to think about whether or not |
539 | we need to call the `deref` method. This Rust feature lets us write code that | |
540 | functions identically whether we have a regular reference or a type that | |
541 | implements `Deref`. | |
542 | ||
543 | The reason the `deref` method returns a reference to a value and that the plain | |
544 | dereference outside the parentheses in `*(y.deref())` is still necessary is due | |
545 | to the ownership system. If the `deref` method returned the value directly | |
546 | instead of a reference to the value, the value would be moved out of `self`. We | |
547 | don’t want to take ownership of the inner value inside `MyBox<T>` in this case | |
548 | and in most cases where we use the dereference operator. | |
549 | ||
550 | Note that the `*` is replaced with a call to the `deref` method and then a call | |
551 | to `*` just once, each time we type a `*` in our code. Because the substitution | |
552 | of `*` does not recurse infinitely, we end up with data of type `i32`, which | |
553 | matches the `5` in `assert_eq!` in Listing 15-9. | |
cc61c64b XL |
554 | |
555 | ### Implicit Deref Coercions with Functions and Methods | |
556 | ||
ea8adc8c XL |
557 | *Deref coercion* is a convenience that Rust performs on arguments to functions |
558 | and methods. Deref coercion converts a reference to a type that implements | |
559 | `Deref` into a reference to a type that `Deref` can convert the original type | |
2c00a5a8 XL |
560 | into. Deref coercion happens automatically when we pass a reference to a |
561 | particular type’s value as an argument to a function or method that doesn’t | |
562 | match the parameter type in the function or method definition. A sequence of | |
563 | calls to the `deref` method converts the type we provided into the type the | |
564 | parameter needs. | |
ea8adc8c XL |
565 | |
566 | Deref coercion was added to Rust so that programmers writing function and | |
567 | method calls don’t need to add as many explicit references and dereferences | |
2c00a5a8 XL |
568 | with `&` and `*`. The deref coercion feature also lets us write more code that |
569 | can work for either references or smart pointers. | |
ea8adc8c | 570 | |
2c00a5a8 XL |
571 | To see deref coercion in action, let’s use the `MyBox<T>` type we defined in |
572 | Listing 15-8 as well as the implementation of `Deref` that we added in Listing | |
573 | 15-10. Listing 15-11 shows the definition of a function that has a string slice | |
574 | parameter: | |
cc61c64b | 575 | |
ea8adc8c | 576 | Filename: src/main.rs |
cc61c64b XL |
577 | |
578 | ``` | |
ea8adc8c XL |
579 | fn hello(name: &str) { |
580 | println!("Hello, {}!", name); | |
cc61c64b XL |
581 | } |
582 | ``` | |
583 | ||
2c00a5a8 | 584 | Listing 15-11: A `hello` function that has the parameter `name` of type `&str` |
ea8adc8c | 585 | |
2c00a5a8 XL |
586 | We can call the `hello` function with a string slice as an argument, such as |
587 | `hello("Rust");` for example. Deref coercion makes it possible to call `hello` | |
588 | with a reference to a value of type `MyBox<String>`, as shown in Listing 15-12: | |
ea8adc8c XL |
589 | |
590 | Filename: src/main.rs | |
cc61c64b XL |
591 | |
592 | ``` | |
ea8adc8c XL |
593 | fn main() { |
594 | let m = MyBox::new(String::from("Rust")); | |
595 | hello(&m); | |
596 | } | |
cc61c64b XL |
597 | ``` |
598 | ||
2c00a5a8 XL |
599 | Listing 15-12: Calling `hello` with a reference to a `MyBox<String>` value, |
600 | which works because of deref coercion | |
ea8adc8c XL |
601 | |
602 | Here we’re calling the `hello` function with the argument `&m`, which is a | |
603 | reference to a `MyBox<String>` value. Because we implemented the `Deref` trait | |
2c00a5a8 | 604 | on `MyBox<T>` in Listing 15-10, Rust can turn `&MyBox<String>` into `&String` |
ea8adc8c | 605 | by calling `deref`. The standard library provides an implementation of `Deref` |
2c00a5a8 XL |
606 | on `String` that returns a string slice, which is in the API documentation for |
607 | `Deref`. Rust calls `deref` again to turn the `&String` into `&str`, which | |
608 | matches the `hello` function’s definition. | |
cc61c64b | 609 | |
2c00a5a8 XL |
610 | If Rust didn’t implement deref coercion, we would have to write the code in |
611 | Listing 15-13 instead of the code in Listing 15-12 to call `hello` with a value | |
612 | of type `&MyBox<String>`: | |
ea8adc8c XL |
613 | |
614 | Filename: src/main.rs | |
cc61c64b XL |
615 | |
616 | ``` | |
ea8adc8c XL |
617 | fn main() { |
618 | let m = MyBox::new(String::from("Rust")); | |
619 | hello(&(*m)[..]); | |
620 | } | |
cc61c64b XL |
621 | ``` |
622 | ||
2c00a5a8 XL |
623 | Listing 15-13: The code we would have to write if Rust didn’t have deref |
624 | coercion | |
ea8adc8c | 625 | |
2c00a5a8 XL |
626 | The `(*m)` dereferences the `MyBox<String>` into a `String`. Then the `&` and |
627 | `[..]` take a string slice of the `String` that is equal to the whole string to | |
628 | match the signature of `hello`. The code without deref coercions is harder to | |
629 | read, write, and understand with all of these symbols involved. Deref coercion | |
630 | allows Rust to handle these conversions for us automatically. | |
ea8adc8c XL |
631 | |
632 | When the `Deref` trait is defined for the types involved, Rust will analyze the | |
2c00a5a8 XL |
633 | types and use `Deref::deref` as many times as necessary to get a reference to |
634 | match the parameter’s type. The number of times that `Deref::deref` needs to be | |
635 | inserted is resolved at compile time, so there is no runtime penalty for taking | |
636 | advantage of deref coercion! | |
ea8adc8c XL |
637 | |
638 | ### How Deref Coercion Interacts with Mutability | |
cc61c64b | 639 | |
ea8adc8c XL |
640 | Similar to how we use the `Deref` trait to override `*` on immutable |
641 | references, Rust provides a `DerefMut` trait for overriding `*` on mutable | |
642 | references. | |
cc61c64b XL |
643 | |
644 | Rust does deref coercion when it finds types and trait implementations in three | |
645 | cases: | |
646 | ||
2c00a5a8 XL |
647 | * From `&T` to `&U` when `T: Deref<Target=U>` |
648 | * From `&mut T` to `&mut U` when `T: DerefMut<Target=U>` | |
649 | * From `&mut T` to `&U` when `T: Deref<Target=U>` | |
cc61c64b | 650 | |
2c00a5a8 | 651 | The first two cases are the same except for mutability. The first case states |
ea8adc8c XL |
652 | that if you have a `&T`, and `T` implements `Deref` to some type `U`, you can |
653 | get a `&U` transparently. The second case states that the same deref coercion | |
654 | happens for mutable references. | |
655 | ||
2c00a5a8 XL |
656 | The third case is trickier: Rust will also coerce a mutable reference to an |
657 | immutable one. But the reverse is *not* possible: immutable references will | |
658 | never coerce to mutable references. Because of the borrowing rules, if you have | |
659 | a mutable reference, that mutable reference must be the only reference to that | |
ea8adc8c XL |
660 | data (otherwise, the program wouldn’t compile). Converting one mutable |
661 | reference to one immutable reference will never break the borrowing rules. | |
662 | Converting an immutable reference to a mutable reference would require that | |
2c00a5a8 | 663 | there is only one immutable reference to that data, and the borrowing rules |
ea8adc8c XL |
664 | don’t guarantee that. Therefore, Rust can’t make the assumption that converting |
665 | an immutable reference to a mutable reference is possible. | |
666 | ||
cc61c64b XL |
667 | ## The `Drop` Trait Runs Code on Cleanup |
668 | ||
ea8adc8c XL |
669 | The second trait important to the smart pointer pattern is `Drop`, which lets |
670 | us customize what happens when a value is about to go out of scope. We can | |
671 | provide an implementation for the `Drop` trait on any type, and the code we | |
672 | specify can be used to release resources like files or network connections. | |
673 | We’re introducing `Drop` in the context of smart pointers because the | |
674 | functionality of the `Drop` trait is almost always used when implementing a | |
2c00a5a8 XL |
675 | smart pointer. For example, `Box<T>` customizes `Drop` to deallocate the space |
676 | on the heap that the box points to. | |
ea8adc8c XL |
677 | |
678 | In some languages, the programmer must call code to free memory or resources | |
679 | every time they finish using an instance of a smart pointer. If they forget, | |
680 | the system might become overloaded and crash. In Rust, we can specify that a | |
681 | particular bit of code should be run whenever a value goes out of scope, and | |
2c00a5a8 XL |
682 | the compiler will insert this code automatically. As a result, we don’t need to |
683 | be careful about placing cleanup code everywhere in a program that an instance | |
684 | of a particular type is finished with, but we still won’t leak resources! | |
ea8adc8c XL |
685 | |
686 | We specify the code to run when a value goes out of scope by implementing the | |
687 | `Drop` trait. The `Drop` trait requires us to implement one method named `drop` | |
2c00a5a8 XL |
688 | that takes a mutable reference to `self`. To see when Rust calls `drop`, let’s |
689 | implement `drop` with `println!` statements for now. | |
ea8adc8c | 690 | |
2c00a5a8 XL |
691 | Listing 15-14 shows a `CustomSmartPointer` struct whose only custom |
692 | functionality is that it will print `Dropping CustomSmartPointer!` when the | |
693 | instance goes out of scope. This example demonstrates when Rust runs the `drop` | |
694 | function: | |
cc61c64b XL |
695 | |
696 | Filename: src/main.rs | |
697 | ||
698 | ``` | |
699 | struct CustomSmartPointer { | |
700 | data: String, | |
701 | } | |
702 | ||
703 | impl Drop for CustomSmartPointer { | |
704 | fn drop(&mut self) { | |
ea8adc8c | 705 | println!("Dropping CustomSmartPointer with data `{}`!", self.data); |
cc61c64b XL |
706 | } |
707 | } | |
708 | ||
709 | fn main() { | |
ea8adc8c XL |
710 | let c = CustomSmartPointer { data: String::from("my stuff") }; |
711 | let d = CustomSmartPointer { data: String::from("other stuff") }; | |
712 | println!("CustomSmartPointers created."); | |
cc61c64b XL |
713 | } |
714 | ``` | |
715 | ||
2c00a5a8 XL |
716 | Listing 15-14: A `CustomSmartPointer` struct that implements the `Drop` trait |
717 | where we would put our cleanup code | |
ea8adc8c XL |
718 | |
719 | The `Drop` trait is included in the prelude, so we don’t need to import it. We | |
2c00a5a8 | 720 | implement the `Drop` trait on `CustomSmartPointer` and provide an |
ea8adc8c | 721 | implementation for the `drop` method that calls `println!`. The body of the |
2c00a5a8 XL |
722 | `drop` function is where you would place any logic that you wanted to run when |
723 | an instance of your type goes out of scope. We’re printing some text here to | |
724 | demonstrate when Rust will call `drop`. | |
cc61c64b | 725 | |
2c00a5a8 XL |
726 | In `main`, we create two instances of `CustomSmartPointer` and then print |
727 | `CustomSmartPointers created.`. At the end of `main`, our instance of | |
ea8adc8c XL |
728 | `CustomSmartPointer` will go out of scope, and Rust will call the code we put |
729 | in the `drop` method, printing our final message. Note that we didn’t need to | |
730 | call the `drop` method explicitly. | |
cc61c64b | 731 | |
ea8adc8c | 732 | When we run this program, we’ll see the following output: |
cc61c64b XL |
733 | |
734 | ``` | |
ea8adc8c XL |
735 | CustomSmartPointers created. |
736 | Dropping CustomSmartPointer with data `other stuff`! | |
737 | Dropping CustomSmartPointer with data `my stuff`! | |
738 | ``` | |
739 | ||
740 | Rust automatically called `drop` for us when our instance went out of scope, | |
741 | calling the code we specified. Variables are dropped in the reverse order of | |
2c00a5a8 XL |
742 | the order in which they were created, so `d` was dropped before `c`. This |
743 | example just gives you a visual guide to how the `drop` method works, but | |
744 | usually you would specify the cleanup code that your type needs to run rather | |
745 | than a print message. | |
746 | ||
747 | ### Dropping a Value Early with `std::mem::drop` | |
748 | ||
749 | Unfortunately, it’s not straightforward to disable the automatic `drop` | |
750 | functionality. Disabling `drop` isn’t usually necessary; the whole point of the | |
751 | `Drop` trait is that it’s taken care of automatically. Occasionally, you might | |
752 | want to clean up a value early. One example is when using smart pointers that | |
753 | manage locks: you might want to force the `drop` method that releases the lock | |
754 | to run so other code in the same scope can acquire the lock. Rust doesn’t let | |
755 | us call the `Drop` trait’s `drop` method manually; instead we have to call the | |
756 | `std::mem::drop` function provided by the standard library if we want to force | |
757 | a value to be dropped before the end of its scope. | |
758 | ||
759 | Let’s see what happens when we try to call the `Drop` trait’s `drop` method | |
760 | manually by modifying the `main` function in Listing 15-14, as shown in Listing | |
761 | 15-15: | |
ea8adc8c XL |
762 | |
763 | Filename: src/main.rs | |
764 | ||
765 | ``` | |
766 | fn main() { | |
767 | let c = CustomSmartPointer { data: String::from("some data") }; | |
768 | println!("CustomSmartPointer created."); | |
769 | c.drop(); | |
770 | println!("CustomSmartPointer dropped before the end of main."); | |
771 | } | |
772 | ``` | |
773 | ||
2c00a5a8 | 774 | Listing 15-15: Attempting to call the `drop` method from the `Drop` trait |
ea8adc8c XL |
775 | manually to clean up early |
776 | ||
2c00a5a8 | 777 | When we try to compile this code, we’ll get this error: |
ea8adc8c | 778 | |
cc61c64b | 779 | ``` |
ea8adc8c | 780 | error[E0040]: explicit use of destructor method |
2c00a5a8 | 781 | --> src/main.rs:14:7 |
ea8adc8c | 782 | | |
2c00a5a8 | 783 | 14 | c.drop(); |
ea8adc8c XL |
784 | | ^^^^ explicit destructor calls not allowed |
785 | ``` | |
786 | ||
2c00a5a8 XL |
787 | This error message states that we’re not allowed to explicitly call `drop`. The |
788 | error message uses the term *destructor*, which is the general programming term | |
789 | for a function that cleans up an instance. A *destructor* is analogous to a | |
ea8adc8c XL |
790 | *constructor* that creates an instance. The `drop` function in Rust is one |
791 | particular destructor. | |
792 | ||
793 | Rust doesn’t let us call `drop` explicitly because Rust would still | |
2c00a5a8 XL |
794 | automatically call `drop` on the value at the end of `main`. This would be a |
795 | *double free* error because Rust would be trying to clean up the same value | |
ea8adc8c | 796 | twice. |
cc61c64b | 797 | |
2c00a5a8 XL |
798 | We can’t disable the automatic insertion of `drop` when a value goes out of |
799 | scope, and we can’t call the `drop` method explicitly. So, if we need to force | |
800 | a value to be cleaned up early, we can use the `std::mem::drop` function. | |
cc61c64b | 801 | |
ea8adc8c XL |
802 | The `std::mem::drop` function is different than the `drop` method in the `Drop` |
803 | trait. We call it by passing the value we want to force to be dropped early as | |
2c00a5a8 XL |
804 | an argument. The function is in the prelude, so we can modify `main` in Listing |
805 | 15-14 to call the `drop` function, as shown in Listing 15-16: | |
cc61c64b XL |
806 | |
807 | Filename: src/main.rs | |
808 | ||
809 | ``` | |
810 | fn main() { | |
811 | let c = CustomSmartPointer { data: String::from("some data") }; | |
812 | println!("CustomSmartPointer created."); | |
813 | drop(c); | |
ea8adc8c | 814 | println!("CustomSmartPointer dropped before the end of main."); |
cc61c64b XL |
815 | } |
816 | ``` | |
817 | ||
2c00a5a8 | 818 | Listing 15-16: Calling `std::mem::drop` to explicitly drop a value before it |
cc61c64b XL |
819 | goes out of scope |
820 | ||
ea8adc8c | 821 | Running this code will print the following: |
cc61c64b XL |
822 | |
823 | ``` | |
824 | CustomSmartPointer created. | |
2c00a5a8 | 825 | Dropping CustomSmartPointer with data `some data`! |
ea8adc8c | 826 | CustomSmartPointer dropped before the end of main. |
cc61c64b XL |
827 | ``` |
828 | ||
2c00a5a8 XL |
829 | The text ```Dropping CustomSmartPointer with data `some data`!``` is printed |
830 | between the `CustomSmartPointer created.` and `CustomSmartPointer dropped | |
831 | before the end of main.` text, showing that the `drop` method code is called to | |
832 | drop `c` at that point. | |
cc61c64b | 833 | |
2c00a5a8 XL |
834 | We can use code specified in a `Drop` trait implementation in many ways to make |
835 | cleanup convenient and safe: for instance, we could use it to create our own | |
836 | memory allocator! With the `Drop` trait and Rust’s ownership system, we don’t | |
837 | have to remember to clean up because Rust does it automatically. | |
ea8adc8c XL |
838 | |
839 | We also don’t have to worry about accidentally cleaning up values still in use | |
840 | because that would cause a compiler error: the ownership system that makes sure | |
2c00a5a8 XL |
841 | references are always valid also ensures that `drop` gets called only once when |
842 | the value is no longer being used. | |
cc61c64b | 843 | |
2c00a5a8 XL |
844 | Now that we’ve examined `Box<T>` and some of the characteristics of smart |
845 | pointers, let’s look at a few other smart pointers defined in the standard | |
ea8adc8c | 846 | library. |
cc61c64b XL |
847 | |
848 | ## `Rc<T>`, the Reference Counted Smart Pointer | |
849 | ||
ea8adc8c | 850 | In the majority of cases, ownership is clear: you know exactly which variable |
2c00a5a8 XL |
851 | owns a given value. However, there are cases when a single value might have |
852 | multiple owners. For example, in graph data structures, multiple edges might | |
ea8adc8c XL |
853 | point to the same node, and that node is conceptually owned by all of the edges |
854 | that point to it. A node shouldn’t be cleaned up unless it doesn’t have any | |
855 | edges pointing to it. | |
856 | ||
2c00a5a8 XL |
857 | To enable multiple ownership, Rust has a type called `Rc<T>`. Its name is an |
858 | abbreviation for *reference counting*, which keeps track of the number of | |
859 | references to a value to know whether or not a value is still in use. If there | |
860 | are zero references to a value, the value can be cleaned up without any | |
861 | references becoming invalid. | |
ea8adc8c | 862 | |
2c00a5a8 XL |
863 | Imagine `Rc<T>` as a TV in a family room. When one person enters to watch TV, |
864 | they turn it on. Others can come into the room and watch the TV. When the last | |
865 | person leaves the room, they turn off the TV because it’s no longer being used. | |
866 | If someone turns off the TV while others are still watching it, there would be | |
ea8adc8c XL |
867 | uproar from the remaining TV watchers! |
868 | ||
2c00a5a8 XL |
869 | We use the `Rc<T>` type when we want to allocate some data on the heap for |
870 | multiple parts of our program to read, and we can’t determine at compile time | |
871 | which part will finish using the data last. If we knew which part would finish | |
872 | last, we could just make that part the data’s owner and the normal ownership | |
873 | rules enforced at compile time would take effect. | |
cc61c64b | 874 | |
2c00a5a8 XL |
875 | Note that `Rc<T>` is only for use in single-threaded scenarios. When we discuss |
876 | concurrency in Chapter 16, we’ll cover how to do reference counting in | |
877 | multithreaded programs. | |
cc61c64b XL |
878 | |
879 | ### Using `Rc<T>` to Share Data | |
880 | ||
2c00a5a8 XL |
881 | Let’s return to our cons list example in Listing 15-5. Recall that we defined |
882 | it using `Box<T>`. This time, we’ll create two lists that both share ownership | |
883 | of a third list, which conceptually will look similar to Figure 15-3: | |
cc61c64b XL |
884 | |
885 | <img alt="Two lists that share ownership of a third list" src="img/trpl15-03.svg" class="center" /> | |
886 | ||
2c00a5a8 | 887 | Figure 15-3: Two lists, `b` and `c`, sharing ownership of a third list, `a` |
ea8adc8c | 888 | |
2c00a5a8 XL |
889 | We’ll create list `a` that contains 5 and then 10. Then we’ll make two more |
890 | lists: `b` that starts with 3 and `c` that starts with 4. Both `b` and `c` | |
891 | lists will then continue on to the first `a` list containing 5 and 10. In other | |
892 | words, both lists will share the first list containing 5 and 10. | |
cc61c64b | 893 | |
2c00a5a8 XL |
894 | Trying to implement this scenario using our definition of `List` with `Box<T>` |
895 | won’t work, as shown in Listing 15-17: | |
cc61c64b XL |
896 | |
897 | Filename: src/main.rs | |
898 | ||
899 | ``` | |
900 | enum List { | |
901 | Cons(i32, Box<List>), | |
902 | Nil, | |
903 | } | |
904 | ||
905 | use List::{Cons, Nil}; | |
906 | ||
907 | fn main() { | |
908 | let a = Cons(5, | |
909 | Box::new(Cons(10, | |
910 | Box::new(Nil)))); | |
911 | let b = Cons(3, Box::new(a)); | |
912 | let c = Cons(4, Box::new(a)); | |
913 | } | |
914 | ``` | |
915 | ||
2c00a5a8 | 916 | Listing 15-17: Demonstrating we’re not allowed to have two lists using `Box<T>` |
ea8adc8c | 917 | that try to share ownership of a third list |
cc61c64b | 918 | |
2c00a5a8 | 919 | When we compile this code, we get this error: |
cc61c64b XL |
920 | |
921 | ``` | |
922 | error[E0382]: use of moved value: `a` | |
923 | --> src/main.rs:13:30 | |
924 | | | |
925 | 12 | let b = Cons(3, Box::new(a)); | |
926 | | - value moved here | |
927 | 13 | let c = Cons(4, Box::new(a)); | |
928 | | ^ value used here after move | |
929 | | | |
2c00a5a8 XL |
930 | = note: move occurs because `a` has type `List`, which does not implement |
931 | the `Copy` trait | |
cc61c64b XL |
932 | ``` |
933 | ||
ea8adc8c XL |
934 | The `Cons` variants own the data they hold, so when we create the `b` list, `a` |
935 | is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when | |
936 | creating `c`, we’re not allowed to because `a` has been moved. | |
cc61c64b XL |
937 | |
938 | We could change the definition of `Cons` to hold references instead, but then | |
2c00a5a8 XL |
939 | we would have to specify lifetime parameters. By specifying lifetime |
940 | parameters, we would be specifying that every element in the list will live at | |
941 | least as long as the entire list. The borrow checker wouldn’t let us compile | |
942 | `let a = Cons(10, &Nil);` for example, because the temporary `Nil` value would | |
943 | be dropped before `a` could take a reference to it. | |
ea8adc8c XL |
944 | |
945 | Instead, we’ll change our definition of `List` to use `Rc<T>` in place of | |
2c00a5a8 XL |
946 | `Box<T>`, as shown in Listing 15-18. Each `Cons` variant will now hold a value |
947 | and an `Rc<T>` pointing to a `List`. When we create `b`, instead of taking | |
948 | ownership of `a`, we’ll clone the `Rc<List>` that `a` is holding, which | |
949 | increases the number of references from one to two and lets `a` and `b` share | |
950 | ownership of the data in that `Rc<List>`. We’ll also clone `a` when creating | |
951 | `c`, which increases the number of references from two to three. Every time we | |
952 | call `Rc::clone`, the reference count to the data within the `Rc<List>` will | |
953 | increase, and the data won’t be cleaned up unless there are zero references to | |
954 | it: | |
cc61c64b XL |
955 | |
956 | Filename: src/main.rs | |
957 | ||
958 | ``` | |
959 | enum List { | |
960 | Cons(i32, Rc<List>), | |
961 | Nil, | |
962 | } | |
963 | ||
964 | use List::{Cons, Nil}; | |
965 | use std::rc::Rc; | |
966 | ||
967 | fn main() { | |
968 | let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); | |
ea8adc8c XL |
969 | let b = Cons(3, Rc::clone(&a)); |
970 | let c = Cons(4, Rc::clone(&a)); | |
cc61c64b XL |
971 | } |
972 | ``` | |
973 | ||
2c00a5a8 | 974 | Listing 15-18: A definition of `List` that uses `Rc<T>` |
cc61c64b | 975 | |
2c00a5a8 XL |
976 | We need to add a `use` statement to bring `Rc<T>` into scope because it’s not |
977 | in the prelude. In `main`, we create the list holding 5 and 10 and store it in | |
978 | a new `Rc<List>` in `a`. Then when we create `b` and `c`, we call the | |
979 | `Rc::clone` function and pass a reference to the `Rc<List>` in `a` as an | |
980 | argument. | |
ea8adc8c | 981 | |
2c00a5a8 XL |
982 | We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust’s |
983 | convention is to use `Rc::clone` in this case. The implementation of | |
984 | `Rc::clone` doesn’t make a deep copy of all the data like most types’ | |
985 | implementations of `clone` do. The call to `Rc::clone` only increments the | |
986 | reference count, which doesn’t take much time. Deep copies of data can take a | |
987 | lot of time. By using `Rc::clone` for reference counting, we can visually | |
988 | distinguish between the deep copy kinds of clones and the kinds of clones that | |
989 | increase the reference count. When looking for performance problems in the | |
990 | code, we only need to consider the deep copy clones and can disregard calls to | |
991 | `Rc::clone`. | |
cc61c64b XL |
992 | |
993 | ### Cloning an `Rc<T>` Increases the Reference Count | |
994 | ||
2c00a5a8 XL |
995 | Let’s change our working example in Listing 15-18 so we can see the reference |
996 | counts changing as we create and drop references to the `Rc<List>` in `a`. | |
997 | ||
998 | In Listing 15-19, we’ll change `main` so it has an inner scope around list `c`; | |
999 | then we can see how the reference count changes when `c` goes out of scope. At | |
1000 | each point in the program where the reference count changes, we’ll print the | |
1001 | reference count, which we can get by calling the `Rc::strong_count` function. | |
1002 | This function is named `strong_count` rather than `count` because the `Rc<T>` | |
1003 | type also has a `weak_count`; we’ll see what `weak_count` is used for in the | |
1004 | “Preventing Reference Cycles” section. | |
cc61c64b XL |
1005 | |
1006 | Filename: src/main.rs | |
1007 | ||
1008 | ``` | |
1009 | fn main() { | |
1010 | let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); | |
ea8adc8c XL |
1011 | println!("count after creating a = {}", Rc::strong_count(&a)); |
1012 | let b = Cons(3, Rc::clone(&a)); | |
1013 | println!("count after creating b = {}", Rc::strong_count(&a)); | |
cc61c64b | 1014 | { |
ea8adc8c XL |
1015 | let c = Cons(4, Rc::clone(&a)); |
1016 | println!("count after creating c = {}", Rc::strong_count(&a)); | |
cc61c64b | 1017 | } |
ea8adc8c | 1018 | println!("count after c goes out of scope = {}", Rc::strong_count(&a)); |
cc61c64b XL |
1019 | } |
1020 | ``` | |
1021 | ||
2c00a5a8 | 1022 | Listing 15-19: Printing the reference count |
cc61c64b | 1023 | |
2c00a5a8 | 1024 | This code prints the following: |
cc61c64b XL |
1025 | |
1026 | ``` | |
ea8adc8c XL |
1027 | count after creating a = 1 |
1028 | count after creating b = 2 | |
1029 | count after creating c = 3 | |
1030 | count after c goes out of scope = 2 | |
cc61c64b XL |
1031 | ``` |
1032 | ||
2c00a5a8 | 1033 | We can see that the `Rc<List>` in `a` has an initial reference count of one; |
ea8adc8c XL |
1034 | then each time we call `clone`, the count goes up by one. When `c` goes out of |
1035 | scope, the count goes down by one. We don’t have to call a function to decrease | |
1036 | the reference count like we have to call `Rc::clone` to increase the reference | |
2c00a5a8 XL |
1037 | count: the implementation of the `Drop` trait decreases the reference count |
1038 | automatically when an `Rc<T>` value goes out of scope. | |
1039 | ||
1040 | What we can’t see in this example is that when `b` and then `a` go out of scope | |
1041 | at the end of `main`, the count is then 0, and the `Rc<List>` is cleaned up | |
1042 | completely at that point. Using `Rc<T>` allows a single value to have | |
1043 | multiple owners, and the count ensures that the value remains valid as long as | |
1044 | any of the owners still exist. | |
1045 | ||
1046 | Via immutable references, `Rc<T>` allows us to share data between multiple | |
1047 | parts of our program for reading only. If `Rc<T>` allowed us to have multiple | |
1048 | mutable references too, we might violate one of the borrowing rules discussed | |
1049 | in Chapter 4: multiple mutable borrows to the same place can cause data races | |
1050 | and inconsistencies. But being able to mutate data is very useful! In the next | |
1051 | section, we’ll discuss the interior mutability pattern and the `RefCell<T>` | |
1052 | type that we can use in conjunction with an `Rc<T>` to work with this | |
1053 | immutability restriction. | |
cc61c64b XL |
1054 | |
1055 | ## `RefCell<T>` and the Interior Mutability Pattern | |
1056 | ||
2c00a5a8 XL |
1057 | *Interior mutability* is a design pattern in Rust that allows you to mutate |
1058 | data even when there are immutable references to that data: normally, this | |
1059 | action is disallowed by the borrowing rules. To do so, the pattern uses | |
1060 | `unsafe` code inside a data structure to bend Rust’s usual rules that govern | |
1061 | mutation and borrowing. We haven’t yet covered unsafe code; we will in Chapter | |
1062 | 19. We can use types that use the interior mutability pattern when we can | |
1063 | ensure that the borrowing rules will be followed at runtime, even though the | |
1064 | compiler can’t guarantee that. The `unsafe` code involved is then wrapped in a | |
1065 | safe API, and the outer type is still immutable. | |
1066 | ||
1067 | Let’s explore this concept by looking at the `RefCell<T>` type that follows the | |
cc61c64b XL |
1068 | interior mutability pattern. |
1069 | ||
ea8adc8c | 1070 | ### Enforcing Borrowing Rules at Runtime with `RefCell<T>` |
cc61c64b XL |
1071 | |
1072 | Unlike `Rc<T>`, the `RefCell<T>` type represents single ownership over the data | |
ea8adc8c | 1073 | it holds. So, what makes `RefCell<T>` different than a type like `Box<T>`? |
2c00a5a8 | 1074 | Recall the borrowing rules you learned in Chapter 4: |
cc61c64b | 1075 | |
2c00a5a8 XL |
1076 | * At any given time, you can have *either* but not both of the following: one |
1077 | mutable reference or any number of immutable references. | |
1078 | * References must always be valid. | |
cc61c64b | 1079 | |
3b2f2976 | 1080 | With references and `Box<T>`, the borrowing rules’ invariants are enforced at |
cc61c64b | 1081 | compile time. With `RefCell<T>`, these invariants are enforced *at runtime*. |
3b2f2976 | 1082 | With references, if you break these rules, you’ll get a compiler error. With |
2c00a5a8 | 1083 | `RefCell<T>`, if you break these rules, your program will `panic!` and exit. |
cc61c64b | 1084 | |
2c00a5a8 XL |
1085 | The advantages of checking the borrowing rules at compile time are that errors |
1086 | will be caught sooner in the development process, and there is no impact on | |
1087 | runtime performance because all the analysis is completed beforehand. For those | |
1088 | reasons, checking the borrowing rules at compile time is the best choice in the | |
1089 | majority of cases, which is why this is Rust’s default. | |
ea8adc8c | 1090 | |
2c00a5a8 | 1091 | The advantage of checking the borrowing rules at runtime instead is that |
ea8adc8c XL |
1092 | certain memory safe scenarios are then allowed, whereas they are disallowed by |
1093 | the compile time checks. Static analysis, like the Rust compiler, is inherently | |
1094 | conservative. Some properties of code are impossible to detect by analyzing the | |
2c00a5a8 XL |
1095 | code: the most famous example is the Halting Problem, which is beyond the scope |
1096 | of this book but is an interesting topic to research. | |
ea8adc8c XL |
1097 | |
1098 | Because some analysis is impossible, if the Rust compiler can’t be sure the | |
2c00a5a8 XL |
1099 | code complies with the ownership rules, it might reject a correct program; in |
1100 | this way, it’s conservative. If Rust accepted an incorrect program, users | |
1101 | wouldn’t be able to trust in the guarantees Rust makes. However, if Rust | |
ea8adc8c | 1102 | rejects a correct program, the programmer will be inconvenienced, but nothing |
2c00a5a8 XL |
1103 | catastrophic can occur. The `RefCell<T>` type is useful when you’re sure your |
1104 | code follows the borrowing rules, but the compiler is unable to understand and | |
1105 | guarantee that. | |
ea8adc8c | 1106 | |
2c00a5a8 XL |
1107 | Similar to `Rc<T>`, `RefCell<T>` is only for use in single-threaded scenarios |
1108 | and will give you a compile time error if you try using it in a multithreaded | |
1109 | context. We’ll talk about how to get the functionality of `RefCell<T>` in a | |
ea8adc8c XL |
1110 | multithreaded program in Chapter 16. |
1111 | ||
2c00a5a8 | 1112 | Here is a recap of the reasons to choose `Box<T>`, `Rc<T>`, or `RefCell<T>`: |
ea8adc8c | 1113 | |
2c00a5a8 | 1114 | * `Rc<T>` enables multiple owners of the same data; `Box<T>` and `RefCell<T>` |
ea8adc8c | 1115 | have single owners. |
2c00a5a8 | 1116 | * `Box<T>` allows immutable or mutable borrows checked at compile time; `Rc<T>` |
ea8adc8c XL |
1117 | only allows immutable borrows checked at compile time; `RefCell<T>` allows |
1118 | immutable or mutable borrows checked at runtime. | |
2c00a5a8 XL |
1119 | * Because `RefCell<T>` allows mutable borrows checked at runtime, we can mutate |
1120 | the value inside the `RefCell<T>` even when the `RefCell<T>` is immutable. | |
ea8adc8c | 1121 | |
2c00a5a8 XL |
1122 | Mutating the value inside an immutable value is the *interior mutability* |
1123 | pattern. Let’s look at a situation in which interior mutability is useful and | |
1124 | examine how it’s possible. | |
ea8adc8c XL |
1125 | |
1126 | ### Interior Mutability: A Mutable Borrow to an Immutable Value | |
1127 | ||
1128 | A consequence of the borrowing rules is that when we have an immutable value, | |
1129 | we can’t borrow it mutably. For example, this code won’t compile: | |
cc61c64b XL |
1130 | |
1131 | ``` | |
ea8adc8c XL |
1132 | fn main() { |
1133 | let x = 5; | |
1134 | let y = &mut x; | |
cc61c64b | 1135 | } |
ea8adc8c XL |
1136 | ``` |
1137 | ||
2c00a5a8 | 1138 | When we try to compile this code, we’ll get the following error: |
cc61c64b | 1139 | |
ea8adc8c XL |
1140 | ``` |
1141 | error[E0596]: cannot borrow immutable local variable `x` as mutable | |
1142 | --> src/main.rs:3:18 | |
1143 | | | |
1144 | 2 | let x = 5; | |
1145 | | - consider changing this to `mut x` | |
1146 | 3 | let y = &mut x; | |
1147 | | ^ cannot borrow mutably | |
1148 | ``` | |
1149 | ||
2c00a5a8 XL |
1150 | However, there are situations in which it would be useful for a value to mutate |
1151 | itself in its methods, but to other code, the value would appear immutable. | |
1152 | Code outside the value’s methods would not be able to mutate the value. Using | |
1153 | `RefCell<T>` is one way to get the ability to have interior mutability. But | |
1154 | `RefCell<T>` doesn’t get around the borrowing rules completely: the borrow | |
1155 | checker in the compiler allows this interior mutability, and the borrowing | |
1156 | rules are checked at runtime instead. If we violate the rules, we’ll get a | |
1157 | `panic!` instead of a compiler error. | |
ea8adc8c | 1158 | |
2c00a5a8 XL |
1159 | Let’s work through a practical example where we can use `RefCell<T>` to mutate |
1160 | an immutable value and see why that is useful. | |
ea8adc8c XL |
1161 | |
1162 | #### A Use Case for Interior Mutability: Mock Objects | |
1163 | ||
2c00a5a8 XL |
1164 | A *test double* is the general programming concept for a type used in place of |
1165 | another type during testing. *Mock objects* are specific types of test doubles | |
1166 | that record what happens during a test so we can assert that the correct | |
1167 | actions took place. | |
ea8adc8c | 1168 | |
2c00a5a8 XL |
1169 | Rust doesn’t have objects in the same sense as other languages have objects, |
1170 | and Rust doesn’t have mock object functionality built into the standard library | |
1171 | like some other languages do. However, we can definitely create a struct that | |
1172 | will serve the same purposes as a mock object. | |
ea8adc8c | 1173 | |
2c00a5a8 XL |
1174 | Here’s the scenario we’ll test: we’ll create a library that tracks a value |
1175 | against a maximum value and sends messages based on how close to the maximum | |
1176 | value the current value is. This library could be used for keeping track of a | |
ea8adc8c XL |
1177 | user’s quota for the number of API calls they’re allowed to make, for example. |
1178 | ||
2c00a5a8 XL |
1179 | Our library will only provide the functionality of tracking how close to the |
1180 | maximum a value is and what the messages should be at what times. Applications | |
1181 | that use our library will be expected to provide the mechanism for sending the | |
1182 | messages: the application could put a message in the application, send an | |
1183 | email, send a text message, or something else. The library doesn’t need to know | |
1184 | that detail. All it needs is something that implements a trait we’ll provide | |
1185 | called `Messenger`. Listing 15-20 shows the library code: | |
ea8adc8c XL |
1186 | |
1187 | Filename: src/lib.rs | |
1188 | ||
1189 | ``` | |
1190 | pub trait Messenger { | |
1191 | fn send(&self, msg: &str); | |
cc61c64b XL |
1192 | } |
1193 | ||
ea8adc8c XL |
1194 | pub struct LimitTracker<'a, T: 'a + Messenger> { |
1195 | messenger: &'a T, | |
1196 | value: usize, | |
1197 | max: usize, | |
cc61c64b XL |
1198 | } |
1199 | ||
ea8adc8c XL |
1200 | impl<'a, T> LimitTracker<'a, T> |
1201 | where T: Messenger { | |
1202 | pub fn new(messenger: &T, max: usize) -> LimitTracker<T> { | |
1203 | LimitTracker { | |
1204 | messenger, | |
1205 | value: 0, | |
1206 | max, | |
1207 | } | |
1208 | } | |
1209 | ||
1210 | pub fn set_value(&mut self, value: usize) { | |
1211 | self.value = value; | |
1212 | ||
1213 | let percentage_of_max = self.value as f64 / self.max as f64; | |
1214 | ||
1215 | if percentage_of_max >= 0.75 && percentage_of_max < 0.9 { | |
2c00a5a8 XL |
1216 | self.messenger.send("Warning: You've used up over 75% of your |
1217 | quota!"); | |
ea8adc8c | 1218 | } else if percentage_of_max >= 0.9 && percentage_of_max < 1.0 { |
2c00a5a8 XL |
1219 | self.messenger.send("Urgent warning: You've used up over 90% of |
1220 | your quota!"); | |
ea8adc8c XL |
1221 | } else if percentage_of_max >= 1.0 { |
1222 | self.messenger.send("Error: You are over your quota!"); | |
1223 | } | |
1224 | } | |
cc61c64b XL |
1225 | } |
1226 | ``` | |
1227 | ||
2c00a5a8 XL |
1228 | Listing 15-20: A library to keep track of how close to a maximum value a value |
1229 | is and warn when the value is at certain levels | |
cc61c64b | 1230 | |
2c00a5a8 XL |
1231 | One important part of this code is that the `Messenger` trait has one method |
1232 | called `send` that takes an immutable reference to `self` and text of the | |
1233 | message. This is the interface our mock object needs to have. The other | |
1234 | important part is that we want to test the behavior of the `set_value` method | |
1235 | on the `LimitTracker`. We can change what we pass in for the `value` parameter, | |
1236 | but `set_value` doesn’t return anything for us to make assertions on. We want | |
1237 | to be able to say that if we create a `LimitTracker` with something that | |
ea8adc8c | 1238 | implements the `Messenger` trait and a particular value for `max`, when we pass |
2c00a5a8 | 1239 | different numbers for `value`, the messenger is told to send the appropriate |
ea8adc8c XL |
1240 | messages. |
1241 | ||
2c00a5a8 XL |
1242 | We need a mock object that instead of sending an email or text message when we |
1243 | call `send` will only keep track of the messages it’s told to send. We can | |
1244 | create a new instance of the mock object, create a `LimitTracker` that uses the | |
1245 | mock object, call the `set_value` method on `LimitTracker`, and then check that | |
1246 | the mock object has the messages we expect. Listing 15-21 shows an attempt of | |
1247 | implementing a mock object to do just that but that the borrow checker won’t | |
1248 | allow: | |
ea8adc8c XL |
1249 | |
1250 | Filename: src/lib.rs | |
cc61c64b XL |
1251 | |
1252 | ``` | |
ea8adc8c XL |
1253 | #[cfg(test)] |
1254 | mod tests { | |
1255 | use super::*; | |
cc61c64b | 1256 | |
ea8adc8c XL |
1257 | struct MockMessenger { |
1258 | sent_messages: Vec<String>, | |
1259 | } | |
cc61c64b | 1260 | |
ea8adc8c XL |
1261 | impl MockMessenger { |
1262 | fn new() -> MockMessenger { | |
1263 | MockMessenger { sent_messages: vec![] } | |
1264 | } | |
1265 | } | |
cc61c64b | 1266 | |
ea8adc8c XL |
1267 | impl Messenger for MockMessenger { |
1268 | fn send(&self, message: &str) { | |
1269 | self.sent_messages.push(String::from(message)); | |
1270 | } | |
1271 | } | |
cc61c64b | 1272 | |
ea8adc8c XL |
1273 | #[test] |
1274 | fn it_sends_an_over_75_percent_warning_message() { | |
1275 | let mock_messenger = MockMessenger::new(); | |
1276 | let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); | |
cc61c64b | 1277 | |
ea8adc8c | 1278 | limit_tracker.set_value(80); |
cc61c64b | 1279 | |
ea8adc8c XL |
1280 | assert_eq!(mock_messenger.sent_messages.len(), 1); |
1281 | } | |
1282 | } | |
cc61c64b XL |
1283 | ``` |
1284 | ||
2c00a5a8 | 1285 | Listing 15-21: An attempt to implement a `MockMessenger` that isn’t allowed by |
ea8adc8c XL |
1286 | the borrow checker |
1287 | ||
1288 | This test code defines a `MockMessenger` struct that has a `sent_messages` | |
1289 | field with a `Vec` of `String` values to keep track of the messages it’s told | |
2c00a5a8 | 1290 | to send. We also define an associated function `new` to make it convenient to |
ea8adc8c | 1291 | create new `MockMessenger` values that start with an empty list of messages. We |
2c00a5a8 | 1292 | then implement the `Messenger` trait for `MockMessenger` so we can give a |
ea8adc8c XL |
1293 | `MockMessenger` to a `LimitTracker`. In the definition of the `send` method, we |
1294 | take the message passed in as a parameter and store it in the `MockMessenger` | |
1295 | list of `sent_messages`. | |
1296 | ||
1297 | In the test, we’re testing what happens when the `LimitTracker` is told to set | |
2c00a5a8 XL |
1298 | `value` to something that is more than 75 percent of the `max` value. First, we |
1299 | create a new `MockMessenger`, which will start with an empty list of messages. | |
1300 | Then we create a new `LimitTracker` and give it a reference to the new | |
1301 | `MockMessenger` and a `max` value of 100. We call the `set_value` method on the | |
1302 | `LimitTracker` with a value of 80, which is more than 75 percent of 100. Then | |
1303 | we assert that the list of messages that the `MockMessenger` is keeping track | |
1304 | of should now have one message in it. | |
ea8adc8c | 1305 | |
2c00a5a8 | 1306 | However, there’s one problem with this test, as shown here: |
cc61c64b XL |
1307 | |
1308 | ``` | |
ea8adc8c | 1309 | error[E0596]: cannot borrow immutable field `self.sent_messages` as mutable |
2c00a5a8 | 1310 | --> src/lib.rs:52:13 |
ea8adc8c | 1311 | | |
2c00a5a8 | 1312 | 51 | fn send(&self, message: &str) { |
ea8adc8c | 1313 | | ----- use `&mut self` here to make mutable |
2c00a5a8 | 1314 | 52 | self.sent_messages.push(String::from(message)); |
ea8adc8c | 1315 | | ^^^^^^^^^^^^^^^^^^ cannot mutably borrow immutable field |
cc61c64b XL |
1316 | ``` |
1317 | ||
ea8adc8c XL |
1318 | We can’t modify the `MockMessenger` to keep track of the messages because the |
1319 | `send` method takes an immutable reference to `self`. We also can’t take the | |
1320 | suggestion from the error text to use `&mut self` instead because then the | |
1321 | signature of `send` wouldn’t match the signature in the `Messenger` trait | |
1322 | definition (feel free to try and see what error message you get). | |
1323 | ||
2c00a5a8 XL |
1324 | This is a situation in which interior mutability can help! We’ll store the |
1325 | `sent_messages` within a `RefCell<T>`, and then the `send` message will be | |
1326 | able to modify `sent_messages` to store the messages we’ve seen. Listing 15-22 | |
1327 | shows what that looks like: | |
ea8adc8c XL |
1328 | |
1329 | Filename: src/lib.rs | |
cc61c64b XL |
1330 | |
1331 | ``` | |
ea8adc8c XL |
1332 | #[cfg(test)] |
1333 | mod tests { | |
1334 | use super::*; | |
1335 | use std::cell::RefCell; | |
cc61c64b | 1336 | |
ea8adc8c XL |
1337 | struct MockMessenger { |
1338 | sent_messages: RefCell<Vec<String>>, | |
1339 | } | |
1340 | ||
1341 | impl MockMessenger { | |
1342 | fn new() -> MockMessenger { | |
1343 | MockMessenger { sent_messages: RefCell::new(vec![]) } | |
1344 | } | |
1345 | } | |
1346 | ||
1347 | impl Messenger for MockMessenger { | |
1348 | fn send(&self, message: &str) { | |
1349 | self.sent_messages.borrow_mut().push(String::from(message)); | |
1350 | } | |
1351 | } | |
1352 | ||
1353 | #[test] | |
1354 | fn it_sends_an_over_75_percent_warning_message() { | |
ff7c6d11 | 1355 | // --snip-- |
ea8adc8c XL |
1356 | |
1357 | assert_eq!(mock_messenger.sent_messages.borrow().len(), 1); | |
1358 | } | |
1359 | } | |
1360 | ``` | |
1361 | ||
2c00a5a8 XL |
1362 | Listing 15-22: Using `RefCell<T>` to mutate an inner value while the outer |
1363 | value is considered immutable | |
ea8adc8c XL |
1364 | |
1365 | The `sent_messages` field is now of type `RefCell<Vec<String>>` instead of | |
2c00a5a8 XL |
1366 | `Vec<String>`. In the `new` function, we create a new `RefCell<Vec<String>>` |
1367 | instance around the empty vector. | |
ea8adc8c XL |
1368 | |
1369 | For the implementation of the `send` method, the first parameter is still an | |
1370 | immutable borrow of `self`, which matches the trait definition. We call | |
2c00a5a8 XL |
1371 | `borrow_mut` on the `RefCell<Vec<String>>` in `self.sent_messages` to get a |
1372 | mutable reference to the value inside the `RefCell<Vec<String>>`, which is | |
1373 | the vector. Then we can call `push` on the mutable reference to the vector to | |
1374 | keep track of the messages sent during the test. | |
ea8adc8c | 1375 | |
2c00a5a8 XL |
1376 | The last change we have to make is in the assertion: to see how many items are |
1377 | in the inner vector, we call `borrow` on the `RefCell<Vec<String>>` to get an | |
ea8adc8c XL |
1378 | immutable reference to the vector. |
1379 | ||
2c00a5a8 | 1380 | Now that you’ve seen how to use `RefCell<T>`, let’s dig into how it works! |
cc61c64b | 1381 | |
ea8adc8c XL |
1382 | #### `RefCell<T>` Keeps Track of Borrows at Runtime |
1383 | ||
2c00a5a8 | 1384 | When creating immutable and mutable references, we use the `&` and `&mut` |
ea8adc8c XL |
1385 | syntax, respectively. With `RefCell<T>`, we use the `borrow` and `borrow_mut` |
1386 | methods, which are part of the safe API that belongs to `RefCell<T>`. The | |
2c00a5a8 XL |
1387 | `borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut` |
1388 | returns the smart pointer type `RefMut<T>`. Both types implement `Deref` so | |
1389 | we can treat them like regular references. | |
ea8adc8c | 1390 | |
2c00a5a8 XL |
1391 | The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart |
1392 | pointers are currently active. Every time we call `borrow`, the `RefCell<T>` | |
1393 | increases its count of how many immutable borrows are active. When a `Ref<T>` | |
1394 | value goes out of scope, the count of immutable borrows goes down by one. Just | |
1395 | like the compile time borrowing rules, `RefCell<T>` lets us have many immutable | |
1396 | borrows or one mutable borrow at any point in time. | |
ea8adc8c XL |
1397 | |
1398 | If we try to violate these rules, rather than getting a compiler error like we | |
1399 | would with references, the implementation of `RefCell<T>` will `panic!` at | |
2c00a5a8 XL |
1400 | runtime. Listing 15-23 shows a modification of the implementation of `send` in |
1401 | Listing 15-22. We’re deliberately trying to create two mutable borrows active | |
1402 | for the same scope to illustrate that `RefCell<T>` prevents us from doing this | |
1403 | at runtime: | |
ea8adc8c XL |
1404 | |
1405 | Filename: src/lib.rs | |
1406 | ||
1407 | ``` | |
1408 | impl Messenger for MockMessenger { | |
1409 | fn send(&self, message: &str) { | |
1410 | let mut one_borrow = self.sent_messages.borrow_mut(); | |
1411 | let mut two_borrow = self.sent_messages.borrow_mut(); | |
1412 | ||
1413 | one_borrow.push(String::from(message)); | |
1414 | two_borrow.push(String::from(message)); | |
1415 | } | |
cc61c64b XL |
1416 | } |
1417 | ``` | |
1418 | ||
2c00a5a8 | 1419 | Listing 15-23: Creating two mutable references in the same scope to see that |
ea8adc8c XL |
1420 | `RefCell<T>` will panic |
1421 | ||
2c00a5a8 XL |
1422 | We create a variable `one_borrow` for the `RefMut<T>` smart pointer returned |
1423 | from `borrow_mut`. Then we create another mutable borrow in the same way in the | |
ea8adc8c | 1424 | variable `two_borrow`. This makes two mutable references in the same scope, |
2c00a5a8 XL |
1425 | which isn’t allowed. When we run the tests for our library, the code in Listing |
1426 | 15-23 will compile without any errors, but the test will fail: | |
cc61c64b XL |
1427 | |
1428 | ``` | |
ea8adc8c XL |
1429 | ---- tests::it_sends_an_over_75_percent_warning_message stdout ---- |
1430 | thread 'tests::it_sends_an_over_75_percent_warning_message' panicked at | |
1431 | 'already borrowed: BorrowMutError', src/libcore/result.rs:906:4 | |
cc61c64b XL |
1432 | note: Run with `RUST_BACKTRACE=1` for a backtrace. |
1433 | ``` | |
1434 | ||
2c00a5a8 | 1435 | Notice that the code panicked with the message `already borrowed: |
ea8adc8c XL |
1436 | BorrowMutError`. This is how `RefCell<T>` handles violations of the borrowing |
1437 | rules at runtime. | |
1438 | ||
2c00a5a8 XL |
1439 | Catching borrowing errors at runtime rather than compile time means that we |
1440 | would find a mistake in our code later in the development process and possibly | |
1441 | not even until our code was deployed to production. Also, our code will incur a | |
1442 | small runtime performance penalty as a result of keeping track of the borrows | |
1443 | at runtime rather than compile time. However, using `RefCell<T>` makes it | |
1444 | possible for us to write a mock object that can modify itself to keep track of | |
1445 | the messages it has seen while we’re using it in a context where only immutable | |
1446 | values are allowed. We can use `RefCell<T>` despite its trade-offs to get more | |
1447 | functionality than regular references give us. | |
ea8adc8c XL |
1448 | |
1449 | ### Having Multiple Owners of Mutable Data by Combining `Rc<T>` and `RefCell<T>` | |
1450 | ||
1451 | A common way to use `RefCell<T>` is in combination with `Rc<T>`. Recall that | |
1452 | `Rc<T>` lets us have multiple owners of some data, but it only gives us | |
1453 | immutable access to that data. If we have an `Rc<T>` that holds a `RefCell<T>`, | |
2c00a5a8 | 1454 | we can get a value that can have multiple owners *and* that we can mutate! |
ea8adc8c | 1455 | |
2c00a5a8 | 1456 | For example, recall the cons list example in Listing 15-18 where we used |
ea8adc8c | 1457 | `Rc<T>` to let us have multiple lists share ownership of another list. Because |
2c00a5a8 XL |
1458 | `Rc<T>` only holds immutable values, we can’t change any of the values in the |
1459 | list once we’ve created them. Let’s add in `RefCell<T>` to gain the ability to | |
1460 | change the values in the lists. Listing 15-24 shows that by using a | |
1461 | `RefCell<T>` in the `Cons` definition, we can modify the value stored in all | |
1462 | the lists: | |
cc61c64b XL |
1463 | |
1464 | Filename: src/main.rs | |
1465 | ||
1466 | ``` | |
1467 | #[derive(Debug)] | |
1468 | enum List { | |
1469 | Cons(Rc<RefCell<i32>>, Rc<List>), | |
1470 | Nil, | |
1471 | } | |
1472 | ||
1473 | use List::{Cons, Nil}; | |
1474 | use std::rc::Rc; | |
1475 | use std::cell::RefCell; | |
1476 | ||
1477 | fn main() { | |
1478 | let value = Rc::new(RefCell::new(5)); | |
1479 | ||
ea8adc8c | 1480 | let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); |
cc61c64b | 1481 | |
ea8adc8c XL |
1482 | let b = Cons(Rc::new(RefCell::new(6)), Rc::clone(&a)); |
1483 | let c = Cons(Rc::new(RefCell::new(10)), Rc::clone(&a)); | |
cc61c64b XL |
1484 | |
1485 | *value.borrow_mut() += 10; | |
1486 | ||
ea8adc8c | 1487 | println!("a after = {:?}", a); |
cc61c64b XL |
1488 | println!("b after = {:?}", b); |
1489 | println!("c after = {:?}", c); | |
1490 | } | |
1491 | ``` | |
1492 | ||
2c00a5a8 | 1493 | Listing 15-24: Using `Rc<RefCell<i32>>` to create a `List` that we can mutate |
ea8adc8c | 1494 | |
2c00a5a8 | 1495 | We create a value that is an instance of `Rc<RefCell<i32>` and store it in a |
ea8adc8c XL |
1496 | variable named `value` so we can access it directly later. Then we create a |
1497 | `List` in `a` with a `Cons` variant that holds `value`. We need to clone | |
2c00a5a8 XL |
1498 | `value` so both `a` and `value` have ownership of the inner `5` value rather |
1499 | than transferring ownership from `value` to `a` or having `a` borrow from | |
1500 | `value`. | |
cc61c64b | 1501 | |
2c00a5a8 XL |
1502 | We wrap the list `a` in an `Rc<T>` so when we create lists `b` and `c`, they |
1503 | can both refer to `a`, which is what we did in Listing 15-18. | |
cc61c64b | 1504 | |
2c00a5a8 | 1505 | After we’ve created the lists in `a`, `b`, and `c`, we add 10 to the value in |
ea8adc8c | 1506 | `value`. We do this by calling `borrow_mut` on `value`, which uses the |
2c00a5a8 XL |
1507 | automatic dereferencing feature we discussed in Chapter 5 (see the section |
1508 | “Where’s the `->` Operator?”) to dereference the `Rc<T>` to the inner | |
1509 | `RefCell<T>` value. The `borrow_mut` method returns a `RefMut<T>` smart | |
1510 | pointer, and we use the dereference operator on it and change the inner value. | |
ea8adc8c | 1511 | |
2c00a5a8 | 1512 | When we print `a`, `b`, and `c`, we can see that they all have the modified |
ea8adc8c | 1513 | value of 15 rather than 5: |
cc61c64b XL |
1514 | |
1515 | ``` | |
ea8adc8c | 1516 | a after = Cons(RefCell { value: 15 }, Nil) |
cc61c64b XL |
1517 | b after = Cons(RefCell { value: 6 }, Cons(RefCell { value: 15 }, Nil)) |
1518 | c after = Cons(RefCell { value: 10 }, Cons(RefCell { value: 15 }, Nil)) | |
1519 | ``` | |
1520 | ||
2c00a5a8 XL |
1521 | This technique is pretty neat! By using `RefCell<T>`, we have an outwardly |
1522 | immutable `List`. But we can use the methods on `RefCell<T>` that provide | |
1523 | access to its interior mutability so we can modify our data when we need to. | |
1524 | The runtime checks of the borrowing rules protect us from data races, and it’s | |
1525 | sometimes worth trading a bit of speed for this flexibility in our data | |
1526 | structures. | |
ea8adc8c | 1527 | |
2c00a5a8 XL |
1528 | The standard library has other types that provide interior mutability, such as |
1529 | `Cell<T>`, which is similar except that instead of giving references to the | |
1530 | inner value, the value is copied in and out of the `Cell<T>`. There’s also | |
1531 | `Mutex<T>`, which offers interior mutability that’s safe to use across threads; | |
1532 | we’ll discuss its use in Chapter 16. Check out the standard library docs for | |
1533 | more details on the differences between these types. | |
ea8adc8c XL |
1534 | |
1535 | ## Reference Cycles Can Leak Memory | |
1536 | ||
2c00a5a8 XL |
1537 | Rust’s memory safety guarantees make it *difficult* but not impossible to |
1538 | accidentally create memory that is never cleaned up (known as a *memory leak*). | |
1539 | Preventing memory leaks entirely is not one of Rust’s guarantees in the same | |
ea8adc8c | 1540 | way that disallowing data races at compile time is, meaning memory leaks are |
2c00a5a8 XL |
1541 | memory safe in Rust. We can see that Rust allows memory leaks by using `Rc<T>` |
1542 | and `RefCell<T>`: it’s possible to create references where items refer to each | |
1543 | other in a cycle. This creates memory leaks because the reference count of each | |
1544 | item in the cycle will never reach 0, and the values will never be dropped. | |
ea8adc8c XL |
1545 | |
1546 | ### Creating a Reference Cycle | |
1547 | ||
2c00a5a8 | 1548 | Let’s look at how a reference cycle might happen and how to prevent it, |
ea8adc8c | 1549 | starting with the definition of the `List` enum and a `tail` method in Listing |
2c00a5a8 | 1550 | 15-25: |
cc61c64b XL |
1551 | |
1552 | Filename: src/main.rs | |
1553 | ||
1554 | ``` | |
ea8adc8c XL |
1555 | use std::rc::Rc; |
1556 | use std::cell::RefCell; | |
1557 | use List::{Cons, Nil}; | |
1558 | ||
cc61c64b XL |
1559 | #[derive(Debug)] |
1560 | enum List { | |
1561 | Cons(i32, RefCell<Rc<List>>), | |
1562 | Nil, | |
1563 | } | |
1564 | ||
1565 | impl List { | |
1566 | fn tail(&self) -> Option<&RefCell<Rc<List>>> { | |
1567 | match *self { | |
1568 | Cons(_, ref item) => Some(item), | |
1569 | Nil => None, | |
1570 | } | |
1571 | } | |
1572 | } | |
1573 | ``` | |
1574 | ||
2c00a5a8 | 1575 | Listing 15-25: A cons list definition that holds a `RefCell<T>` so we can |
cc61c64b XL |
1576 | modify what a `Cons` variant is referring to |
1577 | ||
2c00a5a8 | 1578 | We’re using another variation of the `List` definition in Listing 15-5. The |
ea8adc8c XL |
1579 | second element in the `Cons` variant is now `RefCell<Rc<List>>`, meaning that |
1580 | instead of having the ability to modify the `i32` value like we did in Listing | |
2c00a5a8 XL |
1581 | 15-24, we want to modify which `List` a `Cons` variant is pointing to. We’re |
1582 | also adding a `tail` method to make it convenient for us to access the second | |
1583 | item if we have a `Cons` variant. | |
1584 | ||
1585 | In Listing 15-26, we’re adding a `main` function that uses the definitions in | |
1586 | Listing 15-25. This code creates a list in `a` and a list in `b` that points to | |
ea8adc8c XL |
1587 | the list in `a`, and then modifies the list in `a` to point to `b`, which |
1588 | creates a reference cycle. There are `println!` statements along the way to | |
2c00a5a8 | 1589 | show what the reference counts are at various points in this process: |
cc61c64b XL |
1590 | |
1591 | Filename: src/main.rs | |
1592 | ||
1593 | ``` | |
cc61c64b | 1594 | fn main() { |
cc61c64b XL |
1595 | let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); |
1596 | ||
1597 | println!("a initial rc count = {}", Rc::strong_count(&a)); | |
1598 | println!("a next item = {:?}", a.tail()); | |
1599 | ||
ea8adc8c | 1600 | let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); |
cc61c64b XL |
1601 | |
1602 | println!("a rc count after b creation = {}", Rc::strong_count(&a)); | |
1603 | println!("b initial rc count = {}", Rc::strong_count(&b)); | |
1604 | println!("b next item = {:?}", b.tail()); | |
1605 | ||
2c00a5a8 | 1606 | if let Some(link) = a.tail() { |
ea8adc8c | 1607 | *link.borrow_mut() = Rc::clone(&b); |
cc61c64b XL |
1608 | } |
1609 | ||
1610 | println!("b rc count after changing a = {}", Rc::strong_count(&b)); | |
1611 | println!("a rc count after changing a = {}", Rc::strong_count(&a)); | |
1612 | ||
1613 | // Uncomment the next line to see that we have a cycle; it will | |
1614 | // overflow the stack | |
1615 | // println!("a next item = {:?}", a.tail()); | |
1616 | } | |
1617 | ``` | |
1618 | ||
2c00a5a8 | 1619 | Listing 15-26: Creating a reference cycle of two `List` values pointing to each |
ea8adc8c XL |
1620 | other |
1621 | ||
2c00a5a8 XL |
1622 | We create an `Rc<List>` instance holding a `List` value in the variable `a` |
1623 | with an initial list of `5, Nil`. We then create an `Rc<List>` instance | |
1624 | holding another `List` value in the variable `b` that contains the value 10 and | |
1625 | then points to the list in `a`. | |
ea8adc8c | 1626 | |
2c00a5a8 XL |
1627 | We modify `a` so it points to `b` instead of `Nil`, which creates a cycle. We |
1628 | do that by using the `tail` method to get a reference to the | |
1629 | `RefCell<Rc<List>>` in `a`, which we put in the variable `link`. Then we use | |
1630 | the `borrow_mut` method on the `RefCell<Rc<List>>` to change the value inside | |
1631 | from an `Rc<List>` that holds a `Nil` value to the `Rc<List>` in `b`. | |
ea8adc8c | 1632 | |
2c00a5a8 XL |
1633 | When we run this code, keeping the last `println!` commented out for the |
1634 | moment, we’ll get this output: | |
ea8adc8c XL |
1635 | |
1636 | ``` | |
1637 | a initial rc count = 1 | |
1638 | a next item = Some(RefCell { value: Nil }) | |
1639 | a rc count after b creation = 2 | |
1640 | b initial rc count = 1 | |
1641 | b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) }) | |
1642 | b rc count after changing a = 2 | |
1643 | a rc count after changing a = 2 | |
1644 | ``` | |
1645 | ||
2c00a5a8 XL |
1646 | The reference count of the `Rc<List>` instances in both `a` and `b` are 2 |
1647 | after we change the list in `a` to point to `b`. At the end of `main`, Rust | |
1648 | will try to drop `b` first, which will decrease the count in each of the | |
1649 | `Rc<List>` instances in `a` and `b` by one. | |
ea8adc8c | 1650 | |
2c00a5a8 XL |
1651 | However, because `a` is still referencing the `Rc<List>` that was in `b`, |
1652 | that `Rc<List>` has a count of 1 rather than 0, so the memory the | |
1653 | `Rc<List>` has on the heap won’t be dropped. The memory will just sit there | |
1654 | with a count of one, forever. To visualize this reference cycle, we’ve created | |
1655 | a diagram in Figure 15-4: | |
cc61c64b XL |
1656 | |
1657 | <img alt="Reference cycle of lists" src="img/trpl15-04.svg" class="center" /> | |
1658 | ||
2c00a5a8 | 1659 | Figure 15-4: A reference cycle of lists `a` and `b` pointing to each other |
ea8adc8c | 1660 | |
2c00a5a8 XL |
1661 | If you uncomment the last `println!` and run the program, Rust will try to |
1662 | print this cycle with `a` pointing to `b` pointing to `a` and so forth until it | |
1663 | overflows the stack. | |
ea8adc8c | 1664 | |
2c00a5a8 XL |
1665 | In this case, right after we create the reference cycle, the program ends. The |
1666 | consequences of this cycle aren’t very dire. If a more complex program | |
ea8adc8c | 1667 | allocates lots of memory in a cycle and holds onto it for a long time, the |
2c00a5a8 XL |
1668 | program would use more memory than it needs and might overwhelm the system, |
1669 | causing it to run out of available memory. | |
ea8adc8c XL |
1670 | |
1671 | Creating reference cycles is not easily done, but it’s not impossible either. | |
1672 | If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested | |
2c00a5a8 XL |
1673 | combinations of types with interior mutability and reference counting, you must |
1674 | ensure that you don’t create cycles; you can’t rely on Rust to catch them. | |
1675 | Creating a reference cycle would be a logic bug in your program that you should | |
1676 | use automated tests, code reviews, and other software development practices to | |
1677 | minimize. | |
1678 | ||
1679 | Another solution for avoiding reference cycles is reorganizing your data | |
1680 | structures so that some references express ownership and some references don’t. | |
1681 | As a result, you can have cycles made up of some ownership relationships and | |
1682 | some non-ownership relationships, and only the ownership relationships affect | |
1683 | whether or not a value can be dropped. In Listing 15-25, we always want `Cons` | |
1684 | variants to own their list, so reorganizing the data structure isn’t possible. | |
1685 | Let’s look at an example using graphs made up of parent nodes and child nodes | |
1686 | to see when non-ownership relationships are an appropriate way to prevent | |
1687 | reference cycles. | |
ea8adc8c XL |
1688 | |
1689 | ### Preventing Reference Cycles: Turn an `Rc<T>` into a `Weak<T>` | |
1690 | ||
2c00a5a8 XL |
1691 | So far, we’ve demonstrated that calling `Rc::clone` increases the |
1692 | `strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only | |
1693 | cleaned up if its `strong_count` is 0. We can also create a *weak reference* to | |
1694 | the value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a | |
1695 | reference to the `Rc<T>`. When we call `Rc::downgrade`, we get a smart | |
1696 | pointer of type `Weak<T>`. Instead of increasing the `strong_count` in the | |
1697 | `Rc<T>` instance by one, calling `Rc::downgrade` increases the `weak_count` | |
1698 | by one. The `Rc<T>` type uses `weak_count` to keep track of how many | |
1699 | `Weak<T>` references exist, similar to `strong_count`. The difference is the | |
1700 | `weak_count` doesn’t need to be 0 for the `Rc<T>` instance to be cleaned up. | |
1701 | ||
1702 | Strong references are how we can share ownership of an `Rc<T>` instance. Weak | |
ea8adc8c | 1703 | references don’t express an ownership relationship. They won’t cause a |
2c00a5a8 | 1704 | reference cycle because any cycle involving some weak references will be broken |
ea8adc8c XL |
1705 | once the strong reference count of values involved is 0. |
1706 | ||
2c00a5a8 XL |
1707 | Because the value that `Weak<T>` references might have been dropped, to do |
1708 | anything with the value that a `Weak<T>` is pointing to, we must make sure the | |
1709 | value still exists. We do this by calling the `upgrade` method on a `Weak<T>` | |
1710 | instance, which will return an `Option<Rc<T>>`. We’ll get a result of `Some` if | |
1711 | the `Rc<T>` value has not been dropped yet and a result of `None` if the | |
1712 | `Rc<T>` value has been dropped. Because `upgrade` returns an `Option<T>`, Rust | |
1713 | will ensure that we handle the `Some` case and the `None` case, and there won’t | |
1714 | be an invalid pointer. | |
ea8adc8c XL |
1715 | |
1716 | As an example, rather than using a list whose items know only about the next | |
1717 | item, we’ll create a tree whose items know about their children items *and* | |
1718 | their parent items. | |
1719 | ||
1720 | #### Creating a Tree Data Structure: a `Node` with Child Nodes | |
1721 | ||
2c00a5a8 XL |
1722 | To start, we’ll build a tree with nodes that know about their child nodes. |
1723 | We’ll create a struct named `Node` that holds its own `i32` value as well as | |
1724 | references to its children `Node` values: | |
cc61c64b XL |
1725 | |
1726 | Filename: src/main.rs | |
1727 | ||
1728 | ``` | |
1729 | use std::rc::Rc; | |
1730 | use std::cell::RefCell; | |
1731 | ||
1732 | #[derive(Debug)] | |
1733 | struct Node { | |
1734 | value: i32, | |
1735 | children: RefCell<Vec<Rc<Node>>>, | |
1736 | } | |
1737 | ``` | |
1738 | ||
2c00a5a8 XL |
1739 | We want a `Node` to own its children, and we want to share that ownership with |
1740 | variables so we can access each `Node` in the tree directly. To do this, we | |
1741 | define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to | |
1742 | modify which nodes are children of another node, so we have a `RefCell<T>` in | |
1743 | `children` around the `Vec<Rc<Node>>`. | |
ea8adc8c | 1744 | |
2c00a5a8 | 1745 | Next, we’ll use our struct definition and create one `Node` instance named |
ea8adc8c | 1746 | `leaf` with the value 3 and no children, and another instance named `branch` |
2c00a5a8 | 1747 | with the value 5 and `leaf` as one of its children, as shown in Listing 15-27: |
cc61c64b XL |
1748 | |
1749 | Filename: src/main.rs | |
1750 | ||
1751 | ``` | |
1752 | fn main() { | |
1753 | let leaf = Rc::new(Node { | |
1754 | value: 3, | |
1755 | children: RefCell::new(vec![]), | |
1756 | }); | |
1757 | ||
1758 | let branch = Rc::new(Node { | |
1759 | value: 5, | |
ea8adc8c | 1760 | children: RefCell::new(vec![Rc::clone(&leaf)]), |
cc61c64b XL |
1761 | }); |
1762 | } | |
1763 | ``` | |
1764 | ||
2c00a5a8 | 1765 | Listing 15-27: Creating a `leaf` node with no children and a `branch` node with |
ea8adc8c XL |
1766 | `leaf` as one of its children |
1767 | ||
2c00a5a8 XL |
1768 | We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the |
1769 | `Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from | |
1770 | `branch` to `leaf` through `branch.children`, but there’s no way to get from | |
1771 | `leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and | |
1772 | doesn’t know they’re related. We want `leaf` to know that `branch` is its | |
1773 | parent. We’ll do that next. | |
ea8adc8c | 1774 | |
2c00a5a8 | 1775 | #### Adding a Reference from a Child to Its Parent |
cc61c64b | 1776 | |
ea8adc8c XL |
1777 | To make the child node aware of its parent, we need to add a `parent` field to |
1778 | our `Node` struct definition. The trouble is in deciding what the type of | |
1779 | `parent` should be. We know it can’t contain an `Rc<T>` because that would | |
2c00a5a8 | 1780 | create a reference cycle with `leaf.parent` pointing to `branch` and |
ea8adc8c | 1781 | `branch.children` pointing to `leaf`, which would cause their `strong_count` |
2c00a5a8 | 1782 | values to never be 0. |
cc61c64b | 1783 | |
ea8adc8c XL |
1784 | Thinking about the relationships another way, a parent node should own its |
1785 | children: if a parent node is dropped, its child nodes should be dropped as | |
1786 | well. However, a child should not own its parent: if we drop a child node, the | |
1787 | parent should still exist. This is a case for weak references! | |
cc61c64b | 1788 | |
2c00a5a8 XL |
1789 | So instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`, |
1790 | specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks | |
1791 | like this: | |
cc61c64b XL |
1792 | |
1793 | Filename: src/main.rs | |
1794 | ||
1795 | ``` | |
1796 | use std::rc::{Rc, Weak}; | |
1797 | use std::cell::RefCell; | |
1798 | ||
1799 | #[derive(Debug)] | |
1800 | struct Node { | |
1801 | value: i32, | |
1802 | parent: RefCell<Weak<Node>>, | |
1803 | children: RefCell<Vec<Rc<Node>>>, | |
1804 | } | |
1805 | ``` | |
1806 | ||
2c00a5a8 XL |
1807 | Now a node will be able to refer to its parent node but doesn’t own its parent. |
1808 | In Listing 15-28, we update `main` to use this new definition so the `leaf` | |
1809 | node will have a way to refer to its parent, `branch`: | |
cc61c64b XL |
1810 | |
1811 | Filename: src/main.rs | |
1812 | ||
1813 | ``` | |
1814 | fn main() { | |
1815 | let leaf = Rc::new(Node { | |
1816 | value: 3, | |
1817 | parent: RefCell::new(Weak::new()), | |
1818 | children: RefCell::new(vec![]), | |
1819 | }); | |
1820 | ||
1821 | println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
1822 | ||
1823 | let branch = Rc::new(Node { | |
1824 | value: 5, | |
1825 | parent: RefCell::new(Weak::new()), | |
ea8adc8c | 1826 | children: RefCell::new(vec![Rc::clone(&leaf)]), |
cc61c64b XL |
1827 | }); |
1828 | ||
1829 | *leaf.parent.borrow_mut() = Rc::downgrade(&branch); | |
1830 | ||
1831 | println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
1832 | } | |
1833 | ``` | |
1834 | ||
2c00a5a8 | 1835 | Listing 15-28: A `leaf` node with a `Weak` reference to its parent node `branch` |
ea8adc8c XL |
1836 | |
1837 | Creating the `leaf` node looks similar to how creating the `leaf` node looked | |
2c00a5a8 XL |
1838 | in Listing 15-27 with the exception of the `parent` field: `leaf` starts out |
1839 | without a parent, so we create a new, empty `Weak<Node>` reference instance. | |
ea8adc8c XL |
1840 | |
1841 | At this point, when we try to get a reference to the parent of `leaf` by using | |
1842 | the `upgrade` method, we get a `None` value. We see this in the output from the | |
2c00a5a8 | 1843 | first `println!` statement: |
cc61c64b XL |
1844 | |
1845 | ``` | |
1846 | leaf parent = None | |
1847 | ``` | |
1848 | ||
2c00a5a8 XL |
1849 | When we create the `branch` node, it will also have a new `Weak<Node>` |
1850 | reference in the `parent` field, because `branch` doesn’t have a parent node. | |
1851 | We still have `leaf` as one of the children of `branch`. Once we have the | |
1852 | `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>` | |
1853 | reference to its parent. We use the `borrow_mut` method on the | |
1854 | `RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the | |
1855 | `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from | |
1856 | the `Rc<Node>` in `branch.` | |
ea8adc8c | 1857 | |
2c00a5a8 XL |
1858 | When we print the parent of `leaf` again, this time we’ll get a `Some` variant |
1859 | holding `branch`: now `leaf` can access its parent! When we print `leaf`, we | |
1860 | also avoid the cycle that eventually ended in a stack overflow like we had in | |
1861 | Listing 15-26: the `Weak<Node>` references are printed as `(Weak)`: | |
cc61c64b XL |
1862 | |
1863 | ``` | |
1864 | leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) }, | |
1865 | children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, | |
1866 | children: RefCell { value: [] } }] } }) | |
1867 | ``` | |
1868 | ||
ea8adc8c XL |
1869 | The lack of infinite output indicates that this code didn’t create a reference |
1870 | cycle. We can also tell this by looking at the values we get from calling | |
1871 | `Rc::strong_count` and `Rc::weak_count`. | |
1872 | ||
1873 | #### Visualizing Changes to `strong_count` and `weak_count` | |
1874 | ||
2c00a5a8 | 1875 | Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>` |
ea8adc8c | 1876 | instances change by creating a new inner scope and moving the creation of |
2c00a5a8 | 1877 | `branch` into that scope. By doing so, we can see what happens when `branch` is |
ea8adc8c | 1878 | created and then dropped when it goes out of scope. The modifications are shown |
2c00a5a8 | 1879 | in Listing 15-29: |
cc61c64b XL |
1880 | |
1881 | Filename: src/main.rs | |
1882 | ||
1883 | ``` | |
1884 | fn main() { | |
1885 | let leaf = Rc::new(Node { | |
1886 | value: 3, | |
1887 | parent: RefCell::new(Weak::new()), | |
1888 | children: RefCell::new(vec![]), | |
1889 | }); | |
1890 | ||
1891 | println!( | |
1892 | "leaf strong = {}, weak = {}", | |
1893 | Rc::strong_count(&leaf), | |
1894 | Rc::weak_count(&leaf), | |
1895 | ); | |
1896 | ||
1897 | { | |
1898 | let branch = Rc::new(Node { | |
1899 | value: 5, | |
1900 | parent: RefCell::new(Weak::new()), | |
ea8adc8c | 1901 | children: RefCell::new(vec![Rc::clone(&leaf)]), |
cc61c64b | 1902 | }); |
2c00a5a8 | 1903 | |
cc61c64b XL |
1904 | *leaf.parent.borrow_mut() = Rc::downgrade(&branch); |
1905 | ||
1906 | println!( | |
1907 | "branch strong = {}, weak = {}", | |
1908 | Rc::strong_count(&branch), | |
1909 | Rc::weak_count(&branch), | |
1910 | ); | |
1911 | ||
1912 | println!( | |
1913 | "leaf strong = {}, weak = {}", | |
1914 | Rc::strong_count(&leaf), | |
1915 | Rc::weak_count(&leaf), | |
1916 | ); | |
1917 | } | |
1918 | ||
1919 | println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
1920 | println!( | |
1921 | "leaf strong = {}, weak = {}", | |
1922 | Rc::strong_count(&leaf), | |
1923 | Rc::weak_count(&leaf), | |
1924 | ); | |
1925 | } | |
1926 | ``` | |
1927 | ||
2c00a5a8 | 1928 | Listing 15-29: Creating `branch` in an inner scope and examining strong and |
ea8adc8c | 1929 | weak reference counts |
cc61c64b | 1930 | |
2c00a5a8 XL |
1931 | After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak |
1932 | count of 0. In the inner scope, we create `branch` and associate it with | |
1933 | `leaf`, at which point when we print the counts, the `Rc<Node>` in `branch` | |
1934 | will have a strong count of 1 and a weak count of 1 (for `leaf.parent` pointing | |
1935 | to `branch` with a `Weak<Node>`). When we print the counts in `leaf`, we’ll see | |
1936 | it will have a strong count of 2, because `branch` now has a clone of the | |
1937 | `Rc<Node>` of `leaf` stored in `branch.children` but will still have a weak | |
1938 | count of 0. | |
cc61c64b | 1939 | |
ea8adc8c | 1940 | When the inner scope ends, `branch` goes out of scope and the strong count of |
2c00a5a8 XL |
1941 | the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1 |
1942 | from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we | |
1943 | don’t get any memory leaks! | |
cc61c64b | 1944 | |
3b2f2976 | 1945 | If we try to access the parent of `leaf` after the end of the scope, we’ll get |
2c00a5a8 XL |
1946 | `None` again. At the end of the program, the `Rc<Node>` in `leaf` has a strong |
1947 | count of 1 and a weak count of 0, because the variable `leaf` is now the only | |
1948 | reference to the `Rc<Node>` again. | |
1949 | ||
1950 | All of the logic that manages the counts and value dropping is built into | |
1951 | `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By | |
1952 | specifying that the relationship from a child to its parent should be a | |
1953 | `Weak<T>` reference in the definition of `Node`, we’re able to have parent | |
1954 | nodes point to child nodes and vice versa without creating a reference cycle | |
1955 | and memory leaks. | |
cc61c64b XL |
1956 | |
1957 | ## Summary | |
1958 | ||
2c00a5a8 XL |
1959 | This chapter covered how to use smart pointers to make different guarantees and |
1960 | trade-offs than those Rust makes by default with regular references. The | |
1961 | `Box<T>` type has a known size and points to data allocated on the heap. The | |
1962 | `Rc<T>` type keeps track of the number of references to data on the heap, so | |
1963 | that data can have multiple owners. The `RefCell<T>` type with its interior | |
1964 | mutability gives us a type that we can use when we need an immutable type but | |
1965 | need to change an inner value of that type; it also enforces the borrowing | |
1966 | rules at runtime instead of at compile time. | |
cc61c64b | 1967 | |
2c00a5a8 | 1968 | Also discussed were the `Deref` and `Drop` traits that enable a lot of the |
ea8adc8c | 1969 | functionality of smart pointers. We explored reference cycles that can cause |
2c00a5a8 | 1970 | memory leaks and how to prevent them using `Weak<T>`. |
cc61c64b | 1971 | |
ea8adc8c | 1972 | If this chapter has piqued your interest and you want to implement your own |
2c00a5a8 XL |
1973 | smart pointers, check out “The Rustonomicon” at |
1974 | *https://doc.rust-lang.org/stable/nomicon/* for more useful information. | |
cc61c64b | 1975 | |
2c00a5a8 | 1976 | Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new |
ea8adc8c | 1977 | smart pointers. |