]>
Commit | Line | Data |
---|---|---|
5099ac24 FG |
1 | <!-- DO NOT EDIT THIS FILE. |
2 | ||
3 | This file is periodically generated from the content in the `/src/` | |
4 | directory, so all fixes need to be made in `/src/`. | |
5 | --> | |
6 | ||
7 | [TOC] | |
8 | ||
9 | # Smart Pointers | |
10 | ||
11 | A *pointer* is a general concept for a variable that contains an address in | |
12 | memory. This address refers to, or “points at,” some other data. The most | |
13 | common kind of pointer in Rust is a reference, which you learned about in | |
14 | Chapter 4. References are indicated by the `&` symbol and borrow the value they | |
15 | point to. They don’t have any special capabilities other than referring to | |
04454e1e | 16 | data, and have no overhead. |
5099ac24 | 17 | |
04454e1e FG |
18 | *Smart pointers*, on the other hand, are data structures that act like a |
19 | pointer but also have additional metadata and capabilities. The concept of | |
5099ac24 | 20 | smart pointers isn’t unique to Rust: smart pointers originated in C++ and exist |
04454e1e FG |
21 | in other languages as well. Rust has a variety of smart pointers defined in the |
22 | standard library that provide functionality beyond that provided by references. | |
23 | To explore the general concept, we'll look at a couple of different examples of | |
24 | smart pointers, including a *reference counting* smart pointer type. This | |
25 | pointer enables you to allow data to have multiple owners by keeping track of | |
26 | the number of owners and, when no owners remain, cleaning up the data. | |
27 | ||
28 | Rust, with its concept of ownership and borrowing, has an additional difference | |
29 | between references and smart pointers: while references only borrow data, in | |
30 | many cases, smart pointers *own* the data they point to. | |
31 | ||
32 | Though we didn't call them as much at the time, we’ve already encountered a few | |
33 | smart pointers in this book, including `String` and `Vec<T>` in Chapter 8. Both | |
34 | these types count as smart pointers because they own some memory and allow you | |
35 | to manipulate it. They also have metadata and extra capabilities or guarantees. | |
36 | `String`, for example, stores its capacity as metadata and has the extra | |
37 | ability to ensure its data will always be valid UTF-8. | |
38 | ||
39 | Smart pointers are usually implemented using structs. Unlike an ordinary | |
40 | struct, smart pointers implement the `Deref` and `Drop` traits. The `Deref` | |
41 | trait allows an instance of the smart pointer struct to behave like a reference | |
42 | so you can write your code to work with either references or smart pointers. | |
43 | The `Drop` trait allows you to customize the code that's run when an instance | |
44 | of the smart pointer goes out of scope. In this chapter, we’ll discuss both | |
45 | traits and demonstrate why they’re important to smart pointers. | |
5099ac24 FG |
46 | |
47 | Given that the smart pointer pattern is a general design pattern used | |
48 | frequently in Rust, this chapter won’t cover every existing smart pointer. Many | |
49 | libraries have their own smart pointers, and you can even write your own. We’ll | |
50 | cover the most common smart pointers in the standard library: | |
51 | ||
52 | * `Box<T>` for allocating values on the heap | |
53 | * `Rc<T>`, a reference counting type that enables multiple ownership | |
54 | * `Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`, a type that enforces | |
55 | the borrowing rules at runtime instead of compile time | |
56 | ||
57 | In addition, we’ll cover the *interior mutability* pattern where an immutable | |
58 | type exposes an API for mutating an interior value. We’ll also discuss | |
59 | *reference cycles*: how they can leak memory and how to prevent them. | |
60 | ||
61 | Let’s dive in! | |
62 | ||
63 | ## Using `Box<T>` to Point to Data on the Heap | |
64 | ||
65 | The most straightforward smart pointer is a *box*, whose type is written | |
66 | `Box<T>`. Boxes allow you to store data on the heap rather than the stack. What | |
67 | remains on the stack is the pointer to the heap data. Refer to Chapter 4 to | |
68 | review the difference between the stack and the heap. | |
69 | ||
70 | Boxes don’t have performance overhead, other than storing their data on the | |
71 | heap instead of on the stack. But they don’t have many extra capabilities | |
72 | either. You’ll use them most often in these situations: | |
73 | ||
74 | * When you have a type whose size can’t be known at compile time and you want | |
75 | to use a value of that type in a context that requires an exact size | |
76 | * When you have a large amount of data and you want to transfer ownership but | |
77 | ensure the data won’t be copied when you do so | |
78 | * When you want to own a value and you care only that it’s a type that | |
79 | implements a particular trait rather than being of a specific type | |
80 | ||
81 | We’ll demonstrate the first situation in the “Enabling Recursive Types with | |
82 | Boxes” section. In the second case, transferring ownership of a large amount of | |
83 | data can take a long time because the data is copied around on the stack. To | |
84 | improve performance in this situation, we can store the large amount of data on | |
85 | the heap in a box. Then, only the small amount of pointer data is copied around | |
86 | on the stack, while the data it references stays in one place on the heap. The | |
87 | third case is known as a *trait object*, and Chapter 17 devotes an entire | |
88 | section, “Using Trait Objects That Allow for Values of Different Types,” just | |
89 | to that topic. So what you learn here you’ll apply again in Chapter 17! | |
90 | ||
91 | ### Using a `Box<T>` to Store Data on the Heap | |
92 | ||
04454e1e FG |
93 | Before we discuss the heap storage use case for `Box<T>`, we’ll cover the |
94 | syntax and how to interact with values stored within a `Box<T>`. | |
5099ac24 FG |
95 | |
96 | Listing 15-1 shows how to use a box to store an `i32` value on the heap: | |
97 | ||
98 | Filename: src/main.rs | |
99 | ||
100 | ``` | |
101 | fn main() { | |
102 | let b = Box::new(5); | |
103 | println!("b = {}", b); | |
104 | } | |
105 | ``` | |
106 | ||
107 | Listing 15-1: Storing an `i32` value on the heap using a box | |
108 | ||
109 | We define the variable `b` to have the value of a `Box` that points to the | |
110 | value `5`, which is allocated on the heap. This program will print `b = 5`; in | |
111 | this case, we can access the data in the box similar to how we would if this | |
112 | data were on the stack. Just like any owned value, when a box goes out of | |
113 | scope, as `b` does at the end of `main`, it will be deallocated. The | |
04454e1e FG |
114 | deallocation happens both for the box (stored on the stack) and the data it |
115 | points to (stored on the heap). | |
5099ac24 FG |
116 | |
117 | Putting a single value on the heap isn’t very useful, so you won’t use boxes by | |
118 | themselves in this way very often. Having values like a single `i32` on the | |
119 | stack, where they’re stored by default, is more appropriate in the majority of | |
120 | situations. Let’s look at a case where boxes allow us to define types that we | |
121 | wouldn’t be allowed to if we didn’t have boxes. | |
122 | ||
123 | ### Enabling Recursive Types with Boxes | |
124 | ||
04454e1e FG |
125 | A value of *recursive type* can have another value of the same type as part of |
126 | itself. Recursive types pose an issue because at compile time Rust needs to | |
127 | know how much space a type takes up. However, the nesting of values of | |
128 | recursive types could theoretically continue infinitely, so Rust can’t know how | |
129 | much space the value needs. Because boxes have a known size, we can enable | |
130 | recursive types by inserting a box in the recursive type definition. | |
5099ac24 | 131 | |
04454e1e FG |
132 | As an example of a recursive type, let’s explore the *cons list*. This is a data |
133 | type commonly found in functional programming languages. The cons list type | |
5099ac24 FG |
134 | we’ll define is straightforward except for the recursion; therefore, the |
135 | concepts in the example we’ll work with will be useful any time you get into | |
136 | more complex situations involving recursive types. | |
137 | ||
138 | #### More Information About the Cons List | |
139 | ||
140 | A *cons list* is a data structure that comes from the Lisp programming language | |
04454e1e FG |
141 | and its dialects and is made up of nested pairs. Its name comes from the `cons` |
142 | function (short for “construct function”) in Lisp that constructs a new pair | |
143 | from its two arguments. By calling `cons` on a pair consisting of a value and | |
144 | another pair, we can construct cons lists made up of recursive pairs. | |
5099ac24 | 145 | |
04454e1e FG |
146 | <!-- can you add a direct definition of the cons list -- we |
147 | have a definition for the cons function, but not a direct one for the list. I | |
148 | assume the list is the result of the function, so is a recursive list of nested | |
149 | pairs? /LC --> | |
150 | <!-- There was a definition here, "These pairs containing pairs form a list", | |
151 | but I guess it wasn't clear enough. Is this better? /Carol --> | |
152 | ||
153 | For example, here's a pseudocode representation of a cons list containing the | |
154 | list 1, 2, 3 with each pair in parentheses: | |
155 | ||
156 | ``` | |
157 | (1, (2, (3, Nil))) | |
158 | ``` | |
159 | ||
160 | <!-- maybe here also show us a cons list -- not necessarily how to make one, | |
161 | but the actual list that's the result /LC --> | |
162 | <!-- Done! /Carol --> | |
5099ac24 FG |
163 | |
164 | Each item in a cons list contains two elements: the value of the current item | |
165 | and the next item. The last item in the list contains only a value called `Nil` | |
166 | without a next item. A cons list is produced by recursively calling the `cons` | |
167 | function. The canonical name to denote the base case of the recursion is `Nil`. | |
168 | Note that this is not the same as the “null” or “nil” concept in Chapter 6, | |
169 | which is an invalid or absent value. | |
170 | ||
04454e1e FG |
171 | The cons list isn’t a commonly used data structure in Rust. Most of the time |
172 | when you have a list of items in Rust, `Vec<T>` is a better choice to use. | |
173 | Other, more complex recursive data types *are* useful in various situations, | |
174 | but by starting with the cons list in this chapter, we can explore how boxes | |
175 | let us define a recursive data type without much distraction. | |
5099ac24 FG |
176 | |
177 | Listing 15-2 contains an enum definition for a cons list. Note that this code | |
178 | won’t compile yet because the `List` type doesn’t have a known size, which | |
179 | we’ll demonstrate. | |
180 | ||
181 | Filename: src/main.rs | |
182 | ||
183 | ``` | |
184 | enum List { | |
185 | Cons(i32, List), | |
186 | Nil, | |
187 | } | |
188 | ``` | |
189 | ||
190 | Listing 15-2: The first attempt at defining an enum to represent a cons list | |
191 | data structure of `i32` values | |
192 | ||
193 | > Note: We’re implementing a cons list that holds only `i32` values for the | |
194 | > purposes of this example. We could have implemented it using generics, as we | |
195 | > discussed in Chapter 10, to define a cons list type that could store values of | |
196 | > any type. | |
197 | ||
198 | Using the `List` type to store the list `1, 2, 3` would look like the code in | |
199 | Listing 15-3: | |
200 | ||
201 | Filename: src/main.rs | |
202 | ||
203 | ``` | |
204 | use crate::List::{Cons, Nil}; | |
205 | ||
206 | fn main() { | |
207 | let list = Cons(1, Cons(2, Cons(3, Nil))); | |
208 | } | |
209 | ``` | |
210 | ||
211 | Listing 15-3: Using the `List` enum to store the list `1, 2, 3` | |
212 | ||
213 | The first `Cons` value holds `1` and another `List` value. This `List` value is | |
214 | another `Cons` value that holds `2` and another `List` value. This `List` value | |
215 | is one more `Cons` value that holds `3` and a `List` value, which is finally | |
216 | `Nil`, the non-recursive variant that signals the end of the list. | |
217 | ||
04454e1e FG |
218 | <!-- I really like the use of the cons list as the example here /LC --> |
219 | ||
5099ac24 FG |
220 | If we try to compile the code in Listing 15-3, we get the error shown in |
221 | Listing 15-4: | |
222 | ||
223 | ``` | |
224 | error[E0072]: recursive type `List` has infinite size | |
225 | --> src/main.rs:1:1 | |
226 | | | |
227 | 1 | enum List { | |
228 | | ^^^^^^^^^ recursive type has infinite size | |
229 | 2 | Cons(i32, List), | |
230 | | ---- recursive without indirection | |
231 | | | |
232 | help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` representable | |
233 | ``` | |
234 | ||
235 | Listing 15-4: The error we get when attempting to define a recursive enum | |
236 | ||
237 | The error shows this type “has infinite size.” The reason is that we’ve defined | |
238 | `List` with a variant that is recursive: it holds another value of itself | |
239 | directly. As a result, Rust can’t figure out how much space it needs to store a | |
04454e1e FG |
240 | `List` value. Let’s break down why we get this error. First, we'll look at how |
241 | Rust decides how much space it needs to store a value of a non-recursive type. | |
5099ac24 FG |
242 | |
243 | #### Computing the Size of a Non-Recursive Type | |
244 | ||
245 | Recall the `Message` enum we defined in Listing 6-2 when we discussed enum | |
246 | definitions in Chapter 6: | |
247 | ||
248 | ``` | |
249 | enum Message { | |
250 | Quit, | |
251 | Move { x: i32, y: i32 }, | |
252 | Write(String), | |
253 | ChangeColor(i32, i32, i32), | |
254 | } | |
255 | ``` | |
256 | ||
257 | To determine how much space to allocate for a `Message` value, Rust goes | |
258 | through each of the variants to see which variant needs the most space. Rust | |
259 | sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough | |
260 | space to store two `i32` values, and so forth. Because only one variant will be | |
261 | used, the most space a `Message` value will need is the space it would take to | |
262 | store the largest of its variants. | |
263 | ||
264 | Contrast this with what happens when Rust tries to determine how much space a | |
265 | recursive type like the `List` enum in Listing 15-2 needs. The compiler starts | |
266 | by looking at the `Cons` variant, which holds a value of type `i32` and a value | |
267 | of type `List`. Therefore, `Cons` needs an amount of space equal to the size of | |
268 | an `i32` plus the size of a `List`. To figure out how much memory the `List` | |
269 | type needs, the compiler looks at the variants, starting with the `Cons` | |
270 | variant. The `Cons` variant holds a value of type `i32` and a value of type | |
271 | `List`, and this process continues infinitely, as shown in Figure 15-1. | |
272 | ||
273 | <img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" /> | |
274 | ||
275 | Figure 15-1: An infinite `List` consisting of infinite `Cons` variants | |
276 | ||
277 | #### Using `Box<T>` to Get a Recursive Type with a Known Size | |
278 | ||
04454e1e FG |
279 | Because Rust can’t figure out how much space to allocate for recursively |
280 | defined types, the compiler gives an error with this helpful suggestion: | |
5099ac24 FG |
281 | |
282 | ``` | |
283 | help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` representable | |
284 | ``` | |
285 | ||
286 | In this suggestion, “indirection” means that instead of storing a value | |
04454e1e | 287 | directly, we should change the data structure to store the value indirectly by |
5099ac24 FG |
288 | storing a pointer to the value instead. |
289 | ||
290 | Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>` | |
291 | needs: a pointer’s size doesn’t change based on the amount of data it’s | |
292 | pointing to. This means we can put a `Box<T>` inside the `Cons` variant instead | |
293 | of another `List` value directly. The `Box<T>` will point to the next `List` | |
294 | value that will be on the heap rather than inside the `Cons` variant. | |
04454e1e FG |
295 | Conceptually, we still have a list, created with lists holding other lists, but |
296 | this implementation is now more like placing the items next to one another | |
5099ac24 FG |
297 | rather than inside one another. |
298 | ||
299 | We can change the definition of the `List` enum in Listing 15-2 and the usage | |
300 | of the `List` in Listing 15-3 to the code in Listing 15-5, which will compile: | |
301 | ||
302 | Filename: src/main.rs | |
303 | ||
304 | ``` | |
305 | enum List { | |
306 | Cons(i32, Box<List>), | |
307 | Nil, | |
308 | } | |
309 | ||
310 | use crate::List::{Cons, Nil}; | |
311 | ||
312 | fn main() { | |
313 | let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil)))))); | |
314 | } | |
315 | ``` | |
316 | ||
317 | Listing 15-5: Definition of `List` that uses `Box<T>` in order to have a known | |
318 | size | |
319 | ||
04454e1e | 320 | The `Cons` variant needs the size of an `i32` plus the space to store the |
5099ac24 FG |
321 | box’s pointer data. The `Nil` variant stores no values, so it needs less space |
322 | than the `Cons` variant. We now know that any `List` value will take up the | |
323 | size of an `i32` plus the size of a box’s pointer data. By using a box, we’ve | |
324 | broken the infinite, recursive chain, so the compiler can figure out the size | |
325 | it needs to store a `List` value. Figure 15-2 shows what the `Cons` variant | |
326 | looks like now. | |
327 | ||
328 | <img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" /> | |
329 | ||
330 | Figure 15-2: A `List` that is not infinitely sized because `Cons` holds a `Box` | |
331 | ||
332 | Boxes provide only the indirection and heap allocation; they don’t have any | |
333 | other special capabilities, like those we’ll see with the other smart pointer | |
04454e1e | 334 | types. They also don’t have the performance overhead that these special |
5099ac24 FG |
335 | capabilities incur, so they can be useful in cases like the cons list where the |
336 | indirection is the only feature we need. We’ll look at more use cases for boxes | |
337 | in Chapter 17, too. | |
338 | ||
339 | The `Box<T>` type is a smart pointer because it implements the `Deref` trait, | |
340 | which allows `Box<T>` values to be treated like references. When a `Box<T>` | |
341 | value goes out of scope, the heap data that the box is pointing to is cleaned | |
04454e1e FG |
342 | up as well because of the `Drop` trait implementation. These two traits will be |
343 | even more important to the functionality provided by the other smart pointer | |
344 | types we’ll discuss in the rest of this chapter. Let’s explore these two traits | |
345 | in more detail. | |
5099ac24 FG |
346 | |
347 | ## Treating Smart Pointers Like Regular References with the `Deref` Trait | |
348 | ||
349 | Implementing the `Deref` trait allows you to customize the behavior of the | |
04454e1e | 350 | *dereference operator* `*` (not to be confused with the multiplication or glob |
5099ac24 FG |
351 | operator). By implementing `Deref` in such a way that a smart pointer can be |
352 | treated like a regular reference, you can write code that operates on | |
353 | references and use that code with smart pointers too. | |
354 | ||
355 | Let’s first look at how the dereference operator works with regular references. | |
356 | Then we’ll try to define a custom type that behaves like `Box<T>`, and see why | |
357 | the dereference operator doesn’t work like a reference on our newly defined | |
358 | type. We’ll explore how implementing the `Deref` trait makes it possible for | |
359 | smart pointers to work in ways similar to references. Then we’ll look at | |
360 | Rust’s *deref coercion* feature and how it lets us work with either references | |
361 | or smart pointers. | |
362 | ||
363 | > Note: there’s one big difference between the `MyBox<T>` type we’re about to | |
364 | > build and the real `Box<T>`: our version will not store its data on the heap. | |
365 | > We are focusing this example on `Deref`, so where the data is actually stored | |
366 | > is less important than the pointer-like behavior. | |
367 | ||
04454e1e | 368 | ### Following the Pointer to the Value |
5099ac24 FG |
369 | |
370 | A regular reference is a type of pointer, and one way to think of a pointer is | |
371 | as an arrow to a value stored somewhere else. In Listing 15-6, we create a | |
372 | reference to an `i32` value and then use the dereference operator to follow the | |
04454e1e FG |
373 | reference to the value: |
374 | <!-- is the data different to the value? I wasn't sure if the two things above | |
375 | were supposed to be doing different things? /LC --> | |
376 | <!-- I'm not sure what you mean by "two things above", but generally "value" | |
377 | and "data" are interchangeable. I switched "data" to "value" in this paragraph, | |
378 | does that fix the issue? /Carol --> | |
5099ac24 FG |
379 | |
380 | Filename: src/main.rs | |
381 | ||
382 | ``` | |
383 | fn main() { | |
384 | [1] let x = 5; | |
385 | [2] let y = &x; | |
386 | ||
387 | [3] assert_eq!(5, x); | |
388 | [4] assert_eq!(5, *y); | |
389 | } | |
390 | ``` | |
391 | ||
392 | Listing 15-6: Using the dereference operator to follow a reference to an `i32` | |
393 | value | |
394 | ||
04454e1e | 395 | The variable `x` holds an `i32` value `5` [1]. We set `y` equal to a reference |
5099ac24 FG |
396 | to `x` [2]. We can assert that `x` is equal to `5` [3]. However, if we want to |
397 | make an assertion about the value in `y`, we have to use `*y` to follow the | |
04454e1e FG |
398 | reference to the value it’s pointing to (hence *dereference*) so the compiler |
399 | can compare the actual value [4]. Once we dereference `y`, we have access to | |
400 | the integer value `y` is pointing to that we can compare with `5`. | |
5099ac24 FG |
401 | |
402 | If we tried to write `assert_eq!(5, y);` instead, we would get this compilation | |
403 | error: | |
404 | ||
405 | ``` | |
406 | error[E0277]: can't compare `{integer}` with `&{integer}` | |
407 | --> src/main.rs:6:5 | |
408 | | | |
409 | 6 | assert_eq!(5, y); | |
410 | | ^^^^^^^^^^^^^^^^ no implementation for `{integer} == &{integer}` | |
411 | | | |
412 | = help: the trait `PartialEq<&{integer}>` is not implemented for `{integer}` | |
413 | ``` | |
414 | ||
415 | Comparing a number and a reference to a number isn’t allowed because they’re | |
416 | different types. We must use the dereference operator to follow the reference | |
417 | to the value it’s pointing to. | |
418 | ||
419 | ### Using `Box<T>` Like a Reference | |
420 | ||
421 | We can rewrite the code in Listing 15-6 to use a `Box<T>` instead of a | |
04454e1e FG |
422 | reference; the dereference operator used on the `Box<T>` in Listing 15-7 |
423 | functions in the same way as the dereference operator used on the reference in | |
424 | Listing 15-6: | |
5099ac24 FG |
425 | |
426 | Filename: src/main.rs | |
427 | ||
428 | ``` | |
429 | fn main() { | |
430 | let x = 5; | |
431 | [1] let y = Box::new(x); | |
432 | ||
433 | assert_eq!(5, x); | |
434 | [2] assert_eq!(5, *y); | |
435 | } | |
436 | ``` | |
437 | ||
438 | Listing 15-7: Using the dereference operator on a `Box<i32>` | |
439 | ||
440 | The main difference between Listing 15-7 and Listing 15-6 is that here we set | |
441 | `y` to be an instance of a box pointing to a copied value of `x` rather than a | |
442 | reference pointing to the value of `x` [1]. In the last assertion [2], we can | |
443 | use the dereference operator to follow the box’s pointer in the same way that | |
444 | we did when `y` was a reference. Next, we’ll explore what is special about | |
445 | `Box<T>` that enables us to use the dereference operator by defining our own | |
446 | box type. | |
04454e1e FG |
447 | <!-- is there a benefit to using this method -- a box rather than a reference |
448 | -- or is this just pure example? they seemed to be doing pretty much the same | |
449 | thing, I wasn't clear why you'd choose one over the other /LC --> | |
450 | <!-- This example is illustrating that the box and the reference behave the | |
451 | same in regards to the dereference operator. Differences and why to use `Box` | |
452 | are discussed before and after this section. I tried to clarify the purpose of | |
453 | this listing a bit in the paragraph just before the listing, does that help? | |
454 | /Carol --> | |
5099ac24 FG |
455 | |
456 | ### Defining Our Own Smart Pointer | |
457 | ||
458 | Let’s build a smart pointer similar to the `Box<T>` type provided by the | |
459 | standard library to experience how smart pointers behave differently from | |
460 | references by default. Then we’ll look at how to add the ability to use the | |
461 | dereference operator. | |
462 | ||
463 | The `Box<T>` type is ultimately defined as a tuple struct with one element, so | |
464 | Listing 15-8 defines a `MyBox<T>` type in the same way. We’ll also define a | |
465 | `new` function to match the `new` function defined on `Box<T>`. | |
466 | ||
467 | Filename: src/main.rs | |
468 | ||
469 | ``` | |
470 | [1] struct MyBox<T>(T); | |
471 | ||
472 | impl<T> MyBox<T> { | |
473 | [2] fn new(x: T) -> MyBox<T> { | |
474 | [3] MyBox(x) | |
475 | } | |
476 | } | |
477 | ``` | |
478 | ||
479 | Listing 15-8: Defining a `MyBox<T>` type | |
480 | ||
481 | We define a struct named `MyBox` and declare a generic parameter `T` [1], | |
482 | because we want our type to hold values of any type. The `MyBox` type is a | |
483 | tuple struct with one element of type `T`. The `MyBox::new` function takes one | |
484 | parameter of type `T` [2] and returns a `MyBox` instance that holds the value | |
485 | passed in [3]. | |
486 | ||
487 | Let’s try adding the `main` function in Listing 15-7 to Listing 15-8 and | |
488 | changing it to use the `MyBox<T>` type we’ve defined instead of `Box<T>`. The | |
489 | code in Listing 15-9 won’t compile because Rust doesn’t know how to dereference | |
490 | `MyBox`. | |
491 | ||
492 | Filename: src/main.rs | |
493 | ||
494 | ``` | |
495 | fn main() { | |
496 | let x = 5; | |
497 | let y = MyBox::new(x); | |
498 | ||
499 | assert_eq!(5, x); | |
500 | assert_eq!(5, *y); | |
501 | } | |
502 | ``` | |
503 | ||
504 | Listing 15-9: Attempting to use `MyBox<T>` in the same way we used references | |
505 | and `Box<T>` | |
506 | ||
507 | Here’s the resulting compilation error: | |
508 | ||
509 | ``` | |
510 | error[E0614]: type `MyBox<{integer}>` cannot be dereferenced | |
511 | --> src/main.rs:14:19 | |
512 | | | |
513 | 14 | assert_eq!(5, *y); | |
514 | | ^^ | |
515 | ``` | |
516 | ||
517 | Our `MyBox<T>` type can’t be dereferenced because we haven’t implemented that | |
518 | ability on our type. To enable dereferencing with the `*` operator, we | |
519 | implement the `Deref` trait. | |
520 | ||
521 | ### Treating a Type Like a Reference by Implementing the `Deref` Trait | |
522 | ||
523 | As discussed in the “Implementing a Trait on a Type” section of Chapter 10, to | |
524 | implement a trait, we need to provide implementations for the trait’s required | |
525 | methods. The `Deref` trait, provided by the standard library, requires us to | |
526 | implement one method named `deref` that borrows `self` and returns a reference | |
527 | to the inner data. Listing 15-10 contains an implementation of `Deref` to add | |
528 | to the definition of `MyBox`: | |
529 | ||
530 | Filename: src/main.rs | |
531 | ||
532 | ``` | |
533 | use std::ops::Deref; | |
534 | ||
535 | impl<T> Deref for MyBox<T> { | |
536 | [1] type Target = T; | |
537 | ||
538 | fn deref(&self) -> &Self::Target { | |
539 | [2] &self.0 | |
540 | } | |
541 | } | |
542 | ``` | |
543 | ||
544 | Listing 15-10: Implementing `Deref` on `MyBox<T>` | |
545 | ||
546 | The `type Target = T;` syntax [1] defines an associated type for the `Deref` | |
547 | trait to use. Associated types are a slightly different way of declaring a | |
548 | generic parameter, but you don’t need to worry about them for now; we’ll cover | |
549 | them in more detail in Chapter 19. | |
550 | ||
551 | We fill in the body of the `deref` method with `&self.0` so `deref` returns a | |
04454e1e | 552 | reference to the value we want to access with the `*` operator [2]; recall from |
5099ac24 FG |
553 | the “Using Tuple Structs without Named Fields to Create Different Types” |
554 | section of Chapter 5 that `.0` accesses the first value in a tuple struct. The | |
555 | `main` function in Listing 15-9 that calls `*` on the `MyBox<T>` value now | |
556 | compiles, and the assertions pass! | |
557 | ||
558 | Without the `Deref` trait, the compiler can only dereference `&` references. | |
559 | The `deref` method gives the compiler the ability to take a value of any type | |
560 | that implements `Deref` and call the `deref` method to get a `&` reference that | |
561 | it knows how to dereference. | |
562 | ||
563 | When we entered `*y` in Listing 15-9, behind the scenes Rust actually ran this | |
564 | code: | |
565 | ||
566 | ``` | |
567 | *(y.deref()) | |
568 | ``` | |
569 | ||
570 | Rust substitutes the `*` operator with a call to the `deref` method and then a | |
571 | plain dereference so we don’t have to think about whether or not we need to | |
572 | call the `deref` method. This Rust feature lets us write code that functions | |
573 | identically whether we have a regular reference or a type that implements | |
574 | `Deref`. | |
575 | ||
04454e1e FG |
576 | The reason the `deref` method returns a reference to a value, and that the |
577 | plain dereference outside the parentheses in `*(y.deref())` is still necessary, | |
578 | is to do with the ownership system. If the `deref` method returned the value | |
579 | directly instead of a reference to the value, the value would be moved out of | |
580 | `self`. We don’t want to take ownership of the inner value inside `MyBox<T>` in | |
581 | this case or in most cases where we use the dereference operator. | |
5099ac24 FG |
582 | |
583 | Note that the `*` operator is replaced with a call to the `deref` method and | |
584 | then a call to the `*` operator just once, each time we use a `*` in our code. | |
585 | Because the substitution of the `*` operator does not recurse infinitely, we | |
586 | end up with data of type `i32`, which matches the `5` in `assert_eq!` in | |
587 | Listing 15-9. | |
588 | ||
589 | ### Implicit Deref Coercions with Functions and Methods | |
590 | ||
04454e1e FG |
591 | *Deref coercion* converts a reference to a type that implements the `Deref` |
592 | trait into a reference to another type. For example, deref coercion can convert | |
593 | `&String` to `&str` because `String` implements the `Deref` trait such that it | |
594 | returns `&str`. Deref conversion is a convenience Rust performs on arguments to | |
595 | functions and methods, and works only on types that implement the `Deref` | |
596 | trait. It happens automatically when we pass a reference to a particular type’s | |
597 | value as an argument to a function or method that doesn’t match the parameter | |
598 | type in the function or method definition. A sequence of calls to the `deref` | |
599 | method converts the type we provided into the type the parameter needs. | |
5099ac24 FG |
600 | |
601 | Deref coercion was added to Rust so that programmers writing function and | |
602 | method calls don’t need to add as many explicit references and dereferences | |
603 | with `&` and `*`. The deref coercion feature also lets us write more code that | |
604 | can work for either references or smart pointers. | |
605 | ||
606 | To see deref coercion in action, let’s use the `MyBox<T>` type we defined in | |
607 | Listing 15-8 as well as the implementation of `Deref` that we added in Listing | |
608 | 15-10. Listing 15-11 shows the definition of a function that has a string slice | |
609 | parameter: | |
610 | ||
611 | Filename: src/main.rs | |
612 | ||
613 | ``` | |
614 | fn hello(name: &str) { | |
615 | println!("Hello, {}!", name); | |
616 | } | |
617 | ``` | |
618 | ||
619 | Listing 15-11: A `hello` function that has the parameter `name` of type `&str` | |
620 | ||
621 | We can call the `hello` function with a string slice as an argument, such as | |
622 | `hello("Rust");` for example. Deref coercion makes it possible to call `hello` | |
623 | with a reference to a value of type `MyBox<String>`, as shown in Listing 15-12: | |
624 | ||
625 | Filename: src/main.rs | |
626 | ||
627 | ``` | |
628 | fn main() { | |
629 | let m = MyBox::new(String::from("Rust")); | |
630 | hello(&m); | |
631 | } | |
632 | ``` | |
633 | ||
634 | Listing 15-12: Calling `hello` with a reference to a `MyBox<String>` value, | |
635 | which works because of deref coercion | |
636 | ||
637 | Here we’re calling the `hello` function with the argument `&m`, which is a | |
638 | reference to a `MyBox<String>` value. Because we implemented the `Deref` trait | |
639 | on `MyBox<T>` in Listing 15-10, Rust can turn `&MyBox<String>` into `&String` | |
640 | by calling `deref`. The standard library provides an implementation of `Deref` | |
641 | on `String` that returns a string slice, and this is in the API documentation | |
642 | for `Deref`. Rust calls `deref` again to turn the `&String` into `&str`, which | |
643 | matches the `hello` function’s definition. | |
644 | ||
645 | If Rust didn’t implement deref coercion, we would have to write the code in | |
646 | Listing 15-13 instead of the code in Listing 15-12 to call `hello` with a value | |
647 | of type `&MyBox<String>`. | |
648 | ||
649 | Filename: src/main.rs | |
650 | ||
651 | ``` | |
652 | fn main() { | |
653 | let m = MyBox::new(String::from("Rust")); | |
654 | hello(&(*m)[..]); | |
655 | } | |
656 | ``` | |
657 | ||
658 | Listing 15-13: The code we would have to write if Rust didn’t have deref | |
659 | coercion | |
660 | ||
661 | The `(*m)` dereferences the `MyBox<String>` into a `String`. Then the `&` and | |
662 | `[..]` take a string slice of the `String` that is equal to the whole string to | |
04454e1e | 663 | match the signature of `hello`. This code without deref coercions is harder to |
5099ac24 FG |
664 | read, write, and understand with all of these symbols involved. Deref coercion |
665 | allows Rust to handle these conversions for us automatically. | |
666 | ||
667 | When the `Deref` trait is defined for the types involved, Rust will analyze the | |
668 | types and use `Deref::deref` as many times as necessary to get a reference to | |
669 | match the parameter’s type. The number of times that `Deref::deref` needs to be | |
670 | inserted is resolved at compile time, so there is no runtime penalty for taking | |
671 | advantage of deref coercion! | |
672 | ||
673 | ### How Deref Coercion Interacts with Mutability | |
674 | ||
675 | Similar to how you use the `Deref` trait to override the `*` operator on | |
676 | immutable references, you can use the `DerefMut` trait to override the `*` | |
677 | operator on mutable references. | |
678 | ||
679 | Rust does deref coercion when it finds types and trait implementations in three | |
680 | cases: | |
681 | ||
682 | * From `&T` to `&U` when `T: Deref<Target=U>` | |
683 | * From `&mut T` to `&mut U` when `T: DerefMut<Target=U>` | |
684 | * From `&mut T` to `&U` when `T: Deref<Target=U>` | |
685 | ||
04454e1e FG |
686 | The first two cases are the same as each other except that the second |
687 | implements mutability. The first case states that if you have a `&T`, and `T` | |
688 | implements `Deref` to some type `U`, you can get a `&U` transparently. The | |
689 | second case states that the same deref coercion happens for mutable references. | |
690 | ||
691 | <!-- the first two cases are the same as what? each other? I think so, | |
692 | so have edited as much, but wasn't sure /LC --> | |
693 | <!-- Yep! /Carol --> | |
5099ac24 FG |
694 | |
695 | The third case is trickier: Rust will also coerce a mutable reference to an | |
696 | immutable one. But the reverse is *not* possible: immutable references will | |
697 | never coerce to mutable references. Because of the borrowing rules, if you have | |
698 | a mutable reference, that mutable reference must be the only reference to that | |
699 | data (otherwise, the program wouldn’t compile). Converting one mutable | |
700 | reference to one immutable reference will never break the borrowing rules. | |
701 | Converting an immutable reference to a mutable reference would require that the | |
702 | initial immutable reference is the only immutable reference to that data, but | |
703 | the borrowing rules don’t guarantee that. Therefore, Rust can’t make the | |
704 | assumption that converting an immutable reference to a mutable reference is | |
705 | possible. | |
706 | ||
707 | ## Running Code on Cleanup with the `Drop` Trait | |
708 | ||
709 | The second trait important to the smart pointer pattern is `Drop`, which lets | |
710 | you customize what happens when a value is about to go out of scope. You can | |
04454e1e FG |
711 | provide an implementation for the `Drop` trait on any type, and that code |
712 | can be used to release resources like files or network connections. | |
713 | <!-- so we're saying Drop can be used in other contexts, this is just a | |
714 | convenient place to introduce it? /LC --> | |
715 | <!-- Yep! /Carol --> | |
5099ac24 FG |
716 | We’re introducing `Drop` in the context of smart pointers because the |
717 | functionality of the `Drop` trait is almost always used when implementing a | |
718 | smart pointer. For example, when a `Box<T>` is dropped it will deallocate the | |
719 | space on the heap that the box points to. | |
720 | ||
04454e1e FG |
721 | In some languages, for some types, the programmer must call code to free memory |
722 | or resources every time they finish using an instance those types. Examples | |
723 | include file handles, sockets, or locks. If they forget, the system might | |
724 | become overloaded and crash. In Rust, you can specify that a particular bit of | |
725 | code be run whenever a value goes out of scope, and the compiler will insert | |
726 | this code automatically. As a result, you don’t need to be careful about | |
727 | placing cleanup code everywhere in a program that an instance of a particular | |
728 | type is finished with—you still won’t leak resources! | |
5099ac24 | 729 | |
04454e1e | 730 | You specify the code to run when a value goes out of scope by implementing the |
5099ac24 FG |
731 | `Drop` trait. The `Drop` trait requires you to implement one method named |
732 | `drop` that takes a mutable reference to `self`. To see when Rust calls `drop`, | |
733 | let’s implement `drop` with `println!` statements for now. | |
734 | ||
735 | Listing 15-14 shows a `CustomSmartPointer` struct whose only custom | |
736 | functionality is that it will print `Dropping CustomSmartPointer!` when the | |
04454e1e | 737 | instance goes out of scope, to show when Rust runs the `drop` function. |
5099ac24 FG |
738 | |
739 | Filename: src/main.rs | |
740 | ||
741 | ``` | |
742 | struct CustomSmartPointer { | |
743 | data: String, | |
744 | } | |
745 | ||
746 | [1] impl Drop for CustomSmartPointer { | |
747 | fn drop(&mut self) { | |
748 | [2] println!("Dropping CustomSmartPointer with data `{}`!", self.data); | |
749 | } | |
750 | } | |
751 | ||
752 | fn main() { | |
753 | [3] let c = CustomSmartPointer { | |
754 | data: String::from("my stuff"), | |
755 | }; | |
756 | [4] let d = CustomSmartPointer { | |
757 | data: String::from("other stuff"), | |
758 | }; | |
759 | [5] println!("CustomSmartPointers created."); | |
760 | [6] } | |
761 | ``` | |
762 | ||
763 | Listing 15-14: A `CustomSmartPointer` struct that implements the `Drop` trait | |
764 | where we would put our cleanup code | |
765 | ||
766 | The `Drop` trait is included in the prelude, so we don’t need to bring it into | |
767 | scope. We implement the `Drop` trait on `CustomSmartPointer` [1] and provide an | |
768 | implementation for the `drop` method that calls `println!` [2]. The body of the | |
769 | `drop` function is where you would place any logic that you wanted to run when | |
770 | an instance of your type goes out of scope. We’re printing some text here to | |
04454e1e | 771 | demonstrate visually when Rust will call `drop`. |
5099ac24 FG |
772 | |
773 | In `main`, we create two instances of `CustomSmartPointer` [3][4] and then | |
774 | print `CustomSmartPointers created` [5]. At the end of `main` [6], our | |
775 | instances of `CustomSmartPointer` will go out of scope, and Rust will call the | |
776 | code we put in the `drop` method [2], printing our final message. Note that we | |
777 | didn’t need to call the `drop` method explicitly. | |
778 | ||
779 | When we run this program, we’ll see the following output: | |
780 | ||
781 | ``` | |
782 | CustomSmartPointers created. | |
783 | Dropping CustomSmartPointer with data `other stuff`! | |
784 | Dropping CustomSmartPointer with data `my stuff`! | |
785 | ``` | |
786 | ||
787 | Rust automatically called `drop` for us when our instances went out of scope, | |
788 | calling the code we specified. Variables are dropped in the reverse order of | |
04454e1e FG |
789 | their creation, so `d` was dropped before `c`. This example's purpose is to |
790 | give you a visual guide to how the `drop` method works; usually you would | |
791 | specify the cleanup code that your type needs to run rather than a print | |
792 | message. | |
5099ac24 FG |
793 | |
794 | ### Dropping a Value Early with `std::mem::drop` | |
795 | ||
796 | Unfortunately, it’s not straightforward to disable the automatic `drop` | |
797 | functionality. Disabling `drop` isn’t usually necessary; the whole point of the | |
798 | `Drop` trait is that it’s taken care of automatically. Occasionally, however, | |
799 | you might want to clean up a value early. One example is when using smart | |
800 | pointers that manage locks: you might want to force the `drop` method that | |
801 | releases the lock so that other code in the same scope can acquire the lock. | |
802 | Rust doesn’t let you call the `Drop` trait’s `drop` method manually; instead | |
803 | you have to call the `std::mem::drop` function provided by the standard library | |
804 | if you want to force a value to be dropped before the end of its scope. | |
805 | ||
806 | If we try to call the `Drop` trait’s `drop` method manually by modifying the | |
807 | `main` function from Listing 15-14, as shown in Listing 15-15, we’ll get a | |
808 | compiler error: | |
809 | ||
810 | Filename: src/main.rs | |
811 | ||
812 | ``` | |
813 | fn main() { | |
814 | let c = CustomSmartPointer { | |
815 | data: String::from("some data"), | |
816 | }; | |
817 | println!("CustomSmartPointer created."); | |
818 | c.drop(); | |
819 | println!("CustomSmartPointer dropped before the end of main."); | |
820 | } | |
821 | ``` | |
822 | ||
823 | Listing 15-15: Attempting to call the `drop` method from the `Drop` trait | |
824 | manually to clean up early | |
825 | ||
826 | When we try to compile this code, we’ll get this error: | |
827 | ||
828 | ``` | |
829 | error[E0040]: explicit use of destructor method | |
830 | --> src/main.rs:16:7 | |
831 | | | |
832 | 16 | c.drop(); | |
833 | | --^^^^-- | |
834 | | | | | |
835 | | | explicit destructor calls not allowed | |
836 | ``` | |
837 | ||
838 | This error message states that we’re not allowed to explicitly call `drop`. The | |
839 | error message uses the term *destructor*, which is the general programming term | |
840 | for a function that cleans up an instance. A *destructor* is analogous to a | |
841 | *constructor*, which creates an instance. The `drop` function in Rust is one | |
842 | particular destructor. | |
843 | ||
844 | Rust doesn’t let us call `drop` explicitly because Rust would still | |
04454e1e | 845 | automatically call `drop` on the value at the end of `main`. This would cause a |
5099ac24 FG |
846 | *double free* error because Rust would be trying to clean up the same value |
847 | twice. | |
848 | ||
849 | We can’t disable the automatic insertion of `drop` when a value goes out of | |
850 | scope, and we can’t call the `drop` method explicitly. So, if we need to force | |
04454e1e | 851 | a value to be cleaned up early, we use the `std::mem::drop` function. |
5099ac24 FG |
852 | |
853 | The `std::mem::drop` function is different from the `drop` method in the `Drop` | |
04454e1e FG |
854 | trait. We call it by passing as an argument the value we want to force drop. |
855 | The function is in the prelude, so we can modify `main` in Listing 15-15 to | |
856 | call the `drop` function, as shown in Listing 15-16: | |
5099ac24 FG |
857 | |
858 | Filename: src/main.rs | |
859 | ||
860 | ``` | |
861 | fn main() { | |
862 | let c = CustomSmartPointer { | |
863 | data: String::from("some data"), | |
864 | }; | |
865 | println!("CustomSmartPointer created."); | |
866 | drop(c); | |
867 | println!("CustomSmartPointer dropped before the end of main."); | |
868 | } | |
869 | ``` | |
870 | ||
871 | Listing 15-16: Calling `std::mem::drop` to explicitly drop a value before it | |
872 | goes out of scope | |
873 | ||
874 | Running this code will print the following: | |
875 | ||
876 | ``` | |
877 | CustomSmartPointer created. | |
878 | Dropping CustomSmartPointer with data `some data`! | |
879 | CustomSmartPointer dropped before the end of main. | |
880 | ``` | |
881 | ||
882 | The text ```Dropping CustomSmartPointer with data `some data`!``` is printed | |
883 | between the `CustomSmartPointer created.` and `CustomSmartPointer dropped | |
884 | before the end of main.` text, showing that the `drop` method code is called to | |
885 | drop `c` at that point. | |
886 | ||
887 | You can use code specified in a `Drop` trait implementation in many ways to | |
888 | make cleanup convenient and safe: for instance, you could use it to create your | |
889 | own memory allocator! With the `Drop` trait and Rust’s ownership system, you | |
890 | don’t have to remember to clean up because Rust does it automatically. | |
891 | ||
892 | You also don’t have to worry about problems resulting from accidentally | |
893 | cleaning up values still in use: the ownership system that makes sure | |
894 | references are always valid also ensures that `drop` gets called only once when | |
895 | the value is no longer being used. | |
896 | ||
897 | Now that we’ve examined `Box<T>` and some of the characteristics of smart | |
898 | pointers, let’s look at a few other smart pointers defined in the standard | |
899 | library. | |
900 | ||
901 | ## `Rc<T>`, the Reference Counted Smart Pointer | |
902 | ||
903 | In the majority of cases, ownership is clear: you know exactly which variable | |
904 | owns a given value. However, there are cases when a single value might have | |
905 | multiple owners. For example, in graph data structures, multiple edges might | |
906 | point to the same node, and that node is conceptually owned by all of the edges | |
907 | that point to it. A node shouldn’t be cleaned up unless it doesn’t have any | |
04454e1e | 908 | edges pointing to it and so has no owners. |
5099ac24 | 909 | |
04454e1e FG |
910 | <!-- Is this right, we always have to enable it explicitly? /LC --> |
911 | <!-- Yep! /Carol --> | |
912 | You have to enable multiple ownership explicitly by using the Rust type | |
913 | `Rc<T>`, which is an abbreviation for *reference counting*. The `Rc<T>` type | |
914 | keeps track of the number of references to a value to determine whether or not | |
915 | the value is still in use. If there are zero references to a value, the value | |
916 | can be cleaned up without any references becoming invalid. | |
5099ac24 FG |
917 | |
918 | Imagine `Rc<T>` as a TV in a family room. When one person enters to watch TV, | |
919 | they turn it on. Others can come into the room and watch the TV. When the last | |
920 | person leaves the room, they turn off the TV because it’s no longer being used. | |
921 | If someone turns off the TV while others are still watching it, there would be | |
922 | uproar from the remaining TV watchers! | |
04454e1e | 923 | <!-- ha, perfect analogy /LC --> |
5099ac24 FG |
924 | |
925 | We use the `Rc<T>` type when we want to allocate some data on the heap for | |
926 | multiple parts of our program to read and we can’t determine at compile time | |
927 | which part will finish using the data last. If we knew which part would finish | |
928 | last, we could just make that part the data’s owner, and the normal ownership | |
929 | rules enforced at compile time would take effect. | |
930 | ||
931 | Note that `Rc<T>` is only for use in single-threaded scenarios. When we discuss | |
932 | concurrency in Chapter 16, we’ll cover how to do reference counting in | |
933 | multithreaded programs. | |
934 | ||
935 | ### Using `Rc<T>` to Share Data | |
936 | ||
937 | Let’s return to our cons list example in Listing 15-5. Recall that we defined | |
938 | it using `Box<T>`. This time, we’ll create two lists that both share ownership | |
939 | of a third list. Conceptually, this looks similar to Figure 15-3: | |
940 | ||
941 | <img alt="Two lists that share ownership of a third list" src="img/trpl15-03.svg" class="center" /> | |
942 | ||
943 | Figure 15-3: Two lists, `b` and `c`, sharing ownership of a third list, `a` | |
944 | ||
945 | We’ll create list `a` that contains 5 and then 10. Then we’ll make two more | |
946 | lists: `b` that starts with 3 and `c` that starts with 4. Both `b` and `c` | |
947 | lists will then continue on to the first `a` list containing 5 and 10. In other | |
948 | words, both lists will share the first list containing 5 and 10. | |
949 | ||
950 | Trying to implement this scenario using our definition of `List` with `Box<T>` | |
951 | won’t work, as shown in Listing 15-17: | |
952 | ||
953 | Filename: src/main.rs | |
954 | ||
955 | ``` | |
956 | enum List { | |
957 | Cons(i32, Box<List>), | |
958 | Nil, | |
959 | } | |
960 | ||
961 | use crate::List::{Cons, Nil}; | |
962 | ||
963 | fn main() { | |
964 | let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); | |
965 | [1] let b = Cons(3, Box::new(a)); | |
966 | [2] let c = Cons(4, Box::new(a)); | |
967 | } | |
968 | ``` | |
969 | ||
970 | Listing 15-17: Demonstrating we’re not allowed to have two lists using `Box<T>` | |
971 | that try to share ownership of a third list | |
972 | ||
973 | When we compile this code, we get this error: | |
974 | ||
975 | ``` | |
976 | error[E0382]: use of moved value: `a` | |
977 | --> src/main.rs:11:30 | |
978 | | | |
979 | 9 | let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); | |
980 | | - move occurs because `a` has type `List`, which does not implement the `Copy` trait | |
981 | 10 | let b = Cons(3, Box::new(a)); | |
982 | | - value moved here | |
983 | 11 | let c = Cons(4, Box::new(a)); | |
984 | | ^ value used here after move | |
985 | ``` | |
986 | ||
987 | The `Cons` variants own the data they hold, so when we create the `b` list [1], | |
988 | `a` is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when | |
989 | creating `c` [2], we’re not allowed to because `a` has been moved. | |
990 | ||
991 | We could change the definition of `Cons` to hold references instead, but then | |
992 | we would have to specify lifetime parameters. By specifying lifetime | |
993 | parameters, we would be specifying that every element in the list will live at | |
994 | least as long as the entire list. This is the case for the elements and lists | |
995 | in Listing 15-17, but not in every scenario. | |
996 | ||
997 | Instead, we’ll change our definition of `List` to use `Rc<T>` in place of | |
998 | `Box<T>`, as shown in Listing 15-18. Each `Cons` variant will now hold a value | |
999 | and an `Rc<T>` pointing to a `List`. When we create `b`, instead of taking | |
1000 | ownership of `a`, we’ll clone the `Rc<List>` that `a` is holding, thereby | |
1001 | increasing the number of references from one to two and letting `a` and `b` | |
1002 | share ownership of the data in that `Rc<List>`. We’ll also clone `a` when | |
1003 | creating `c`, increasing the number of references from two to three. Every time | |
1004 | we call `Rc::clone`, the reference count to the data within the `Rc<List>` will | |
1005 | increase, and the data won’t be cleaned up unless there are zero references to | |
1006 | it. | |
1007 | ||
1008 | Filename: src/main.rs | |
1009 | ||
1010 | ``` | |
1011 | enum List { | |
1012 | Cons(i32, Rc<List>), | |
1013 | Nil, | |
1014 | } | |
1015 | ||
1016 | use crate::List::{Cons, Nil}; | |
1017 | [1] use std::rc::Rc; | |
1018 | ||
1019 | fn main() { | |
1020 | [2] let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); | |
1021 | [3] let b = Cons(3, Rc::clone(&a)); | |
1022 | [4] let c = Cons(4, Rc::clone(&a)); | |
1023 | } | |
1024 | ``` | |
1025 | ||
1026 | Listing 15-18: A definition of `List` that uses `Rc<T>` | |
1027 | ||
1028 | We need to add a `use` statement to bring `Rc<T>` into scope [1] because it’s | |
1029 | not in the prelude. In `main`, we create the list holding 5 and 10 and store it | |
1030 | in a new `Rc<List>` in `a` [2]. Then when we create `b` [3] and `c` [4], we | |
1031 | call the `Rc::clone` function and pass a reference to the `Rc<List>` in `a` as | |
1032 | an argument. | |
1033 | ||
1034 | We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust’s | |
1035 | convention is to use `Rc::clone` in this case. The implementation of | |
1036 | `Rc::clone` doesn’t make a deep copy of all the data like most types’ | |
1037 | implementations of `clone` do. The call to `Rc::clone` only increments the | |
1038 | reference count, which doesn’t take much time. Deep copies of data can take a | |
1039 | lot of time. By using `Rc::clone` for reference counting, we can visually | |
1040 | distinguish between the deep-copy kinds of clones and the kinds of clones that | |
1041 | increase the reference count. When looking for performance problems in the | |
1042 | code, we only need to consider the deep-copy clones and can disregard calls to | |
1043 | `Rc::clone`. | |
1044 | ||
1045 | ### Cloning an `Rc<T>` Increases the Reference Count | |
1046 | ||
1047 | Let’s change our working example in Listing 15-18 so we can see the reference | |
1048 | counts changing as we create and drop references to the `Rc<List>` in `a`. | |
1049 | ||
1050 | In Listing 15-19, we’ll change `main` so it has an inner scope around list `c`; | |
1051 | then we can see how the reference count changes when `c` goes out of scope. | |
1052 | ||
1053 | Filename: src/main.rs | |
1054 | ||
1055 | ``` | |
1056 | fn main() { | |
1057 | let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); | |
1058 | println!("count after creating a = {}", Rc::strong_count(&a)); | |
1059 | let b = Cons(3, Rc::clone(&a)); | |
1060 | println!("count after creating b = {}", Rc::strong_count(&a)); | |
1061 | { | |
1062 | let c = Cons(4, Rc::clone(&a)); | |
1063 | println!("count after creating c = {}", Rc::strong_count(&a)); | |
1064 | } | |
1065 | println!("count after c goes out of scope = {}", Rc::strong_count(&a)); | |
1066 | } | |
1067 | ``` | |
1068 | ||
1069 | Listing 15-19: Printing the reference count | |
1070 | ||
1071 | At each point in the program where the reference count changes, we print the | |
04454e1e FG |
1072 | reference count, which we get by calling the `Rc::strong_count` function. This |
1073 | function is named `strong_count` rather than `count` because the `Rc<T>` type | |
1074 | also has a `weak_count`; we’ll see what `weak_count` is used for in the | |
5099ac24 FG |
1075 | “Preventing Reference Cycles: Turning an `Rc<T>` into a `Weak<T>`” section. |
1076 | ||
1077 | This code prints the following: | |
1078 | ||
1079 | ``` | |
1080 | count after creating a = 1 | |
1081 | count after creating b = 2 | |
1082 | count after creating c = 3 | |
1083 | count after c goes out of scope = 2 | |
1084 | ``` | |
1085 | ||
1086 | We can see that the `Rc<List>` in `a` has an initial reference count of 1; then | |
1087 | each time we call `clone`, the count goes up by 1. When `c` goes out of scope, | |
1088 | the count goes down by 1. We don’t have to call a function to decrease the | |
1089 | reference count like we have to call `Rc::clone` to increase the reference | |
1090 | count: the implementation of the `Drop` trait decreases the reference count | |
1091 | automatically when an `Rc<T>` value goes out of scope. | |
1092 | ||
1093 | What we can’t see in this example is that when `b` and then `a` go out of scope | |
1094 | at the end of `main`, the count is then 0, and the `Rc<List>` is cleaned up | |
04454e1e FG |
1095 | completely. Using `Rc<T>` allows a single value to have multiple owners, and |
1096 | the count ensures that the value remains valid as long as any of the owners | |
1097 | still exist. | |
5099ac24 FG |
1098 | |
1099 | Via immutable references, `Rc<T>` allows you to share data between multiple | |
1100 | parts of your program for reading only. If `Rc<T>` allowed you to have multiple | |
1101 | mutable references too, you might violate one of the borrowing rules discussed | |
1102 | in Chapter 4: multiple mutable borrows to the same place can cause data races | |
1103 | and inconsistencies. But being able to mutate data is very useful! In the next | |
1104 | section, we’ll discuss the interior mutability pattern and the `RefCell<T>` | |
1105 | type that you can use in conjunction with an `Rc<T>` to work with this | |
1106 | immutability restriction. | |
1107 | ||
1108 | ## `RefCell<T>` and the Interior Mutability Pattern | |
1109 | ||
1110 | *Interior mutability* is a design pattern in Rust that allows you to mutate | |
1111 | data even when there are immutable references to that data; normally, this | |
1112 | action is disallowed by the borrowing rules. To mutate data, the pattern uses | |
1113 | `unsafe` code inside a data structure to bend Rust’s usual rules that govern | |
04454e1e FG |
1114 | mutation and borrowing. We haven’t yet covered unsafe code that indicates we're |
1115 | checking the rules manually instead of the compiler checking them for us; we | |
1116 | will discuss unsafe code more in Chapter 19. | |
1117 | <!-- maybe quickly give a quick idea of what unsafe code means, at a really | |
1118 | high level, to let them know how it helps in this situation /LC --> | |
1119 | <!-- Done! /Carol --> | |
1120 | We can use types that use the interior mutability pattern only when we can | |
5099ac24 FG |
1121 | ensure that the borrowing rules will be followed at runtime, even though the |
1122 | compiler can’t guarantee that. The `unsafe` code involved is then wrapped in a | |
1123 | safe API, and the outer type is still immutable. | |
1124 | ||
1125 | Let’s explore this concept by looking at the `RefCell<T>` type that follows the | |
1126 | interior mutability pattern. | |
1127 | ||
1128 | ### Enforcing Borrowing Rules at Runtime with `RefCell<T>` | |
1129 | ||
1130 | Unlike `Rc<T>`, the `RefCell<T>` type represents single ownership over the data | |
1131 | it holds. So, what makes `RefCell<T>` different from a type like `Box<T>`? | |
1132 | Recall the borrowing rules you learned in Chapter 4: | |
1133 | ||
04454e1e FG |
1134 | * At any given time, you can have *either* (but not both) one mutable reference |
1135 | or any number of immutable references. | |
5099ac24 FG |
1136 | * References must always be valid. |
1137 | ||
1138 | With references and `Box<T>`, the borrowing rules’ invariants are enforced at | |
1139 | compile time. With `RefCell<T>`, these invariants are enforced *at runtime*. | |
1140 | With references, if you break these rules, you’ll get a compiler error. With | |
1141 | `RefCell<T>`, if you break these rules, your program will panic and exit. | |
1142 | ||
1143 | The advantages of checking the borrowing rules at compile time are that errors | |
1144 | will be caught sooner in the development process, and there is no impact on | |
1145 | runtime performance because all the analysis is completed beforehand. For those | |
1146 | reasons, checking the borrowing rules at compile time is the best choice in the | |
1147 | majority of cases, which is why this is Rust’s default. | |
1148 | ||
1149 | The advantage of checking the borrowing rules at runtime instead is that | |
04454e1e FG |
1150 | certain memory-safe scenarios are then allowed, where they would’ve been |
1151 | disallowed by the compile-time checks. Static analysis, like the Rust compiler, | |
1152 | is inherently conservative. Some properties of code are impossible to detect by | |
1153 | analyzing the code: the most famous example is the Halting Problem, which is | |
1154 | beyond the scope of this book but is an interesting topic to research. | |
5099ac24 FG |
1155 | |
1156 | Because some analysis is impossible, if the Rust compiler can’t be sure the | |
1157 | code complies with the ownership rules, it might reject a correct program; in | |
1158 | this way, it’s conservative. If Rust accepted an incorrect program, users | |
1159 | wouldn’t be able to trust in the guarantees Rust makes. However, if Rust | |
1160 | rejects a correct program, the programmer will be inconvenienced, but nothing | |
1161 | catastrophic can occur. The `RefCell<T>` type is useful when you’re sure your | |
1162 | code follows the borrowing rules but the compiler is unable to understand and | |
1163 | guarantee that. | |
1164 | ||
1165 | Similar to `Rc<T>`, `RefCell<T>` is only for use in single-threaded scenarios | |
1166 | and will give you a compile-time error if you try using it in a multithreaded | |
1167 | context. We’ll talk about how to get the functionality of `RefCell<T>` in a | |
1168 | multithreaded program in Chapter 16. | |
1169 | ||
1170 | Here is a recap of the reasons to choose `Box<T>`, `Rc<T>`, or `RefCell<T>`: | |
1171 | ||
1172 | * `Rc<T>` enables multiple owners of the same data; `Box<T>` and `RefCell<T>` | |
1173 | have single owners. | |
1174 | * `Box<T>` allows immutable or mutable borrows checked at compile time; `Rc<T>` | |
1175 | allows only immutable borrows checked at compile time; `RefCell<T>` allows | |
1176 | immutable or mutable borrows checked at runtime. | |
1177 | * Because `RefCell<T>` allows mutable borrows checked at runtime, you can | |
1178 | mutate the value inside the `RefCell<T>` even when the `RefCell<T>` is | |
1179 | immutable. | |
1180 | ||
1181 | Mutating the value inside an immutable value is the *interior mutability* | |
1182 | pattern. Let’s look at a situation in which interior mutability is useful and | |
1183 | examine how it’s possible. | |
1184 | ||
1185 | ### Interior Mutability: A Mutable Borrow to an Immutable Value | |
1186 | ||
1187 | A consequence of the borrowing rules is that when you have an immutable value, | |
1188 | you can’t borrow it mutably. For example, this code won’t compile: | |
1189 | ||
1190 | ``` | |
1191 | fn main() { | |
1192 | let x = 5; | |
1193 | let y = &mut x; | |
1194 | } | |
1195 | ``` | |
1196 | ||
1197 | If you tried to compile this code, you’d get the following error: | |
1198 | ||
1199 | ``` | |
1200 | error[E0596]: cannot borrow `x` as mutable, as it is not declared as mutable | |
1201 | --> src/main.rs:3:13 | |
1202 | | | |
1203 | 2 | let x = 5; | |
1204 | | - help: consider changing this to be mutable: `mut x` | |
1205 | 3 | let y = &mut x; | |
1206 | | ^^^^^^ cannot borrow as mutable | |
1207 | ``` | |
1208 | ||
1209 | However, there are situations in which it would be useful for a value to mutate | |
1210 | itself in its methods but appear immutable to other code. Code outside the | |
1211 | value’s methods would not be able to mutate the value. Using `RefCell<T>` is | |
04454e1e | 1212 | one way to get the ability to have interior mutability, but `RefCell<T>` |
5099ac24 FG |
1213 | doesn’t get around the borrowing rules completely: the borrow checker in the |
1214 | compiler allows this interior mutability, and the borrowing rules are checked | |
1215 | at runtime instead. If you violate the rules, you’ll get a `panic!` instead of | |
1216 | a compiler error. | |
1217 | ||
1218 | Let’s work through a practical example where we can use `RefCell<T>` to mutate | |
1219 | an immutable value and see why that is useful. | |
1220 | ||
1221 | #### A Use Case for Interior Mutability: Mock Objects | |
1222 | ||
04454e1e FG |
1223 | Sometimes during testing a programmer will use a type in place of another type, |
1224 | in order to observe particular behavior and assert it's implemented correctly. | |
1225 | <!-- why do we use other types -- to test different types with the code? Or you | |
1226 | mean they might use special types that offer particular functionality, like | |
1227 | mock objects do? /LC --> | |
1228 | <!-- I'm not sure I understand your question completely, but I've tried to | |
1229 | clarify by completing the sentence you left and adding a bit after this spot-- | |
1230 | is it clearer now? /Carol --> | |
1231 | This placeholder type is called a *test double*. Think of it in the sense of a | |
1232 | "stunt double" in filmmaking, where a person steps in and substitutes for an | |
1233 | actor to do a particular tricky scene. Test doubles stand in for other types | |
1234 | when we're running tests. *Mock objects* are specific types of test doubles | |
5099ac24 FG |
1235 | that record what happens during a test so you can assert that the correct |
1236 | actions took place. | |
1237 | ||
1238 | Rust doesn’t have objects in the same sense as other languages have objects, | |
1239 | and Rust doesn’t have mock object functionality built into the standard library | |
1240 | as some other languages do. However, you can definitely create a struct that | |
1241 | will serve the same purposes as a mock object. | |
1242 | ||
1243 | Here’s the scenario we’ll test: we’ll create a library that tracks a value | |
1244 | against a maximum value and sends messages based on how close to the maximum | |
1245 | value the current value is. This library could be used to keep track of a | |
1246 | user’s quota for the number of API calls they’re allowed to make, for example. | |
1247 | ||
1248 | Our library will only provide the functionality of tracking how close to the | |
1249 | maximum a value is and what the messages should be at what times. Applications | |
1250 | that use our library will be expected to provide the mechanism for sending the | |
1251 | messages: the application could put a message in the application, send an | |
1252 | email, send a text message, or something else. The library doesn’t need to know | |
1253 | that detail. All it needs is something that implements a trait we’ll provide | |
1254 | called `Messenger`. Listing 15-20 shows the library code: | |
1255 | ||
1256 | Filename: src/lib.rs | |
1257 | ||
1258 | ``` | |
1259 | pub trait Messenger { | |
1260 | [1] fn send(&self, msg: &str); | |
1261 | } | |
1262 | ||
1263 | pub struct LimitTracker<'a, T: Messenger> { | |
1264 | messenger: &'a T, | |
1265 | value: usize, | |
1266 | max: usize, | |
1267 | } | |
1268 | ||
1269 | impl<'a, T> LimitTracker<'a, T> | |
1270 | where | |
1271 | T: Messenger, | |
1272 | { | |
04454e1e | 1273 | pub fn new(messenger: &'a T, max: usize) -> LimitTracker<'a, T> { |
5099ac24 FG |
1274 | LimitTracker { |
1275 | messenger, | |
1276 | value: 0, | |
1277 | max, | |
1278 | } | |
1279 | } | |
1280 | ||
1281 | [2] pub fn set_value(&mut self, value: usize) { | |
1282 | self.value = value; | |
1283 | ||
1284 | let percentage_of_max = self.value as f64 / self.max as f64; | |
1285 | ||
1286 | if percentage_of_max >= 1.0 { | |
1287 | self.messenger.send("Error: You are over your quota!"); | |
1288 | } else if percentage_of_max >= 0.9 { | |
1289 | self.messenger | |
1290 | .send("Urgent warning: You've used up over 90% of your quota!"); | |
1291 | } else if percentage_of_max >= 0.75 { | |
1292 | self.messenger | |
1293 | .send("Warning: You've used up over 75% of your quota!"); | |
1294 | } | |
1295 | } | |
1296 | } | |
1297 | ``` | |
1298 | ||
1299 | Listing 15-20: A library to keep track of how close a value is to a maximum | |
1300 | value and warn when the value is at certain levels | |
1301 | ||
1302 | One important part of this code is that the `Messenger` trait has one method | |
1303 | called `send` that takes an immutable reference to `self` and the text of the | |
1304 | message [1]. This trait is the interface our mock object needs to implement so | |
1305 | that the mock can be used in the same way a real object is. The other important | |
1306 | part is that we want to test the behavior of the `set_value` method on the | |
1307 | `LimitTracker` [2]. We can change what we pass in for the `value` parameter, | |
1308 | but `set_value` doesn’t return anything for us to make assertions on. We want | |
1309 | to be able to say that if we create a `LimitTracker` with something that | |
1310 | implements the `Messenger` trait and a particular value for `max`, when we pass | |
1311 | different numbers for `value`, the messenger is told to send the appropriate | |
1312 | messages. | |
1313 | ||
1314 | We need a mock object that, instead of sending an email or text message when we | |
1315 | call `send`, will only keep track of the messages it’s told to send. We can | |
1316 | create a new instance of the mock object, create a `LimitTracker` that uses the | |
1317 | mock object, call the `set_value` method on `LimitTracker`, and then check that | |
1318 | the mock object has the messages we expect. Listing 15-21 shows an attempt to | |
1319 | implement a mock object to do just that, but the borrow checker won’t allow it: | |
1320 | ||
1321 | Filename: src/lib.rs | |
1322 | ||
1323 | ``` | |
1324 | #[cfg(test)] | |
1325 | mod tests { | |
1326 | use super::*; | |
1327 | ||
1328 | [1] struct MockMessenger { | |
1329 | [2] sent_messages: Vec<String>, | |
1330 | } | |
1331 | ||
1332 | impl MockMessenger { | |
1333 | [3] fn new() -> MockMessenger { | |
1334 | MockMessenger { | |
1335 | sent_messages: vec![], | |
1336 | } | |
1337 | } | |
1338 | } | |
1339 | ||
1340 | [4] impl Messenger for MockMessenger { | |
1341 | fn send(&self, message: &str) { | |
1342 | [5] self.sent_messages.push(String::from(message)); | |
1343 | } | |
1344 | } | |
1345 | ||
1346 | #[test] | |
1347 | [6] fn it_sends_an_over_75_percent_warning_message() { | |
1348 | let mock_messenger = MockMessenger::new(); | |
1349 | let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); | |
1350 | ||
1351 | limit_tracker.set_value(80); | |
1352 | ||
1353 | assert_eq!(mock_messenger.sent_messages.len(), 1); | |
1354 | } | |
1355 | } | |
1356 | ``` | |
1357 | ||
1358 | Listing 15-21: An attempt to implement a `MockMessenger` that isn’t allowed by | |
1359 | the borrow checker | |
1360 | ||
1361 | This test code defines a `MockMessenger` struct [1] that has a `sent_messages` | |
1362 | field with a `Vec` of `String` values [2] to keep track of the messages it’s | |
1363 | told to send. We also define an associated function `new` [3] to make it | |
1364 | convenient to create new `MockMessenger` values that start with an empty list | |
1365 | of messages. We then implement the `Messenger` trait for `MockMessenger` [4] so | |
1366 | we can give a `MockMessenger` to a `LimitTracker`. In the definition of the | |
1367 | `send` method [5], we take the message passed in as a parameter and store it in | |
1368 | the `MockMessenger` list of `sent_messages`. | |
1369 | ||
1370 | In the test, we’re testing what happens when the `LimitTracker` is told to set | |
1371 | `value` to something that is more than 75 percent of the `max` value [6]. | |
1372 | First, we create a new `MockMessenger`, which will start with an empty list of | |
1373 | messages. Then we create a new `LimitTracker` and give it a reference to the | |
1374 | new `MockMessenger` and a `max` value of 100. We call the `set_value` method on | |
1375 | the `LimitTracker` with a value of 80, which is more than 75 percent of 100. | |
1376 | Then we assert that the list of messages that the `MockMessenger` is keeping | |
1377 | track of should now have one message in it. | |
1378 | ||
1379 | However, there’s one problem with this test, as shown here: | |
1380 | ||
1381 | ``` | |
1382 | error[E0596]: cannot borrow `self.sent_messages` as mutable, as it is behind a `&` reference | |
1383 | --> src/lib.rs:58:13 | |
1384 | | | |
1385 | 2 | fn send(&self, msg: &str); | |
1386 | | ----- help: consider changing that to be a mutable reference: `&mut self` | |
1387 | ... | |
1388 | 58 | self.sent_messages.push(String::from(message)); | |
1389 | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `self` is a `&` reference, so the data it refers to cannot be borrowed as mutable | |
1390 | ``` | |
1391 | ||
1392 | We can’t modify the `MockMessenger` to keep track of the messages, because the | |
1393 | `send` method takes an immutable reference to `self`. We also can’t take the | |
1394 | suggestion from the error text to use `&mut self` instead, because then the | |
1395 | signature of `send` wouldn’t match the signature in the `Messenger` trait | |
1396 | definition (feel free to try and see what error message you get). | |
1397 | ||
1398 | This is a situation in which interior mutability can help! We’ll store the | |
1399 | `sent_messages` within a `RefCell<T>`, and then the `send` method will be | |
1400 | able to modify `sent_messages` to store the messages we’ve seen. Listing 15-22 | |
1401 | shows what that looks like: | |
1402 | ||
1403 | Filename: src/lib.rs | |
1404 | ||
1405 | ``` | |
1406 | #[cfg(test)] | |
1407 | mod tests { | |
1408 | use super::*; | |
1409 | use std::cell::RefCell; | |
1410 | ||
1411 | struct MockMessenger { | |
1412 | [1] sent_messages: RefCell<Vec<String>>, | |
1413 | } | |
1414 | ||
1415 | impl MockMessenger { | |
1416 | fn new() -> MockMessenger { | |
1417 | MockMessenger { | |
1418 | sent_messages: RefCell::new(vec![]) [2], | |
1419 | } | |
1420 | } | |
1421 | } | |
1422 | ||
1423 | impl Messenger for MockMessenger { | |
1424 | fn send(&self, message: &str) { | |
1425 | [3] self.sent_messages.borrow_mut().push(String::from(message)); | |
1426 | } | |
1427 | } | |
1428 | ||
1429 | #[test] | |
1430 | fn it_sends_an_over_75_percent_warning_message() { | |
1431 | // --snip-- | |
1432 | ||
1433 | [4] assert_eq!(mock_messenger.sent_messages.borrow().len(), 1); | |
1434 | } | |
1435 | } | |
1436 | ``` | |
1437 | ||
1438 | Listing 15-22: Using `RefCell<T>` to mutate an inner value while the outer | |
1439 | value is considered immutable | |
1440 | ||
1441 | The `sent_messages` field is now of type `RefCell<Vec<String>>` [1] instead of | |
1442 | `Vec<String>`. In the `new` function, we create a new `RefCell<Vec<String>>` | |
1443 | instance around the empty vector [2]. | |
1444 | ||
1445 | For the implementation of the `send` method, the first parameter is still an | |
1446 | immutable borrow of `self`, which matches the trait definition. We call | |
1447 | `borrow_mut` on the `RefCell<Vec<String>>` in `self.sent_messages` [3] to get a | |
1448 | mutable reference to the value inside the `RefCell<Vec<String>>`, which is | |
1449 | the vector. Then we can call `push` on the mutable reference to the vector to | |
1450 | keep track of the messages sent during the test. | |
1451 | ||
1452 | The last change we have to make is in the assertion: to see how many items are | |
1453 | in the inner vector, we call `borrow` on the `RefCell<Vec<String>>` to get an | |
1454 | immutable reference to the vector [4]. | |
1455 | ||
1456 | Now that you’ve seen how to use `RefCell<T>`, let’s dig into how it works! | |
1457 | ||
1458 | #### Keeping Track of Borrows at Runtime with `RefCell<T>` | |
1459 | ||
1460 | When creating immutable and mutable references, we use the `&` and `&mut` | |
1461 | syntax, respectively. With `RefCell<T>`, we use the `borrow` and `borrow_mut` | |
1462 | methods, which are part of the safe API that belongs to `RefCell<T>`. The | |
1463 | `borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut` | |
1464 | returns the smart pointer type `RefMut<T>`. Both types implement `Deref`, so we | |
1465 | can treat them like regular references. | |
1466 | ||
1467 | The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart | |
1468 | pointers are currently active. Every time we call `borrow`, the `RefCell<T>` | |
1469 | increases its count of how many immutable borrows are active. When a `Ref<T>` | |
1470 | value goes out of scope, the count of immutable borrows goes down by one. Just | |
1471 | like the compile-time borrowing rules, `RefCell<T>` lets us have many immutable | |
1472 | borrows or one mutable borrow at any point in time. | |
1473 | ||
1474 | If we try to violate these rules, rather than getting a compiler error as we | |
1475 | would with references, the implementation of `RefCell<T>` will panic at | |
1476 | runtime. Listing 15-23 shows a modification of the implementation of `send` in | |
1477 | Listing 15-22. We’re deliberately trying to create two mutable borrows active | |
1478 | for the same scope to illustrate that `RefCell<T>` prevents us from doing this | |
1479 | at runtime. | |
1480 | ||
1481 | Filename: src/lib.rs | |
1482 | ||
1483 | ``` | |
1484 | impl Messenger for MockMessenger { | |
1485 | fn send(&self, message: &str) { | |
1486 | let mut one_borrow = self.sent_messages.borrow_mut(); | |
1487 | let mut two_borrow = self.sent_messages.borrow_mut(); | |
1488 | ||
1489 | one_borrow.push(String::from(message)); | |
1490 | two_borrow.push(String::from(message)); | |
1491 | } | |
1492 | } | |
1493 | ``` | |
1494 | ||
1495 | Listing 15-23: Creating two mutable references in the same scope to see that | |
1496 | `RefCell<T>` will panic | |
1497 | ||
1498 | We create a variable `one_borrow` for the `RefMut<T>` smart pointer returned | |
1499 | from `borrow_mut`. Then we create another mutable borrow in the same way in the | |
1500 | variable `two_borrow`. This makes two mutable references in the same scope, | |
1501 | which isn’t allowed. When we run the tests for our library, the code in Listing | |
1502 | 15-23 will compile without any errors, but the test will fail: | |
1503 | ||
1504 | ``` | |
1505 | ---- tests::it_sends_an_over_75_percent_warning_message stdout ---- | |
1506 | thread 'main' panicked at 'already borrowed: BorrowMutError', src/lib.rs:60:53 | |
1507 | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace | |
1508 | ``` | |
1509 | ||
1510 | Notice that the code panicked with the message `already borrowed: | |
1511 | BorrowMutError`. This is how `RefCell<T>` handles violations of the borrowing | |
1512 | rules at runtime. | |
1513 | ||
04454e1e FG |
1514 | Choosing to catch borrowing errors at runtime rather than compile time, as |
1515 | we've done here, means you'd potentially be finding mistakes in your code later | |
1516 | in the development process: possibly not until your code was deployed to | |
1517 | production. Also, your code would incur a small runtime performance penalty as | |
1518 | a result of keeping track of the borrows at runtime rather than compile time. | |
1519 | However, using `RefCell<T>` makes it possible to write a mock object that can | |
1520 | modify itself to keep track of the messages it has seen while you’re using it | |
1521 | in a context where only immutable values are allowed. You can use `RefCell<T>` | |
1522 | despite its trade-offs to get more functionality than regular references | |
1523 | provide. | |
5099ac24 FG |
1524 | |
1525 | ### Having Multiple Owners of Mutable Data by Combining `Rc<T>` and `RefCell<T>` | |
1526 | ||
1527 | A common way to use `RefCell<T>` is in combination with `Rc<T>`. Recall that | |
1528 | `Rc<T>` lets you have multiple owners of some data, but it only gives immutable | |
1529 | access to that data. If you have an `Rc<T>` that holds a `RefCell<T>`, you can | |
1530 | get a value that can have multiple owners *and* that you can mutate! | |
1531 | ||
1532 | For example, recall the cons list example in Listing 15-18 where we used | |
1533 | `Rc<T>` to allow multiple lists to share ownership of another list. Because | |
1534 | `Rc<T>` holds only immutable values, we can’t change any of the values in the | |
1535 | list once we’ve created them. Let’s add in `RefCell<T>` to gain the ability to | |
1536 | change the values in the lists. Listing 15-24 shows that by using a | |
1537 | `RefCell<T>` in the `Cons` definition, we can modify the value stored in all | |
1538 | the lists: | |
1539 | ||
1540 | Filename: src/main.rs | |
1541 | ||
1542 | ``` | |
1543 | #[derive(Debug)] | |
1544 | enum List { | |
1545 | Cons(Rc<RefCell<i32>>, Rc<List>), | |
1546 | Nil, | |
1547 | } | |
1548 | ||
1549 | use crate::List::{Cons, Nil}; | |
1550 | use std::cell::RefCell; | |
1551 | use std::rc::Rc; | |
1552 | ||
1553 | fn main() { | |
1554 | [1] let value = Rc::new(RefCell::new(5)); | |
1555 | ||
1556 | [2] let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); | |
1557 | ||
1558 | let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a)); | |
1559 | let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a)); | |
1560 | ||
1561 | [3] *value.borrow_mut() += 10; | |
1562 | ||
1563 | println!("a after = {:?}", a); | |
1564 | println!("b after = {:?}", b); | |
1565 | println!("c after = {:?}", c); | |
1566 | } | |
1567 | ``` | |
1568 | ||
1569 | Listing 15-24: Using `Rc<RefCell<i32>>` to create a `List` that we can mutate | |
1570 | ||
1571 | We create a value that is an instance of `Rc<RefCell<i32>>` and store it in a | |
1572 | variable named `value` [1] so we can access it directly later. Then we create a | |
1573 | `List` in `a` with a `Cons` variant that holds `value` [2]. We need to clone | |
1574 | `value` so both `a` and `value` have ownership of the inner `5` value rather | |
1575 | than transferring ownership from `value` to `a` or having `a` borrow from | |
1576 | `value`. | |
1577 | ||
1578 | We wrap the list `a` in an `Rc<T>` so when we create lists `b` and `c`, they | |
1579 | can both refer to `a`, which is what we did in Listing 15-18. | |
1580 | ||
04454e1e FG |
1581 | After we’ve created the lists in `a`, `b`, and `c`, we want to add 10 to the |
1582 | value in `value` [3]. We do this by calling `borrow_mut` on `value`, which uses | |
1583 | the automatic dereferencing feature we discussed in Chapter 5 (see the section | |
5099ac24 FG |
1584 | “Where’s the `->` Operator?”) to dereference the `Rc<T>` to the inner |
1585 | `RefCell<T>` value. The `borrow_mut` method returns a `RefMut<T>` smart | |
1586 | pointer, and we use the dereference operator on it and change the inner value. | |
1587 | ||
1588 | When we print `a`, `b`, and `c`, we can see that they all have the modified | |
1589 | value of 15 rather than 5: | |
1590 | ||
1591 | ``` | |
1592 | a after = Cons(RefCell { value: 15 }, Nil) | |
1593 | b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil)) | |
1594 | c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil)) | |
1595 | ``` | |
1596 | ||
1597 | This technique is pretty neat! By using `RefCell<T>`, we have an outwardly | |
1598 | immutable `List` value. But we can use the methods on `RefCell<T>` that provide | |
1599 | access to its interior mutability so we can modify our data when we need to. | |
1600 | The runtime checks of the borrowing rules protect us from data races, and it’s | |
1601 | sometimes worth trading a bit of speed for this flexibility in our data | |
1602 | structures. | |
1603 | ||
1604 | The standard library has other types that provide interior mutability, such as | |
1605 | `Cell<T>`, which is similar except that instead of giving references to the | |
1606 | inner value, the value is copied in and out of the `Cell<T>`. There’s also | |
1607 | `Mutex<T>`, which offers interior mutability that’s safe to use across threads; | |
1608 | we’ll discuss its use in Chapter 16. Check out the standard library docs for | |
1609 | more details on the differences between these types. | |
1610 | ||
1611 | ## Reference Cycles Can Leak Memory | |
1612 | ||
1613 | Rust’s memory safety guarantees make it difficult, but not impossible, to | |
1614 | accidentally create memory that is never cleaned up (known as a *memory leak*). | |
1615 | Preventing memory leaks entirely is not one of Rust’s guarantees, meaning | |
1616 | memory leaks are memory safe in Rust. We can see that Rust allows memory leaks | |
1617 | by using `Rc<T>` and `RefCell<T>`: it’s possible to create references where | |
1618 | items refer to each other in a cycle. This creates memory leaks because the | |
1619 | reference count of each item in the cycle will never reach 0, and the values | |
1620 | will never be dropped. | |
1621 | ||
1622 | ### Creating a Reference Cycle | |
1623 | ||
1624 | Let’s look at how a reference cycle might happen and how to prevent it, | |
1625 | starting with the definition of the `List` enum and a `tail` method in Listing | |
1626 | 15-25: | |
1627 | ||
1628 | Filename: src/main.rs | |
1629 | ||
1630 | ``` | |
1631 | use crate::List::{Cons, Nil}; | |
1632 | use std::cell::RefCell; | |
1633 | use std::rc::Rc; | |
1634 | ||
1635 | #[derive(Debug)] | |
1636 | enum List { | |
1637 | [1] Cons(i32, RefCell<Rc<List>>), | |
1638 | Nil, | |
1639 | } | |
1640 | ||
1641 | impl List { | |
1642 | [2] fn tail(&self) -> Option<&RefCell<Rc<List>>> { | |
1643 | match self { | |
1644 | Cons(_, item) => Some(item), | |
1645 | Nil => None, | |
1646 | } | |
1647 | } | |
1648 | } | |
1649 | ``` | |
1650 | ||
1651 | Listing 15-25: A cons list definition that holds a `RefCell<T>` so we can | |
1652 | modify what a `Cons` variant is referring to | |
1653 | ||
1654 | We’re using another variation of the `List` definition from Listing 15-5. The | |
1655 | second element in the `Cons` variant is now `RefCell<Rc<List>>` [1], meaning | |
1656 | that instead of having the ability to modify the `i32` value as we did in | |
04454e1e | 1657 | Listing 15-24, we want to modify the `List` value a `Cons` variant is |
5099ac24 FG |
1658 | pointing to. We’re also adding a `tail` method [2] to make it convenient for us |
1659 | to access the second item if we have a `Cons` variant. | |
1660 | ||
1661 | In Listing 15-26, we’re adding a `main` function that uses the definitions in | |
1662 | Listing 15-25. This code creates a list in `a` and a list in `b` that points to | |
1663 | the list in `a`. Then it modifies the list in `a` to point to `b`, creating a | |
1664 | reference cycle. There are `println!` statements along the way to show what the | |
1665 | reference counts are at various points in this process. | |
1666 | ||
1667 | Filename: src/main.rs | |
1668 | ||
1669 | ``` | |
1670 | fn main() { | |
1671 | [1] let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); | |
1672 | ||
1673 | println!("a initial rc count = {}", Rc::strong_count(&a)); | |
1674 | println!("a next item = {:?}", a.tail()); | |
1675 | ||
1676 | [2] let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); | |
1677 | ||
1678 | println!("a rc count after b creation = {}", Rc::strong_count(&a)); | |
1679 | println!("b initial rc count = {}", Rc::strong_count(&b)); | |
1680 | println!("b next item = {:?}", b.tail()); | |
1681 | ||
1682 | [3] if let Some(link) = a.tail() { | |
1683 | [4] *link.borrow_mut() = Rc::clone(&b); | |
1684 | } | |
1685 | ||
1686 | println!("b rc count after changing a = {}", Rc::strong_count(&b)); | |
1687 | println!("a rc count after changing a = {}", Rc::strong_count(&a)); | |
1688 | ||
1689 | // Uncomment the next line to see that we have a cycle; | |
1690 | // it will overflow the stack | |
1691 | // println!("a next item = {:?}", a.tail()); | |
1692 | } | |
1693 | ``` | |
1694 | ||
1695 | Listing 15-26: Creating a reference cycle of two `List` values pointing to each | |
1696 | other | |
1697 | ||
1698 | We create an `Rc<List>` instance holding a `List` value in the variable `a` | |
1699 | with an initial list of `5, Nil` [1]. We then create an `Rc<List>` instance | |
1700 | holding another `List` value in the variable `b` that contains the value 10 and | |
1701 | points to the list in `a` [2]. | |
1702 | ||
1703 | We modify `a` so it points to `b` instead of `Nil`, creating a cycle. We do | |
1704 | that by using the `tail` method to get a reference to the `RefCell<Rc<List>>` | |
1705 | in `a`, which we put in the variable `link` [3]. Then we use the `borrow_mut` | |
1706 | method on the `RefCell<Rc<List>>` to change the value inside from an `Rc<List>` | |
1707 | that holds a `Nil` value to the `Rc<List>` in `b` [4]. | |
1708 | ||
1709 | When we run this code, keeping the last `println!` commented out for the | |
1710 | moment, we’ll get this output: | |
1711 | ||
1712 | ``` | |
1713 | a initial rc count = 1 | |
1714 | a next item = Some(RefCell { value: Nil }) | |
1715 | a rc count after b creation = 2 | |
1716 | b initial rc count = 1 | |
1717 | b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) }) | |
1718 | b rc count after changing a = 2 | |
1719 | a rc count after changing a = 2 | |
1720 | ``` | |
1721 | ||
1722 | The reference count of the `Rc<List>` instances in both `a` and `b` are 2 after | |
1723 | we change the list in `a` to point to `b`. At the end of `main`, Rust drops the | |
1724 | variable `b`, which decreases the reference count of the `Rc<List>` instance | |
1725 | from 2 to 1. The memory that `Rc<List>` has on the heap won’t be dropped at | |
1726 | this point, because its reference count is 1, not 0. Then Rust drops `a`, which | |
1727 | decreases the reference count of the `a` `Rc<List>` instance from 2 to 1 as | |
1728 | well. This instance’s memory can’t be dropped either, because the other | |
1729 | `Rc<List>` instance still refers to it. The memory allocated to the list will | |
1730 | remain uncollected forever. To visualize this reference cycle, we’ve created a | |
1731 | diagram in Figure 15-4. | |
1732 | ||
1733 | <img alt="Reference cycle of lists" src="img/trpl15-04.svg" class="center" /> | |
1734 | ||
1735 | Figure 15-4: A reference cycle of lists `a` and `b` pointing to each other | |
1736 | ||
1737 | If you uncomment the last `println!` and run the program, Rust will try to | |
1738 | print this cycle with `a` pointing to `b` pointing to `a` and so forth until it | |
1739 | overflows the stack. | |
1740 | ||
04454e1e FG |
1741 | <!-- "in this case" meaning with the last print statement commented out? /LC --> |
1742 | <!-- Technically with both the println commented out and not. I've tried to | |
1743 | clarify the purpose of this paragraph earlier /Carol --> | |
1744 | Compared to a real-world program, the consequences creating a reference cycle | |
1745 | in this example aren’t very dire: right after we create the reference cycle, | |
1746 | the program ends. However, if a more complex program allocated lots of memory | |
1747 | in a cycle and held onto it for a long time, the program would use more memory | |
1748 | than it needed and might overwhelm the system, causing it to run out of | |
1749 | available memory. | |
5099ac24 FG |
1750 | |
1751 | Creating reference cycles is not easily done, but it’s not impossible either. | |
1752 | If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested | |
1753 | combinations of types with interior mutability and reference counting, you must | |
1754 | ensure that you don’t create cycles; you can’t rely on Rust to catch them. | |
1755 | Creating a reference cycle would be a logic bug in your program that you should | |
1756 | use automated tests, code reviews, and other software development practices to | |
1757 | minimize. | |
1758 | ||
1759 | Another solution for avoiding reference cycles is reorganizing your data | |
1760 | structures so that some references express ownership and some references don’t. | |
1761 | As a result, you can have cycles made up of some ownership relationships and | |
1762 | some non-ownership relationships, and only the ownership relationships affect | |
1763 | whether or not a value can be dropped. In Listing 15-25, we always want `Cons` | |
1764 | variants to own their list, so reorganizing the data structure isn’t possible. | |
1765 | Let’s look at an example using graphs made up of parent nodes and child nodes | |
1766 | to see when non-ownership relationships are an appropriate way to prevent | |
1767 | reference cycles. | |
1768 | ||
1769 | ### Preventing Reference Cycles: Turning an `Rc<T>` into a `Weak<T>` | |
1770 | ||
1771 | So far, we’ve demonstrated that calling `Rc::clone` increases the | |
1772 | `strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only cleaned | |
1773 | up if its `strong_count` is 0. You can also create a *weak reference* to the | |
1774 | value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a | |
04454e1e FG |
1775 | reference to the `Rc<T>`. Strong references are how you can share ownership of |
1776 | an `Rc<T>` instance. Weak references don’t express an ownership relationship, | |
1777 | and their count doesn't affect when an `Rc<T>` instance is cleaned up. They | |
1778 | won’t cause a reference cycle because any cycle involving some weak references | |
1779 | will be broken once the strong reference count of values involved is 0. | |
1780 | ||
1781 | <!-- can you explain what a weak reference does/is for? By the end of this | |
1782 | paragraph I wasn't entirely sure. It would be good to have a high level | |
1783 | succinct definition /LC --> | |
1784 | <!-- I've moved the explanation that was in the next paragraph to this spot and | |
1785 | modified it a bit, does this work? /Carol --> | |
1786 | ||
1787 | When you call `Rc::downgrade`, you get a smart pointer of type `Weak<T>`. | |
1788 | Instead of increasing the `strong_count` in the `Rc<T>` instance by 1, calling | |
1789 | `Rc::downgrade` increases the `weak_count` by 1. The `Rc<T>` type uses | |
1790 | `weak_count` to keep track of how many `Weak<T>` references exist, similar to | |
1791 | `strong_count`. The difference is the `weak_count` doesn’t need to be 0 for the | |
1792 | `Rc<T>` instance to be cleaned up. | |
5099ac24 FG |
1793 | |
1794 | Because the value that `Weak<T>` references might have been dropped, to do | |
1795 | anything with the value that a `Weak<T>` is pointing to, you must make sure the | |
1796 | value still exists. Do this by calling the `upgrade` method on a `Weak<T>` | |
1797 | instance, which will return an `Option<Rc<T>>`. You’ll get a result of `Some` | |
1798 | if the `Rc<T>` value has not been dropped yet and a result of `None` if the | |
1799 | `Rc<T>` value has been dropped. Because `upgrade` returns an `Option<Rc<T>>`, | |
1800 | Rust will ensure that the `Some` case and the `None` case are handled, and | |
1801 | there won’t be an invalid pointer. | |
1802 | ||
1803 | As an example, rather than using a list whose items know only about the next | |
1804 | item, we’ll create a tree whose items know about their children items *and* | |
1805 | their parent items. | |
1806 | ||
1807 | #### Creating a Tree Data Structure: a `Node` with Child Nodes | |
1808 | ||
1809 | To start, we’ll build a tree with nodes that know about their child nodes. | |
1810 | We’ll create a struct named `Node` that holds its own `i32` value as well as | |
1811 | references to its children `Node` values: | |
1812 | ||
1813 | Filename: src/main.rs | |
1814 | ||
1815 | ``` | |
1816 | use std::cell::RefCell; | |
1817 | use std::rc::Rc; | |
1818 | ||
1819 | #[derive(Debug)] | |
1820 | struct Node { | |
1821 | value: i32, | |
1822 | children: RefCell<Vec<Rc<Node>>>, | |
1823 | } | |
1824 | ``` | |
1825 | ||
1826 | We want a `Node` to own its children, and we want to share that ownership with | |
1827 | variables so we can access each `Node` in the tree directly. To do this, we | |
1828 | define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to | |
1829 | modify which nodes are children of another node, so we have a `RefCell<T>` in | |
1830 | `children` around the `Vec<Rc<Node>>`. | |
1831 | ||
1832 | Next, we’ll use our struct definition and create one `Node` instance named | |
1833 | `leaf` with the value 3 and no children, and another instance named `branch` | |
1834 | with the value 5 and `leaf` as one of its children, as shown in Listing 15-27: | |
1835 | ||
1836 | Filename: src/main.rs | |
1837 | ||
1838 | ``` | |
1839 | fn main() { | |
1840 | let leaf = Rc::new(Node { | |
1841 | value: 3, | |
1842 | children: RefCell::new(vec![]), | |
1843 | }); | |
1844 | ||
1845 | let branch = Rc::new(Node { | |
1846 | value: 5, | |
1847 | children: RefCell::new(vec![Rc::clone(&leaf)]), | |
1848 | }); | |
1849 | } | |
1850 | ``` | |
1851 | ||
1852 | Listing 15-27: Creating a `leaf` node with no children and a `branch` node with | |
1853 | `leaf` as one of its children | |
1854 | ||
1855 | We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the | |
1856 | `Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from | |
1857 | `branch` to `leaf` through `branch.children`, but there’s no way to get from | |
1858 | `leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and | |
1859 | doesn’t know they’re related. We want `leaf` to know that `branch` is its | |
1860 | parent. We’ll do that next. | |
1861 | ||
1862 | #### Adding a Reference from a Child to Its Parent | |
1863 | ||
1864 | To make the child node aware of its parent, we need to add a `parent` field to | |
1865 | our `Node` struct definition. The trouble is in deciding what the type of | |
1866 | `parent` should be. We know it can’t contain an `Rc<T>`, because that would | |
1867 | create a reference cycle with `leaf.parent` pointing to `branch` and | |
1868 | `branch.children` pointing to `leaf`, which would cause their `strong_count` | |
1869 | values to never be 0. | |
1870 | ||
1871 | Thinking about the relationships another way, a parent node should own its | |
1872 | children: if a parent node is dropped, its child nodes should be dropped as | |
1873 | well. However, a child should not own its parent: if we drop a child node, the | |
1874 | parent should still exist. This is a case for weak references! | |
1875 | ||
1876 | So instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`, | |
1877 | specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks | |
1878 | like this: | |
1879 | ||
1880 | Filename: src/main.rs | |
1881 | ||
1882 | ``` | |
1883 | use std::cell::RefCell; | |
1884 | use std::rc::{Rc, Weak}; | |
1885 | ||
1886 | #[derive(Debug)] | |
1887 | struct Node { | |
1888 | value: i32, | |
1889 | parent: RefCell<Weak<Node>>, | |
1890 | children: RefCell<Vec<Rc<Node>>>, | |
1891 | } | |
1892 | ``` | |
1893 | ||
1894 | A node will be able to refer to its parent node but doesn’t own its parent. | |
1895 | In Listing 15-28, we update `main` to use this new definition so the `leaf` | |
1896 | node will have a way to refer to its parent, `branch`: | |
1897 | ||
1898 | Filename: src/main.rs | |
1899 | ||
1900 | ``` | |
1901 | fn main() { | |
1902 | let leaf = Rc::new(Node { | |
1903 | value: 3, | |
1904 | [1] parent: RefCell::new(Weak::new()), | |
1905 | children: RefCell::new(vec![]), | |
1906 | }); | |
1907 | ||
1908 | [2] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
1909 | ||
1910 | let branch = Rc::new(Node { | |
1911 | value: 5, | |
1912 | [3] parent: RefCell::new(Weak::new()), | |
1913 | children: RefCell::new(vec![Rc::clone(&leaf)]), | |
1914 | }); | |
1915 | ||
1916 | [4] *leaf.parent.borrow_mut() = Rc::downgrade(&branch); | |
1917 | ||
1918 | [5] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
1919 | } | |
1920 | ``` | |
1921 | ||
1922 | Listing 15-28: A `leaf` node with a weak reference to its parent node `branch` | |
1923 | ||
04454e1e FG |
1924 | Creating the `leaf` node looks similar to Listing 15-27 with the exception of |
1925 | the `parent` field: `leaf` starts out without a parent, so we create a new, | |
1926 | empty `Weak<Node>` reference instance [1]. | |
5099ac24 FG |
1927 | |
1928 | At this point, when we try to get a reference to the parent of `leaf` by using | |
1929 | the `upgrade` method, we get a `None` value. We see this in the output from the | |
1930 | first `println!` statement [2]: | |
1931 | ||
1932 | ``` | |
1933 | leaf parent = None | |
1934 | ``` | |
1935 | ||
1936 | When we create the `branch` node, it will also have a new `Weak<Node>` | |
1937 | reference in the `parent` field [3], because `branch` doesn’t have a parent | |
1938 | node. We still have `leaf` as one of the children of `branch`. Once we have the | |
1939 | `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>` | |
1940 | reference to its parent [4]. We use the `borrow_mut` method on the | |
1941 | `RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the | |
1942 | `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from | |
1943 | the `Rc<Node>` in `branch.` | |
1944 | ||
1945 | When we print the parent of `leaf` again [5], this time we’ll get a `Some` | |
1946 | variant holding `branch`: now `leaf` can access its parent! When we print | |
1947 | `leaf`, we also avoid the cycle that eventually ended in a stack overflow like | |
1948 | we had in Listing 15-26; the `Weak<Node>` references are printed as `(Weak)`: | |
1949 | ||
1950 | ``` | |
1951 | leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) }, | |
1952 | children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, | |
1953 | children: RefCell { value: [] } }] } }) | |
1954 | ``` | |
1955 | ||
1956 | The lack of infinite output indicates that this code didn’t create a reference | |
1957 | cycle. We can also tell this by looking at the values we get from calling | |
1958 | `Rc::strong_count` and `Rc::weak_count`. | |
1959 | ||
1960 | #### Visualizing Changes to `strong_count` and `weak_count` | |
1961 | ||
1962 | Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>` | |
1963 | instances change by creating a new inner scope and moving the creation of | |
1964 | `branch` into that scope. By doing so, we can see what happens when `branch` is | |
1965 | created and then dropped when it goes out of scope. The modifications are shown | |
1966 | in Listing 15-29: | |
1967 | ||
1968 | Filename: src/main.rs | |
1969 | ||
1970 | ``` | |
1971 | fn main() { | |
1972 | let leaf = Rc::new(Node { | |
1973 | value: 3, | |
1974 | parent: RefCell::new(Weak::new()), | |
1975 | children: RefCell::new(vec![]), | |
1976 | }); | |
1977 | ||
1978 | [1] println!( | |
1979 | "leaf strong = {}, weak = {}", | |
1980 | Rc::strong_count(&leaf), | |
1981 | Rc::weak_count(&leaf), | |
1982 | ); | |
1983 | ||
1984 | [2] { | |
1985 | let branch = Rc::new(Node { | |
1986 | value: 5, | |
1987 | parent: RefCell::new(Weak::new()), | |
1988 | children: RefCell::new(vec![Rc::clone(&leaf)]), | |
1989 | }); | |
1990 | ||
1991 | *leaf.parent.borrow_mut() = Rc::downgrade(&branch); | |
1992 | ||
1993 | [3] println!( | |
1994 | "branch strong = {}, weak = {}", | |
1995 | Rc::strong_count(&branch), | |
1996 | Rc::weak_count(&branch), | |
1997 | ); | |
1998 | ||
1999 | [4] println!( | |
2000 | "leaf strong = {}, weak = {}", | |
2001 | Rc::strong_count(&leaf), | |
2002 | Rc::weak_count(&leaf), | |
2003 | ); | |
2004 | [5] } | |
2005 | ||
2006 | [6] println!("leaf parent = {:?}", leaf.parent.borrow().upgrade()); | |
2007 | [7] println!( | |
2008 | "leaf strong = {}, weak = {}", | |
2009 | Rc::strong_count(&leaf), | |
2010 | Rc::weak_count(&leaf), | |
2011 | ); | |
2012 | } | |
2013 | ``` | |
2014 | ||
2015 | Listing 15-29: Creating `branch` in an inner scope and examining strong and | |
2016 | weak reference counts | |
2017 | ||
2018 | After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak | |
2019 | count of 0 [1]. In the inner scope [2], we create `branch` and associate it | |
2020 | with `leaf`, at which point when we print the counts [3], the `Rc<Node>` in | |
2021 | `branch` will have a strong count of 1 and a weak count of 1 (for `leaf.parent` | |
2022 | pointing to `branch` with a `Weak<Node>`). When we print the counts in `leaf` | |
2023 | [4], we’ll see it will have a strong count of 2, because `branch` now has a | |
2024 | clone of the `Rc<Node>` of `leaf` stored in `branch.children`, but will still | |
2025 | have a weak count of 0. | |
2026 | ||
2027 | When the inner scope ends [5], `branch` goes out of scope and the strong count | |
2028 | of the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1 | |
2029 | from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we | |
2030 | don’t get any memory leaks! | |
2031 | ||
2032 | If we try to access the parent of `leaf` after the end of the scope, we’ll get | |
2033 | `None` again [6]. At the end of the program [7], the `Rc<Node>` in `leaf` has a | |
2034 | strong count of 1 and a weak count of 0, because the variable `leaf` is now the | |
2035 | only reference to the `Rc<Node>` again. | |
2036 | ||
2037 | All of the logic that manages the counts and value dropping is built into | |
2038 | `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By | |
2039 | specifying that the relationship from a child to its parent should be a | |
2040 | `Weak<T>` reference in the definition of `Node`, you’re able to have parent | |
2041 | nodes point to child nodes and vice versa without creating a reference cycle | |
2042 | and memory leaks. | |
2043 | ||
2044 | ## Summary | |
2045 | ||
2046 | This chapter covered how to use smart pointers to make different guarantees and | |
2047 | trade-offs from those Rust makes by default with regular references. The | |
2048 | `Box<T>` type has a known size and points to data allocated on the heap. The | |
2049 | `Rc<T>` type keeps track of the number of references to data on the heap so | |
2050 | that data can have multiple owners. The `RefCell<T>` type with its interior | |
2051 | mutability gives us a type that we can use when we need an immutable type but | |
2052 | need to change an inner value of that type; it also enforces the borrowing | |
2053 | rules at runtime instead of at compile time. | |
2054 | ||
2055 | Also discussed were the `Deref` and `Drop` traits, which enable a lot of the | |
2056 | functionality of smart pointers. We explored reference cycles that can cause | |
2057 | memory leaks and how to prevent them using `Weak<T>`. | |
2058 | ||
2059 | If this chapter has piqued your interest and you want to implement your own | |
2060 | smart pointers, check out “The Rustonomicon” at | |
2061 | *https://doc.rust-lang.org/stable/nomicon/* for more useful information. | |
2062 | ||
2063 | Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new | |
2064 | smart pointers. |