]> git.proxmox.com Git - rustc.git/blob - src/doc/book/src/ch04-01-what-is-ownership.md
New upstream version 1.48.0~beta.8+dfsg1
[rustc.git] / src / doc / book / src / ch04-01-what-is-ownership.md
1 ## What Is Ownership?
2
3 Rust’s central feature is *ownership*. Although the feature is straightforward
4 to explain, it has deep implications for the rest of the language.
5
6 All programs have to manage the way they use a computer’s memory while running.
7 Some languages have garbage collection that constantly looks for no longer used
8 memory as the program runs; in other languages, the programmer must explicitly
9 allocate and free the memory. Rust uses a third approach: memory is managed
10 through a system of ownership with a set of rules that the compiler checks at
11 compile time. None of the ownership features slow down your program while it’s
12 running.
13
14 Because ownership is a new concept for many programmers, it does take some time
15 to get used to. The good news is that the more experienced you become with Rust
16 and the rules of the ownership system, the more you’ll be able to naturally
17 develop code that is safe and efficient. Keep at it!
18
19 When you understand ownership, you’ll have a solid foundation for understanding
20 the features that make Rust unique. In this chapter, you’ll learn ownership by
21 working through some examples that focus on a very common data structure:
22 strings.
23
24 > ### The Stack and the Heap
25 >
26 > In many programming languages, you don’t have to think about the stack and
27 > the heap very often. But in a systems programming language like Rust, whether
28 > a value is on the stack or the heap has more of an effect on how the language
29 > behaves and why you have to make certain decisions. Parts of ownership will
30 > be described in relation to the stack and the heap later in this chapter, so
31 > here is a brief explanation in preparation.
32 >
33 > Both the stack and the heap are parts of memory that are available to your
34 > code to use at runtime, but they are structured in different ways. The stack
35 > stores values in the order it gets them and removes the values in the
36 > opposite order. This is referred to as *last in, first out*. Think of a stack
37 > of plates: when you add more plates, you put them on top of the pile, and
38 > when you need a plate, you take one off the top. Adding or removing plates
39 > from the middle or bottom wouldn’t work as well! Adding data is called
40 > *pushing onto the stack*, and removing data is called *popping off the stack*.
41 >
42 > All data stored on the stack must have a known, fixed size. Data with an
43 > unknown size at compile time or a size that might change must be stored on
44 > the heap instead. The heap is less organized: when you put data on the heap,
45 > you request a certain amount of space. The memory allocator finds an empty
46 > spot in the heap that is big enough, marks it as being in use, and returns a
47 > *pointer*, which is the address of that location. This process is called
48 > *allocating on the heap* and is sometimes abbreviated as just *allocating*.
49 > Pushing values onto the stack is not considered allocating. Because the
50 > pointer is a known, fixed size, you can store the pointer on the stack, but
51 > when you want the actual data, you must follow the pointer.
52 >
53 > Think of being seated at a restaurant. When you enter, you state the number of
54 > people in your group, and the staff finds an empty table that fits everyone
55 > and leads you there. If someone in your group comes late, they can ask where
56 > you’ve been seated to find you.
57 >
58 > Pushing to the stack is faster than allocating on the heap because the
59 > allocator never has to search for a place to store new data; that
60 > location is always at the top of the stack. Comparatively, allocating space
61 > on the heap requires more work, because the allocator must first find
62 > a big enough space to hold the data and then perform bookkeeping to prepare
63 > for the next allocation.
64 >
65 > Accessing data in the heap is slower than accessing data on the stack because
66 > you have to follow a pointer to get there. Contemporary processors are faster
67 > if they jump around less in memory. Continuing the analogy, consider a server
68 > at a restaurant taking orders from many tables. It’s most efficient to get
69 > all the orders at one table before moving on to the next table. Taking an
70 > order from table A, then an order from table B, then one from A again, and
71 > then one from B again would be a much slower process. By the same token, a
72 > processor can do its job better if it works on data that’s close to other
73 > data (as it is on the stack) rather than farther away (as it can be on the
74 > heap). Allocating a large amount of space on the heap can also take time.
75 >
76 > When your code calls a function, the values passed into the function
77 > (including, potentially, pointers to data on the heap) and the function’s
78 > local variables get pushed onto the stack. When the function is over, those
79 > values get popped off the stack.
80 >
81 > Keeping track of what parts of code are using what data on the heap,
82 > minimizing the amount of duplicate data on the heap, and cleaning up unused
83 > data on the heap so you don’t run out of space are all problems that ownership
84 > addresses. Once you understand ownership, you won’t need to think about the
85 > stack and the heap very often, but knowing that managing heap data is why
86 > ownership exists can help explain why it works the way it does.
87
88 ### Ownership Rules
89
90 First, let’s take a look at the ownership rules. Keep these rules in mind as we
91 work through the examples that illustrate them:
92
93 * Each value in Rust has a variable that’s called its *owner*.
94 * There can only be one owner at a time.
95 * When the owner goes out of scope, the value will be dropped.
96
97 ### Variable Scope
98
99 We’ve walked through an example of a Rust program already in Chapter 2. Now
100 that we’re past basic syntax, we won’t include all the `fn main() {` code in
101 examples, so if you’re following along, you’ll have to put the following
102 examples inside a `main` function manually. As a result, our examples will be a
103 bit more concise, letting us focus on the actual details rather than
104 boilerplate code.
105
106 As a first example of ownership, we’ll look at the *scope* of some variables. A
107 scope is the range within a program for which an item is valid. Let’s say we
108 have a variable that looks like this:
109
110 ```rust
111 let s = "hello";
112 ```
113
114 The variable `s` refers to a string literal, where the value of the string is
115 hardcoded into the text of our program. The variable is valid from the point at
116 which it’s declared until the end of the current *scope*. Listing 4-1 has
117 comments annotating where the variable `s` is valid.
118
119 ```rust
120 {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-01/src/main.rs:here}}
121 ```
122
123 <span class="caption">Listing 4-1: A variable and the scope in which it is
124 valid</span>
125
126 In other words, there are two important points in time here:
127
128 * When `s` comes *into scope*, it is valid.
129 * It remains valid until it goes *out of scope*.
130
131 At this point, the relationship between scopes and when variables are valid is
132 similar to that in other programming languages. Now we’ll build on top of this
133 understanding by introducing the `String` type.
134
135 ### The `String` Type
136
137 To illustrate the rules of ownership, we need a data type that is more complex
138 than the ones we covered in the [“Data Types”][data-types]<!-- ignore -->
139 section of Chapter 3. The types covered previously are all stored on the stack
140 and popped off the stack when their scope is over, but we want to look at data
141 that is stored on the heap and explore how Rust knows when to clean up that
142 data.
143
144 We’ll use `String` as the example here and concentrate on the parts of `String`
145 that relate to ownership. These aspects also apply to other complex data types,
146 whether they are provided by the standard library or created by you. We’ll
147 discuss `String` in more depth in Chapter 8.
148
149 We’ve already seen string literals, where a string value is hardcoded into our
150 program. String literals are convenient, but they aren’t suitable for every
151 situation in which we may want to use text. One reason is that they’re
152 immutable. Another is that not every string value can be known when we write
153 our code: for example, what if we want to take user input and store it? For
154 these situations, Rust has a second string type, `String`. This type is
155 allocated on the heap and as such is able to store an amount of text that is
156 unknown to us at compile time. You can create a `String` from a string literal
157 using the `from` function, like so:
158
159 ```rust
160 let s = String::from("hello");
161 ```
162
163 The double colon (`::`) is an operator that allows us to namespace this
164 particular `from` function under the `String` type rather than using some sort
165 of name like `string_from`. We’ll discuss this syntax more in the [“Method
166 Syntax”][method-syntax]<!-- ignore --> section of Chapter 5 and when we talk
167 about namespacing with modules in [“Paths for Referring to an Item in the
168 Module Tree”][paths-module-tree]<!-- ignore --> in Chapter 7.
169
170 This kind of string *can* be mutated:
171
172 ```rust
173 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-01-can-mutate-string/src/main.rs:here}}
174 ```
175
176 So, what’s the difference here? Why can `String` be mutated but literals
177 cannot? The difference is how these two types deal with memory.
178
179 ### Memory and Allocation
180
181 In the case of a string literal, we know the contents at compile time, so the
182 text is hardcoded directly into the final executable. This is why string
183 literals are fast and efficient. But these properties only come from the string
184 literal’s immutability. Unfortunately, we can’t put a blob of memory into the
185 binary for each piece of text whose size is unknown at compile time and whose
186 size might change while running the program.
187
188 With the `String` type, in order to support a mutable, growable piece of text,
189 we need to allocate an amount of memory on the heap, unknown at compile time,
190 to hold the contents. This means:
191
192 * The memory must be requested from the memory allocator at runtime.
193 * We need a way of returning this memory to the allocator when we’re
194 done with our `String`.
195
196 That first part is done by us: when we call `String::from`, its implementation
197 requests the memory it needs. This is pretty much universal in programming
198 languages.
199
200 However, the second part is different. In languages with a *garbage collector
201 (GC)*, the GC keeps track and cleans up memory that isn’t being used anymore,
202 and we don’t need to think about it. Without a GC, it’s our responsibility to
203 identify when memory is no longer being used and call code to explicitly return
204 it, just as we did to request it. Doing this correctly has historically been a
205 difficult programming problem. If we forget, we’ll waste memory. If we do it
206 too early, we’ll have an invalid variable. If we do it twice, that’s a bug too.
207 We need to pair exactly one `allocate` with exactly one `free`.
208
209 Rust takes a different path: the memory is automatically returned once the
210 variable that owns it goes out of scope. Here’s a version of our scope example
211 from Listing 4-1 using a `String` instead of a string literal:
212
213 ```rust
214 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-02-string-scope/src/main.rs:here}}
215 ```
216
217 There is a natural point at which we can return the memory our `String` needs
218 to the allocator: when `s` goes out of scope. When a variable goes out
219 of scope, Rust calls a special function for us. This function is called `drop`,
220 and it’s where the author of `String` can put the code to return the memory.
221 Rust calls `drop` automatically at the closing curly bracket.
222
223 > Note: In C++, this pattern of deallocating resources at the end of an item’s
224 > lifetime is sometimes called *Resource Acquisition Is Initialization (RAII)*.
225 > The `drop` function in Rust will be familiar to you if you’ve used RAII
226 > patterns.
227
228 This pattern has a profound impact on the way Rust code is written. It may seem
229 simple right now, but the behavior of code can be unexpected in more
230 complicated situations when we want to have multiple variables use the data
231 we’ve allocated on the heap. Let’s explore some of those situations now.
232
233 #### Ways Variables and Data Interact: Move
234
235 Multiple variables can interact with the same data in different ways in Rust.
236 Let’s look at an example using an integer in Listing 4-2.
237
238 ```rust
239 {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-02/src/main.rs:here}}
240 ```
241
242 <span class="caption">Listing 4-2: Assigning the integer value of variable `x`
243 to `y`</span>
244
245 We can probably guess what this is doing: “bind the value `5` to `x`; then make
246 a copy of the value in `x` and bind it to `y`.” We now have two variables, `x`
247 and `y`, and both equal `5`. This is indeed what is happening, because integers
248 are simple values with a known, fixed size, and these two `5` values are pushed
249 onto the stack.
250
251 Now let’s look at the `String` version:
252
253 ```rust
254 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-03-string-move/src/main.rs:here}}
255 ```
256
257 This looks very similar to the previous code, so we might assume that the way
258 it works would be the same: that is, the second line would make a copy of the
259 value in `s1` and bind it to `s2`. But this isn’t quite what happens.
260
261 Take a look at Figure 4-1 to see what is happening to `String` under the
262 covers. A `String` is made up of three parts, shown on the left: a pointer to
263 the memory that holds the contents of the string, a length, and a capacity.
264 This group of data is stored on the stack. On the right is the memory on the
265 heap that holds the contents.
266
267 <img alt="String in memory" src="img/trpl04-01.svg" class="center" style="width: 50%;" />
268
269 <span class="caption">Figure 4-1: Representation in memory of a `String`
270 holding the value `"hello"` bound to `s1`</span>
271
272 The length is how much memory, in bytes, the contents of the `String` is
273 currently using. The capacity is the total amount of memory, in bytes, that the
274 `String` has received from the allocator. The difference between length
275 and capacity matters, but not in this context, so for now, it’s fine to ignore
276 the capacity.
277
278 When we assign `s1` to `s2`, the `String` data is copied, meaning we copy the
279 pointer, the length, and the capacity that are on the stack. We do not copy the
280 data on the heap that the pointer refers to. In other words, the data
281 representation in memory looks like Figure 4-2.
282
283 <img alt="s1 and s2 pointing to the same value" src="img/trpl04-02.svg" class="center" style="width: 50%;" />
284
285 <span class="caption">Figure 4-2: Representation in memory of the variable `s2`
286 that has a copy of the pointer, length, and capacity of `s1`</span>
287
288 The representation does *not* look like Figure 4-3, which is what memory would
289 look like if Rust instead copied the heap data as well. If Rust did this, the
290 operation `s2 = s1` could be very expensive in terms of runtime performance if
291 the data on the heap were large.
292
293 <img alt="s1 and s2 to two places" src="img/trpl04-03.svg" class="center" style="width: 50%;" />
294
295 <span class="caption">Figure 4-3: Another possibility for what `s2 = s1` might
296 do if Rust copied the heap data as well</span>
297
298 Earlier, we said that when a variable goes out of scope, Rust automatically
299 calls the `drop` function and cleans up the heap memory for that variable. But
300 Figure 4-2 shows both data pointers pointing to the same location. This is a
301 problem: when `s2` and `s1` go out of scope, they will both try to free the
302 same memory. This is known as a *double free* error and is one of the memory
303 safety bugs we mentioned previously. Freeing memory twice can lead to memory
304 corruption, which can potentially lead to security vulnerabilities.
305
306 To ensure memory safety, there’s one more detail to what happens in this
307 situation in Rust. Instead of trying to copy the allocated memory, Rust
308 considers `s1` to no longer be valid and, therefore, Rust doesn’t need to free
309 anything when `s1` goes out of scope. Check out what happens when you try to
310 use `s1` after `s2` is created; it won’t work:
311
312 ```rust,ignore,does_not_compile
313 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-04-cant-use-after-move/src/main.rs:here}}
314 ```
315
316 You’ll get an error like this because Rust prevents you from using the
317 invalidated reference:
318
319 ```console
320 {{#include ../listings/ch04-understanding-ownership/no-listing-04-cant-use-after-move/output.txt}}
321 ```
322
323 If you’ve heard the terms *shallow copy* and *deep copy* while working with
324 other languages, the concept of copying the pointer, length, and capacity
325 without copying the data probably sounds like making a shallow copy. But
326 because Rust also invalidates the first variable, instead of being called a
327 shallow copy, it’s known as a *move*. In this example, we would say that
328 `s1` was *moved* into `s2`. So what actually happens is shown in Figure 4-4.
329
330 <img alt="s1 moved to s2" src="img/trpl04-04.svg" class="center" style="width: 50%;" />
331
332 <span class="caption">Figure 4-4: Representation in memory after `s1` has been
333 invalidated</span>
334
335 That solves our problem! With only `s2` valid, when it goes out of scope, it
336 alone will free the memory, and we’re done.
337
338 In addition, there’s a design choice that’s implied by this: Rust will never
339 automatically create “deep” copies of your data. Therefore, any *automatic*
340 copying can be assumed to be inexpensive in terms of runtime performance.
341
342 #### Ways Variables and Data Interact: Clone
343
344 If we *do* want to deeply copy the heap data of the `String`, not just the
345 stack data, we can use a common method called `clone`. We’ll discuss method
346 syntax in Chapter 5, but because methods are a common feature in many
347 programming languages, you’ve probably seen them before.
348
349 Here’s an example of the `clone` method in action:
350
351 ```rust
352 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-05-clone/src/main.rs:here}}
353 ```
354
355 This works just fine and explicitly produces the behavior shown in Figure 4-3,
356 where the heap data *does* get copied.
357
358 When you see a call to `clone`, you know that some arbitrary code is being
359 executed and that code may be expensive. It’s a visual indicator that something
360 different is going on.
361
362 #### Stack-Only Data: Copy
363
364 There’s another wrinkle we haven’t talked about yet. This code using integers –
365 part of which was shown in Listing 4-2 – works and is valid:
366
367 ```rust
368 {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-06-copy/src/main.rs:here}}
369 ```
370
371 But this code seems to contradict what we just learned: we don’t have a call to
372 `clone`, but `x` is still valid and wasn’t moved into `y`.
373
374 The reason is that types such as integers that have a known size at compile
375 time are stored entirely on the stack, so copies of the actual values are quick
376 to make. That means there’s no reason we would want to prevent `x` from being
377 valid after we create the variable `y`. In other words, there’s no difference
378 between deep and shallow copying here, so calling `clone` wouldn’t do anything
379 different from the usual shallow copying and we can leave it out.
380
381 Rust has a special annotation called the `Copy` trait that we can place on
382 types like integers that are stored on the stack (we’ll talk more about traits
383 in Chapter 10). If a type has the `Copy` trait, an older variable is still
384 usable after assignment. Rust won’t let us annotate a type with the `Copy`
385 trait if the type, or any of its parts, has implemented the `Drop` trait. If
386 the type needs something special to happen when the value goes out of scope and
387 we add the `Copy` annotation to that type, we’ll get a compile-time error. To
388 learn about how to add the `Copy` annotation to your type, see [“Derivable
389 Traits”][derivable-traits]<!-- ignore --> in Appendix C.
390
391 So what types are `Copy`? You can check the documentation for the given type to
392 be sure, but as a general rule, any group of simple scalar values can be
393 `Copy`, and nothing that requires allocation or is some form of resource is
394 `Copy`. Here are some of the types that are `Copy`:
395
396 * All the integer types, such as `u32`.
397 * The Boolean type, `bool`, with values `true` and `false`.
398 * All the floating point types, such as `f64`.
399 * The character type, `char`.
400 * Tuples, if they only contain types that are also `Copy`. For example,
401 `(i32, i32)` is `Copy`, but `(i32, String)` is not.
402
403 ### Ownership and Functions
404
405 The semantics for passing a value to a function are similar to those for
406 assigning a value to a variable. Passing a variable to a function will move or
407 copy, just as assignment does. Listing 4-3 has an example with some annotations
408 showing where variables go into and out of scope.
409
410 <span class="filename">Filename: src/main.rs</span>
411
412 ```rust
413 {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-03/src/main.rs}}
414 ```
415
416 <span class="caption">Listing 4-3: Functions with ownership and scope
417 annotated</span>
418
419 If we tried to use `s` after the call to `takes_ownership`, Rust would throw a
420 compile-time error. These static checks protect us from mistakes. Try adding
421 code to `main` that uses `s` and `x` to see where you can use them and where
422 the ownership rules prevent you from doing so.
423
424 ### Return Values and Scope
425
426 Returning values can also transfer ownership. Listing 4-4 is an example with
427 similar annotations to those in Listing 4-3.
428
429 <span class="filename">Filename: src/main.rs</span>
430
431 ```rust
432 {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-04/src/main.rs}}
433 ```
434
435 <span class="caption">Listing 4-4: Transferring ownership of return
436 values</span>
437
438 The ownership of a variable follows the same pattern every time: assigning a
439 value to another variable moves it. When a variable that includes data on the
440 heap goes out of scope, the value will be cleaned up by `drop` unless the data
441 has been moved to be owned by another variable.
442
443 Taking ownership and then returning ownership with every function is a bit
444 tedious. What if we want to let a function use a value but not take ownership?
445 It’s quite annoying that anything we pass in also needs to be passed back if we
446 want to use it again, in addition to any data resulting from the body of the
447 function that we might want to return as well.
448
449 It’s possible to return multiple values using a tuple, as shown in Listing 4-5.
450
451 <span class="filename">Filename: src/main.rs</span>
452
453 ```rust
454 {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-05/src/main.rs}}
455 ```
456
457 <span class="caption">Listing 4-5: Returning ownership of parameters</span>
458
459 But this is too much ceremony and a lot of work for a concept that should be
460 common. Luckily for us, Rust has a feature for this concept, called
461 *references*.
462
463 [data-types]: ch03-02-data-types.html#data-types
464 [derivable-traits]: appendix-03-derivable-traits.html
465 [method-syntax]: ch05-03-method-syntax.html#method-syntax
466 [paths-module-tree]: ch07-03-paths-for-referring-to-an-item-in-the-module-tree.html