src/doc/book/the-stack-and-the-heap.md

   1 % The Stack and the Heap
   2
   3 As a systems language, Rust operates at a low level. If you’re coming from a
   4 high-level language, there are some aspects of systems programming that you may
   5 not be familiar with. The most important one is how memory works, with a stack
   6 and a heap. If you’re familiar with how C-like languages use stack allocation,
   7 this chapter will be a refresher. If you’re not, you’ll learn about this more
   8 general concept, but with a Rust-y focus.
   9
  10 As with most things, when learning about them, we’ll use a simplified model to
  11 start. This lets you get a handle on the basics, without getting bogged down
  12 with details which are, for now, irrelevant. The examples we’ll use aren’t 100%
  13 accurate, but are representative for the level we’re trying to learn at right
  14 now. Once you have the basics down, learning more about how allocators are
  15 implemented, virtual memory, and other advanced topics will reveal the leaks in
  16 this particular abstraction.
  17
  18 # Memory management
  19
  20 These two terms are about memory management. The stack and the heap are
  21 abstractions that help you determine when to allocate and deallocate memory.
  22
  23 Here’s a high-level comparison:
  24
  25 The stack is very fast, and is where memory is allocated in Rust by default.
  26 But the allocation is local to a function call, and is limited in size. The
  27 heap, on the other hand, is slower, and is explicitly allocated by your
  28 program. But it’s effectively unlimited in size, and is globally accessible.
  29
  30 # The Stack
  31
  32 Let’s talk about this Rust program:
  33
  34 ```rust
  35 fn main() {
  36     let x = 42;
  37 }
  38 ```
  39
  40 This program has one variable binding, `x`. This memory needs to be allocated
  41 from somewhere. Rust ‘stack allocates’ by default, which means that basic
  42 values ‘go on the stack’. What does that mean?
  43
  44 Well, when a function gets called, some memory gets allocated for all of its
  45 local variables and some other information. This is called a ‘stack frame’, and
  46 for the purpose of this tutorial, we’re going to ignore the extra information
  47 and just consider the local variables we’re allocating. So in this case, when
  48 `main()` is run, we’ll allocate a single 32-bit integer for our stack frame.
  49 This is automatically handled for you, as you can see; we didn’t have to write
  50 any special Rust code or anything.
  51
  52 When the function exits, its stack frame gets deallocated. This happens
  53 automatically as well.
  54
  55 That’s all there is for this simple program. The key thing to understand here
  56 is that stack allocation is very, very fast. Since we know all the local
  57 variables we have ahead of time, we can grab the memory all at once. And since
  58 we’ll throw them all away at the same time as well, we can get rid of it very
  59 fast too.
  60
  61 The downside is that we can’t keep values around if we need them for longer
  62 than a single function. We also haven’t talked about what the word, ‘stack’,
  63 means. To do that, we need a slightly more complicated example:
  64
  65 ```rust
  66 fn foo() {
  67     let y = 5;
  68     let z = 100;
  69 }
  70
  71 fn main() {
  72     let x = 42;
  73
  74     foo();
  75 }
  76 ```
  77
  78 This program has three variables total: two in `foo()`, one in `main()`. Just
  79 as before, when `main()` is called, a single integer is allocated for its stack
  80 frame. But before we can show what happens when `foo()` is called, we need to
  81 visualize what’s going on with memory. Your operating system presents a view of
  82 memory to your program that’s pretty simple: a huge list of addresses, from 0
  83 to a large number, representing how much RAM your computer has. For example, if
  84 you have a gigabyte of RAM, your addresses go from `0` to `1,073,741,823`. That
  85 number comes from 2<sup>30</sup>, the number of bytes in a gigabyte. [^gigabyte]
  86
  87 [^gigabyte]: ‘Gigabyte’ can mean two things: 10^9, or 2^30. The SI standard resolved this by stating that ‘gigabyte’ is 10^9, and ‘gibibyte’ is 2^30. However, very few people use this terminology, and rely on context to differentiate. We follow in that tradition here.
  88
  89 This memory is kind of like a giant array: addresses start at zero and go
  90 up to the final number. So here’s a diagram of our first stack frame:
  91
  92 | Address | Name | Value |
  93 |---------|------|-------|
  94 | 0       | x    | 42    |
  95
  96 We’ve got `x` located at address `0`, with the value `42`.
  97
  98 When `foo()` is called, a new stack frame is allocated:
  99
 100 | Address | Name | Value |
 101 |---------|------|-------|
 102 | 2       | z    | 100   |
 103 | 1       | y    | 5     |
 104 | 0       | x    | 42    |
 105
 106 Because `0` was taken by the first frame, `1` and `2` are used for `foo()`’s
 107 stack frame. It grows upward, the more functions we call.
 108
 109
 110 There are some important things we have to take note of here. The numbers 0, 1,
 111 and 2 are all solely for illustrative purposes, and bear no relationship to the
 112 address values the computer will use in reality. In particular, the series of
 113 addresses are in reality going to be separated by some number of bytes that
 114 separate each address, and that separation may even exceed the size of the
 115 value being stored.
 116
 117 After `foo()` is over, its frame is deallocated:
 118
 119 | Address | Name | Value |
 120 |---------|------|-------|
 121 | 0       | x    | 42    |
 122
 123 And then, after `main()`, even this last value goes away. Easy!
 124
 125 It’s called a ‘stack’ because it works like a stack of dinner plates: the first
 126 plate you put down is the last plate to pick back up. Stacks are sometimes
 127 called ‘last in, first out queues’ for this reason, as the last value you put
 128 on the stack is the first one you retrieve from it.
 129
 130 Let’s try a three-deep example:
 131
 132 ```rust
 133 fn bar() {
 134     let i = 6;
 135 }
 136
 137 fn foo() {
 138     let a = 5;
 139     let b = 100;
 140     let c = 1;
 141
 142     bar();
 143 }
 144
 145 fn main() {
 146     let x = 42;
 147
 148     foo();
 149 }
 150 ```
 151
 152 Okay, first, we call `main()`:
 153
 154 | Address | Name | Value |
 155 |---------|------|-------|
 156 | 0       | x    | 42    |
 157
 158 Next up, `main()` calls `foo()`:
 159
 160 | Address | Name | Value |
 161 |---------|------|-------|
 162 | 3       | c    | 1     |
 163 | 2       | b    | 100   |
 164 | 1       | a    | 5     |
 165 | 0       | x    | 42    |
 166
 167 And then `foo()` calls `bar()`:
 168
 169 | Address | Name | Value |
 170 |---------|------|-------|
 171 | 4       | i    | 6     |
 172 | 3       | c    | 1     |
 173 | 2       | b    | 100   |
 174 | 1       | a    | 5     |
 175 | 0       | x    | 42    |
 176
 177 Whew! Our stack is growing tall.
 178
 179 After `bar()` is over, its frame is deallocated, leaving just `foo()` and
 180 `main()`:
 181
 182 | Address | Name | Value |
 183 |---------|------|-------|
 184 | 3       | c    | 1     |
 185 | 2       | b    | 100   |
 186 | 1       | a    | 5     |
 187 | 0       | x    | 42    |
 188
 189 And then `foo()` ends, leaving just `main()`:
 190
 191 | Address | Name | Value |
 192 |---------|------|-------|
 193 | 0       | x    | 42    |
 194
 195 And then we’re done. Getting the hang of it? It’s like piling up dishes: you
 196 add to the top, you take away from the top.
 197
 198 # The Heap
 199
 200 Now, this works pretty well, but not everything can work like this. Sometimes,
 201 you need to pass some memory between different functions, or keep it alive for
 202 longer than a single function’s execution. For this, we can use the heap.
 203
 204 In Rust, you can allocate memory on the heap with the [`Box<T>` type][box].
 205 Here’s an example:
 206
 207 ```rust
 208 fn main() {
 209     let x = Box::new(5);
 210     let y = 42;
 211 }
 212 ```
 213
 214 [box]: ../std/boxed/index.html
 215
 216 Here’s what happens in memory when `main()` is called:
 217
 218 | Address | Name | Value  |
 219 |---------|------|--------|
 220 | 1       | y    | 42     |
 221 | 0       | x    | ?????? |
 222
 223 We allocate space for two variables on the stack. `y` is `42`, as it always has
 224 been, but what about `x`? Well, `x` is a `Box<i32>`, and boxes allocate memory
 225 on the heap. The actual value of the box is a structure which has a pointer to
 226 ‘the heap’. When we start executing the function, and `Box::new()` is called,
 227 it allocates some memory for the heap, and puts `5` there. The memory now looks
 228 like this:
 229
 230 | Address              | Name | Value                  |
 231 |----------------------|------|------------------------|
 232 | (2<sup>30</sup>) - 1 |      | 5                      |
 233 | ...                  | ...  | ...                    |
 234 | 1                    | y    | 42                     |
 235 | 0                    | x    | → (2<sup>30</sup>) - 1 |
 236
 237 We have (2<sup>30</sup>) - 1 addresses in our hypothetical computer with 1GB of RAM. And since
 238 our stack grows from zero, the easiest place to allocate memory is from the
 239 other end. So our first value is at the highest place in memory. And the value
 240 of the struct at `x` has a [raw pointer][rawpointer] to the place we’ve
 241 allocated on the heap, so the value of `x` is (2<sup>30</sup>) - 1, the memory
 242 location we’ve asked for.
 243
 244 [rawpointer]: raw-pointers.html
 245
 246 We haven’t really talked too much about what it actually means to allocate and
 247 deallocate memory in these contexts. Getting into very deep detail is out of
 248 the scope of this tutorial, but what’s important to point out here is that
 249 the heap isn’t just a stack that grows from the opposite end. We’ll have an
 250 example of this later in the book, but because the heap can be allocated and
 251 freed in any order, it can end up with ‘holes’. Here’s a diagram of the memory
 252 layout of a program which has been running for a while now:
 253
 254
 255 | Address              | Name | Value                  |
 256 |----------------------|------|------------------------|
 257 | (2<sup>30</sup>) - 1 |      | 5                      |
 258 | (2<sup>30</sup>) - 2 |      |                        |
 259 | (2<sup>30</sup>) - 3 |      |                        |
 260 | (2<sup>30</sup>) - 4 |      | 42                     |
 261 | ...                  | ...  | ...                    |
 262 | 3                    | y    | → (2<sup>30</sup>) - 4 |
 263 | 2                    | y    | 42                     |
 264 | 1                    | y    | 42                     |
 265 | 0                    | x    | → (2<sup>30</sup>) - 1 |
 266
 267 In this case, we’ve allocated four things on the heap, but deallocated two of
 268 them. There’s a gap between (2<sup>30</sup>) - 1 and (2<sup>30</sup>) - 4 which isn’t
 269 currently being used. The specific details of how and why this happens depends
 270 on what kind of strategy you use to manage the heap. Different programs can use
 271 different ‘memory allocators’, which are libraries that manage this for you.
 272 Rust programs use [jemalloc][jemalloc] for this purpose.
 273
 274 [jemalloc]: http://www.canonware.com/jemalloc/
 275
 276 Anyway, back to our example. Since this memory is on the heap, it can stay
 277 alive longer than the function which allocates the box. In this case, however,
 278 it doesn’t.[^moving] When the function is over, we need to free the stack frame
 279 for `main()`. `Box<T>`, though, has a trick up its sleeve: [Drop][drop]. The
 280 implementation of `Drop` for `Box` deallocates the memory that was allocated
 281 when it was created. Great! So when `x` goes away, it first frees the memory
 282 allocated on the heap:
 283
 284 | Address | Name | Value  |
 285 |---------|------|--------|
 286 | 1       | y    | 42     |
 287 | 0       | x    | ?????? |
 288
 289 [drop]: drop.html
 290 [^moving]: We can make the memory live longer by transferring ownership,
 291            sometimes called ‘moving out of the box’. More complex examples will
 292            be covered later.
 293
 294
 295 And then the stack frame goes away, freeing all of our memory.
 296
 297 # Arguments and borrowing
 298
 299 We’ve got some basic examples with the stack and the heap going, but what about
 300 function arguments and borrowing? Here’s a small Rust program:
 301
 302 ```rust
 303 fn foo(i: &i32) {
 304     let z = 42;
 305 }
 306
 307 fn main() {
 308     let x = 5;
 309     let y = &x;
 310
 311     foo(y);
 312 }
 313 ```
 314
 315 When we enter `main()`, memory looks like this:
 316
 317 | Address | Name | Value  |
 318 |---------|------|--------|
 319 | 1       | y    | → 0    |
 320 | 0       | x    | 5      |
 321
 322 `x` is a plain old `5`, and `y` is a reference to `x`. So its value is the
 323 memory location that `x` lives at, which in this case is `0`.
 324
 325 What about when we call `foo()`, passing `y` as an argument?
 326
 327 | Address | Name | Value  |
 328 |---------|------|--------|
 329 | 3       | z    | 42     |
 330 | 2       | i    | → 0    |
 331 | 1       | y    | → 0    |
 332 | 0       | x    | 5      |
 333
 334 Stack frames aren’t just for local bindings, they’re for arguments too. So in
 335 this case, we need to have both `i`, our argument, and `z`, our local variable
 336 binding. `i` is a copy of the argument, `y`. Since `y`’s value is `0`, so is
 337 `i`’s.
 338
 339 This is one reason why borrowing a variable doesn’t deallocate any memory: the
 340 value of a reference is just a pointer to a memory location. If we got rid of
 341 the underlying memory, things wouldn’t work very well.
 342
 343 # A complex example
 344
 345 Okay, let’s go through this complex program step-by-step:
 346
 347 ```rust
 348 fn foo(x: &i32) {
 349     let y = 10;
 350     let z = &y;
 351
 352     baz(z);
 353     bar(x, z);
 354 }
 355
 356 fn bar(a: &i32, b: &i32) {
 357     let c = 5;
 358     let d = Box::new(5);
 359     let e = &d;
 360
 361     baz(e);
 362 }
 363
 364 fn baz(f: &i32) {
 365     let g = 100;
 366 }
 367
 368 fn main() {
 369     let h = 3;
 370     let i = Box::new(20);
 371     let j = &h;
 372
 373     foo(j);
 374 }
 375 ```
 376
 377 First, we call `main()`:
 378
 379 | Address              | Name | Value                  |
 380 |----------------------|------|------------------------|
 381 | (2<sup>30</sup>) - 1 |      | 20                     |
 382 | ...                  | ...  | ...                    |
 383 | 2                    | j    | → 0                    |
 384 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 385 | 0                    | h    | 3                      |
 386
 387 We allocate memory for `j`, `i`, and `h`. `i` is on the heap, and so has a
 388 value pointing there.
 389
 390 Next, at the end of `main()`, `foo()` gets called:
 391
 392 | Address              | Name | Value                  |
 393 |----------------------|------|------------------------|
 394 | (2<sup>30</sup>) - 1 |      | 20                     |
 395 | ...                  | ...  | ...                    |
 396 | 5                    | z    | → 4                    |
 397 | 4                    | y    | 10                     |
 398 | 3                    | x    | → 0                    |
 399 | 2                    | j    | → 0                    |
 400 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 401 | 0                    | h    | 3                      |
 402
 403 Space gets allocated for `x`, `y`, and `z`. The argument `x` has the same value
 404 as `j`, since that’s what we passed it in. It’s a pointer to the `0` address,
 405 since `j` points at `h`.
 406
 407 Next, `foo()` calls `baz()`, passing `z`:
 408
 409 | Address              | Name | Value                  |
 410 |----------------------|------|------------------------|
 411 | (2<sup>30</sup>) - 1 |      | 20                     |
 412 | ...                  | ...  | ...                    |
 413 | 7                    | g    | 100                    |
 414 | 6                    | f    | → 4                    |
 415 | 5                    | z    | → 4                    |
 416 | 4                    | y    | 10                     |
 417 | 3                    | x    | → 0                    |
 418 | 2                    | j    | → 0                    |
 419 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 420 | 0                    | h    | 3                      |
 421
 422 We’ve allocated memory for `f` and `g`. `baz()` is very short, so when it’s
 423 over, we get rid of its stack frame:
 424
 425 | Address              | Name | Value                  |
 426 |----------------------|------|------------------------|
 427 | (2<sup>30</sup>) - 1 |      | 20                     |
 428 | ...                  | ...  | ...                    |
 429 | 5                    | z    | → 4                    |
 430 | 4                    | y    | 10                     |
 431 | 3                    | x    | → 0                    |
 432 | 2                    | j    | → 0                    |
 433 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 434 | 0                    | h    | 3                      |
 435
 436 Next, `foo()` calls `bar()` with `x` and `z`:
 437
 438 | Address              | Name | Value                  |
 439 |----------------------|------|------------------------|
 440 | (2<sup>30</sup>) - 1 |      | 20                     |
 441 | (2<sup>30</sup>) - 2 |      | 5                      |
 442 | ...                  | ...  | ...                    |
 443 | 10                   | e    | → 9                    |
 444 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 445 | 8                    | c    | 5                      |
 446 | 7                    | b    | → 4                    |
 447 | 6                    | a    | → 0                    |
 448 | 5                    | z    | → 4                    |
 449 | 4                    | y    | 10                     |
 450 | 3                    | x    | → 0                    |
 451 | 2                    | j    | → 0                    |
 452 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 453 | 0                    | h    | 3                      |
 454
 455 We end up allocating another value on the heap, and so we have to subtract one
 456 from (2<sup>30</sup>) - 1. It’s easier to just write that than `1,073,741,822`. In any
 457 case, we set up the variables as usual.
 458
 459 At the end of `bar()`, it calls `baz()`:
 460
 461 | Address              | Name | Value                  |
 462 |----------------------|------|------------------------|
 463 | (2<sup>30</sup>) - 1 |      | 20                     |
 464 | (2<sup>30</sup>) - 2 |      | 5                      |
 465 | ...                  | ...  | ...                    |
 466 | 12                   | g    | 100                    |
 467 | 11                   | f    | → (2<sup>30</sup>) - 2 |
 468 | 10                   | e    | → 9                    |
 469 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 470 | 8                    | c    | 5                      |
 471 | 7                    | b    | → 4                    |
 472 | 6                    | a    | → 0                    |
 473 | 5                    | z    | → 4                    |
 474 | 4                    | y    | 10                     |
 475 | 3                    | x    | → 0                    |
 476 | 2                    | j    | → 0                    |
 477 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 478 | 0                    | h    | 3                      |
 479
 480 With this, we’re at our deepest point! Whew! Congrats for following along this
 481 far.
 482
 483 After `baz()` is over, we get rid of `f` and `g`:
 484
 485 | Address              | Name | Value                  |
 486 |----------------------|------|------------------------|
 487 | (2<sup>30</sup>) - 1 |      | 20                     |
 488 | (2<sup>30</sup>) - 2 |      | 5                      |
 489 | ...                  | ...  | ...                    |
 490 | 10                   | e    | → 9                    |
 491 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 492 | 8                    | c    | 5                      |
 493 | 7                    | b    | → 4                    |
 494 | 6                    | a    | → 0                    |
 495 | 5                    | z    | → 4                    |
 496 | 4                    | y    | 10                     |
 497 | 3                    | x    | → 0                    |
 498 | 2                    | j    | → 0                    |
 499 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 500 | 0                    | h    | 3                      |
 501
 502 Next, we return from `bar()`. `d` in this case is a `Box<T>`, so it also frees
 503 what it points to: (2<sup>30</sup>) - 2.
 504
 505 | Address              | Name | Value                  |
 506 |----------------------|------|------------------------|
 507 | (2<sup>30</sup>) - 1 |      | 20                     |
 508 | ...                  | ...  | ...                    |
 509 | 5                    | z    | → 4                    |
 510 | 4                    | y    | 10                     |
 511 | 3                    | x    | → 0                    |
 512 | 2                    | j    | → 0                    |
 513 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 514 | 0                    | h    | 3                      |
 515
 516 And after that, `foo()` returns:
 517
 518 | Address              | Name | Value                  |
 519 |----------------------|------|------------------------|
 520 | (2<sup>30</sup>) - 1 |      | 20                     |
 521 | ...                  | ...  | ...                    |
 522 | 2                    | j    | → 0                    |
 523 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 524 | 0                    | h    | 3                      |
 525
 526 And then, finally, `main()`, which cleans the rest up. When `i` is `Drop`ped,
 527 it will clean up the last of the heap too.
 528
 529 # What do other languages do?
 530
 531 Most languages with a garbage collector heap-allocate by default. This means
 532 that every value is boxed. There are a number of reasons why this is done, but
 533 they’re out of scope for this tutorial. There are some possible optimizations
 534 that don’t make it true 100% of the time, too. Rather than relying on the stack
 535 and `Drop` to clean up memory, the garbage collector deals with the heap
 536 instead.
 537
 538 # Which to use?
 539
 540 So if the stack is faster and easier to manage, why do we need the heap? A big
 541 reason is that Stack-allocation alone means you only have LIFO semantics for
 542 reclaiming storage. Heap-allocation is strictly more general, allowing storage
 543 to be taken from and returned to the pool in arbitrary order, but at a
 544 complexity cost.
 545
 546 Generally, you should prefer stack allocation, and so, Rust stack-allocates by
 547 default. The LIFO model of the stack is simpler, at a fundamental level. This
 548 has two big impacts: runtime efficiency and semantic impact.
 549
 550 ## Runtime Efficiency
 551
 552 Managing the memory for the stack is trivial: The machine just
 553 increments or decrements a single value, the so-called “stack pointer”.
 554 Managing memory for the heap is non-trivial: heap-allocated memory is freed at
 555 arbitrary points, and each block of heap-allocated memory can be of arbitrary
 556 size, the memory manager must generally work much harder to identify memory for
 557 reuse.
 558
 559 If you’d like to dive into this topic in greater detail, [this paper][wilson]
 560 is a great introduction.
 561
 562 [wilson]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.4688
 563
 564 ## Semantic impact
 565
 566 Stack-allocation impacts the Rust language itself, and thus the developer’s
 567 mental model. The LIFO semantics is what drives how the Rust language handles
 568 automatic memory management. Even the deallocation of a uniquely-owned
 569 heap-allocated box can be driven by the stack-based LIFO semantics, as
 570 discussed throughout this chapter. The flexibility (i.e. expressiveness) of non
 571 LIFO-semantics means that in general the compiler cannot automatically infer at
 572 compile-time where memory should be freed; it has to rely on dynamic protocols,
 573 potentially from outside the language itself, to drive deallocation (reference
 574 counting, as used by `Rc<T>` and `Arc<T>`, is one example of this).
 575
 576 When taken to the extreme, the increased expressive power of heap allocation
 577 comes at the cost of either significant runtime support (e.g. in the form of a
 578 garbage collector) or significant programmer effort (in the form of explicit
 579 memory management calls that require verification not provided by the Rust
 580 compiler).