src/doc/book/the-stack-and-the-heap.md

   1 % The Stack and the Heap
   2
   3 As a systems language, Rust operates at a low level. If you’re coming from a
   4 high-level language, there are some aspects of systems programming that you may
   5 not be familiar with. The most important one is how memory works, with a stack
   6 and a heap. If you’re familiar with how C-like languages use stack allocation,
   7 this chapter will be a refresher. If you’re not, you’ll learn about this more
   8 general concept, but with a Rust-y focus.
   9
  10 As with most things, when learning about them, we’ll use a simplified model to
  11 start. This lets you get a handle on the basics, without getting bogged down
  12 with details which are, for now, irrelevant. The examples we’ll use aren’t 100%
  13 accurate, but are representative for the level we’re trying to learn at right
  14 now. Once you have the basics down, learning more about how allocators are
  15 implemented, virtual memory, and other advanced topics will reveal the leaks in
  16 this particular abstraction.
  17
  18 # Memory management
  19
  20 These two terms are about memory management. The stack and the heap are
  21 abstractions that help you determine when to allocate and deallocate memory.
  22
  23 Here’s a high-level comparison:
  24
  25 The stack is very fast, and is where memory is allocated in Rust by default.
  26 But the allocation is local to a function call, and is limited in size. The
  27 heap, on the other hand, is slower, and is explicitly allocated by your
  28 program. But it’s effectively unlimited in size, and is globally accessible.
  29 Note this meaning of heap, which allocates arbitrary-sized blocks of memory in arbitrary
  30 order, is quite different from the heap data structure.
  31
  32 # The Stack
  33
  34 Let’s talk about this Rust program:
  35
  36 ```rust
  37 fn main() {
  38     let x = 42;
  39 }
  40 ```
  41
  42 This program has one variable binding, `x`. This memory needs to be allocated
  43 from somewhere. Rust ‘stack allocates’ by default, which means that basic
  44 values ‘go on the stack’. What does that mean?
  45
  46 Well, when a function gets called, some memory gets allocated for all of its
  47 local variables and some other information. This is called a ‘stack frame’, and
  48 for the purpose of this tutorial, we’re going to ignore the extra information
  49 and only consider the local variables we’re allocating. So in this case, when
  50 `main()` is run, we’ll allocate a single 32-bit integer for our stack frame.
  51 This is automatically handled for you, as you can see; we didn’t have to write
  52 any special Rust code or anything.
  53
  54 When the function exits, its stack frame gets deallocated. This happens
  55 automatically as well.
  56
  57 That’s all there is for this simple program. The key thing to understand here
  58 is that stack allocation is very, very fast. Since we know all the local
  59 variables we have ahead of time, we can grab the memory all at once. And since
  60 we’ll throw them all away at the same time as well, we can get rid of it very
  61 fast too.
  62
  63 The downside is that we can’t keep values around if we need them for longer
  64 than a single function. We also haven’t talked about what the word, ‘stack’,
  65 means. To do that, we need a slightly more complicated example:
  66
  67 ```rust
  68 fn foo() {
  69     let y = 5;
  70     let z = 100;
  71 }
  72
  73 fn main() {
  74     let x = 42;
  75
  76     foo();
  77 }
  78 ```
  79
  80 This program has three variables total: two in `foo()`, one in `main()`. Just
  81 as before, when `main()` is called, a single integer is allocated for its stack
  82 frame. But before we can show what happens when `foo()` is called, we need to
  83 visualize what’s going on with memory. Your operating system presents a view of
  84 memory to your program that’s pretty simple: a huge list of addresses, from 0
  85 to a large number, representing how much RAM your computer has. For example, if
  86 you have a gigabyte of RAM, your addresses go from `0` to `1,073,741,823`. That
  87 number comes from 2<sup>30</sup>, the number of bytes in a gigabyte. [^gigabyte]
  88
  89 [^gigabyte]: ‘Gigabyte’ can mean two things: 10^9, or 2^30. The SI standard resolved this by stating that ‘gigabyte’ is 10^9, and ‘gibibyte’ is 2^30. However, very few people use this terminology, and rely on context to differentiate. We follow in that tradition here.
  90
  91 This memory is kind of like a giant array: addresses start at zero and go
  92 up to the final number. So here’s a diagram of our first stack frame:
  93
  94 | Address | Name | Value |
  95 |---------|------|-------|
  96 | 0       | x    | 42    |
  97
  98 We’ve got `x` located at address `0`, with the value `42`.
  99
 100 When `foo()` is called, a new stack frame is allocated:
 101
 102 | Address | Name | Value |
 103 |---------|------|-------|
 104 | 2       | z    | 100   |
 105 | 1       | y    | 5     |
 106 | 0       | x    | 42    |
 107
 108 Because `0` was taken by the first frame, `1` and `2` are used for `foo()`’s
 109 stack frame. It grows upward, the more functions we call.
 110
 111
 112 There are some important things we have to take note of here. The numbers 0, 1,
 113 and 2 are all solely for illustrative purposes, and bear no relationship to the
 114 address values the computer will use in reality. In particular, the series of
 115 addresses are in reality going to be separated by some number of bytes that
 116 separate each address, and that separation may even exceed the size of the
 117 value being stored.
 118
 119 After `foo()` is over, its frame is deallocated:
 120
 121 | Address | Name | Value |
 122 |---------|------|-------|
 123 | 0       | x    | 42    |
 124
 125 And then, after `main()`, even this last value goes away. Easy!
 126
 127 It’s called a ‘stack’ because it works like a stack of dinner plates: the first
 128 plate you put down is the last plate to pick back up. Stacks are sometimes
 129 called ‘last in, first out queues’ for this reason, as the last value you put
 130 on the stack is the first one you retrieve from it.
 131
 132 Let’s try a three-deep example:
 133
 134 ```rust
 135 fn italic() {
 136     let i = 6;
 137 }
 138
 139 fn bold() {
 140     let a = 5;
 141     let b = 100;
 142     let c = 1;
 143
 144     italic();
 145 }
 146
 147 fn main() {
 148     let x = 42;
 149
 150     bold();
 151 }
 152 ```
 153
 154 We have some kooky function names to make the diagrams clearer.
 155
 156 Okay, first, we call `main()`:
 157
 158 | Address | Name | Value |
 159 |---------|------|-------|
 160 | 0       | x    | 42    |
 161
 162 Next up, `main()` calls `bold()`:
 163
 164 | Address | Name | Value |
 165 |---------|------|-------|
 166 | **3**   | **c**|**1**  |
 167 | **2**   | **b**|**100**|
 168 | **1**   | **a**| **5** |
 169 | 0       | x    | 42    |
 170
 171 And then `bold()` calls `italic()`:
 172
 173 | Address | Name | Value |
 174 |---------|------|-------|
 175 | *4*     | *i*  | *6*   |
 176 | **3**   | **c**|**1**  |
 177 | **2**   | **b**|**100**|
 178 | **1**   | **a**| **5** |
 179 | 0       | x    | 42    |
 180
 181 Whew! Our stack is growing tall.
 182
 183 After `italic()` is over, its frame is deallocated, leaving only `bold()` and
 184 `main()`:
 185
 186 | Address | Name | Value |
 187 |---------|------|-------|
 188 | **3**   | **c**|**1**  |
 189 | **2**   | **b**|**100**|
 190 | **1**   | **a**| **5** |
 191 | 0       | x    | 42    |
 192
 193 And then `bold()` ends, leaving only `main()`:
 194
 195 | Address | Name | Value |
 196 |---------|------|-------|
 197 | 0       | x    | 42    |
 198
 199 And then we’re done. Getting the hang of it? It’s like piling up dishes: you
 200 add to the top, you take away from the top.
 201
 202 # The Heap
 203
 204 Now, this works pretty well, but not everything can work like this. Sometimes,
 205 you need to pass some memory between different functions, or keep it alive for
 206 longer than a single function’s execution. For this, we can use the heap.
 207
 208 In Rust, you can allocate memory on the heap with the [`Box<T>` type][box].
 209 Here’s an example:
 210
 211 ```rust
 212 fn main() {
 213     let x = Box::new(5);
 214     let y = 42;
 215 }
 216 ```
 217
 218 [box]: ../std/boxed/index.html
 219
 220 Here’s what happens in memory when `main()` is called:
 221
 222 | Address | Name | Value  |
 223 |---------|------|--------|
 224 | 1       | y    | 42     |
 225 | 0       | x    | ?????? |
 226
 227 We allocate space for two variables on the stack. `y` is `42`, as it always has
 228 been, but what about `x`? Well, `x` is a `Box<i32>`, and boxes allocate memory
 229 on the heap. The actual value of the box is a structure which has a pointer to
 230 ‘the heap’. When we start executing the function, and `Box::new()` is called,
 231 it allocates some memory for the heap, and puts `5` there. The memory now looks
 232 like this:
 233
 234 | Address              | Name | Value                  |
 235 |----------------------|------|------------------------|
 236 | (2<sup>30</sup>) - 1 |      | 5                      |
 237 | ...                  | ...  | ...                    |
 238 | 1                    | y    | 42                     |
 239 | 0                    | x    | → (2<sup>30</sup>) - 1 |
 240
 241 We have (2<sup>30</sup>) - 1 addresses in our hypothetical computer with 1GB of RAM. And since
 242 our stack grows from zero, the easiest place to allocate memory is from the
 243 other end. So our first value is at the highest place in memory. And the value
 244 of the struct at `x` has a [raw pointer][rawpointer] to the place we’ve
 245 allocated on the heap, so the value of `x` is (2<sup>30</sup>) - 1, the memory
 246 location we’ve asked for.
 247
 248 [rawpointer]: raw-pointers.html
 249
 250 We haven’t really talked too much about what it actually means to allocate and
 251 deallocate memory in these contexts. Getting into very deep detail is out of
 252 the scope of this tutorial, but what’s important to point out here is that
 253 the heap isn’t a stack that grows from the opposite end. We’ll have an
 254 example of this later in the book, but because the heap can be allocated and
 255 freed in any order, it can end up with ‘holes’. Here’s a diagram of the memory
 256 layout of a program which has been running for a while now:
 257
 258
 259 | Address              | Name | Value                  |
 260 |----------------------|------|------------------------|
 261 | (2<sup>30</sup>) - 1 |      | 5                      |
 262 | (2<sup>30</sup>) - 2 |      |                        |
 263 | (2<sup>30</sup>) - 3 |      |                        |
 264 | (2<sup>30</sup>) - 4 |      | 42                     |
 265 | ...                  | ...  | ...                    |
 266 | 2                    | z    | → (2<sup>30</sup>) - 4 |
 267 | 1                    | y    | 42                     |
 268 | 0                    | x    | → (2<sup>30</sup>) - 1 |
 269
 270 In this case, we’ve allocated four things on the heap, but deallocated two of
 271 them. There’s a gap between (2<sup>30</sup>) - 1 and (2<sup>30</sup>) - 4 which isn’t
 272 currently being used. The specific details of how and why this happens depends
 273 on what kind of strategy you use to manage the heap. Different programs can use
 274 different ‘memory allocators’, which are libraries that manage this for you.
 275 Rust programs use [jemalloc][jemalloc] for this purpose.
 276
 277 [jemalloc]: http://www.canonware.com/jemalloc/
 278
 279 Anyway, back to our example. Since this memory is on the heap, it can stay
 280 alive longer than the function which allocates the box. In this case, however,
 281 it doesn’t.[^moving] When the function is over, we need to free the stack frame
 282 for `main()`. `Box<T>`, though, has a trick up its sleeve: [Drop][drop]. The
 283 implementation of `Drop` for `Box` deallocates the memory that was allocated
 284 when it was created. Great! So when `x` goes away, it first frees the memory
 285 allocated on the heap:
 286
 287 | Address | Name | Value  |
 288 |---------|------|--------|
 289 | 1       | y    | 42     |
 290 | 0       | x    | ?????? |
 291
 292 [drop]: drop.html
 293 [^moving]: We can make the memory live longer by transferring ownership,
 294            sometimes called ‘moving out of the box’. More complex examples will
 295            be covered later.
 296
 297
 298 And then the stack frame goes away, freeing all of our memory.
 299
 300 # Arguments and borrowing
 301
 302 We’ve got some basic examples with the stack and the heap going, but what about
 303 function arguments and borrowing? Here’s a small Rust program:
 304
 305 ```rust
 306 fn foo(i: &i32) {
 307     let z = 42;
 308 }
 309
 310 fn main() {
 311     let x = 5;
 312     let y = &x;
 313
 314     foo(y);
 315 }
 316 ```
 317
 318 When we enter `main()`, memory looks like this:
 319
 320 | Address | Name | Value  |
 321 |---------|------|--------|
 322 | 1       | y    | → 0    |
 323 | 0       | x    | 5      |
 324
 325 `x` is a plain old `5`, and `y` is a reference to `x`. So its value is the
 326 memory location that `x` lives at, which in this case is `0`.
 327
 328 What about when we call `foo()`, passing `y` as an argument?
 329
 330 | Address | Name | Value  |
 331 |---------|------|--------|
 332 | 3       | z    | 42     |
 333 | 2       | i    | → 0    |
 334 | 1       | y    | → 0    |
 335 | 0       | x    | 5      |
 336
 337 Stack frames aren’t only for local bindings, they’re for arguments too. So in
 338 this case, we need to have both `i`, our argument, and `z`, our local variable
 339 binding. `i` is a copy of the argument, `y`. Since `y`’s value is `0`, so is
 340 `i`’s.
 341
 342 This is one reason why borrowing a variable doesn’t deallocate any memory: the
 343 value of a reference is a pointer to a memory location. If we got rid of
 344 the underlying memory, things wouldn’t work very well.
 345
 346 # A complex example
 347
 348 Okay, let’s go through this complex program step-by-step:
 349
 350 ```rust
 351 fn foo(x: &i32) {
 352     let y = 10;
 353     let z = &y;
 354
 355     baz(z);
 356     bar(x, z);
 357 }
 358
 359 fn bar(a: &i32, b: &i32) {
 360     let c = 5;
 361     let d = Box::new(5);
 362     let e = &d;
 363
 364     baz(e);
 365 }
 366
 367 fn baz(f: &i32) {
 368     let g = 100;
 369 }
 370
 371 fn main() {
 372     let h = 3;
 373     let i = Box::new(20);
 374     let j = &h;
 375
 376     foo(j);
 377 }
 378 ```
 379
 380 First, we call `main()`:
 381
 382 | Address              | Name | Value                  |
 383 |----------------------|------|------------------------|
 384 | (2<sup>30</sup>) - 1 |      | 20                     |
 385 | ...                  | ...  | ...                    |
 386 | 2                    | j    | → 0                    |
 387 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 388 | 0                    | h    | 3                      |
 389
 390 We allocate memory for `j`, `i`, and `h`. `i` is on the heap, and so has a
 391 value pointing there.
 392
 393 Next, at the end of `main()`, `foo()` gets called:
 394
 395 | Address              | Name | Value                  |
 396 |----------------------|------|------------------------|
 397 | (2<sup>30</sup>) - 1 |      | 20                     |
 398 | ...                  | ...  | ...                    |
 399 | 5                    | z    | → 4                    |
 400 | 4                    | y    | 10                     |
 401 | 3                    | x    | → 0                    |
 402 | 2                    | j    | → 0                    |
 403 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 404 | 0                    | h    | 3                      |
 405
 406 Space gets allocated for `x`, `y`, and `z`. The argument `x` has the same value
 407 as `j`, since that’s what we passed it in. It’s a pointer to the `0` address,
 408 since `j` points at `h`.
 409
 410 Next, `foo()` calls `baz()`, passing `z`:
 411
 412 | Address              | Name | Value                  |
 413 |----------------------|------|------------------------|
 414 | (2<sup>30</sup>) - 1 |      | 20                     |
 415 | ...                  | ...  | ...                    |
 416 | 7                    | g    | 100                    |
 417 | 6                    | f    | → 4                    |
 418 | 5                    | z    | → 4                    |
 419 | 4                    | y    | 10                     |
 420 | 3                    | x    | → 0                    |
 421 | 2                    | j    | → 0                    |
 422 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 423 | 0                    | h    | 3                      |
 424
 425 We’ve allocated memory for `f` and `g`. `baz()` is very short, so when it’s
 426 over, we get rid of its stack frame:
 427
 428 | Address              | Name | Value                  |
 429 |----------------------|------|------------------------|
 430 | (2<sup>30</sup>) - 1 |      | 20                     |
 431 | ...                  | ...  | ...                    |
 432 | 5                    | z    | → 4                    |
 433 | 4                    | y    | 10                     |
 434 | 3                    | x    | → 0                    |
 435 | 2                    | j    | → 0                    |
 436 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 437 | 0                    | h    | 3                      |
 438
 439 Next, `foo()` calls `bar()` with `x` and `z`:
 440
 441 | Address              | Name | Value                  |
 442 |----------------------|------|------------------------|
 443 | (2<sup>30</sup>) - 1 |      | 20                     |
 444 | (2<sup>30</sup>) - 2 |      | 5                      |
 445 | ...                  | ...  | ...                    |
 446 | 10                   | e    | → 9                    |
 447 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 448 | 8                    | c    | 5                      |
 449 | 7                    | b    | → 4                    |
 450 | 6                    | a    | → 0                    |
 451 | 5                    | z    | → 4                    |
 452 | 4                    | y    | 10                     |
 453 | 3                    | x    | → 0                    |
 454 | 2                    | j    | → 0                    |
 455 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 456 | 0                    | h    | 3                      |
 457
 458 We end up allocating another value on the heap, and so we have to subtract one
 459 from (2<sup>30</sup>) - 1. It’s easier to write that than `1,073,741,822`. In any
 460 case, we set up the variables as usual.
 461
 462 At the end of `bar()`, it calls `baz()`:
 463
 464 | Address              | Name | Value                  |
 465 |----------------------|------|------------------------|
 466 | (2<sup>30</sup>) - 1 |      | 20                     |
 467 | (2<sup>30</sup>) - 2 |      | 5                      |
 468 | ...                  | ...  | ...                    |
 469 | 12                   | g    | 100                    |
 470 | 11                   | f    | → (2<sup>30</sup>) - 2 |
 471 | 10                   | e    | → 9                    |
 472 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 473 | 8                    | c    | 5                      |
 474 | 7                    | b    | → 4                    |
 475 | 6                    | a    | → 0                    |
 476 | 5                    | z    | → 4                    |
 477 | 4                    | y    | 10                     |
 478 | 3                    | x    | → 0                    |
 479 | 2                    | j    | → 0                    |
 480 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 481 | 0                    | h    | 3                      |
 482
 483 With this, we’re at our deepest point! Whew! Congrats for following along this
 484 far.
 485
 486 After `baz()` is over, we get rid of `f` and `g`:
 487
 488 | Address              | Name | Value                  |
 489 |----------------------|------|------------------------|
 490 | (2<sup>30</sup>) - 1 |      | 20                     |
 491 | (2<sup>30</sup>) - 2 |      | 5                      |
 492 | ...                  | ...  | ...                    |
 493 | 10                   | e    | → 9                    |
 494 | 9                    | d    | → (2<sup>30</sup>) - 2 |
 495 | 8                    | c    | 5                      |
 496 | 7                    | b    | → 4                    |
 497 | 6                    | a    | → 0                    |
 498 | 5                    | z    | → 4                    |
 499 | 4                    | y    | 10                     |
 500 | 3                    | x    | → 0                    |
 501 | 2                    | j    | → 0                    |
 502 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 503 | 0                    | h    | 3                      |
 504
 505 Next, we return from `bar()`. `d` in this case is a `Box<T>`, so it also frees
 506 what it points to: (2<sup>30</sup>) - 2.
 507
 508 | Address              | Name | Value                  |
 509 |----------------------|------|------------------------|
 510 | (2<sup>30</sup>) - 1 |      | 20                     |
 511 | ...                  | ...  | ...                    |
 512 | 5                    | z    | → 4                    |
 513 | 4                    | y    | 10                     |
 514 | 3                    | x    | → 0                    |
 515 | 2                    | j    | → 0                    |
 516 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 517 | 0                    | h    | 3                      |
 518
 519 And after that, `foo()` returns:
 520
 521 | Address              | Name | Value                  |
 522 |----------------------|------|------------------------|
 523 | (2<sup>30</sup>) - 1 |      | 20                     |
 524 | ...                  | ...  | ...                    |
 525 | 2                    | j    | → 0                    |
 526 | 1                    | i    | → (2<sup>30</sup>) - 1 |
 527 | 0                    | h    | 3                      |
 528
 529 And then, finally, `main()`, which cleans the rest up. When `i` is `Drop`ped,
 530 it will clean up the last of the heap too.
 531
 532 # What do other languages do?
 533
 534 Most languages with a garbage collector heap-allocate by default. This means
 535 that every value is boxed. There are a number of reasons why this is done, but
 536 they’re out of scope for this tutorial. There are some possible optimizations
 537 that don’t make it true 100% of the time, too. Rather than relying on the stack
 538 and `Drop` to clean up memory, the garbage collector deals with the heap
 539 instead.
 540
 541 # Which to use?
 542
 543 So if the stack is faster and easier to manage, why do we need the heap? A big
 544 reason is that Stack-allocation alone means you only have 'Last In First Out (LIFO)' semantics for
 545 reclaiming storage. Heap-allocation is strictly more general, allowing storage
 546 to be taken from and returned to the pool in arbitrary order, but at a
 547 complexity cost.
 548
 549 Generally, you should prefer stack allocation, and so, Rust stack-allocates by
 550 default. The LIFO model of the stack is simpler, at a fundamental level. This
 551 has two big impacts: runtime efficiency and semantic impact.
 552
 553 ## Runtime Efficiency
 554
 555 Managing the memory for the stack is trivial: The machine
 556 increments or decrements a single value, the so-called “stack pointer”.
 557 Managing memory for the heap is non-trivial: heap-allocated memory is freed at
 558 arbitrary points, and each block of heap-allocated memory can be of arbitrary
 559 size, so the memory manager must generally work much harder to
 560 identify memory for reuse.
 561
 562 If you’d like to dive into this topic in greater detail, [this paper][wilson]
 563 is a great introduction.
 564
 565 [wilson]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.4688
 566
 567 ## Semantic impact
 568
 569 Stack-allocation impacts the Rust language itself, and thus the developer’s
 570 mental model. The LIFO semantics is what drives how the Rust language handles
 571 automatic memory management. Even the deallocation of a uniquely-owned
 572 heap-allocated box can be driven by the stack-based LIFO semantics, as
 573 discussed throughout this chapter. The flexibility (i.e. expressiveness) of non
 574 LIFO-semantics means that in general the compiler cannot automatically infer at
 575 compile-time where memory should be freed; it has to rely on dynamic protocols,
 576 potentially from outside the language itself, to drive deallocation (reference
 577 counting, as used by `Rc<T>` and `Arc<T>`, is one example of this).
 578
 579 When taken to the extreme, the increased expressive power of heap allocation
 580 comes at the cost of either significant runtime support (e.g. in the form of a
 581 garbage collector) or significant programmer effort (in the form of explicit
 582 memory management calls that require verification not provided by the Rust
 583 compiler).