src/doc/book/2018-edition/src/ch13-04-performance.md

   1 ## Comparing Performance: Loops vs. Iterators
   2
   3 To determine whether to use loops or iterators, you need to know which version
   4 of our `search` functions is faster: the version with an explicit `for` loop or
   5 the version with iterators.
   6
   7 We ran a benchmark by loading the entire contents of *The Adventures of
   8 Sherlock Holmes* by Sir Arthur Conan Doyle into a `String` and looking for the
   9 word *the* in the contents. Here are the results of the benchmark on the
  10 version of `search` using the `for` loop and the version using iterators:
  11
  12 ```text
  13 test bench_search_for  ... bench:  19,620,300 ns/iter (+/- 915,700)
  14 test bench_search_iter ... bench:  19,234,900 ns/iter (+/- 657,200)
  15 ```
  16
  17 The iterator version was slightly faster! We won’t explain the benchmark code
  18 here, because the point is not to prove that the two versions are equivalent
  19 but to get a general sense of how these two implementations compare
  20 performance-wise.
  21
  22 For a more comprehensive benchmark, you should check using various texts of
  23 various sizes as the `contents`, different words and words of different lengths
  24 as the `query`, and all kinds of other variations. The point is this:
  25 iterators, although a high-level abstraction, get compiled down to roughly the
  26 same code as if you’d written the lower-level code yourself. Iterators are one
  27 of Rust’s *zero-cost abstractions*, by which we mean using the abstraction
  28 imposes no additional runtime overhead. This is analogous to how Bjarne
  29 Stroustrup, the original designer and implementor of C++, defines
  30 *zero-overhead* in “Foundations of C++” (2012):
  31
  32 > In general, C++ implementations obey the zero-overhead principle: What you
  33 > don’t use, you don’t pay for. And further: What you do use, you couldn’t hand
  34 > code any better.
  35
  36 As another example, the following code is taken from an audio decoder. The
  37 decoding algorithm uses the linear prediction mathematical operation to
  38 estimate future values based on a linear function of the previous samples. This
  39 code uses an iterator chain to do some math on three variables in scope: a
  40 `buffer` slice of data, an array of 12 `coefficients`, and an amount by which
  41 to shift data in `qlp_shift`. We’ve declared the variables within this example
  42 but not given them any values; although this code doesn’t have much meaning
  43 outside of its context, it’s still a concise, real-world example of how Rust
  44 translates high-level ideas to low-level code.
  45
  46 ```rust,ignore
  47 let buffer: &mut [i32];
  48 let coefficients: [i64; 12];
  49 let qlp_shift: i16;
  50
  51 for i in 12..buffer.len() {
  52     let prediction = coefficients.iter()
  53                                  .zip(&buffer[i - 12..i])
  54                                  .map(|(&c, &s)| c * s as i64)
  55                                  .sum::<i64>() >> qlp_shift;
  56     let delta = buffer[i];
  57     buffer[i] = prediction as i32 + delta;
  58 }
  59 ```
  60
  61 To calculate the value of `prediction`, this code iterates through each of the
  62 12 values in `coefficients` and uses the `zip` method to pair the coefficient
  63 values with the previous 12 values in `buffer`. Then, for each pair, we
  64 multiply the values together, sum all the results, and shift the bits in the
  65 sum `qlp_shift` bits to the right.
  66
  67 Calculations in applications like audio decoders often prioritize performance
  68 most highly. Here, we’re creating an iterator, using two adaptors, and then
  69 consuming the value. What assembly code would this Rust code compile to? Well,
  70 as of this writing, it compiles down to the same assembly you’d write by hand.
  71 There’s no loop at all corresponding to the iteration over the values in
  72 `coefficients`: Rust knows that there are 12 iterations, so it “unrolls” the
  73 loop. *Unrolling* is an optimization that removes the overhead of the loop
  74 controlling code and instead generates repetitive code for each iteration of
  75 the loop.
  76
  77 All of the coefficients get stored in registers, which means accessing the
  78 values is very fast. There are no bounds checks on the array access at runtime.
  79 All these optimizations that Rust is able to apply make the resulting code
  80 extremely efficient. Now that you know this, you can use iterators and closures
  81 without fear! They make code seem like it’s higher level but don’t impose a
  82 runtime performance penalty for doing so.
  83
  84 ## Summary
  85
  86 Closures and iterators are Rust features inspired by functional programming
  87 language ideas. They contribute to Rust’s capability to clearly express
  88 high-level ideas at low-level performance. The implementations of closures and
  89 iterators are such that runtime performance is not affected. This is part of
  90 Rust’s goal to strive to provide zero-cost abstractions.
  91
  92 Now that we’ve improved the expressiveness of our I/O project, let’s look at
  93 some more features of `cargo` that will help us share the project with the
  94 world.