]>
Commit | Line | Data |
---|---|---|
ba9703b0 XL |
1 | # From MIR to Binaries |
2 | ||
f2b60f7d FG |
3 | All of the preceding chapters of this guide have one thing in common: |
4 | we never generated any executable machine code at all! | |
5 | With this chapter, all of that changes. | |
ba9703b0 | 6 | |
f2b60f7d FG |
7 | So far, |
8 | we've shown how the compiler can take raw source code in text format | |
9 | and transform it into [MIR]. | |
10 | We have also shown how the compiler does various | |
11 | analyses on the code to detect things like type or lifetime errors. | |
12 | Now, we will finally take the MIR and produce some executable machine code. | |
ba9703b0 | 13 | |
6a06907d XL |
14 | [MIR]: ./mir/index.md |
15 | ||
f2b60f7d FG |
16 | > NOTE: This part of a compiler is often called the _backend_. |
17 | > The term is a bit overloaded because in the compiler source, | |
18 | > it usually refers to the "codegen backend" (i.e. LLVM, Cranelift, or GCC). | |
19 | > Usually, when you see the word "backend" in this part, | |
20 | > we are referring to the "codegen backend". | |
ba9703b0 XL |
21 | |
22 | So what do we need to do? | |
23 | ||
487cf647 | 24 | 1. First, we need to collect the set of things to generate code for. |
f2b60f7d FG |
25 | In particular, |
26 | we need to find out which concrete types to substitute for generic ones, | |
27 | since we need to generate code for the concrete types. | |
28 | Generating code for the concrete types | |
29 | (i.e. emitting a copy of the code for each concrete type) is called _monomorphization_, | |
30 | so the process of collecting all the concrete types is called _monomorphization collection_. | |
487cf647 | 31 | 2. Next, we need to actually lower the MIR to a codegen IR |
ba9703b0 | 32 | (usually LLVM IR) for each concrete type we collected. |
487cf647 | 33 | 3. Finally, we need to invoke the codegen backend, |
f2b60f7d FG |
34 | which runs a bunch of optimization passes, |
35 | generates executable code, | |
36 | and links together an executable binary. | |
ba9703b0 XL |
37 | |
38 | [codegen1]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html | |
39 | ||
40 | The code for codegen is actually a bit complex due to a few factors: | |
41 | ||
f2b60f7d FG |
42 | - Support for multiple codegen backends (LLVM, Cranelift, and GCC). |
43 | We try to share as much backend code between them as possible, | |
44 | so a lot of it is generic over the codegen implementation. | |
45 | This means that there are often a lot of layers of abstraction. | |
ba9703b0 | 46 | - Codegen happens asynchronously in another thread for performance. |
f2b60f7d | 47 | - The actual codegen is done by a third-party library (either of the 3 backends). |
ba9703b0 | 48 | |
f2b60f7d FG |
49 | Generally, the [`rustc_codegen_ssa`][ssa] crate contains backend-agnostic code, |
50 | while the [`rustc_codegen_llvm`][llvm] crate contains code specific to LLVM codegen. | |
ba9703b0 XL |
51 | |
52 | [ssa]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/index.html | |
53 | [llvm]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_llvm/index.html | |
54 | ||
55 | At a very high level, the entry point is | |
f2b60f7d FG |
56 | [`rustc_codegen_ssa::base::codegen_crate`][codegen1]. |
57 | This function starts the process discussed in the rest of this chapter. |