]>
Commit | Line | Data |
---|---|---|
ba9703b0 XL |
1 | # Overview of the Compiler |
2 | ||
6a06907d XL |
3 | <!-- toc --> |
4 | ||
5 | This chapter is about the overall process of compiling a program -- how | |
6 | everything fits together. | |
7 | ||
8 | The rust compiler is special in two ways: it does things to your code that | |
9 | other compilers don't do (e.g. borrow checking) and it has a lot of | |
10 | unconventional implementation choices (e.g. queries). We will talk about these | |
11 | in turn in this chapter, and in the rest of the guide, we will look at all the | |
12 | individual pieces in more detail. | |
13 | ||
14 | ## What the compiler does to your code | |
15 | ||
16 | So first, let's look at what the compiler does to your code. For now, we will | |
17 | avoid mentioning how the compiler implements these steps except as needed; | |
18 | we'll talk about that later. | |
19 | ||
20 | - The compile process begins when a user writes a Rust source program in text | |
21 | and invokes the `rustc` compiler on it. The work that the compiler needs to | |
22 | perform is defined by command-line options. For example, it is possible to | |
23 | enable nightly features (`-Z` flags), perform `check`-only builds, or emit | |
24 | LLVM-IR rather than executable machine code. The `rustc` executable call may | |
25 | be indirect through the use of `cargo`. | |
26 | - Command line argument parsing occurs in the [`rustc_driver`]. This crate | |
27 | defines the compile configuration that is requested by the user and passes it | |
28 | to the rest of the compilation process as a [`rustc_interface::Config`]. | |
29 | - The raw Rust source text is analyzed by a low-level lexer located in | |
30 | [`rustc_lexer`]. At this stage, the source text is turned into a stream of | |
31 | atomic source code units known as _tokens_. The lexer supports the | |
32 | Unicode character encoding. | |
33 | - The token stream passes through a higher-level lexer located in | |
34 | [`rustc_parse`] to prepare for the next stage of the compile process. The | |
35 | [`StringReader`] struct is used at this stage to perform a set of validations | |
36 | and turn strings into interned symbols (_interning_ is discussed later). | |
37 | [String interning] is a way of storing only one immutable | |
38 | copy of each distinct string value. | |
39 | ||
40 | - The lexer has a small interface and doesn't depend directly on the | |
41 | diagnostic infrastructure in `rustc`. Instead it provides diagnostics as plain | |
42 | data which are emitted in `rustc_parse::lexer::mod` as real diagnostics. | |
43 | - The lexer preserves full fidelity information for both IDEs and proc macros. | |
44 | - The parser [translates the token stream from the lexer into an Abstract Syntax | |
45 | Tree (AST)][parser]. It uses a recursive descent (top-down) approach to syntax | |
46 | analysis. The crate entry points for the parser are the `Parser::parse_crate_mod()` and | |
47 | `Parser::parse_mod()` methods found in `rustc_parse::parser::item`. The external | |
48 | module parsing entry point is `rustc_expand::module::parse_external_mod`. And | |
49 | the macro parser entry point is [`Parser::parse_nonterminal()`][parse_nonterminal]. | |
50 | - Parsing is performed with a set of `Parser` utility methods including `fn bump`, | |
51 | `fn check`, `fn eat`, `fn expect`, `fn look_ahead`. | |
52 | - Parsing is organized by the semantic construct that is being parsed. Separate | |
53 | `parse_*` methods can be found in `rustc_parse` `parser` directory. The source | |
54 | file name follows the construct name. For example, the following files are found | |
55 | in the parser: | |
56 | - `expr.rs` | |
57 | - `pat.rs` | |
58 | - `ty.rs` | |
59 | - `stmt.rs` | |
60 | - This naming scheme is used across many compiler stages. You will find | |
61 | either a file or directory with the same name across the parsing, lowering, | |
62 | type checking, THIR lowering, and MIR building sources. | |
63 | - Macro expansion, AST validation, name resolution, and early linting takes place | |
64 | during this stage of the compile process. | |
65 | - The parser uses the standard `DiagnosticBuilder` API for error handling, but we | |
66 | try to recover, parsing a superset of Rust's grammar, while also emitting an error. | |
67 | - `rustc_ast::ast::{Crate, Mod, Expr, Pat, ...}` AST nodes are returned from the parser. | |
68 | - We then take the AST and [convert it to High-Level Intermediate | |
69 | Representation (HIR)][hir]. This is a compiler-friendly representation of the | |
70 | AST. This involves a lot of desugaring of things like loops and `async fn`. | |
136023e0 XL |
71 | - We use the HIR to do [type inference] (the process of automatic |
72 | detection of the type of an expression), [trait solving] (the process | |
73 | of pairing up an impl with each reference to a trait), and [type | |
74 | checking] (the process of converting the types found in the HIR | |
75 | (`hir::Ty`), which represent the syntactic things that the user wrote, | |
76 | into the internal representation used by the compiler (`Ty<'tcx>`), | |
77 | and using that information to verify the type safety, correctness and | |
78 | coherence of the types used in the program). | |
6a06907d XL |
79 | - The HIR is then [lowered to Mid-Level Intermediate Representation (MIR)][mir]. |
80 | - Along the way, we construct the THIR, which is an even more desugared HIR. | |
81 | THIR is used for pattern and exhaustiveness checking. It is also more | |
82 | convenient to convert into MIR than HIR is. | |
83 | - The MIR is used for [borrow checking]. | |
84 | - We (want to) do [many optimizations on the MIR][mir-opt] because it is still | |
85 | generic and that improves the code we generate later, improving compilation | |
86 | speed too. | |
87 | - MIR is a higher level (and generic) representation, so it is easier to do | |
88 | some optimizations at MIR level than at LLVM-IR level. For example LLVM | |
89 | doesn't seem to be able to optimize the pattern the [`simplify_try`] mir | |
90 | opt looks for. | |
91 | - Rust code is _monomorphized_, which means making copies of all the generic | |
92 | code with the type parameters replaced by concrete types. To do | |
93 | this, we need to collect a list of what concrete types to generate code for. | |
94 | This is called _monomorphization collection_. | |
95 | - We then begin what is vaguely called _code generation_ or _codegen_. | |
96 | - The [code generation stage (codegen)][codegen] is when higher level | |
97 | representations of source are turned into an executable binary. `rustc` | |
98 | uses LLVM for code generation. The first step is to convert the MIR | |
99 | to LLVM Intermediate Representation (LLVM IR). This is where the MIR | |
100 | is actually monomorphized, according to the list we created in the | |
101 | previous step. | |
102 | - The LLVM IR is passed to LLVM, which does a lot more optimizations on it. | |
103 | It then emits machine code. It is basically assembly code with additional | |
104 | low-level types and annotations added. (e.g. an ELF object or wasm). | |
105 | - The different libraries/binaries are linked together to produce the final | |
106 | binary. | |
107 | ||
108 | [String interning]: https://en.wikipedia.org/wiki/String_interning | |
109 | [`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html | |
110 | [`rustc_driver`]: https://rustc-dev-guide.rust-lang.org/rustc-driver.html | |
111 | [`rustc_interface::Config`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/interface/struct.Config.html | |
112 | [lex]: https://rustc-dev-guide.rust-lang.org/the-parser.html | |
113 | [`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html | |
114 | [`rustc_parse`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html | |
115 | [parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html | |
116 | [hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html | |
117 | [type inference]: https://rustc-dev-guide.rust-lang.org/type-inference.html | |
136023e0 XL |
118 | [trait solving]: https://rustc-dev-guide.rust-lang.org/traits/resolution.html |
119 | [type checking]: https://rustc-dev-guide.rust-lang.org/type-checking.html | |
6a06907d XL |
120 | [mir]: https://rustc-dev-guide.rust-lang.org/mir/index.html |
121 | [borrow checking]: https://rustc-dev-guide.rust-lang.org/borrow_check.html | |
122 | [mir-opt]: https://rustc-dev-guide.rust-lang.org/mir/optimizations.html | |
123 | [`simplify_try`]: https://github.com/rust-lang/rust/pull/66282 | |
124 | [codegen]: https://rustc-dev-guide.rust-lang.org/backend/codegen.html | |
125 | [parse_nonterminal]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_nonterminal | |
126 | ||
127 | ## How it does it | |
128 | ||
129 | Ok, so now that we have a high-level view of what the compiler does to your | |
130 | code, let's take a high-level view of _how_ it does all that stuff. There are a | |
131 | lot of constraints and conflicting goals that the compiler needs to | |
132 | satisfy/optimize for. For example, | |
133 | ||
134 | - Compilation speed: how fast is it to compile a program. More/better | |
135 | compile-time analyses often means compilation is slower. | |
136 | - Also, we want to support incremental compilation, so we need to take that | |
137 | into account. How can we keep track of what work needs to be redone and | |
138 | what can be reused if the user modifies their program? | |
139 | - Also we can't store too much stuff in the incremental cache because | |
140 | it would take a long time to load from disk and it could take a lot | |
141 | of space on the user's system... | |
142 | - Compiler memory usage: while compiling a program, we don't want to use more | |
143 | memory than we need. | |
144 | - Program speed: how fast is your compiled program. More/better compile-time | |
145 | analyses often means the compiler can do better optimizations. | |
146 | - Program size: how large is the compiled binary? Similar to the previous | |
147 | point. | |
148 | - Compiler compilation speed: how long does it take to compile the compiler? | |
149 | This impacts contributors and compiler maintenance. | |
150 | - Implementation complexity: building a compiler is one of the hardest | |
151 | things a person/group can do, and Rust is not a very simple language, so how | |
152 | do we make the compiler's code base manageable? | |
153 | - Compiler correctness: the binaries produced by the compiler should do what | |
154 | the input programs says they do, and should continue to do so despite the | |
155 | tremendous amount of change constantly going on. | |
156 | - Integration: a number of other tools need to use the compiler in | |
157 | various ways (e.g. cargo, clippy, miri, RLS) that must be supported. | |
158 | - Compiler stability: the compiler should not crash or fail ungracefully on the | |
159 | stable channel. | |
160 | - Rust stability: the compiler must respect Rust's stability guarantees by not | |
161 | breaking programs that previously compiled despite the many changes that are | |
162 | always going on to its implementation. | |
163 | - Limitations of other tools: rustc uses LLVM in its backend, and LLVM has some | |
164 | strengths we leverage and some limitations/weaknesses we need to work around. | |
165 | ||
166 | So, as you read through the rest of the guide, keep these things in mind. They | |
167 | will often inform decisions that we make. | |
168 | ||
169 | ### Intermediate representations | |
170 | ||
171 | As with most compilers, `rustc` uses some intermediate representations (IRs) to | |
172 | facilitate computations. In general, working directly with the source code is | |
173 | extremely inconvenient and error-prone. Source code is designed to be human-friendly while at | |
174 | the same time being unambiguous, but it's less convenient for doing something | |
175 | like, say, type checking. | |
176 | ||
177 | Instead most compilers, including `rustc`, build some sort of IR out of the | |
178 | source code which is easier to analyze. `rustc` has a few IRs, each optimized | |
179 | for different purposes: | |
180 | ||
181 | - Token stream: the lexer produces a stream of tokens directly from the source | |
182 | code. This stream of tokens is easier for the parser to deal with than raw | |
183 | text. | |
184 | - Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream | |
185 | of tokens produced by the lexer. It represents | |
186 | pretty much exactly what the user wrote. It helps to do some syntactic sanity | |
187 | checking (e.g. checking that a type is expected where the user wrote one). | |
188 | - High-level IR (HIR): This is a sort of desugared AST. It's still close | |
189 | to what the user wrote syntactically, but it includes some implicit things | |
190 | such as some elided lifetimes, etc. This IR is amenable to type checking. | |
191 | - Typed HIR (THIR): This is an intermediate between HIR and MIR, and used to be called | |
192 | High-level Abstract IR (HAIR). It is like the HIR but it is fully typed and a bit | |
193 | more desugared (e.g. method calls and implicit dereferences are made fully explicit). | |
194 | Moreover, it is easier to lower to MIR from THIR than from HIR. | |
195 | - Middle-level IR (MIR): This IR is basically a Control-Flow Graph (CFG). A CFG | |
196 | is a type of diagram that shows the basic blocks of a program and how control | |
197 | flow can go between them. Likewise, MIR also has a bunch of basic blocks with | |
198 | simple typed statements inside them (e.g. assignment, simple computations, | |
199 | etc) and control flow edges to other basic blocks (e.g., calls, dropping | |
200 | values). MIR is used for borrow checking and other | |
201 | important dataflow-based checks, such as checking for uninitialized values. | |
202 | It is also used for a series of optimizations and for constant evaluation (via | |
203 | MIRI). Because MIR is still generic, we can do a lot of analyses here more | |
204 | efficiently than after monomorphization. | |
205 | - LLVM IR: This is the standard form of all input to the LLVM compiler. LLVM IR | |
206 | is a sort of typed assembly language with lots of annotations. It's | |
207 | a standard format that is used by all compilers that use LLVM (e.g. the clang | |
208 | C compiler also outputs LLVM IR). LLVM IR is designed to be easy for other | |
209 | compilers to emit and also rich enough for LLVM to run a bunch of | |
210 | optimizations on it. | |
211 | ||
212 | One other thing to note is that many values in the compiler are _interned_. | |
213 | This is a performance and memory optimization in which we allocate the values | |
214 | in a special allocator called an _arena_. Then, we pass around references to | |
215 | the values allocated in the arena. This allows us to make sure that identical | |
216 | values (e.g. types in your program) are only allocated once and can be compared | |
217 | cheaply by comparing pointers. Many of the intermediate representations are | |
218 | interned. | |
219 | ||
220 | ### Queries | |
221 | ||
222 | The first big implementation choice is the _query_ system. The rust compiler | |
223 | uses a query system which is unlike most textbook compilers, which are | |
224 | organized as a series of passes over the code that execute sequentially. The | |
225 | compiler does this to make incremental compilation possible -- that is, if the | |
226 | user makes a change to their program and recompiles, we want to do as little | |
227 | redundant work as possible to produce the new binary. | |
228 | ||
229 | In `rustc`, all the major steps above are organized as a bunch of queries that | |
230 | call each other. For example, there is a query to ask for the type of something | |
231 | and another to ask for the optimized MIR of a function. These | |
232 | queries can call each other and are all tracked through the query system. | |
233 | The results of the queries are cached on disk so that we can tell which | |
234 | queries' results changed from the last compilation and only redo those. This is | |
235 | how incremental compilation works. | |
236 | ||
237 | In principle, for the query-fied steps, we do each of the above for each item | |
238 | individually. For example, we will take the HIR for a function and use queries | |
239 | to ask for the LLVM IR for that HIR. This drives the generation of optimized | |
240 | MIR, which drives the borrow checker, which drives the generation of MIR, and | |
241 | so on. | |
242 | ||
243 | ... except that this is very over-simplified. In fact, some queries are not | |
244 | cached on disk, and some parts of the compiler have to run for all code anyway | |
245 | for correctness even if the code is dead code (e.g. the borrow checker). For | |
246 | example, [currently the `mir_borrowck` query is first executed on all functions | |
247 | of a crate.][passes] Then the codegen backend invokes the | |
248 | `collect_and_partition_mono_items` query, which first recursively requests the | |
249 | `optimized_mir` for all reachable functions, which in turn runs `mir_borrowck` | |
250 | for that function and then creates codegen units. This kind of split will need | |
251 | to remain to ensure that unreachable functions still have their errors emitted. | |
252 | ||
253 | [passes]: https://github.com/rust-lang/rust/blob/45ebd5808afd3df7ba842797c0fcd4447ddf30fb/src/librustc_interface/passes.rs#L824 | |
254 | ||
255 | Moreover, the compiler wasn't originally built to use a query system; the query | |
256 | system has been retrofitted into the compiler, so parts of it are not query-fied | |
257 | yet. Also, LLVM isn't our code, so that isn't querified either. The plan is to | |
258 | eventually query-fy all of the steps listed in the previous section, | |
259 | but as of <!-- date: 2021-02 --> February 2021, only the steps between HIR and | |
260 | LLVM IR are query-fied. That is, lexing, parsing, name resolution, and macro | |
261 | expansion are done all at once for the whole program. | |
262 | ||
263 | One other thing to mention here is the all-important "typing context", | |
264 | [`TyCtxt`], which is a giant struct that is at the center of all things. | |
265 | (Note that the name is mostly historic. This is _not_ a "typing context" in the | |
266 | sense of `Γ` or `Δ` from type theory. The name is retained because that's what | |
267 | the name of the struct is in the source code.) All | |
268 | queries are defined as methods on the [`TyCtxt`] type, and the in-memory query | |
269 | cache is stored there too. In the code, there is usually a variable called | |
270 | `tcx` which is a handle on the typing context. You will also see lifetimes with | |
271 | the name `'tcx`, which means that something is tied to the lifetime of the | |
272 | `TyCtxt` (usually it is stored or interned there). | |
273 | ||
274 | [`TyCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html | |
275 | ||
276 | ### `ty::Ty` | |
277 | ||
278 | Types are really important in Rust, and they form the core of a lot of compiler | |
279 | analyses. The main type (in the compiler) that represents types (in the user's | |
280 | program) is [`rustc_middle::ty::Ty`][ty]. This is so important that we have a whole chapter | |
281 | on [`ty::Ty`][ty], but for now, we just want to mention that it exists and is the way | |
282 | `rustc` represents types! | |
283 | ||
284 | Also note that the `rustc_middle::ty` module defines the `TyCtxt` struct we mentioned before. | |
285 | ||
286 | [ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.Ty.html | |
287 | ||
288 | ### Parallelism | |
289 | ||
290 | Compiler performance is a problem that we would like to improve on | |
291 | (and are always working on). One aspect of that is parallelizing | |
292 | `rustc` itself. | |
293 | ||
294 | Currently, there is only one part of rustc that is already parallel: codegen. | |
295 | During monomorphization, the compiler will split up all the code to be | |
296 | generated into smaller chunks called _codegen units_. These are then generated | |
297 | by independent instances of LLVM. Since they are independent, we can run them | |
298 | in parallel. At the end, the linker is run to combine all the codegen units | |
299 | together into one binary. | |
300 | ||
301 | However, the rest of the compiler is still not yet parallel. There have been | |
302 | lots of efforts spent on this, but it is generally a hard problem. The current | |
303 | approach is to turn `RefCell`s into `Mutex`s -- that is, we | |
304 | switch to thread-safe internal mutability. However, there are ongoing | |
305 | challenges with lock contention, maintaining query-system invariants under | |
306 | concurrency, and the complexity of the code base. One can try out the current | |
307 | work by enabling parallel compilation in `config.toml`. It's still early days, | |
308 | but there are already some promising performance improvements. | |
309 | ||
310 | ### Bootstrapping | |
311 | ||
312 | `rustc` itself is written in Rust. So how do we compile the compiler? We use an | |
313 | older compiler to compile the newer compiler. This is called [_bootstrapping_]. | |
314 | ||
315 | Bootstrapping has a lot of interesting implications. For example, it means | |
316 | that one of the major users of Rust is the Rust compiler, so we are | |
317 | constantly testing our own software ("eating our own dogfood"). | |
318 | ||
319 | For more details on bootstrapping, see | |
320 | [the bootstrapping section of the guide][rustc-bootstrap]. | |
321 | ||
322 | [_bootstrapping_]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) | |
323 | [rustc-bootstrap]: building/bootstrapping.md | |
324 | ||
325 | # Unresolved Questions | |
326 | ||
327 | - Does LLVM ever do optimizations in debug builds? | |
328 | - How do I explore phases of the compile process in my own sources (lexer, | |
329 | parser, HIR, etc)? - e.g., `cargo rustc -- -Z unpretty=hir-tree` allows you to | |
330 | view HIR representation | |
331 | - What is the main source entry point for `X`? | |
332 | - Where do phases diverge for cross-compilation to machine code across | |
333 | different platforms? | |
334 | ||
335 | # References | |
336 | ||
337 | - Command line parsing | |
338 | - Guide: [The Rustc Driver and Interface](https://rustc-dev-guide.rust-lang.org/rustc-driver.html) | |
339 | - Driver definition: [`rustc_driver`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/) | |
340 | - Main entry point: [`rustc_session::config::build_session_options`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_session/config/fn.build_session_options.html) | |
341 | - Lexical Analysis: Lex the user program to a stream of tokens | |
342 | - Guide: [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) | |
343 | - Lexer definition: [`rustc_lexer`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html) | |
344 | - Main entry point: [`rustc_lexer::first_token`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/fn.first_token.html) | |
345 | - Parsing: Parse the stream of tokens to an Abstract Syntax Tree (AST) | |
346 | - Guide: [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) | |
347 | - Parser definition: [`rustc_parse`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html) | |
348 | - Main entry points: | |
349 | - [Entry point for first file in crate](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/passes/fn.parse.html) | |
350 | - [Entry point for outline module parsing](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html) | |
351 | - [Entry point for macro fragments][parse_nonterminal] | |
352 | - AST definition: [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) | |
353 | - Expansion: **TODO** | |
354 | - Name Resolution: **TODO** | |
355 | - Feature gating: **TODO** | |
356 | - Early linting: **TODO** | |
357 | - The High Level Intermediate Representation (HIR) | |
358 | - Guide: [The HIR](https://rustc-dev-guide.rust-lang.org/hir.html) | |
359 | - Guide: [Identifiers in the HIR](https://rustc-dev-guide.rust-lang.org/hir.html#identifiers-in-the-hir) | |
360 | - Guide: [The HIR Map](https://rustc-dev-guide.rust-lang.org/hir.html#the-hir-map) | |
361 | - Guide: [Lowering AST to HIR](https://rustc-dev-guide.rust-lang.org/lowering.html) | |
362 | - How to view HIR representation for your code `cargo rustc -- -Z unpretty=hir-tree` | |
363 | - Rustc HIR definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) | |
364 | - Main entry point: **TODO** | |
365 | - Late linting: **TODO** | |
366 | - Type Inference | |
367 | - Guide: [Type Inference](https://rustc-dev-guide.rust-lang.org/type-inference.html) | |
368 | - Guide: [The ty Module: Representing Types](https://rustc-dev-guide.rust-lang.org/ty.html) (semantics) | |
369 | - Main entry point (type inference): [`InferCtxtBuilder::enter`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtBuilder.html#method.enter) | |
370 | - Main entry point (type checking bodies): [the `typeck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html#method.typeck) | |
371 | - These two functions can't be decoupled. | |
372 | - The Mid Level Intermediate Representation (MIR) | |
373 | - Guide: [The MIR (Mid level IR)](https://rustc-dev-guide.rust-lang.org/mir/index.html) | |
374 | - Definition: [`rustc_middle/src/mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html) | |
375 | - Definition of source that manipulates the MIR: [`rustc_mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/index.html) | |
376 | - The Borrow Checker | |
377 | - Guide: [MIR Borrow Check](https://rustc-dev-guide.rust-lang.org/borrow_check.html) | |
378 | - Definition: [`rustc_mir/borrow_check`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/index.html) | |
379 | - Main entry point: [`mir_borrowck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/borrow_check/fn.mir_borrowck.html) | |
380 | - MIR Optimizations | |
381 | - Guide: [MIR Optimizations](https://rustc-dev-guide.rust-lang.org/mir/optimizations.html) | |
382 | - Definition: [`rustc_mir/transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/index.html) | |
383 | - Main entry point: [`optimized_mir` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/fn.optimized_mir.html) | |
384 | - Code Generation | |
385 | - Guide: [Code Generation](https://rustc-dev-guide.rust-lang.org/backend/codegen.html) | |
386 | - Generating Machine Code from LLVM IR with LLVM - **TODO: reference?** | |
387 | - Main entry point: [`rustc_codegen_ssa::base::codegen_crate`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html) | |
388 | - This monomorphizes and produces LLVM IR for one codegen unit. It then | |
389 | starts a background thread to run LLVM, which must be joined later. | |
390 | - Monomorphization happens lazily via [`FunctionCx::monomorphize`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.monomorphize) and [`rustc_codegen_ssa::base::codegen_instance `](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html) |