]>
Commit | Line | Data |
---|---|---|
ba9703b0 XL |
1 | # Overview of the Compiler |
2 | ||
6a06907d XL |
3 | <!-- toc --> |
4 | ||
5 | This chapter is about the overall process of compiling a program -- how | |
6 | everything fits together. | |
7 | ||
5099ac24 | 8 | The Rust compiler is special in two ways: it does things to your code that |
6a06907d XL |
9 | other compilers don't do (e.g. borrow checking) and it has a lot of |
10 | unconventional implementation choices (e.g. queries). We will talk about these | |
11 | in turn in this chapter, and in the rest of the guide, we will look at all the | |
12 | individual pieces in more detail. | |
13 | ||
14 | ## What the compiler does to your code | |
15 | ||
16 | So first, let's look at what the compiler does to your code. For now, we will | |
17 | avoid mentioning how the compiler implements these steps except as needed; | |
18 | we'll talk about that later. | |
19 | ||
20 | - The compile process begins when a user writes a Rust source program in text | |
21 | and invokes the `rustc` compiler on it. The work that the compiler needs to | |
22 | perform is defined by command-line options. For example, it is possible to | |
23 | enable nightly features (`-Z` flags), perform `check`-only builds, or emit | |
24 | LLVM-IR rather than executable machine code. The `rustc` executable call may | |
25 | be indirect through the use of `cargo`. | |
26 | - Command line argument parsing occurs in the [`rustc_driver`]. This crate | |
27 | defines the compile configuration that is requested by the user and passes it | |
28 | to the rest of the compilation process as a [`rustc_interface::Config`]. | |
29 | - The raw Rust source text is analyzed by a low-level lexer located in | |
30 | [`rustc_lexer`]. At this stage, the source text is turned into a stream of | |
31 | atomic source code units known as _tokens_. The lexer supports the | |
32 | Unicode character encoding. | |
33 | - The token stream passes through a higher-level lexer located in | |
34 | [`rustc_parse`] to prepare for the next stage of the compile process. The | |
35 | [`StringReader`] struct is used at this stage to perform a set of validations | |
36 | and turn strings into interned symbols (_interning_ is discussed later). | |
37 | [String interning] is a way of storing only one immutable | |
38 | copy of each distinct string value. | |
39 | ||
40 | - The lexer has a small interface and doesn't depend directly on the | |
41 | diagnostic infrastructure in `rustc`. Instead it provides diagnostics as plain | |
42 | data which are emitted in `rustc_parse::lexer::mod` as real diagnostics. | |
43 | - The lexer preserves full fidelity information for both IDEs and proc macros. | |
44 | - The parser [translates the token stream from the lexer into an Abstract Syntax | |
45 | Tree (AST)][parser]. It uses a recursive descent (top-down) approach to syntax | |
c295e0f8 XL |
46 | analysis. The crate entry points for the parser are the |
47 | [`Parser::parse_crate_mod()`][parse_crate_mod] and [`Parser::parse_mod()`][parse_mod] | |
48 | methods found in [`rustc_parse::parser::Parser`]. The external module parsing | |
49 | entry point is [`rustc_expand::module::parse_external_mod`][parse_external_mod]. | |
50 | And the macro parser entry point is [`Parser::parse_nonterminal()`][parse_nonterminal]. | |
6a06907d XL |
51 | - Parsing is performed with a set of `Parser` utility methods including `fn bump`, |
52 | `fn check`, `fn eat`, `fn expect`, `fn look_ahead`. | |
53 | - Parsing is organized by the semantic construct that is being parsed. Separate | |
c295e0f8 XL |
54 | `parse_*` methods can be found in [`rustc_parse` `parser`][rustc_parse_parser_dir] |
55 | directory. The source file name follows the construct name. For example, the | |
56 | following files are found in the parser: | |
6a06907d XL |
57 | - `expr.rs` |
58 | - `pat.rs` | |
59 | - `ty.rs` | |
60 | - `stmt.rs` | |
61 | - This naming scheme is used across many compiler stages. You will find | |
62 | either a file or directory with the same name across the parsing, lowering, | |
63 | type checking, THIR lowering, and MIR building sources. | |
64 | - Macro expansion, AST validation, name resolution, and early linting takes place | |
65 | during this stage of the compile process. | |
66 | - The parser uses the standard `DiagnosticBuilder` API for error handling, but we | |
67 | try to recover, parsing a superset of Rust's grammar, while also emitting an error. | |
68 | - `rustc_ast::ast::{Crate, Mod, Expr, Pat, ...}` AST nodes are returned from the parser. | |
69 | - We then take the AST and [convert it to High-Level Intermediate | |
70 | Representation (HIR)][hir]. This is a compiler-friendly representation of the | |
71 | AST. This involves a lot of desugaring of things like loops and `async fn`. | |
136023e0 XL |
72 | - We use the HIR to do [type inference] (the process of automatic |
73 | detection of the type of an expression), [trait solving] (the process | |
74 | of pairing up an impl with each reference to a trait), and [type | |
75 | checking] (the process of converting the types found in the HIR | |
76 | (`hir::Ty`), which represent the syntactic things that the user wrote, | |
77 | into the internal representation used by the compiler (`Ty<'tcx>`), | |
78 | and using that information to verify the type safety, correctness and | |
79 | coherence of the types used in the program). | |
6a06907d XL |
80 | - The HIR is then [lowered to Mid-Level Intermediate Representation (MIR)][mir]. |
81 | - Along the way, we construct the THIR, which is an even more desugared HIR. | |
82 | THIR is used for pattern and exhaustiveness checking. It is also more | |
83 | convenient to convert into MIR than HIR is. | |
84 | - The MIR is used for [borrow checking]. | |
85 | - We (want to) do [many optimizations on the MIR][mir-opt] because it is still | |
86 | generic and that improves the code we generate later, improving compilation | |
87 | speed too. | |
88 | - MIR is a higher level (and generic) representation, so it is easier to do | |
89 | some optimizations at MIR level than at LLVM-IR level. For example LLVM | |
90 | doesn't seem to be able to optimize the pattern the [`simplify_try`] mir | |
91 | opt looks for. | |
92 | - Rust code is _monomorphized_, which means making copies of all the generic | |
93 | code with the type parameters replaced by concrete types. To do | |
94 | this, we need to collect a list of what concrete types to generate code for. | |
95 | This is called _monomorphization collection_. | |
96 | - We then begin what is vaguely called _code generation_ or _codegen_. | |
97 | - The [code generation stage (codegen)][codegen] is when higher level | |
98 | representations of source are turned into an executable binary. `rustc` | |
99 | uses LLVM for code generation. The first step is to convert the MIR | |
100 | to LLVM Intermediate Representation (LLVM IR). This is where the MIR | |
101 | is actually monomorphized, according to the list we created in the | |
102 | previous step. | |
103 | - The LLVM IR is passed to LLVM, which does a lot more optimizations on it. | |
104 | It then emits machine code. It is basically assembly code with additional | |
105 | low-level types and annotations added. (e.g. an ELF object or wasm). | |
106 | - The different libraries/binaries are linked together to produce the final | |
107 | binary. | |
108 | ||
109 | [String interning]: https://en.wikipedia.org/wiki/String_interning | |
110 | [`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html | |
111 | [`rustc_driver`]: https://rustc-dev-guide.rust-lang.org/rustc-driver.html | |
112 | [`rustc_interface::Config`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/interface/struct.Config.html | |
113 | [lex]: https://rustc-dev-guide.rust-lang.org/the-parser.html | |
114 | [`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html | |
115 | [`rustc_parse`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html | |
116 | [parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html | |
117 | [hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html | |
118 | [type inference]: https://rustc-dev-guide.rust-lang.org/type-inference.html | |
136023e0 XL |
119 | [trait solving]: https://rustc-dev-guide.rust-lang.org/traits/resolution.html |
120 | [type checking]: https://rustc-dev-guide.rust-lang.org/type-checking.html | |
6a06907d XL |
121 | [mir]: https://rustc-dev-guide.rust-lang.org/mir/index.html |
122 | [borrow checking]: https://rustc-dev-guide.rust-lang.org/borrow_check.html | |
123 | [mir-opt]: https://rustc-dev-guide.rust-lang.org/mir/optimizations.html | |
124 | [`simplify_try`]: https://github.com/rust-lang/rust/pull/66282 | |
125 | [codegen]: https://rustc-dev-guide.rust-lang.org/backend/codegen.html | |
126 | [parse_nonterminal]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_nonterminal | |
c295e0f8 XL |
127 | [parse_crate_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_crate_mod |
128 | [parse_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_mod | |
129 | [`rustc_parse::parser::Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html | |
130 | [parse_external_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html | |
131 | [rustc_parse_parser_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_parse/src/parser | |
6a06907d XL |
132 | |
133 | ## How it does it | |
134 | ||
135 | Ok, so now that we have a high-level view of what the compiler does to your | |
136 | code, let's take a high-level view of _how_ it does all that stuff. There are a | |
137 | lot of constraints and conflicting goals that the compiler needs to | |
138 | satisfy/optimize for. For example, | |
139 | ||
140 | - Compilation speed: how fast is it to compile a program. More/better | |
141 | compile-time analyses often means compilation is slower. | |
142 | - Also, we want to support incremental compilation, so we need to take that | |
143 | into account. How can we keep track of what work needs to be redone and | |
144 | what can be reused if the user modifies their program? | |
145 | - Also we can't store too much stuff in the incremental cache because | |
146 | it would take a long time to load from disk and it could take a lot | |
147 | of space on the user's system... | |
148 | - Compiler memory usage: while compiling a program, we don't want to use more | |
149 | memory than we need. | |
150 | - Program speed: how fast is your compiled program. More/better compile-time | |
151 | analyses often means the compiler can do better optimizations. | |
152 | - Program size: how large is the compiled binary? Similar to the previous | |
153 | point. | |
154 | - Compiler compilation speed: how long does it take to compile the compiler? | |
155 | This impacts contributors and compiler maintenance. | |
156 | - Implementation complexity: building a compiler is one of the hardest | |
157 | things a person/group can do, and Rust is not a very simple language, so how | |
158 | do we make the compiler's code base manageable? | |
159 | - Compiler correctness: the binaries produced by the compiler should do what | |
160 | the input programs says they do, and should continue to do so despite the | |
161 | tremendous amount of change constantly going on. | |
162 | - Integration: a number of other tools need to use the compiler in | |
163 | various ways (e.g. cargo, clippy, miri, RLS) that must be supported. | |
164 | - Compiler stability: the compiler should not crash or fail ungracefully on the | |
165 | stable channel. | |
166 | - Rust stability: the compiler must respect Rust's stability guarantees by not | |
167 | breaking programs that previously compiled despite the many changes that are | |
168 | always going on to its implementation. | |
169 | - Limitations of other tools: rustc uses LLVM in its backend, and LLVM has some | |
170 | strengths we leverage and some limitations/weaknesses we need to work around. | |
171 | ||
172 | So, as you read through the rest of the guide, keep these things in mind. They | |
173 | will often inform decisions that we make. | |
174 | ||
175 | ### Intermediate representations | |
176 | ||
177 | As with most compilers, `rustc` uses some intermediate representations (IRs) to | |
178 | facilitate computations. In general, working directly with the source code is | |
179 | extremely inconvenient and error-prone. Source code is designed to be human-friendly while at | |
180 | the same time being unambiguous, but it's less convenient for doing something | |
181 | like, say, type checking. | |
182 | ||
183 | Instead most compilers, including `rustc`, build some sort of IR out of the | |
184 | source code which is easier to analyze. `rustc` has a few IRs, each optimized | |
185 | for different purposes: | |
186 | ||
187 | - Token stream: the lexer produces a stream of tokens directly from the source | |
188 | code. This stream of tokens is easier for the parser to deal with than raw | |
189 | text. | |
190 | - Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream | |
191 | of tokens produced by the lexer. It represents | |
192 | pretty much exactly what the user wrote. It helps to do some syntactic sanity | |
193 | checking (e.g. checking that a type is expected where the user wrote one). | |
194 | - High-level IR (HIR): This is a sort of desugared AST. It's still close | |
195 | to what the user wrote syntactically, but it includes some implicit things | |
196 | such as some elided lifetimes, etc. This IR is amenable to type checking. | |
197 | - Typed HIR (THIR): This is an intermediate between HIR and MIR, and used to be called | |
198 | High-level Abstract IR (HAIR). It is like the HIR but it is fully typed and a bit | |
199 | more desugared (e.g. method calls and implicit dereferences are made fully explicit). | |
200 | Moreover, it is easier to lower to MIR from THIR than from HIR. | |
201 | - Middle-level IR (MIR): This IR is basically a Control-Flow Graph (CFG). A CFG | |
202 | is a type of diagram that shows the basic blocks of a program and how control | |
203 | flow can go between them. Likewise, MIR also has a bunch of basic blocks with | |
204 | simple typed statements inside them (e.g. assignment, simple computations, | |
205 | etc) and control flow edges to other basic blocks (e.g., calls, dropping | |
206 | values). MIR is used for borrow checking and other | |
207 | important dataflow-based checks, such as checking for uninitialized values. | |
208 | It is also used for a series of optimizations and for constant evaluation (via | |
209 | MIRI). Because MIR is still generic, we can do a lot of analyses here more | |
210 | efficiently than after monomorphization. | |
211 | - LLVM IR: This is the standard form of all input to the LLVM compiler. LLVM IR | |
212 | is a sort of typed assembly language with lots of annotations. It's | |
213 | a standard format that is used by all compilers that use LLVM (e.g. the clang | |
214 | C compiler also outputs LLVM IR). LLVM IR is designed to be easy for other | |
215 | compilers to emit and also rich enough for LLVM to run a bunch of | |
216 | optimizations on it. | |
217 | ||
218 | One other thing to note is that many values in the compiler are _interned_. | |
219 | This is a performance and memory optimization in which we allocate the values | |
220 | in a special allocator called an _arena_. Then, we pass around references to | |
221 | the values allocated in the arena. This allows us to make sure that identical | |
222 | values (e.g. types in your program) are only allocated once and can be compared | |
223 | cheaply by comparing pointers. Many of the intermediate representations are | |
224 | interned. | |
225 | ||
226 | ### Queries | |
227 | ||
5099ac24 | 228 | The first big implementation choice is the _query_ system. The Rust compiler |
6a06907d XL |
229 | uses a query system which is unlike most textbook compilers, which are |
230 | organized as a series of passes over the code that execute sequentially. The | |
231 | compiler does this to make incremental compilation possible -- that is, if the | |
232 | user makes a change to their program and recompiles, we want to do as little | |
233 | redundant work as possible to produce the new binary. | |
234 | ||
235 | In `rustc`, all the major steps above are organized as a bunch of queries that | |
236 | call each other. For example, there is a query to ask for the type of something | |
237 | and another to ask for the optimized MIR of a function. These | |
238 | queries can call each other and are all tracked through the query system. | |
239 | The results of the queries are cached on disk so that we can tell which | |
240 | queries' results changed from the last compilation and only redo those. This is | |
241 | how incremental compilation works. | |
242 | ||
243 | In principle, for the query-fied steps, we do each of the above for each item | |
244 | individually. For example, we will take the HIR for a function and use queries | |
245 | to ask for the LLVM IR for that HIR. This drives the generation of optimized | |
246 | MIR, which drives the borrow checker, which drives the generation of MIR, and | |
247 | so on. | |
248 | ||
249 | ... except that this is very over-simplified. In fact, some queries are not | |
250 | cached on disk, and some parts of the compiler have to run for all code anyway | |
251 | for correctness even if the code is dead code (e.g. the borrow checker). For | |
252 | example, [currently the `mir_borrowck` query is first executed on all functions | |
253 | of a crate.][passes] Then the codegen backend invokes the | |
254 | `collect_and_partition_mono_items` query, which first recursively requests the | |
255 | `optimized_mir` for all reachable functions, which in turn runs `mir_borrowck` | |
256 | for that function and then creates codegen units. This kind of split will need | |
257 | to remain to ensure that unreachable functions still have their errors emitted. | |
258 | ||
259 | [passes]: https://github.com/rust-lang/rust/blob/45ebd5808afd3df7ba842797c0fcd4447ddf30fb/src/librustc_interface/passes.rs#L824 | |
260 | ||
261 | Moreover, the compiler wasn't originally built to use a query system; the query | |
262 | system has been retrofitted into the compiler, so parts of it are not query-fied | |
263 | yet. Also, LLVM isn't our code, so that isn't querified either. The plan is to | |
264 | eventually query-fy all of the steps listed in the previous section, | |
3c0e092e | 265 | but as of <!-- date: 2021-11 --> November 2021, only the steps between HIR and |
6a06907d XL |
266 | LLVM IR are query-fied. That is, lexing, parsing, name resolution, and macro |
267 | expansion are done all at once for the whole program. | |
268 | ||
269 | One other thing to mention here is the all-important "typing context", | |
270 | [`TyCtxt`], which is a giant struct that is at the center of all things. | |
271 | (Note that the name is mostly historic. This is _not_ a "typing context" in the | |
272 | sense of `Γ` or `Δ` from type theory. The name is retained because that's what | |
273 | the name of the struct is in the source code.) All | |
274 | queries are defined as methods on the [`TyCtxt`] type, and the in-memory query | |
275 | cache is stored there too. In the code, there is usually a variable called | |
276 | `tcx` which is a handle on the typing context. You will also see lifetimes with | |
277 | the name `'tcx`, which means that something is tied to the lifetime of the | |
278 | `TyCtxt` (usually it is stored or interned there). | |
279 | ||
280 | [`TyCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html | |
281 | ||
282 | ### `ty::Ty` | |
283 | ||
284 | Types are really important in Rust, and they form the core of a lot of compiler | |
285 | analyses. The main type (in the compiler) that represents types (in the user's | |
286 | program) is [`rustc_middle::ty::Ty`][ty]. This is so important that we have a whole chapter | |
287 | on [`ty::Ty`][ty], but for now, we just want to mention that it exists and is the way | |
288 | `rustc` represents types! | |
289 | ||
290 | Also note that the `rustc_middle::ty` module defines the `TyCtxt` struct we mentioned before. | |
291 | ||
5e7ed085 | 292 | [ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html |
6a06907d XL |
293 | |
294 | ### Parallelism | |
295 | ||
296 | Compiler performance is a problem that we would like to improve on | |
297 | (and are always working on). One aspect of that is parallelizing | |
298 | `rustc` itself. | |
299 | ||
c295e0f8 | 300 | Currently, there is only one part of rustc that is parallel by default: codegen. |
6a06907d XL |
301 | |
302 | However, the rest of the compiler is still not yet parallel. There have been | |
303 | lots of efforts spent on this, but it is generally a hard problem. The current | |
304 | approach is to turn `RefCell`s into `Mutex`s -- that is, we | |
305 | switch to thread-safe internal mutability. However, there are ongoing | |
306 | challenges with lock contention, maintaining query-system invariants under | |
307 | concurrency, and the complexity of the code base. One can try out the current | |
308 | work by enabling parallel compilation in `config.toml`. It's still early days, | |
309 | but there are already some promising performance improvements. | |
310 | ||
311 | ### Bootstrapping | |
312 | ||
313 | `rustc` itself is written in Rust. So how do we compile the compiler? We use an | |
314 | older compiler to compile the newer compiler. This is called [_bootstrapping_]. | |
315 | ||
316 | Bootstrapping has a lot of interesting implications. For example, it means | |
317 | that one of the major users of Rust is the Rust compiler, so we are | |
318 | constantly testing our own software ("eating our own dogfood"). | |
319 | ||
320 | For more details on bootstrapping, see | |
321 | [the bootstrapping section of the guide][rustc-bootstrap]. | |
322 | ||
323 | [_bootstrapping_]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) | |
324 | [rustc-bootstrap]: building/bootstrapping.md | |
325 | ||
326 | # Unresolved Questions | |
327 | ||
328 | - Does LLVM ever do optimizations in debug builds? | |
329 | - How do I explore phases of the compile process in my own sources (lexer, | |
330 | parser, HIR, etc)? - e.g., `cargo rustc -- -Z unpretty=hir-tree` allows you to | |
331 | view HIR representation | |
332 | - What is the main source entry point for `X`? | |
333 | - Where do phases diverge for cross-compilation to machine code across | |
334 | different platforms? | |
335 | ||
336 | # References | |
337 | ||
338 | - Command line parsing | |
339 | - Guide: [The Rustc Driver and Interface](https://rustc-dev-guide.rust-lang.org/rustc-driver.html) | |
340 | - Driver definition: [`rustc_driver`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/) | |
341 | - Main entry point: [`rustc_session::config::build_session_options`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_session/config/fn.build_session_options.html) | |
342 | - Lexical Analysis: Lex the user program to a stream of tokens | |
343 | - Guide: [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) | |
344 | - Lexer definition: [`rustc_lexer`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html) | |
345 | - Main entry point: [`rustc_lexer::first_token`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/fn.first_token.html) | |
346 | - Parsing: Parse the stream of tokens to an Abstract Syntax Tree (AST) | |
347 | - Guide: [Lexing and Parsing](https://rustc-dev-guide.rust-lang.org/the-parser.html) | |
348 | - Parser definition: [`rustc_parse`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html) | |
349 | - Main entry points: | |
350 | - [Entry point for first file in crate](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/passes/fn.parse.html) | |
351 | - [Entry point for outline module parsing](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html) | |
352 | - [Entry point for macro fragments][parse_nonterminal] | |
353 | - AST definition: [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) | |
354 | - Expansion: **TODO** | |
355 | - Name Resolution: **TODO** | |
356 | - Feature gating: **TODO** | |
357 | - Early linting: **TODO** | |
358 | - The High Level Intermediate Representation (HIR) | |
359 | - Guide: [The HIR](https://rustc-dev-guide.rust-lang.org/hir.html) | |
360 | - Guide: [Identifiers in the HIR](https://rustc-dev-guide.rust-lang.org/hir.html#identifiers-in-the-hir) | |
361 | - Guide: [The HIR Map](https://rustc-dev-guide.rust-lang.org/hir.html#the-hir-map) | |
362 | - Guide: [Lowering AST to HIR](https://rustc-dev-guide.rust-lang.org/lowering.html) | |
363 | - How to view HIR representation for your code `cargo rustc -- -Z unpretty=hir-tree` | |
364 | - Rustc HIR definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) | |
365 | - Main entry point: **TODO** | |
366 | - Late linting: **TODO** | |
367 | - Type Inference | |
368 | - Guide: [Type Inference](https://rustc-dev-guide.rust-lang.org/type-inference.html) | |
369 | - Guide: [The ty Module: Representing Types](https://rustc-dev-guide.rust-lang.org/ty.html) (semantics) | |
370 | - Main entry point (type inference): [`InferCtxtBuilder::enter`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtBuilder.html#method.enter) | |
371 | - Main entry point (type checking bodies): [the `typeck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html#method.typeck) | |
372 | - These two functions can't be decoupled. | |
373 | - The Mid Level Intermediate Representation (MIR) | |
374 | - Guide: [The MIR (Mid level IR)](https://rustc-dev-guide.rust-lang.org/mir/index.html) | |
375 | - Definition: [`rustc_middle/src/mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html) | |
3c0e092e | 376 | - Definition of sources that manipulates the MIR: [`rustc_mir_build`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/index.html), [`rustc_mir_dataflow`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/index.html), [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) |
6a06907d XL |
377 | - The Borrow Checker |
378 | - Guide: [MIR Borrow Check](https://rustc-dev-guide.rust-lang.org/borrow_check.html) | |
3c0e092e XL |
379 | - Definition: [`rustc_borrowck`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html) |
380 | - Main entry point: [`mir_borrowck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/fn.mir_borrowck.html) | |
6a06907d XL |
381 | - MIR Optimizations |
382 | - Guide: [MIR Optimizations](https://rustc-dev-guide.rust-lang.org/mir/optimizations.html) | |
3c0e092e XL |
383 | - Definition: [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) |
384 | - Main entry point: [`optimized_mir` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/fn.optimized_mir.html) | |
6a06907d XL |
385 | - Code Generation |
386 | - Guide: [Code Generation](https://rustc-dev-guide.rust-lang.org/backend/codegen.html) | |
387 | - Generating Machine Code from LLVM IR with LLVM - **TODO: reference?** | |
388 | - Main entry point: [`rustc_codegen_ssa::base::codegen_crate`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html) | |
389 | - This monomorphizes and produces LLVM IR for one codegen unit. It then | |
390 | starts a background thread to run LLVM, which must be joined later. | |
391 | - Monomorphization happens lazily via [`FunctionCx::monomorphize`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.monomorphize) and [`rustc_codegen_ssa::base::codegen_instance `](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html) |