]> git.proxmox.com Git - rustc.git/blob - src/doc/rustc-dev-guide/src/building/bootstrapping.md
New upstream version 1.56.0~beta.4+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / building / bootstrapping.md
1 # Bootstrapping the Compiler
2
3 <!-- toc -->
4
5 This subchapter is about the bootstrapping process.
6
7 ## What is bootstrapping? How does it work?
8
9 [Bootstrapping] is the process of using a compiler to compile itself.
10 More accurately, it means using an older compiler to compile a newer version
11 of the same compiler.
12
13 This raises a chicken-and-egg paradox: where did the first compiler come from?
14 It must have been written in a different language. In Rust's case it was
15 [written in OCaml][ocaml-compiler]. However it was abandoned long ago and the
16 only way to build a modern version of rustc is a slightly less modern
17 version.
18
19 This is exactly how `x.py` works: it downloads the current beta release of
20 rustc, then uses it to compile the new compiler.
21
22 ## Stages of bootstrapping
23
24 Compiling `rustc` is done in stages:
25
26 - **Stage 0:** the stage0 compiler is usually (you can configure `x.py` to use
27 something else) the current _beta_ `rustc` compiler and its associated dynamic
28 libraries (which `x.py` will download for you). This stage0 compiler is then
29 used only to compile `rustbuild`, `std`, and `rustc`. When compiling
30 `rustc`, this stage0 compiler uses the freshly compiled `std`.
31 There are two concepts at play here: a compiler (with its set of dependencies)
32 and its 'target' or 'object' libraries (`std` and `rustc`).
33 Both are staged, but in a staggered manner.
34 - **Stage 1:** the code in your clone (for new version) is then
35 compiled with the stage0 compiler to produce the stage1 compiler.
36 However, it was built with an older compiler (stage0), so to
37 optimize the stage1 compiler we go to the next stage.
38 - In theory, the stage1 compiler is functionally identical to the
39 stage2 compiler, but in practice there are subtle differences. In
40 particular, the stage1 compiler itself was built by stage0 and
41 hence not by the source in your working directory: this means that
42 the symbol names used in the compiler source may not match the
43 symbol names that would have been made by the stage1 compiler. This is
44 important when using dynamic linking and the lack of ABI compatibility
45 between versions. This primarily manifests when tests try to link with any
46 of the `rustc_*` crates or use the (now deprecated) plugin infrastructure.
47 These tests are marked with `ignore-stage1`.
48 - **Stage 2:** we rebuild our stage1 compiler with itself to produce
49 the stage2 compiler (i.e. it builds itself) to have all the _latest
50 optimizations_. (By default, we copy the stage1 libraries for use by
51 the stage2 compiler, since they ought to be identical.)
52 - _(Optional)_ **Stage 3**: to sanity check our new compiler, we
53 can build the libraries with the stage2 compiler. The result ought
54 to be identical to before, unless something has broken.
55
56 The `stage2` compiler is the one distributed with `rustup` and all other
57 install methods. However, it takes a very long time to build because one must
58 first build the new compiler with an older compiler and then use that to
59 build the new compiler with itself. For development, you usually only want
60 the `stage1` compiler: `x.py build library/std`.
61
62 ### Default stages
63
64 `x.py` tries to be helpful and pick the stage you most likely meant for each subcommand.
65 These defaults are as follows:
66
67 - `check`: `--stage 0`
68 - `doc`: `--stage 0`
69 - `build`: `--stage 1`
70 - `test`: `--stage 1`
71 - `dist`: `--stage 2`
72 - `install`: `--stage 2`
73 - `bench`: `--stage 2`
74
75 You can always override the stage by passing `--stage N` explicitly.
76
77 For more information about stages, [see below](#understanding-stages-of-bootstrap).
78
79 ## Complications of bootstrapping
80
81 Since the build system uses the current beta compiler to build the stage-1
82 bootstrapping compiler, the compiler source code can't use some features
83 until they reach beta (because otherwise the beta compiler doesn't support
84 them). On the other hand, for [compiler intrinsics][intrinsics] and internal
85 features, the features _have_ to be used. Additionally, the compiler makes
86 heavy use of nightly features (`#![feature(...)]`). How can we resolve this
87 problem?
88
89 There are two methods used:
90 1. The build system sets `--cfg bootstrap` when building with `stage0`, so we
91 can use `cfg(not(bootstrap))` to only use features when built with `stage1`.
92 This is useful for e.g. features that were just stabilized, which require
93 `#![feature(...)]` when built with `stage0`, but not for `stage1`.
94 2. The build system sets `RUSTC_BOOTSTRAP=1`. This special variable means to
95 _break the stability guarantees_ of rust: Allow using `#![feature(...)]` with
96 a compiler that's not nightly. This should never be used except when
97 bootstrapping the compiler.
98
99 [Bootstrapping]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers)
100 [intrinsics]: ../appendix/glossary.md#intrinsic
101 [ocaml-compiler]: https://github.com/rust-lang/rust/tree/ef75860a0a72f79f97216f8aaa5b388d98da6480/src/boot
102
103 ## Contributing to bootstrap
104
105 When you use the bootstrap system, you'll call it through `x.py`.
106 However, most of the code lives in `src/bootstrap`.
107 `bootstrap` has a difficult problem: it is written in Rust, but yet it is run
108 before the rust compiler is built! To work around this, there are two
109 components of bootstrap: the main one written in rust, and `bootstrap.py`.
110 `bootstrap.py` is what gets run by `x.py`. It takes care of downloading the
111 `stage0` compiler, which will then build the bootstrap binary written in
112 Rust.
113
114 Because there are two separate codebases behind `x.py`, they need to
115 be kept in sync. In particular, both `bootstrap.py` and the bootstrap binary
116 parse `config.toml` and read the same command line arguments. `bootstrap.py`
117 keeps these in sync by setting various environment variables, and the
118 programs sometimes have to add arguments that are explicitly ignored, to be
119 read by the other.
120
121 ### Adding a setting to config.toml
122
123 This section is a work in progress. In the meantime, you can see an example
124 contribution [here][bootstrap-build].
125
126 [bootstrap-build]: https://github.com/rust-lang/rust/pull/71994
127
128 ## Understanding stages of bootstrap
129
130 ### Overview
131
132 This is a detailed look into the separate bootstrap stages.
133
134 The convention `x.py` uses is that:
135 - A `--stage N` flag means to run the stage N compiler (`stageN/rustc`).
136 - A "stage N artifact" is a build artifact that is _produced_ by the stage N compiler.
137 - The "stage (N+1) compiler" is assembled from "stage N artifacts". This
138 process is called _uplifting_.
139
140 #### Build artifacts
141
142 Anything you can build with `x.py` is a _build artifact_.
143 Build artifacts include, but are not limited to:
144
145 - binaries, like `stage0-rustc/rustc-main`
146 - shared objects, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.so`
147 - [rlib] files, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.rlib`
148 - HTML files generated by rustdoc, like `doc/std`
149
150 [rlib]: ../serialization.md
151
152 #### Assembling the compiler
153
154 There is a separate step between building the compiler and making it possible
155 to run. This step is called _assembling_ or _uplifting_ the compiler. It copies
156 all the necessary build artifacts from `build/stageN-sysroot` to
157 `build/stage(N+1)`, which allows you to use `build/stage(N+1)` as a [toolchain]
158 with `rustup toolchain link`.
159
160 There is [no way to trigger this step on its own][#73519], but `x.py` will
161 perform it automatically any time you build with stage N+1.
162
163 [toolchain]: https://rustc-dev-guide.rust-lang.org/building/how-to-build-and-run.html#creating-a-rustup-toolchain
164 [#73519]: https://github.com/rust-lang/rust/issues/73519
165
166 #### Examples
167
168 - `x.py build --stage 0` means to build with the beta `rustc`.
169 - `x.py doc --stage 0` means to document using the beta `rustdoc`.
170 - `x.py test --stage 0 library/std` means to run tests on the standard library
171 without building `rustc` from source ('build with stage 0, then test the
172 artifacts'). If you're working on the standard library, this is normally the
173 test command you want.
174 - `x.py test src/test/ui` means to build the stage 1 compiler and run
175 `compiletest` on it. If you're working on the compiler, this is normally the
176 test command you want.
177
178 #### Examples of what *not* to do
179
180 - `x.py test --stage 0 src/test/ui` is not meaningful: it runs tests on the
181 _beta_ compiler and doesn't build `rustc` from source. Use `test src/test/ui`
182 instead, which builds stage 1 from source.
183 - `x.py test --stage 0 compiler/rustc` builds the compiler but runs no tests:
184 it's running `cargo test -p rustc`, but cargo doesn't understand Rust's
185 tests. You shouldn't need to use this, use `test` instead (without arguments).
186 - `x.py build --stage 0 compiler/rustc` builds the compiler, but does
187 not [assemble] it. Use `x.py build library/std` instead, which puts the
188 compiler in `stage1/rustc`.
189
190 [assemble]: #assembling-the-compiler
191
192 ### Building vs. Running
193
194
195 Note that `build --stage N compiler/rustc` **does not** build the stage N compiler:
196 instead it builds the stage _N+1_ compiler _using_ the stage N compiler.
197
198 In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which
199 will later be uplifted to be the stage1 compiler_.
200
201 In each stage, two major steps are performed:
202
203 1. `std` is compiled by the stage N compiler.
204 2. That `std` is linked to programs built by the stage N compiler, including
205 the stage N artifacts (stage (N+1) compiler).
206
207 This is somewhat intuitive if one thinks of the stage N artifacts as "just"
208 another program we are building with the stage N compiler:
209 `build --stage N compiler/rustc` is linking the stage N artifacts to the `std`
210 built by the stage N compiler.
211
212 Here is a chart of a full build using `x.py`:
213
214 <img alt="A diagram of the rustc compilation phases" src="../img/rustc_stages.svg" class="center" />
215
216 Keep in mind this diagram is a simplification, i.e. `rustdoc` can be built at
217 different stages, the process is a bit different when passing flags such as
218 `--keep-stage`, or if there are non-host targets.
219
220 The stage 2 compiler is what is shipped to end-users.
221
222 ### Stages and `std`
223
224 Note that there are two `std` libraries in play here:
225 1. The library _linked_ to `stageN/rustc`, which was built by stage N-1 (stage N-1 `std`)
226 2. The library _used to compile programs_ with `stageN/rustc`, which was
227 built by stage N (stage N `std`).
228
229 Stage N `std` is pretty much necessary for any useful work with the stage N compiler.
230 Without it, you can only compile programs with `#![no_core]` -- not terribly useful!
231
232 The reason these need to be different is because they aren't necessarily ABI-compatible:
233 there could be a new layout optimizations, changes to MIR, or other changes
234 to Rust metadata on nightly that aren't present in beta.
235
236 This is also where `--keep-stage 1 library/std` comes into play. Since most
237 changes to the compiler don't actually change the ABI, once you've produced a
238 `std` in stage 1, you can probably just reuse it with a different compiler.
239 If the ABI hasn't changed, you're good to go, no need to spend time
240 recompiling that `std`.
241 `--keep-stage` simply assumes the previous compile is fine and copies those
242 artifacts into the appropriate place, skipping the cargo invocation.
243
244 ### Cross-compiling
245
246 Building stage2 `std` is different depending on whether you are cross-compiling or not
247 (see in the table how stage2 only builds non-host `std` targets).
248 This is because `x.py` uses a trick: if `HOST` and `TARGET` are the same,
249 it will reuse stage1 `std` for stage2! This is sound because stage1 `std`
250 was compiled with the stage1 compiler, i.e. a compiler using the source code
251 you currently have checked out. So it should be identical (and therefore ABI-compatible)
252 to the `std` that `stage2/rustc` would compile.
253
254 However, when cross-compiling, stage1 `std` will only run on the host.
255 So the stage2 compiler has to recompile `std` for the target.
256
257 ### Why does only libstd use `cfg(bootstrap)`?
258
259 The `rustc` generated by the stage0 compiler is linked to the freshly-built
260 `std`, which means that for the most part only `std` needs to be cfg-gated,
261 so that `rustc` can use features added to std immediately after their addition,
262 without need for them to get into the downloaded beta.
263
264 Note this is different from any other Rust program: stage1 `rustc`
265 is built by the _beta_ compiler, but using the _master_ version of libstd!
266
267 The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints
268 that use diagnostic items. This happens very rarely.
269
270 ### What is a 'sysroot'?
271
272 When you build a project with cargo, the build artifacts for dependencies
273 are normally stored in `target/debug/deps`. This only contains dependencies cargo
274 knows about; in particular, it doesn't have the standard library. Where do
275 `std` or `proc_macro` come from? It comes from the **sysroot**, the root
276 of a number of directories where the compiler loads build artifacts at runtime.
277 The sysroot doesn't just store the standard library, though - it includes
278 anything that needs to be loaded at runtime. That includes (but is not limited
279 to):
280
281 - `libstd`/`libtest`/`libproc_macro`
282 - The compiler crates themselves, when using `rustc_private`. In-tree these
283 are always present; out of tree, you need to install `rustc-dev` with rustup.
284 - `libLLVM.so`, the shared object file for the LLVM project. In-tree this is
285 either built from source or downloaded from CI; out-of-tree, you need to
286 install `llvm-tools-preview` with rustup.
287
288 All the artifacts listed so far are *compiler* runtime dependencies. You can
289 see them with `rustc --print sysroot`:
290
291 ```
292 $ ls $(rustc --print sysroot)/lib
293 libchalk_derive-0685d79833dc9b2b.so libstd-25c6acf8063a3802.so
294 libLLVM-11-rust-1.50.0-nightly.so libtest-57470d2aa8f7aa83.so
295 librustc_driver-4f0cc9f50e53f0ba.so libtracing_attributes-e4be92c35ab2a33b.so
296 librustc_macros-5f0ec4a119c6ac86.so rustlib
297 ```
298
299 There are also runtime dependencies for the standard library! These are in
300 `lib/rustlib`, not `lib/` directly.
301
302 ```
303 $ ls $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib | head -n 5
304 libaddr2line-6c8e02b8fedc1e5f.rlib
305 libadler-9ef2480568df55af.rlib
306 liballoc-9c4002b5f79ba0e1.rlib
307 libcfg_if-512eb53291f6de7e.rlib
308 libcompiler_builtins-ef2408da76957905.rlib
309 ```
310
311 `rustlib` includes libraries like `hashbrown` and `cfg_if`, which are not part
312 of the public API of the standard library, but are used to implement it.
313 `rustlib` is part of the search path for linkers, but `lib` will never be part
314 of the search path.
315
316 #### -Z force-unstable-if-unmarked
317
318 Since `rustlib` is part of the search path, it means we have to be careful
319 about which crates are included in it. In particular, all crates except for
320 the standard library are built with the flag `-Z force-unstable-if-unmarked`,
321 which means that you have to use `#![feature(rustc_private)]` in order to
322 load it (as opposed to the standard library, which is always available).
323
324 The `-Z force-unstable-if-unmarked` flag has a variety of purposes to help
325 enforce that the correct crates are marked as unstable. It was introduced
326 primarily to allow rustc and the standard library to link to arbitrary crates
327 on crates.io which do not themselves use `staged_api`. `rustc` also relies on
328 this flag to mark all of its crates as unstable with the `rustc_private`
329 feature so that each crate does not need to be carefully marked with
330 `unstable`.
331
332 This flag is automatically applied to all of `rustc` and the standard library
333 by the bootstrap scripts. This is needed because the compiler and all of its
334 dependencies are shipped in the sysroot to all users.
335
336 This flag has the following effects:
337
338 - Marks the crate as "unstable" with the `rustc_private` feature if it is not
339 itself marked as stable or unstable.
340 - Allows these crates to access other forced-unstable crates without any need
341 for attributes. Normally a crate would need a `#![feature(rustc_private)]`
342 attribute to use other unstable crates. However, that would make it
343 impossible for a crate from crates.io to access its own dependencies since
344 that crate won't have a `feature(rustc_private)` attribute, but *everything*
345 is compiled with `-Z force-unstable-if-unmarked`.
346
347 Code which does not use `-Z force-unstable-if-unmarked` should include the
348 `#![feature(rustc_private)]` crate attribute to access these force-unstable
349 crates. This is needed for things that link `rustc`, such as `miri`, `rls`, or
350 `clippy`.
351
352 You can find more discussion about sysroots in:
353 - The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded from sysroot
354 - [Discussions about sysroot on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/deps.20in.20sysroot/)
355 - [Discussions about building rustdoc out of tree](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/How.20to.20create.20an.20executable.20accessing.20.60rustc_private.60.3F)
356
357 [rustdoc PR]: https://github.com/rust-lang/rust/pull/76728
358
359 ### Directories and artifacts generated by x.py
360
361 The following tables indicate the outputs of various stage actions:
362
363 | Stage 0 Action | Output |
364 |-----------------------------------------------------------|----------------------------------------------|
365 | `beta` extracted | `build/HOST/stage0` |
366 | `stage0` builds `bootstrap` | `build/bootstrap` |
367 | `stage0` builds `test`/`std` | `build/HOST/stage0-std/TARGET` |
368 | copy `stage0-std` (HOST only) | `build/HOST/stage0-sysroot/lib/rustlib/HOST` |
369 | `stage0` builds `rustc` with `stage0-sysroot` | `build/HOST/stage0-rustc/HOST` |
370 | copy `stage0-rustc (except executable)` | `build/HOST/stage0-sysroot/lib/rustlib/HOST` |
371 | build `llvm` | `build/HOST/llvm` |
372 | `stage0` builds `codegen` with `stage0-sysroot` | `build/HOST/stage0-codegen/HOST` |
373 | `stage0` builds `rustdoc`, `clippy`, `miri`, with `stage0-sysroot` | `build/HOST/stage0-tools/HOST` |
374
375 `--stage=0` stops here.
376
377 | Stage 1 Action | Output |
378 |-----------------------------------------------------|---------------------------------------|
379 | copy (uplift) `stage0-rustc` executable to `stage1` | `build/HOST/stage1/bin` |
380 | copy (uplift) `stage0-codegen` to `stage1` | `build/HOST/stage1/lib` |
381 | copy (uplift) `stage0-sysroot` to `stage1` | `build/HOST/stage1/lib` |
382 | `stage1` builds `test`/`std` | `build/HOST/stage1-std/TARGET` |
383 | copy `stage1-std` (HOST only) | `build/HOST/stage1/lib/rustlib/HOST` |
384 | `stage1` builds `rustc` | `build/HOST/stage1-rustc/HOST` |
385 | copy `stage1-rustc` (except executable) | `build/HOST/stage1/lib/rustlib/HOST` |
386 | `stage1` builds `codegen` | `build/HOST/stage1-codegen/HOST` |
387
388 `--stage=1` stops here.
389
390 | Stage 2 Action | Output |
391 |--------------------------------------------------------|-----------------------------------------------------------------|
392 | copy (uplift) `stage1-rustc` executable | `build/HOST/stage2/bin` |
393 | copy (uplift) `stage1-sysroot` | `build/HOST/stage2/lib and build/HOST/stage2/lib/rustlib/HOST` |
394 | `stage2` builds `test`/`std` (not HOST targets) | `build/HOST/stage2-std/TARGET` |
395 | copy `stage2-std` (not HOST targets) | `build/HOST/stage2/lib/rustlib/TARGET` |
396 | `stage2` builds `rustdoc`, `clippy`, `miri` | `build/HOST/stage2-tools/HOST` |
397 | copy `rustdoc` | `build/HOST/stage2/bin` |
398
399 `--stage=2` stops here.
400
401 ## Passing stage-specific flags to `rustc`
402
403 `x.py` allows you to pass stage-specific flags to `rustc` when bootstrapping.
404 The `RUSTFLAGS_BOOTSTRAP` environment variable is passed as RUSTFLAGS to the bootstrap stage
405 (stage0), and `RUSTFLAGS_NOT_BOOTSTRAP` is passed when building artifacts for later stages.
406
407 ## Environment Variables
408
409 During bootstrapping, there are a bunch of compiler-internal environment
410 variables that are used. If you are trying to run an intermediate version of
411 `rustc`, sometimes you may need to set some of these environment variables
412 manually. Otherwise, you get an error like the following:
413
414 ```text
415 thread 'main' panicked at 'RUSTC_STAGE was not set: NotPresent', library/core/src/result.rs:1165:5
416 ```
417
418 If `./stageN/bin/rustc` gives an error about environment variables, that
419 usually means something is quite wrong -- or you're trying to compile e.g.
420 `rustc` or `std` or something that depends on environment variables. In
421 the unlikely case that you actually need to invoke rustc in such a situation,
422 you can find the environment variable values by adding the following flag to
423 your `x.py` command: `--on-fail=print-env`.