]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-dev-guide/src/building/bootstrapping.md
New upstream version 1.65.0+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / building / bootstrapping.md
CommitLineData
60c5eb7d
XL
1# Bootstrapping the Compiler
2
6a06907d
XL
3<!-- toc -->
4
60c5eb7d 5
3c0e092e 6[*Bootstrapping*][boot] is the process of using a compiler to compile itself.
6a06907d
XL
7More accurately, it means using an older compiler to compile a newer version
8of the same compiler.
9
10This raises a chicken-and-egg paradox: where did the first compiler come from?
11It must have been written in a different language. In Rust's case it was
12[written in OCaml][ocaml-compiler]. However it was abandoned long ago and the
13only way to build a modern version of rustc is a slightly less modern
14version.
15
16This is exactly how `x.py` works: it downloads the current beta release of
17rustc, then uses it to compile the new compiler.
18
19## Stages of bootstrapping
20
f2b60f7d
FG
21Compiling `rustc` is done in stages. Here's a diagram, adapted from Joshua Nelson's
22[talk on bootstrapping][rustconf22-talk] at RustConf 2022, with detailed explanations below.
23
24The `A`, `B`, `C`, and `D` show the ordering of the stages of bootstrapping.
25<span style="background-color: lightblue; color: black">Blue</span> nodes are downloaded,
26<span style="background-color: yellow; color: black">yellow</span> nodes are built with the
27stage0 compiler, and
28<span style="background-color: lightgreen; color: black">green</span> nodes are built with the
29stage1 compiler.
30
31[rustconf22-talk]: https://rustconf.com/schedule#bootstrapping-the-once-and-future-compiler
32
33```mermaid
34graph TD
35 s0c["stage0 compiler (1.63)"]:::downloaded -->|A| s0l("stage0 std (1.64)"):::with-s0c;
36 s0c & s0l --- stepb[ ]:::empty;
37 stepb -->|B| s0ca["stage0 compiler artifacts (1.64)"]:::with-s0c;
38 s0ca -->|copy| s1c["stage1 compiler (1.64)"]:::with-s0c;
39 s1c -->|C| s1l("stage1 std (1.64)"):::with-s1c;
40 s1c & s1l --- stepd[ ]:::empty;
41 stepd -->|D| s1ca["stage1 compiler artifacts (1.64)"]:::with-s1c;
42 s1ca -->|copy| s2c["stage2 compiler"]:::with-s1c;
43
44 classDef empty width:0px,height:0px;
45 classDef downloaded fill: lightblue;
46 classDef with-s0c fill: yellow;
47 classDef with-s1c fill: lightgreen;
48```
3c0e092e
XL
49
50### Stage 0
51
52The stage0 compiler is usually the current _beta_ `rustc` compiler
53and its associated dynamic libraries,
54which `x.py` will download for you.
55(You can also configure `x.py` to use something else.)
56
57The stage0 compiler is then used only to compile `rustbuild`, `std`, and `rustc`.
58When compiling `rustc`, the stage0 compiler uses the freshly compiled `std`.
59There are two concepts at play here:
60a compiler (with its set of dependencies)
61and its 'target' or 'object' libraries (`std` and `rustc`).
62Both are staged, but in a staggered manner.
63
64### Stage 1
65
66The rustc source code is then compiled with the stage0 compiler to produce the stage1 compiler.
67
68### Stage 2
69
70We then rebuild our stage1 compiler with itself to produce the stage2 compiler.
71
72In theory, the stage1 compiler is functionally identical to the stage2 compiler,
73but in practice there are subtle differences.
74In particular, the stage1 compiler itself was built by stage0
75and hence not by the source in your working directory.
76This means that the symbol names used in the compiler source
77may not match the symbol names that would have been made by the stage1 compiler,
78which can cause problems for dynamic libraries and tests.
79
80The `stage2` compiler is the one distributed with `rustup` and all other install methods.
81However, it takes a very long time to build
82because one must first build the new compiler with an older compiler
83and then use that to build the new compiler with itself.
84For development, you usually only want the `stage1` compiler,
064997fb 85which you can build with `./x.py build library`.
a2a8927a 86See [Building the Compiler](./how-to-build-and-run.html#building-the-compiler).
3c0e092e
XL
87
88### Stage 3
89
90Stage 3 is optional. To sanity check our new compiler, we
91can build the libraries with the stage2 compiler. The result ought
92to be identical to before, unless something has broken.
93
94### Building the stages
6a06907d
XL
95
96`x.py` tries to be helpful and pick the stage you most likely meant for each subcommand.
97These defaults are as follows:
98
99- `check`: `--stage 0`
100- `doc`: `--stage 0`
101- `build`: `--stage 1`
102- `test`: `--stage 1`
103- `dist`: `--stage 2`
104- `install`: `--stage 2`
105- `bench`: `--stage 2`
106
107You can always override the stage by passing `--stage N` explicitly.
108
109For more information about stages, [see below](#understanding-stages-of-bootstrap).
110
111## Complications of bootstrapping
112
113Since the build system uses the current beta compiler to build the stage-1
114bootstrapping compiler, the compiler source code can't use some features
115until they reach beta (because otherwise the beta compiler doesn't support
116them). On the other hand, for [compiler intrinsics][intrinsics] and internal
117features, the features _have_ to be used. Additionally, the compiler makes
118heavy use of nightly features (`#![feature(...)]`). How can we resolve this
119problem?
120
121There are two methods used:
1221. The build system sets `--cfg bootstrap` when building with `stage0`, so we
123can use `cfg(not(bootstrap))` to only use features when built with `stage1`.
124This is useful for e.g. features that were just stabilized, which require
125`#![feature(...)]` when built with `stage0`, but not for `stage1`.
1262. The build system sets `RUSTC_BOOTSTRAP=1`. This special variable means to
127_break the stability guarantees_ of rust: Allow using `#![feature(...)]` with
128a compiler that's not nightly. This should never be used except when
129bootstrapping the compiler.
130
3c0e092e 131[boot]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers)
6a06907d
XL
132[intrinsics]: ../appendix/glossary.md#intrinsic
133[ocaml-compiler]: https://github.com/rust-lang/rust/tree/ef75860a0a72f79f97216f8aaa5b388d98da6480/src/boot
134
135## Contributing to bootstrap
136
137When you use the bootstrap system, you'll call it through `x.py`.
138However, most of the code lives in `src/bootstrap`.
139`bootstrap` has a difficult problem: it is written in Rust, but yet it is run
5099ac24 140before the Rust compiler is built! To work around this, there are two
6a06907d
XL
141components of bootstrap: the main one written in rust, and `bootstrap.py`.
142`bootstrap.py` is what gets run by `x.py`. It takes care of downloading the
143`stage0` compiler, which will then build the bootstrap binary written in
144Rust.
145
146Because there are two separate codebases behind `x.py`, they need to
147be kept in sync. In particular, both `bootstrap.py` and the bootstrap binary
148parse `config.toml` and read the same command line arguments. `bootstrap.py`
149keeps these in sync by setting various environment variables, and the
150programs sometimes have to add arguments that are explicitly ignored, to be
151read by the other.
152
153### Adding a setting to config.toml
154
155This section is a work in progress. In the meantime, you can see an example
156contribution [here][bootstrap-build].
157
158[bootstrap-build]: https://github.com/rust-lang/rust/pull/71994
159
160## Understanding stages of bootstrap
161
162### Overview
163
164This is a detailed look into the separate bootstrap stages.
165
166The convention `x.py` uses is that:
3c0e092e 167
6a06907d
XL
168- A `--stage N` flag means to run the stage N compiler (`stageN/rustc`).
169- A "stage N artifact" is a build artifact that is _produced_ by the stage N compiler.
3c0e092e 170- The stage N+1 compiler is assembled from stage N *artifacts*. This
6a06907d
XL
171 process is called _uplifting_.
172
173#### Build artifacts
174
175Anything you can build with `x.py` is a _build artifact_.
176Build artifacts include, but are not limited to:
177
178- binaries, like `stage0-rustc/rustc-main`
179- shared objects, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.so`
180- [rlib] files, like `stage0-sysroot/rustlib/libstd-6fae108520cf72fe.rlib`
181- HTML files generated by rustdoc, like `doc/std`
182
183[rlib]: ../serialization.md
184
6a06907d
XL
185#### Examples
186
3c0e092e
XL
187- `./x.py build --stage 0` means to build with the beta `rustc`.
188- `./x.py doc --stage 0` means to document using the beta `rustdoc`.
189- `./x.py test --stage 0 library/std` means to run tests on the standard library
6a06907d
XL
190 without building `rustc` from source ('build with stage 0, then test the
191 artifacts'). If you're working on the standard library, this is normally the
192 test command you want.
3c0e092e 193- `./x.py test src/test/ui` means to build the stage 1 compiler and run
6a06907d
XL
194 `compiletest` on it. If you're working on the compiler, this is normally the
195 test command you want.
196
197#### Examples of what *not* to do
198
064997fb 199- `./x.py test --stage 0 src/test/ui` is not useful: it runs tests on the
6a06907d
XL
200 _beta_ compiler and doesn't build `rustc` from source. Use `test src/test/ui`
201 instead, which builds stage 1 from source.
3c0e092e 202- `./x.py test --stage 0 compiler/rustc` builds the compiler but runs no tests:
6a06907d
XL
203 it's running `cargo test -p rustc`, but cargo doesn't understand Rust's
204 tests. You shouldn't need to use this, use `test` instead (without arguments).
3c0e092e
XL
205- `./x.py build --stage 0 compiler/rustc` builds the compiler, but does not build
206 libstd or even libcore. Most of the time, you'll want `./x.py build
064997fb 207 library` instead, which allows compiling programs without needing to define
3c0e092e 208 lang items.
6a06907d 209
3c0e092e 210### Building vs. running
6a06907d
XL
211
212Note that `build --stage N compiler/rustc` **does not** build the stage N compiler:
3c0e092e 213instead it builds the stage N+1 compiler _using_ the stage N compiler.
6a06907d
XL
214
215In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which
216will later be uplifted to be the stage1 compiler_.
217
218In each stage, two major steps are performed:
219
2201. `std` is compiled by the stage N compiler.
3c0e092e
XL
2212. That `std` is linked to programs built by the stage N compiler,
222 including the stage N artifacts (stage N+1 compiler).
6a06907d
XL
223
224This is somewhat intuitive if one thinks of the stage N artifacts as "just"
225another program we are building with the stage N compiler:
226`build --stage N compiler/rustc` is linking the stage N artifacts to the `std`
227built by the stage N compiler.
60c5eb7d 228
6a06907d 229Here is a chart of a full build using `x.py`:
60c5eb7d
XL
230
231<img alt="A diagram of the rustc compilation phases" src="../img/rustc_stages.svg" class="center" />
232
233Keep in mind this diagram is a simplification, i.e. `rustdoc` can be built at
234different stages, the process is a bit different when passing flags such as
235`--keep-stage`, or if there are non-host targets.
236
6a06907d
XL
237### Stages and `std`
238
239Note that there are two `std` libraries in play here:
2401. The library _linked_ to `stageN/rustc`, which was built by stage N-1 (stage N-1 `std`)
2412. The library _used to compile programs_ with `stageN/rustc`, which was
242 built by stage N (stage N `std`).
243
244Stage N `std` is pretty much necessary for any useful work with the stage N compiler.
245Without it, you can only compile programs with `#![no_core]` -- not terribly useful!
246
247The reason these need to be different is because they aren't necessarily ABI-compatible:
5e7ed085 248there could be new layout optimizations, changes to MIR, or other changes
6a06907d
XL
249to Rust metadata on nightly that aren't present in beta.
250
251This is also where `--keep-stage 1 library/std` comes into play. Since most
252changes to the compiler don't actually change the ABI, once you've produced a
253`std` in stage 1, you can probably just reuse it with a different compiler.
254If the ABI hasn't changed, you're good to go, no need to spend time
255recompiling that `std`.
256`--keep-stage` simply assumes the previous compile is fine and copies those
257artifacts into the appropriate place, skipping the cargo invocation.
258
3c0e092e
XL
259### Cross-compiling rustc
260
064997fb 261*Cross-compiling* is the process of compiling code that will run on another architecture.
3c0e092e
XL
262For instance, you might want to build an ARM version of rustc using an x86 machine.
263Building stage2 `std` is different when you are cross-compiling.
6a06907d 264
6a06907d
XL
265This is because `x.py` uses a trick: if `HOST` and `TARGET` are the same,
266it will reuse stage1 `std` for stage2! This is sound because stage1 `std`
267was compiled with the stage1 compiler, i.e. a compiler using the source code
268you currently have checked out. So it should be identical (and therefore ABI-compatible)
269to the `std` that `stage2/rustc` would compile.
270
271However, when cross-compiling, stage1 `std` will only run on the host.
272So the stage2 compiler has to recompile `std` for the target.
273
3c0e092e
XL
274(See in the table how stage2 only builds non-host `std` targets).
275
6a06907d
XL
276### Why does only libstd use `cfg(bootstrap)`?
277
278The `rustc` generated by the stage0 compiler is linked to the freshly-built
279`std`, which means that for the most part only `std` needs to be cfg-gated,
280so that `rustc` can use features added to std immediately after their addition,
281without need for them to get into the downloaded beta.
282
283Note this is different from any other Rust program: stage1 `rustc`
284is built by the _beta_ compiler, but using the _master_ version of libstd!
285
286The only time `rustc` uses `cfg(bootstrap)` is when it adds internal lints
287that use diagnostic items. This happens very rarely.
288
289### What is a 'sysroot'?
290
291When you build a project with cargo, the build artifacts for dependencies
292are normally stored in `target/debug/deps`. This only contains dependencies cargo
293knows about; in particular, it doesn't have the standard library. Where do
294`std` or `proc_macro` come from? It comes from the **sysroot**, the root
295of a number of directories where the compiler loads build artifacts at runtime.
296The sysroot doesn't just store the standard library, though - it includes
297anything that needs to be loaded at runtime. That includes (but is not limited
298to):
299
300- `libstd`/`libtest`/`libproc_macro`
301- The compiler crates themselves, when using `rustc_private`. In-tree these
302 are always present; out of tree, you need to install `rustc-dev` with rustup.
303- `libLLVM.so`, the shared object file for the LLVM project. In-tree this is
304 either built from source or downloaded from CI; out-of-tree, you need to
305 install `llvm-tools-preview` with rustup.
306
307All the artifacts listed so far are *compiler* runtime dependencies. You can
308see them with `rustc --print sysroot`:
309
310```
311$ ls $(rustc --print sysroot)/lib
312libchalk_derive-0685d79833dc9b2b.so libstd-25c6acf8063a3802.so
313libLLVM-11-rust-1.50.0-nightly.so libtest-57470d2aa8f7aa83.so
314librustc_driver-4f0cc9f50e53f0ba.so libtracing_attributes-e4be92c35ab2a33b.so
315librustc_macros-5f0ec4a119c6ac86.so rustlib
316```
317
318There are also runtime dependencies for the standard library! These are in
319`lib/rustlib`, not `lib/` directly.
320
321```
322$ ls $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib | head -n 5
323libaddr2line-6c8e02b8fedc1e5f.rlib
324libadler-9ef2480568df55af.rlib
325liballoc-9c4002b5f79ba0e1.rlib
326libcfg_if-512eb53291f6de7e.rlib
327libcompiler_builtins-ef2408da76957905.rlib
328```
329
330`rustlib` includes libraries like `hashbrown` and `cfg_if`, which are not part
331of the public API of the standard library, but are used to implement it.
332`rustlib` is part of the search path for linkers, but `lib` will never be part
333of the search path.
334
335#### -Z force-unstable-if-unmarked
336
337Since `rustlib` is part of the search path, it means we have to be careful
338about which crates are included in it. In particular, all crates except for
339the standard library are built with the flag `-Z force-unstable-if-unmarked`,
340which means that you have to use `#![feature(rustc_private)]` in order to
341load it (as opposed to the standard library, which is always available).
342
343The `-Z force-unstable-if-unmarked` flag has a variety of purposes to help
344enforce that the correct crates are marked as unstable. It was introduced
345primarily to allow rustc and the standard library to link to arbitrary crates
346on crates.io which do not themselves use `staged_api`. `rustc` also relies on
347this flag to mark all of its crates as unstable with the `rustc_private`
348feature so that each crate does not need to be carefully marked with
349`unstable`.
350
351This flag is automatically applied to all of `rustc` and the standard library
352by the bootstrap scripts. This is needed because the compiler and all of its
353dependencies are shipped in the sysroot to all users.
354
355This flag has the following effects:
356
357- Marks the crate as "unstable" with the `rustc_private` feature if it is not
358 itself marked as stable or unstable.
359- Allows these crates to access other forced-unstable crates without any need
360 for attributes. Normally a crate would need a `#![feature(rustc_private)]`
361 attribute to use other unstable crates. However, that would make it
362 impossible for a crate from crates.io to access its own dependencies since
363 that crate won't have a `feature(rustc_private)` attribute, but *everything*
364 is compiled with `-Z force-unstable-if-unmarked`.
365
366Code which does not use `-Z force-unstable-if-unmarked` should include the
367`#![feature(rustc_private)]` crate attribute to access these force-unstable
368crates. This is needed for things that link `rustc`, such as `miri`, `rls`, or
369`clippy`.
370
371You can find more discussion about sysroots in:
372- The [rustdoc PR] explaining why it uses `extern crate` for dependencies loaded from sysroot
373- [Discussions about sysroot on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/deps.20in.20sysroot/)
374- [Discussions about building rustdoc out of tree](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/How.20to.20create.20an.20executable.20accessing.20.60rustc_private.60.3F)
375
376[rustdoc PR]: https://github.com/rust-lang/rust/pull/76728
377
3c0e092e 378### Directories and artifacts generated by `x.py`
6a06907d 379
60c5eb7d
XL
380The following tables indicate the outputs of various stage actions:
381
382| Stage 0 Action | Output |
383|-----------------------------------------------------------|----------------------------------------------|
384| `beta` extracted | `build/HOST/stage0` |
385| `stage0` builds `bootstrap` | `build/bootstrap` |
6a06907d 386| `stage0` builds `test`/`std` | `build/HOST/stage0-std/TARGET` |
60c5eb7d
XL
387| copy `stage0-std` (HOST only) | `build/HOST/stage0-sysroot/lib/rustlib/HOST` |
388| `stage0` builds `rustc` with `stage0-sysroot` | `build/HOST/stage0-rustc/HOST` |
04454e1e 389| copy `stage0-rustc` (except executable) | `build/HOST/stage0-sysroot/lib/rustlib/HOST` |
60c5eb7d
XL
390| build `llvm` | `build/HOST/llvm` |
391| `stage0` builds `codegen` with `stage0-sysroot` | `build/HOST/stage0-codegen/HOST` |
6a06907d 392| `stage0` builds `rustdoc`, `clippy`, `miri`, with `stage0-sysroot` | `build/HOST/stage0-tools/HOST` |
60c5eb7d
XL
393
394`--stage=0` stops here.
395
396| Stage 1 Action | Output |
397|-----------------------------------------------------|---------------------------------------|
398| copy (uplift) `stage0-rustc` executable to `stage1` | `build/HOST/stage1/bin` |
399| copy (uplift) `stage0-codegen` to `stage1` | `build/HOST/stage1/lib` |
400| copy (uplift) `stage0-sysroot` to `stage1` | `build/HOST/stage1/lib` |
6a06907d 401| `stage1` builds `test`/`std` | `build/HOST/stage1-std/TARGET` |
60c5eb7d
XL
402| copy `stage1-std` (HOST only) | `build/HOST/stage1/lib/rustlib/HOST` |
403| `stage1` builds `rustc` | `build/HOST/stage1-rustc/HOST` |
404| copy `stage1-rustc` (except executable) | `build/HOST/stage1/lib/rustlib/HOST` |
405| `stage1` builds `codegen` | `build/HOST/stage1-codegen/HOST` |
406
407`--stage=1` stops here.
408
6a06907d
XL
409| Stage 2 Action | Output |
410|--------------------------------------------------------|-----------------------------------------------------------------|
411| copy (uplift) `stage1-rustc` executable | `build/HOST/stage2/bin` |
412| copy (uplift) `stage1-sysroot` | `build/HOST/stage2/lib and build/HOST/stage2/lib/rustlib/HOST` |
413| `stage2` builds `test`/`std` (not HOST targets) | `build/HOST/stage2-std/TARGET` |
414| copy `stage2-std` (not HOST targets) | `build/HOST/stage2/lib/rustlib/TARGET` |
415| `stage2` builds `rustdoc`, `clippy`, `miri` | `build/HOST/stage2-tools/HOST` |
416| copy `rustdoc` | `build/HOST/stage2/bin` |
60c5eb7d
XL
417
418`--stage=2` stops here.
419
6a06907d 420## Passing stage-specific flags to `rustc`
60c5eb7d 421
6a06907d
XL
422`x.py` allows you to pass stage-specific flags to `rustc` when bootstrapping.
423The `RUSTFLAGS_BOOTSTRAP` environment variable is passed as RUSTFLAGS to the bootstrap stage
424(stage0), and `RUSTFLAGS_NOT_BOOTSTRAP` is passed when building artifacts for later stages.
60c5eb7d 425
dfeec247
XL
426## Environment Variables
427
428During bootstrapping, there are a bunch of compiler-internal environment
429variables that are used. If you are trying to run an intermediate version of
430`rustc`, sometimes you may need to set some of these environment variables
431manually. Otherwise, you get an error like the following:
432
433```text
6a06907d 434thread 'main' panicked at 'RUSTC_STAGE was not set: NotPresent', library/core/src/result.rs:1165:5
dfeec247
XL
435```
436
437If `./stageN/bin/rustc` gives an error about environment variables, that
438usually means something is quite wrong -- or you're trying to compile e.g.
6a06907d 439`rustc` or `std` or something that depends on environment variables. In
dfeec247
XL
440the unlikely case that you actually need to invoke rustc in such a situation,
441you can find the environment variable values by adding the following flag to
442your `x.py` command: `--on-fail=print-env`.