]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-dev-guide/src/queries/incremental-compilation.md
Merge tag 'debian/1.52.1+dfsg1-1_exp2' into proxmox/buster
[rustc.git] / src / doc / rustc-dev-guide / src / queries / incremental-compilation.md
CommitLineData
a1dfa0c6
XL
1# Incremental compilation
2
6a06907d
XL
3<!-- toc -->
4
a1dfa0c6
XL
5The incremental compilation scheme is, in essence, a surprisingly
6simple extension to the overall query system. We'll start by describing
7a slightly simplified variant of the real thing – the "basic algorithm" –
8and then describe some possible improvements.
9
10## The basic algorithm
11
12The basic algorithm is
13called the **red-green** algorithm[^salsa]. The high-level idea is
14that, after each run of the compiler, we will save the results of all
15the queries that we do, as well as the **query DAG**. The
16**query DAG** is a [DAG] that indexes which queries executed which
6a06907d 17other queries. So, for example, there would be an [edge] from a query Q1
a1dfa0c6
XL
18to another query Q2 if computing Q1 required computing Q2 (note that
19because queries cannot depend on themselves, this results in a DAG and
20not a general graph).
21
22[DAG]: https://en.wikipedia.org/wiki/Directed_acyclic_graph
23
24On the next run of the compiler, then, we can sometimes reuse these
25query results to avoid re-executing a query. We do this by assigning
26every query a **color**:
27
28- If a query is colored **red**, that means that its result during
29 this compilation has **changed** from the previous compilation.
30- If a query is colored **green**, that means that its result is
31 the **same** as the previous compilation.
32
33There are two key insights here:
34
35- First, if all the inputs to query Q are colored green, then the
36 query Q **must** result in the same value as last time and hence
37 need not be re-executed (or else the compiler is not deterministic).
38- Second, even if some inputs to a query changes, it may be that it
39 **still** produces the same result as the previous compilation. In
40 particular, the query may only use part of its input.
41 - Therefore, after executing a query, we always check whether it
42 produced the same result as the previous time. **If it did,** we
43 can still mark the query as green, and hence avoid re-executing
44 dependent queries.
45
46### The try-mark-green algorithm
47
48At the core of incremental compilation is an algorithm called
49"try-mark-green". It has the job of determining the color of a given
50query Q (which must not have yet been executed). In cases where Q has
51red inputs, determining Q's color may involve re-executing Q so that
52we can compare its output, but if all of Q's inputs are green, then we
53can conclude that Q must be green without re-executing it or inspecting
54its value at all. In the compiler, this allows us to avoid
55deserializing the result from disk when we don't need it, and in fact
56enables us to sometimes skip *serializing* the result as well
57(see the refinements section below).
58
59Try-mark-green works as follows:
60
61- First check if the query Q was executed during the previous compilation.
62 - If not, we can just re-execute the query as normal, and assign it the
63 color of red.
64- If yes, then load the 'dependent queries' of Q.
65- If there is a saved result, then we load the `reads(Q)` vector from the
66 query DAG. The "reads" is the set of queries that Q executed during
67 its execution.
68 - For each query R in `reads(Q)`, we recursively demand the color
69 of R using try-mark-green.
70 - Note: it is important that we visit each node in `reads(Q)` in same order
71 as they occurred in the original compilation. See [the section on the
72 query DAG below](#dag).
73 - If **any** of the nodes in `reads(Q)` wind up colored **red**, then Q is
74 dirty.
75 - We re-execute Q and compare the hash of its result to the hash of the
76 result from the previous compilation.
77 - If the hash has not changed, we can mark Q as **green** and return.
78 - Otherwise, **all** of the nodes in `reads(Q)` must be **green**. In that
79 case, we can color Q as **green** and return.
80
81<a name="dag"></a>
82
83### The query DAG
84
85The query DAG code is stored in
6a06907d 86[`compiler/rustc_middle/src/dep_graph`][dep_graph]. Construction of the DAG is done
a1dfa0c6
XL
87by instrumenting the query execution.
88
89One key point is that the query DAG also tracks ordering; that is, for
90each query Q, we not only track the queries that Q reads, we track the
91**order** in which they were read. This allows try-mark-green to walk
92those queries back in the same order. This is important because once a
93subquery comes back as red, we can no longer be sure that Q will continue
94along the same path as before. That is, imagine a query like this:
95
96```rust,ignore
97fn main_query(tcx) {
98 if tcx.subquery1() {
99 tcx.subquery2()
100 } else {
101 tcx.subquery3()
102 }
103}
104```
105
106Now imagine that in the first compilation, `main_query` starts by
107executing `subquery1`, and this returns true. In that case, the next
108query `main_query` executes will be `subquery2`, and `subquery3` will
109not be executed at all.
110
111But now imagine that in the **next** compilation, the input has
112changed such that `subquery1` returns **false**. In this case, `subquery2`
113would never execute. If try-mark-green were to visit `reads(main_query)` out
114of order, however, it might visit `subquery2` before `subquery1`, and hence
115execute it.
116This can lead to ICEs and other problems in the compiler.
117
6a06907d 118[dep_graph]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/dep_graph/index.html
a1dfa0c6
XL
119
120## Improvements to the basic algorithm
121
122In the description of the basic algorithm, we said that at the end of
123compilation we would save the results of all the queries that were
124performed. In practice, this can be quite wasteful – many of those
125results are very cheap to recompute, and serializing and deserializing
126them is not a particular win. In practice, what we would do is to save
127**the hashes** of all the subqueries that we performed. Then, in select cases,
128we **also** save the results.
129
130This is why the incremental algorithm separates computing the
131**color** of a node, which often does not require its value, from
132computing the **result** of a node. Computing the result is done via a simple
133algorithm like so:
134
135- Check if a saved result for Q is available. If so, compute the color of Q.
136 If Q is green, deserialize and return the saved result.
137- Otherwise, execute Q.
138 - We can then compare the hash of the result and color Q as green if
139 it did not change.
140
532ac7d7 141## Resources
6a06907d
XL
142The initial design document can be found [here][initial-design], which expands
143on the memoization details, provides more high-level overview and motivation
144for this system.
532ac7d7 145
a1dfa0c6
XL
146# Footnotes
147
148[^salsa]: I have long wanted to rename it to the Salsa algorithm, but it never caught on. -@nikomatsakis
6a06907d
XL
149
150[edge]: https://en.wikipedia.org/wiki/Glossary_of_graph_theory_terms#edge
151[initial-design]: https://github.com/nikomatsakis/rustc-on-demand-incremental-design-doc/blob/master/0000-rustc-on-demand-and-incremental.md