]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-dev-guide/src/mir/index.md
New upstream version 1.58.1+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / mir / index.md
CommitLineData
a1dfa0c6
XL
1# The MIR (Mid-level IR)
2
6a06907d
XL
3<!-- toc -->
4
a1dfa0c6
XL
5MIR is Rust's _Mid-level Intermediate Representation_. It is
6constructed from [HIR](../hir.html). MIR was introduced in
7[RFC 1211]. It is a radically simplified form of Rust that is used for
8certain flow-sensitive safety checks – notably the borrow checker! –
9and also for optimization and code generation.
10
11If you'd like a very high-level introduction to MIR, as well as some
12of the compiler concepts that it relies on (such as control-flow
13graphs and desugaring), you may enjoy the
14[rust-lang blog post that introduced MIR][blog].
15
16[blog]: https://blog.rust-lang.org/2016/04/19/MIR.html
17
18## Introduction to MIR
19
6a06907d 20MIR is defined in the [`compiler/rustc_middle/src/mir/`][mir] module, but much of the code
3c0e092e
XL
21that manipulates it is found in [`compiler/rustc_mir_build`][mirmanip_build],
22[`compiler/rustc_mir_transform`][mirmanip_transform], and
23[`compiler/rustc_mir_dataflow`][mirmanip_dataflow].
a1dfa0c6 24
60c5eb7d 25[RFC 1211]: https://rust-lang.github.io/rfcs/1211-mir.html
a1dfa0c6
XL
26
27Some of the key characteristics of MIR are:
28
29- It is based on a [control-flow graph][cfg].
30- It does not have nested expressions.
31- All types in MIR are fully explicit.
32
33[cfg]: ../appendix/background.html#cfg
34
35## Key MIR vocabulary
36
37This section introduces the key concepts of MIR, summarized here:
38
39- **Basic blocks**: units of the control-flow graph, consisting of:
40 - **statements:** actions with one successor
41 - **terminators:** actions with potentially multiple successors; always at
42 the end of a block
43 - (if you're not familiar with the term *basic block*, see the [background
44 chapter][cfg])
45- **Locals:** Memory locations allocated on the stack (conceptually, at
46 least), such as function arguments, local variables, and
47 temporaries. These are identified by an index, written with a
48 leading underscore, like `_1`. There is also a special "local"
49 (`_0`) allocated to store the return value.
50- **Places:** expressions that identify a location in memory, like `_1` or
51 `_1.f`.
52- **Rvalues:** expressions that produce a value. The "R" stands for
53 the fact that these are the "right-hand side" of an assignment.
54 - **Operands:** the arguments to an rvalue, which can either be a
55 constant (like `22`) or a place (like `_1`).
56
3c0e092e 57You can get a feeling for how MIR is constructed by translating simple
a1dfa0c6
XL
58programs into MIR and reading the pretty printed output. In fact, the
59playground makes this easy, since it supplies a MIR button that will
60show you the MIR for your program. Try putting this program into play
61(or [clicking on this link][sample-play]), and then clicking the "MIR"
62button on the top:
63
64[sample-play]: https://play.rust-lang.org/?gist=30074856e62e74e91f06abd19bd72ece&version=stable
65
66```rust
67fn main() {
68 let mut vec = Vec::new();
69 vec.push(1);
70 vec.push(2);
71}
72```
73
74You should see something like:
75
76```mir
77// WARNING: This output format is intended for human consumers only
78// and is subject to change without notice. Knock yourself out.
79fn main() -> () {
80 ...
81}
82```
83
84This is the MIR format for the `main` function.
85
86**Variable declarations.** If we drill in a bit, we'll see it begins
87with a bunch of variable declarations. They look like this:
88
89```mir
90let mut _0: (); // return place
74b04a01 91let mut _1: std::vec::Vec<i32>; // in scope 0 at src/main.rs:2:9: 2:16
a1dfa0c6
XL
92let mut _2: ();
93let mut _3: &mut std::vec::Vec<i32>;
94let mut _4: ();
95let mut _5: &mut std::vec::Vec<i32>;
96```
97
98You can see that variables in MIR don't have names, they have indices,
99like `_0` or `_1`. We also intermingle the user's variables (e.g.,
74b04a01
XL
100`_1`) with temporary values (e.g., `_2` or `_3`). You can tell apart
101user-defined variables because they have debuginfo associated to them (see below).
102
103**User variable debuginfo.** Below the variable declarations, we find the only
104hint that `_1` represents a user variable:
105```mir
106scope 1 {
107 debug vec => _1; // in scope 1 at src/main.rs:2:9: 2:16
108}
109```
110Each `debug <Name> => <Place>;` annotation describes a named user variable,
111and where (i.e. the place) a debugger can find the data of that variable.
112Here the mapping is trivial, but optimizations may complicate the place,
113or lead to multiple user variables sharing the same place.
114Additionally, closure captures are described using the same system, and so
115they're complicated even without optimizations, e.g.: `debug x => (*((*_1).0: &T));`.
116
117The "scope" blocks (e.g., `scope 1 { .. }`) describe the lexical structure of
118the source program (which names were in scope when), so any part of the program
119annotated with `// in scope 0` would be missing `vec`, if you were stepping
120through the code in a debugger, for example.
a1dfa0c6
XL
121
122**Basic blocks.** Reading further, we see our first **basic block** (naturally
123it may look slightly different when you view it, and I am ignoring some of the
124comments):
125
126```mir
127bb0: {
128 StorageLive(_1);
129 _1 = const <std::vec::Vec<T>>::new() -> bb2;
130}
131```
132
133A basic block is defined by a series of **statements** and a final
134**terminator**. In this case, there is one statement:
135
136```mir
137StorageLive(_1);
138```
139
140This statement indicates that the variable `_1` is "live", meaning
141that it may be used later – this will persist until we encounter a
142`StorageDead(_1)` statement, which indicates that the variable `_1` is
143done being used. These "storage statements" are used by LLVM to
144allocate stack space.
145
146The **terminator** of the block `bb0` is the call to `Vec::new`:
147
148```mir
149_1 = const <std::vec::Vec<T>>::new() -> bb2;
150```
151
152Terminators are different from statements because they can have more
153than one successor – that is, control may flow to different
154places. Function calls like the call to `Vec::new` are always
155terminators because of the possibility of unwinding, although in the
156case of `Vec::new` we are able to see that indeed unwinding is not
416331ca 157possible, and hence we list only one successor block, `bb2`.
a1dfa0c6
XL
158
159If we look ahead to `bb2`, we will see it looks like this:
160
161```mir
162bb2: {
163 StorageLive(_3);
164 _3 = &mut _1;
165 _2 = const <std::vec::Vec<T>>::push(move _3, const 1i32) -> [return: bb3, unwind: bb4];
166}
167```
168
169Here there are two statements: another `StorageLive`, introducing the `_3`
170temporary, and then an assignment:
171
172```mir
173_3 = &mut _1;
174```
175
176Assignments in general have the form:
177
178```text
179<Place> = <Rvalue>
180```
181
182A place is an expression like `_3`, `_3.f` or `*_3` – it denotes a
183location in memory. An **Rvalue** is an expression that creates a
184value: in this case, the rvalue is a mutable borrow expression, which
185looks like `&mut <Place>`. So we can kind of define a grammar for
186rvalues like so:
187
188```text
189<Rvalue> = & (mut)? <Place>
190 | <Operand> + <Operand>
191 | <Operand> - <Operand>
192 | ...
193
194<Operand> = Constant
195 | copy Place
196 | move Place
197```
198
199As you can see from this grammar, rvalues cannot be nested – they can
200only reference places and constants. Moreover, when you use a place,
201we indicate whether we are **copying it** (which requires that the
202place have a type `T` where `T: Copy`) or **moving it** (which works
203for a place of any type). So, for example, if we had the expression `x
60c5eb7d 204= a + b + c` in Rust, that would get compiled to two statements and a
a1dfa0c6
XL
205temporary:
206
207```mir
208TMP1 = a + b
209x = TMP1 + c
210```
211
212([Try it and see][play-abc], though you may want to do release mode to skip
213over the overflow checks.)
214
215[play-abc]: https://play.rust-lang.org/?gist=1751196d63b2a71f8208119e59d8a5b6&version=stable
216
217## MIR data types
218
6a06907d 219The MIR data types are defined in the [`compiler/rustc_middle/src/mir/`][mir]
a1dfa0c6
XL
220module. Each of the key concepts mentioned in the previous section
221maps in a fairly straightforward way to a Rust type.
222
6a06907d 223The main MIR data type is [`Body`]. It contains the data for a single
a1dfa0c6
XL
224function (along with sub-instances of Mir for "promoted constants",
225but [you can read about those below](#promoted)).
226
227- **Basic blocks**: The basic blocks are stored in the field
6a06907d
XL
228 [`Body::basic_blocks`][basicblocks]; this is a vector
229 of [`BasicBlockData`] structures. Nobody ever references a
230 basic block directly: instead, we pass around [`BasicBlock`]
231 values, which are [newtype'd] indices into this vector.
232- **Statements** are represented by the type [`Statement`].
233- **Terminators** are represented by the [`Terminator`].
234- **Locals** are represented by a [newtype'd] index type [`Local`].
235 The data for a local variable is found in the
236 [`Body::local_decls`][localdecls] vector). There is also a special constant
237 [`RETURN_PLACE`] identifying the special "local" representing the return value.
238- **Places** are identified by the enum [`Place`]. There are a few
239 variants:
a1dfa0c6
XL
240 - Local variables like `_1`
241 - Static variables `FOO`
242 - **Projections**, which are fields or other things that "project
6a06907d
XL
243 out" from a base place. These are represented by the type
244 [`ProjectionElem`]. So e.g. the place `_1.f` is a projection,
245 with `f` being the "projection element" and `_1` being the base
a1dfa0c6 246 path. `*_1` is also a projection, with the `*` being represented
6a06907d
XL
247 by the [`ProjectionElem::Deref`] element.
248- **Rvalues** are represented by the enum [`Rvalue`].
249- **Operands** are represented by the enum [`Operand`].
a1dfa0c6
XL
250
251## Representing constants
252
253*to be written*
254
255<a name="promoted"></a>
256
257### Promoted constants
258
6a06907d 259See the const-eval WG's [docs on promotion](https://github.com/rust-lang/const-eval/blob/master/promotion.md).
a1dfa0c6
XL
260
261
6a06907d 262[mir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html
3c0e092e
XL
263[mirmanip_build]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/index.html
264[mirmanip_transform]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html
265[mirmanip_dataflow]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/index.html
6a06907d 266[`Body`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Body.html
ba9703b0 267[newtype'd]: ../appendix/glossary.html#newtype
6a06907d
XL
268[basicblocks]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Body.html#structfield.basic_blocks
269[`BasicBlock`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.BasicBlock.html
270[`BasicBlockData`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.BasicBlockData.html
271[`Statement`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Statement.html
272[`Terminator`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/terminator/struct.Terminator.html
273[`Local`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Local.html
274[localdecls]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Body.html#structfield.local_decls
275[`RETURN_PLACE`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/constant.RETURN_PLACE.html
276[`Place`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Place.html
277[`ProjectionElem`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html
278[`ProjectionElem::Deref`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html#variant.Deref
279[`Rvalue`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Rvalue.html
280[`Operand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Operand.html