]> git.proxmox.com Git - rustc.git/blob - src/doc/rustc-dev-guide/src/hir.md
New upstream version 1.44.1+dfsg1
[rustc.git] / src / doc / rustc-dev-guide / src / hir.md
1 # The HIR
2
3 The HIR – "High-Level Intermediate Representation" – is the primary IR used
4 in most of rustc. It is a compiler-friendly representation of the abstract
5 syntax tree (AST) that is generated after parsing, macro expansion, and name
6 resolution (see [Lowering](./lowering.html) for how the HIR is created).
7 Many parts of HIR resemble Rust surface syntax quite closely, with
8 the exception that some of Rust's expression forms have been desugared away.
9 For example, `for` loops are converted into a `loop` and do not appear in
10 the HIR. This makes HIR more amenable to analysis than a normal AST.
11
12 This chapter covers the main concepts of the HIR.
13
14 You can view the HIR representation of your code by passing the
15 `-Zunpretty=hir-tree` flag to rustc:
16
17 ```bash
18 cargo rustc -- -Zunpretty=hir-tree
19 ```
20
21 ### Out-of-band storage and the `Crate` type
22
23 The top-level data-structure in the HIR is the [`Crate`], which stores
24 the contents of the crate currently being compiled (we only ever
25 construct HIR for the current crate). Whereas in the AST the crate
26 data structure basically just contains the root module, the HIR
27 `Crate` structure contains a number of maps and other things that
28 serve to organize the content of the crate for easier access.
29
30 [`Crate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Crate.html
31
32 For example, the contents of individual items (e.g. modules,
33 functions, traits, impls, etc) in the HIR are not immediately
34 accessible in the parents. So, for example, if there is a module item
35 `foo` containing a function `bar()`:
36
37 ```rust
38 mod foo {
39 fn bar() { }
40 }
41 ```
42
43 then in the HIR the representation of module `foo` (the [`Mod`]
44 struct) would only have the **`ItemId`** `I` of `bar()`. To get the
45 details of the function `bar()`, we would lookup `I` in the
46 `items` map.
47
48 [`Mod`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Mod.html
49
50 One nice result from this representation is that one can iterate
51 over all items in the crate by iterating over the key-value pairs
52 in these maps (without the need to trawl through the whole HIR).
53 There are similar maps for things like trait items and impl items,
54 as well as "bodies" (explained below).
55
56 The other reason to set up the representation this way is for better
57 integration with incremental compilation. This way, if you gain access
58 to an [`&rustc_hir::Item`] (e.g. for the mod `foo`), you do not immediately
59 gain access to the contents of the function `bar()`. Instead, you only
60 gain access to the **id** for `bar()`, and you must invoke some
61 function to lookup the contents of `bar()` given its id; this gives
62 the compiler a chance to observe that you accessed the data for
63 `bar()`, and then record the dependency.
64
65 [`&rustc_hir::Item`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Item.html
66
67 <a name="hir-id"></a>
68
69 ### Identifiers in the HIR
70
71 Most of the code that has to deal with things in HIR tends not to
72 carry around references into the HIR, but rather to carry around
73 *identifier numbers* (or just "ids"). Right now, you will find four
74 sorts of identifiers in active use:
75
76 - [`DefId`], which primarily names "definitions" or top-level items.
77 - You can think of a [`DefId`] as being shorthand for a very explicit
78 and complete path, like `std::collections::HashMap`. However,
79 these paths are able to name things that are not nameable in
80 normal Rust (e.g. impls), and they also include extra information
81 about the crate (such as its version number, as two versions of
82 the same crate can co-exist).
83 - A [`DefId`] really consists of two parts, a `CrateNum` (which
84 identifies the crate) and a `DefIndex` (which indexes into a list
85 of items that is maintained per crate).
86 - [`HirId`], which combines the index of a particular item with an
87 offset within that item.
88 - the key point of a [`HirId`] is that it is *relative* to some item
89 (which is named via a [`DefId`]).
90 - [`BodyId`], this is an identifier that refers to a specific
91 body (definition of a function or constant) in the crate. It is currently
92 effectively a "newtype'd" [`HirId`].
93 - [`NodeId`], which is an absolute id that identifies a single node in the HIR
94 tree.
95 - While these are still in common use, **they are being slowly phased out**.
96 - Since they are absolute within the crate, adding a new node anywhere in the
97 tree causes the [`NodeId`]s of all subsequent code in the crate to change.
98 This is terrible for incremental compilation, as you can perhaps imagine.
99
100 [`DefId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
101 [`HirId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir_id/struct.HirId.html
102 [`BodyId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.BodyId.html
103 [`NodeId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/node_id/struct.NodeId.html
104
105 We also have an internal map to go from `DefId` to what’s called "Def path". "Def path" is like a
106 module path but a bit more rich. For example, it may be `crate::foo::MyStruct` that identifies
107 this definition uniquely. It’s a bit different than a module path because it might include a type
108 parameter `T`, which you can't write in normal rust, like `crate::foo::MyStruct::T`. These are used
109 in incremental compilation.
110
111 ### The HIR Map
112
113 Most of the time when you are working with the HIR, you will do so via
114 the **HIR Map**, accessible in the tcx via [`tcx.hir_map`] (and defined in
115 the [`hir::map`] module). The [HIR map] contains a [number of methods] to
116 convert between IDs of various kinds and to lookup data associated
117 with an HIR node.
118
119 [`tcx.hir_map`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/context/struct.GlobalCtxt.html#structfield.hir_map
120 [`hir::map`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/index.html
121 [HIR map]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html
122 [number of methods]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#methods
123
124 For example, if you have a [`DefId`], and you would like to convert it
125 to a [`NodeId`], you can use
126 [`tcx.hir.as_local_node_id(def_id)`][as_local_node_id]. This returns
127 an `Option<NodeId>` – this will be `None` if the def-id refers to
128 something outside of the current crate (since then it has no HIR
129 node), but otherwise returns `Some(n)` where `n` is the node-id of the
130 definition.
131
132 [as_local_node_id]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.as_local_node_id
133
134 Similarly, you can use [`tcx.hir.find(n)`][find] to lookup the node for a
135 [`NodeId`]. This returns a `Option<Node<'tcx>>`, where [`Node`] is an enum
136 defined in the map; by matching on this you can find out what sort of
137 node the node-id referred to and also get a pointer to the data
138 itself. Often, you know what sort of node `n` is – e.g. if you know
139 that `n` must be some HIR expression, you can do
140 [`tcx.hir.expect_expr(n)`][expect_expr], which will extract and return the
141 [`&hir::Expr`][Expr], panicking if `n` is not in fact an expression.
142
143 [find]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.find
144 [`Node`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/enum.Node.html
145 [expect_expr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.expect_expr
146 [Expr]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Expr.html
147
148 Finally, you can use the HIR map to find the parents of nodes, via
149 calls like [`tcx.hir.get_parent_node(n)`][get_parent_node].
150
151 [get_parent_node]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.get_parent_node
152
153 ### HIR Bodies
154
155 A [`rustc_hir::Body`] represents some kind of executable code, such as the body
156 of a function/closure or the definition of a constant. Bodies are
157 associated with an **owner**, which is typically some kind of item
158 (e.g. an `fn()` or `const`), but could also be a closure expression
159 (e.g. `|x, y| x + y`). You can use the HIR map to find the body
160 associated with a given def-id ([`maybe_body_owned_by`]) or to find
161 the owner of a body ([`body_owner_def_id`]).
162
163 [`rustc_hir::Body`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/struct.Body.html
164 [`maybe_body_owned_by`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.maybe_body_owned_by
165 [`body_owner_def_id`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/map/struct.Map.html#method.body_owner_def_id