]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-guide/src/type-inference.md
New upstream version 1.36.0+dfsg1
[rustc.git] / src / doc / rustc-guide / src / type-inference.md
CommitLineData
a1dfa0c6
XL
1# Type inference
2
3Type inference is the process of automatic detection of the type of an
4expression.
5
6It is what allows Rust to work with fewer or no type annotations,
7making things easier for users:
8
9```rust,ignore
10fn main() {
11 let mut things = vec![];
12 things.push("thing")
13}
14```
15
16Here, the type of `things` is *inferenced* to be `&str` because that's the value
17we push into `things`.
18
19The type inference is based on the standard Hindley-Milner (HM) type inference
20algorithm, but extended in various way to accommodate subtyping, region
21inference, and higher-ranked types.
22
23## A note on terminology
24
25We use the notation `?T` to refer to inference variables, also called
26existential variables.
27
28We use the terms "region" and "lifetime" interchangeably. Both refer to
29the `'a` in `&'a T`.
30
31The term "bound region" refers to a region that is bound in a function
32signature, such as the `'a` in `for<'a> fn(&'a u32)`. A region is
33"free" if it is not bound.
34
35## Creating an inference context
36
37You create and "enter" an inference context by doing something like
38the following:
39
40```rust,ignore
41tcx.infer_ctxt().enter(|infcx| {
42 // Use the inference context `infcx` here.
43})
44```
45
46Each inference context creates a short-lived type arena to store the
47fresh types and things that it will create, as described in the
48[chapter on the `ty` module][ty-ch]. This arena is created by the `enter`
49function and disposed of after it returns.
50
51[ty-ch]: ty.html
52
53Within the closure, `infcx` has the type `InferCtxt<'cx, 'gcx, 'tcx>`
54for some fresh `'cx` and `'tcx` – the latter corresponds to the lifetime of
55this temporary arena, and the `'cx` is the lifetime of the `InferCtxt` itself.
56(Again, see the [`ty` chapter][ty-ch] for more details on this setup.)
57
58The `tcx.infer_ctxt` method actually returns a builder, which means
59there are some kinds of configuration you can do before the `infcx` is
60created. See `InferCtxtBuilder` for more information.
61
62<a name="vars"></a>
63
64## Inference variables
65
66The main purpose of the inference context is to house a bunch of
67**inference variables** – these represent types or regions whose precise
68value is not yet known, but will be uncovered as we perform type-checking.
69
70If you're familiar with the basic ideas of unification from H-M type
71systems, or logic languages like Prolog, this is the same concept. If
72you're not, you might want to read a tutorial on how H-M type
73inference works, or perhaps this blog post on
74[unification in the Chalk project].
75
76[Unification in the Chalk project]: http://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/
77
78All told, the inference context stores four kinds of inference variables as of
79this writing:
80
81- Type variables, which come in three varieties:
82 - General type variables (the most common). These can be unified with any
83 type.
84 - Integral type variables, which can only be unified with an integral type,
85 and arise from an integer literal expression like `22`.
86 - Float type variables, which can only be unified with a float type, and
87 arise from a float literal expression like `22.0`.
88- Region variables, which represent lifetimes, and arise all over the place.
89
90All the type variables work in much the same way: you can create a new
91type variable, and what you get is `Ty<'tcx>` representing an
92unresolved type `?T`. Then later you can apply the various operations
93that the inferencer supports, such as equality or subtyping, and it
94will possibly **instantiate** (or **bind**) that `?T` to a specific
95value as a result.
96
97The region variables work somewhat differently, and are described
98below in a separate section.
99
100## Enforcing equality / subtyping
101
102The most basic operations you can perform in the type inferencer is
103**equality**, which forces two types `T` and `U` to be the same. The
104recommended way to add an equality constraint is to use the `at`
105method, roughly like so:
106
107```rust,ignore
108infcx.at(...).eq(t, u);
109```
110
111The first `at()` call provides a bit of context, i.e. why you are
112doing this unification, and in what environment, and the `eq` method
113performs the actual equality constraint.
114
115When you equate things, you force them to be precisely equal. Equating
116returns an `InferResult` – if it returns `Err(err)`, then equating
117failed, and the enclosing `TypeError` will tell you what went wrong.
118
119The success case is perhaps more interesting. The "primary" return
120type of `eq` is `()` – that is, when it succeeds, it doesn't return a
121value of any particular interest. Rather, it is executed for its
122side-effects of constraining type variables and so forth. However, the
123actual return type is not `()`, but rather `InferOk<()>`. The
124`InferOk` type is used to carry extra trait obligations – your job is
125to ensure that these are fulfilled (typically by enrolling them in a
126fulfillment context). See the [trait chapter] for more background on that.
127
128[trait chapter]: traits/resolution.html
129
130You can similarly enforce subtyping through `infcx.at(..).sub(..)`. The same
131basic concepts as above apply.
132
133## "Trying" equality
134
135Sometimes you would like to know if it is *possible* to equate two
136types without error. You can test that with `infcx.can_eq` (or
137`infcx.can_sub` for subtyping). If this returns `Ok`, then equality
138is possible – but in all cases, any side-effects are reversed.
139
140Be aware, though, that the success or failure of these methods is always
141**modulo regions**. That is, two types `&'a u32` and `&'b u32` will
142return `Ok` for `can_eq`, even if `'a != 'b`. This falls out from the
143"two-phase" nature of how we solve region constraints.
144
145## Snapshots
146
147As described in the previous section on `can_eq`, often it is useful
148to be able to do a series of operations and then roll back their
149side-effects. This is done for various reasons: one of them is to be
150able to backtrack, trying out multiple possibilities before settling
151on which path to take. Another is in order to ensure that a series of
152smaller changes take place atomically or not at all.
153
154To allow for this, the inference context supports a `snapshot` method.
155When you call it, it will start recording changes that occur from the
156operations you perform. When you are done, you can either invoke
157`rollback_to`, which will undo those changes, or else `confirm`, which
158will make the permanent. Snapshots can be nested as long as you follow
159a stack-like discipline.
160
161Rather than use snapshots directly, it is often helpful to use the
162methods like `commit_if_ok` or `probe` that encapsulate higher-level
163patterns.
164
165## Subtyping obligations
166
167One thing worth discussing is subtyping obligations. When you force
168two types to be a subtype, like `?T <: i32`, we can often convert those
169into equality constraints. This follows from Rust's rather limited notion
170of subtyping: so, in the above case, `?T <: i32` is equivalent to `?T = i32`.
171
172However, in some cases we have to be more careful. For example, when
173regions are involved. So if you have `?T <: &'a i32`, what we would do
174is to first "generalize" `&'a i32` into a type with a region variable:
175`&'?b i32`, and then unify `?T` with that (`?T = &'?b i32`). We then
176relate this new variable with the original bound:
177
178```text
179&'?b i32 <: &'a i32
180```
181
182This will result in a region constraint (see below) of `'?b: 'a`.
183
184One final interesting case is relating two unbound type variables,
185like `?T <: ?U`. In that case, we can't make progress, so we enqueue
186an obligation `Subtype(?T, ?U)` and return it via the `InferOk`
187mechanism. You'll have to try again when more details about `?T` or
188`?U` are known.
189
190## Region constraints
191
192Regions are inferenced somewhat differently from types. Rather than
193eagerly unifying things, we simply collect constraints as we go, but
194make (almost) no attempt to solve regions. These constraints have the
195form of an "outlives" constraint:
196
197```text
198'a: 'b
199```
200
201Actually the code tends to view them as a subregion relation, but it's the same
202idea:
203
204```text
205'b <= 'a
206```
207
208(There are various other kinds of constraints, such as "verifys"; see
209the `region_constraints` module for details.)
210
211There is one case where we do some amount of eager unification. If you have an
212equality constraint between two regions
213
214```text
215'a = 'b
216```
217
218we will record that fact in a unification table. You can then use
219`opportunistic_resolve_var` to convert `'b` to `'a` (or vice
220versa). This is sometimes needed to ensure termination of fixed-point
221algorithms.
222
223## Extracting region constraints
224
225Ultimately, region constraints are only solved at the very end of
226type-checking, once all other constraints are known. There are two
227ways to solve region constraints right now: lexical and
228non-lexical. Eventually there will only be one.
229
230To solve **lexical** region constraints, you invoke
231`resolve_regions_and_report_errors`. This "closes" the region
232constraint process and invoke the `lexical_region_resolve` code. Once
233this is done, any further attempt to equate or create a subtyping
234relationship will yield an ICE.
235
236Non-lexical region constraints are not handled within the inference
237context. Instead, the NLL solver (actually, the MIR type-checker)
238invokes `take_and_reset_region_constraints` periodically. This
239extracts all of the outlives constraints from the region solver, but
240leaves the set of variables intact. This is used to get *just* the
241region constraints that resulted from some particular point in the
242program, since the NLL solver needs to know not just *what* regions
243were subregions but *where*. Finally, the NLL solver invokes
244`take_region_var_origins`, which "closes" the region constraint
245process in the same way as normal solving.
246
247## Lexical region resolution
248
249Lexical region resolution is done by initially assigning each region
250variable to an empty value. We then process each outlives constraint
251repeatedly, growing region variables until a fixed-point is reached.
252Region variables can be grown using a least-upper-bound relation on
253the region lattice in a fairly straightforward fashion.