]> git.proxmox.com Git - rustc.git/blame - src/doc/rustc-dev-guide/src/type-inference.md
Merge tag 'debian/1.52.1+dfsg1-1_exp2' into proxmox/buster
[rustc.git] / src / doc / rustc-dev-guide / src / type-inference.md
CommitLineData
a1dfa0c6
XL
1# Type inference
2
6a06907d
XL
3<!-- toc -->
4
a1dfa0c6
XL
5Type inference is the process of automatic detection of the type of an
6expression.
7
8It is what allows Rust to work with fewer or no type annotations,
9making things easier for users:
10
dc9dc135 11```rust
a1dfa0c6
XL
12fn main() {
13 let mut things = vec![];
dc9dc135 14 things.push("thing");
a1dfa0c6
XL
15}
16```
17
dc9dc135 18Here, the type of `things` is *inferred* to be `Vec<&str>` because of the value
a1dfa0c6
XL
19we push into `things`.
20
21The type inference is based on the standard Hindley-Milner (HM) type inference
22algorithm, but extended in various way to accommodate subtyping, region
23inference, and higher-ranked types.
24
25## A note on terminology
26
27We use the notation `?T` to refer to inference variables, also called
28existential variables.
29
30We use the terms "region" and "lifetime" interchangeably. Both refer to
31the `'a` in `&'a T`.
32
33The term "bound region" refers to a region that is bound in a function
34signature, such as the `'a` in `for<'a> fn(&'a u32)`. A region is
35"free" if it is not bound.
36
37## Creating an inference context
38
39You create and "enter" an inference context by doing something like
40the following:
41
42```rust,ignore
43tcx.infer_ctxt().enter(|infcx| {
44 // Use the inference context `infcx` here.
45})
46```
47
dc9dc135
XL
48Within the closure, `infcx` has the type `InferCtxt<'cx, 'tcx>` for some
49fresh `'cx`, while `'tcx` is the same as outside the inference context.
a1dfa0c6
XL
50(Again, see the [`ty` chapter][ty-ch] for more details on this setup.)
51
74b04a01
XL
52[ty-ch]: ty.html
53
a1dfa0c6
XL
54The `tcx.infer_ctxt` method actually returns a builder, which means
55there are some kinds of configuration you can do before the `infcx` is
56created. See `InferCtxtBuilder` for more information.
57
58<a name="vars"></a>
59
60## Inference variables
61
62The main purpose of the inference context is to house a bunch of
63**inference variables** – these represent types or regions whose precise
64value is not yet known, but will be uncovered as we perform type-checking.
65
66If you're familiar with the basic ideas of unification from H-M type
67systems, or logic languages like Prolog, this is the same concept. If
68you're not, you might want to read a tutorial on how H-M type
69inference works, or perhaps this blog post on
70[unification in the Chalk project].
71
72[Unification in the Chalk project]: http://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/
73
6a06907d
XL
74All told, the inference context stores four kinds of inference variables
75(as of <!-- date: 2018-01 --> January 2018):
a1dfa0c6
XL
76
77- Type variables, which come in three varieties:
78 - General type variables (the most common). These can be unified with any
79 type.
80 - Integral type variables, which can only be unified with an integral type,
81 and arise from an integer literal expression like `22`.
82 - Float type variables, which can only be unified with a float type, and
83 arise from a float literal expression like `22.0`.
84- Region variables, which represent lifetimes, and arise all over the place.
85
86All the type variables work in much the same way: you can create a new
87type variable, and what you get is `Ty<'tcx>` representing an
88unresolved type `?T`. Then later you can apply the various operations
89that the inferencer supports, such as equality or subtyping, and it
90will possibly **instantiate** (or **bind**) that `?T` to a specific
91value as a result.
92
93The region variables work somewhat differently, and are described
94below in a separate section.
95
96## Enforcing equality / subtyping
97
98The most basic operations you can perform in the type inferencer is
99**equality**, which forces two types `T` and `U` to be the same. The
100recommended way to add an equality constraint is to use the `at`
101method, roughly like so:
102
103```rust,ignore
104infcx.at(...).eq(t, u);
105```
106
107The first `at()` call provides a bit of context, i.e. why you are
108doing this unification, and in what environment, and the `eq` method
109performs the actual equality constraint.
110
111When you equate things, you force them to be precisely equal. Equating
112returns an `InferResult` – if it returns `Err(err)`, then equating
113failed, and the enclosing `TypeError` will tell you what went wrong.
114
115The success case is perhaps more interesting. The "primary" return
116type of `eq` is `()` – that is, when it succeeds, it doesn't return a
117value of any particular interest. Rather, it is executed for its
118side-effects of constraining type variables and so forth. However, the
119actual return type is not `()`, but rather `InferOk<()>`. The
120`InferOk` type is used to carry extra trait obligations – your job is
121to ensure that these are fulfilled (typically by enrolling them in a
122fulfillment context). See the [trait chapter] for more background on that.
123
124[trait chapter]: traits/resolution.html
125
126You can similarly enforce subtyping through `infcx.at(..).sub(..)`. The same
127basic concepts as above apply.
128
129## "Trying" equality
130
131Sometimes you would like to know if it is *possible* to equate two
132types without error. You can test that with `infcx.can_eq` (or
133`infcx.can_sub` for subtyping). If this returns `Ok`, then equality
134is possible – but in all cases, any side-effects are reversed.
135
136Be aware, though, that the success or failure of these methods is always
137**modulo regions**. That is, two types `&'a u32` and `&'b u32` will
138return `Ok` for `can_eq`, even if `'a != 'b`. This falls out from the
139"two-phase" nature of how we solve region constraints.
140
141## Snapshots
142
143As described in the previous section on `can_eq`, often it is useful
144to be able to do a series of operations and then roll back their
145side-effects. This is done for various reasons: one of them is to be
146able to backtrack, trying out multiple possibilities before settling
147on which path to take. Another is in order to ensure that a series of
148smaller changes take place atomically or not at all.
149
150To allow for this, the inference context supports a `snapshot` method.
151When you call it, it will start recording changes that occur from the
152operations you perform. When you are done, you can either invoke
153`rollback_to`, which will undo those changes, or else `confirm`, which
ba9703b0 154will make them permanent. Snapshots can be nested as long as you follow
a1dfa0c6
XL
155a stack-like discipline.
156
157Rather than use snapshots directly, it is often helpful to use the
158methods like `commit_if_ok` or `probe` that encapsulate higher-level
159patterns.
160
161## Subtyping obligations
162
163One thing worth discussing is subtyping obligations. When you force
164two types to be a subtype, like `?T <: i32`, we can often convert those
165into equality constraints. This follows from Rust's rather limited notion
166of subtyping: so, in the above case, `?T <: i32` is equivalent to `?T = i32`.
167
168However, in some cases we have to be more careful. For example, when
169regions are involved. So if you have `?T <: &'a i32`, what we would do
170is to first "generalize" `&'a i32` into a type with a region variable:
171`&'?b i32`, and then unify `?T` with that (`?T = &'?b i32`). We then
172relate this new variable with the original bound:
173
174```text
175&'?b i32 <: &'a i32
176```
177
178This will result in a region constraint (see below) of `'?b: 'a`.
179
180One final interesting case is relating two unbound type variables,
181like `?T <: ?U`. In that case, we can't make progress, so we enqueue
182an obligation `Subtype(?T, ?U)` and return it via the `InferOk`
183mechanism. You'll have to try again when more details about `?T` or
184`?U` are known.
185
186## Region constraints
187
188Regions are inferenced somewhat differently from types. Rather than
189eagerly unifying things, we simply collect constraints as we go, but
190make (almost) no attempt to solve regions. These constraints have the
191form of an "outlives" constraint:
192
193```text
194'a: 'b
195```
196
197Actually the code tends to view them as a subregion relation, but it's the same
198idea:
199
200```text
201'b <= 'a
202```
203
204(There are various other kinds of constraints, such as "verifys"; see
ba9703b0 205the [`region_constraints`] module for details.)
a1dfa0c6
XL
206
207There is one case where we do some amount of eager unification. If you have an
208equality constraint between two regions
209
210```text
211'a = 'b
212```
213
214we will record that fact in a unification table. You can then use
ba9703b0 215[`opportunistic_resolve_var`] to convert `'b` to `'a` (or vice
a1dfa0c6
XL
216versa). This is sometimes needed to ensure termination of fixed-point
217algorithms.
218
ba9703b0
XL
219[`region_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/region_constraints/index.html
220[`opportunistic_resolve_var`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/region_constraints/struct.RegionConstraintCollector.html#method.opportunistic_resolve_var
221
a1dfa0c6
XL
222## Extracting region constraints
223
224Ultimately, region constraints are only solved at the very end of
225type-checking, once all other constraints are known. There are two
226ways to solve region constraints right now: lexical and
227non-lexical. Eventually there will only be one.
228
229To solve **lexical** region constraints, you invoke
ba9703b0
XL
230[`resolve_regions_and_report_errors`]. This "closes" the region
231constraint process and invokes the [`lexical_region_resolve`] code. Once
a1dfa0c6
XL
232this is done, any further attempt to equate or create a subtyping
233relationship will yield an ICE.
234
235Non-lexical region constraints are not handled within the inference
236context. Instead, the NLL solver (actually, the MIR type-checker)
ba9703b0 237invokes [`take_and_reset_region_constraints`] periodically. This
a1dfa0c6
XL
238extracts all of the outlives constraints from the region solver, but
239leaves the set of variables intact. This is used to get *just* the
240region constraints that resulted from some particular point in the
241program, since the NLL solver needs to know not just *what* regions
ba9703b0
XL
242were subregions, but also *where*. Finally, the NLL solver invokes
243[`take_region_var_origins`], which "closes" the region constraint
a1dfa0c6
XL
244process in the same way as normal solving.
245
ba9703b0
XL
246[`resolve_regions_and_report_errors`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxt.html#method.resolve_regions_and_report_errors
247[`lexical_region_resolve`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/lexical_region_resolve/index.html
248[`take_and_reset_region_constraints`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxt.html#method.take_and_reset_region_constraints
249[`take_region_var_origins`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxt.html#method.take_region_var_origins
250
a1dfa0c6
XL
251## Lexical region resolution
252
253Lexical region resolution is done by initially assigning each region
254variable to an empty value. We then process each outlives constraint
255repeatedly, growing region variables until a fixed-point is reached.
256Region variables can be grown using a least-upper-bound relation on
257the region lattice in a fairly straightforward fashion.