]>
Commit | Line | Data |
---|---|---|
85aaf69f SL |
1 | Region inference |
2 | ||
3 | # Terminology | |
4 | ||
b039eaaf | 5 | Note that we use the terms region and lifetime interchangeably. |
85aaf69f SL |
6 | |
7 | # Introduction | |
8 | ||
9 | Region inference uses a somewhat more involved algorithm than type | |
b039eaaf | 10 | inference. It is not the most efficient thing ever written though it |
85aaf69f SL |
11 | seems to work well enough in practice (famous last words). The reason |
12 | that we use a different algorithm is because, unlike with types, it is | |
13 | impractical to hand-annotate with regions (in some cases, there aren't | |
14 | even the requisite syntactic forms). So we have to get it right, and | |
15 | it's worth spending more time on a more involved analysis. Moreover, | |
16 | regions are a simpler case than types: they don't have aggregate | |
17 | structure, for example. | |
18 | ||
19 | Unlike normal type inference, which is similar in spirit to H-M and thus | |
20 | works progressively, the region type inference works by accumulating | |
21 | constraints over the course of a function. Finally, at the end of | |
22 | processing a function, we process and solve the constraints all at | |
23 | once. | |
24 | ||
25 | The constraints are always of one of three possible forms: | |
26 | ||
b039eaaf SL |
27 | - `ConstrainVarSubVar(Ri, Rj)` states that region variable Ri must be |
28 | a subregion of Rj | |
29 | - `ConstrainRegSubVar(R, Ri)` states that the concrete region R (which | |
30 | must not be a variable) must be a subregion of the variable Ri | |
31 | - `ConstrainVarSubReg(Ri, R)` states the variable Ri shoudl be less | |
32 | than the concrete region R. This is kind of deprecated and ought to | |
33 | be replaced with a verify (they essentially play the same role). | |
34 | ||
35 | In addition to constraints, we also gather up a set of "verifys" | |
36 | (what, you don't think Verify is a noun? Get used to it my | |
37 | friend!). These represent relations that must hold but which don't | |
38 | influence inference proper. These take the form of: | |
39 | ||
40 | - `VerifyRegSubReg(Ri, Rj)` indicates that Ri <= Rj must hold, | |
41 | where Rj is not an inference variable (and Ri may or may not contain | |
42 | one). This doesn't influence inference because we will already have | |
43 | inferred Ri to be as small as possible, so then we just test whether | |
44 | that result was less than Rj or not. | |
45 | - `VerifyGenericBound(R, Vb)` is a more complex expression which tests | |
46 | that the region R must satisfy the bound `Vb`. The bounds themselves | |
47 | may have structure like "must outlive one of the following regions" | |
48 | or "must outlive ALL of the following regions. These bounds arise | |
49 | from constraints like `T: 'a` -- if we know that `T: 'b` and `T: 'c` | |
50 | (say, from where clauses), then we can conclude that `T: 'a` if `'b: | |
51 | 'a` *or* `'c: 'a`. | |
85aaf69f SL |
52 | |
53 | # Building up the constraints | |
54 | ||
55 | Variables and constraints are created using the following methods: | |
56 | ||
57 | - `new_region_var()` creates a new, unconstrained region variable; | |
b039eaaf SL |
58 | - `make_subregion(Ri, Rj)` states that Ri is a subregion of Rj |
59 | - `lub_regions(Ri, Rj) -> Rk` returns a region Rk which is | |
60 | the smallest region that is greater than both Ri and Rj | |
61 | - `glb_regions(Ri, Rj) -> Rk` returns a region Rk which is | |
62 | the greatest region that is smaller than both Ri and Rj | |
85aaf69f SL |
63 | |
64 | The actual region resolution algorithm is not entirely | |
65 | obvious, though it is also not overly complex. | |
66 | ||
67 | ## Snapshotting | |
68 | ||
69 | It is also permitted to try (and rollback) changes to the graph. This | |
70 | is done by invoking `start_snapshot()`, which returns a value. Then | |
71 | later you can call `rollback_to()` which undoes the work. | |
72 | Alternatively, you can call `commit()` which ends all snapshots. | |
73 | Snapshots can be recursive---so you can start a snapshot when another | |
74 | is in progress, but only the root snapshot can "commit". | |
75 | ||
85aaf69f SL |
76 | ## The problem |
77 | ||
78 | Basically our input is a directed graph where nodes can be divided | |
79 | into two categories: region variables and concrete regions. Each edge | |
80 | `R -> S` in the graph represents a constraint that the region `R` is a | |
81 | subregion of the region `S`. | |
82 | ||
83 | Region variable nodes can have arbitrary degree. There is one region | |
84 | variable node per region variable. | |
85 | ||
86 | Each concrete region node is associated with some, well, concrete | |
87 | region: e.g., a free lifetime, or the region for a particular scope. | |
88 | Note that there may be more than one concrete region node for a | |
89 | particular region value. Moreover, because of how the graph is built, | |
90 | we know that all concrete region nodes have either in-degree 1 or | |
91 | out-degree 1. | |
92 | ||
93 | Before resolution begins, we build up the constraints in a hashmap | |
94 | that maps `Constraint` keys to spans. During resolution, we construct | |
95 | the actual `Graph` structure that we describe here. | |
96 | ||
b039eaaf SL |
97 | ## Computing the values for region variables |
98 | ||
99 | The algorithm is a simple dataflow algorithm. Each region variable | |
100 | begins as empty. We iterate over the constraints, and for each constraint | |
101 | we grow the relevant region variable to be as big as it must be to meet all the | |
102 | constraints. This means the region variables can grow to be `'static` if | |
103 | necessary. | |
104 | ||
105 | ## Verification | |
106 | ||
107 | After all constraints are fully propoagated, we do a "verification" | |
108 | step where we walk over the verify bounds and check that they are | |
109 | satisfied. These bounds represent the "maximal" values that a region | |
110 | variable can take on, basically. | |
85aaf69f SL |
111 | |
112 | # The Region Hierarchy | |
113 | ||
114 | ## Without closures | |
115 | ||
116 | Let's first consider the region hierarchy without thinking about | |
117 | closures, because they add a lot of complications. The region | |
118 | hierarchy *basically* mirrors the lexical structure of the code. | |
119 | There is a region for every piece of 'evaluation' that occurs, meaning | |
120 | every expression, block, and pattern (patterns are considered to | |
121 | "execute" by testing the value they are applied to and creating any | |
122 | relevant bindings). So, for example: | |
123 | ||
c1a9b12d | 124 | fn foo(x: isize, y: isize) { // -+ |
85aaf69f SL |
125 | // +------------+ // | |
126 | // | +-----+ // | | |
127 | // | +-+ +-+ +-+ // | | |
128 | // | | | | | | | // | | |
129 | // v v v v v v v // | | |
130 | let z = x + y; // | | |
131 | ... // | | |
132 | } // -+ | |
133 | ||
134 | fn bar() { ... } | |
135 | ||
136 | In this example, there is a region for the fn body block as a whole, | |
137 | and then a subregion for the declaration of the local variable. | |
138 | Within that, there are sublifetimes for the assignment pattern and | |
139 | also the expression `x + y`. The expression itself has sublifetimes | |
140 | for evaluating `x` and `y`. | |
141 | ||
142 | ## Function calls | |
143 | ||
144 | Function calls are a bit tricky. I will describe how we handle them | |
145 | *now* and then a bit about how we can improve them (Issue #6268). | |
146 | ||
147 | Consider a function call like `func(expr1, expr2)`, where `func`, | |
148 | `arg1`, and `arg2` are all arbitrary expressions. Currently, | |
149 | we construct a region hierarchy like: | |
150 | ||
151 | +----------------+ | |
152 | | | | |
153 | +--+ +---+ +---+| | |
154 | v v v v v vv | |
155 | func(expr1, expr2) | |
156 | ||
157 | Here you can see that the call as a whole has a region and the | |
158 | function plus arguments are subregions of that. As a side-effect of | |
159 | this, we get a lot of spurious errors around nested calls, in | |
160 | particular when combined with `&mut` functions. For example, a call | |
161 | like this one | |
162 | ||
163 | self.foo(self.bar()) | |
164 | ||
165 | where both `foo` and `bar` are `&mut self` functions will always yield | |
166 | an error. | |
167 | ||
168 | Here is a more involved example (which is safe) so we can see what's | |
169 | going on: | |
170 | ||
c1a9b12d | 171 | struct Foo { f: usize, g: usize } |
85aaf69f | 172 | ... |
c1a9b12d | 173 | fn add(p: &mut usize, v: usize) { |
85aaf69f SL |
174 | *p += v; |
175 | } | |
176 | ... | |
c1a9b12d | 177 | fn inc(p: &mut usize) -> usize { |
85aaf69f SL |
178 | *p += 1; *p |
179 | } | |
180 | fn weird() { | |
181 | let mut x: Box<Foo> = box Foo { ... }; | |
182 | 'a: add(&mut (*x).f, | |
183 | 'b: inc(&mut (*x).f)) // (..) | |
184 | } | |
185 | ||
186 | The important part is the line marked `(..)` which contains a call to | |
187 | `add()`. The first argument is a mutable borrow of the field `f`. The | |
188 | second argument also borrows the field `f`. Now, in the current borrow | |
189 | checker, the first borrow is given the lifetime of the call to | |
190 | `add()`, `'a`. The second borrow is given the lifetime of `'b` of the | |
191 | call to `inc()`. Because `'b` is considered to be a sublifetime of | |
192 | `'a`, an error is reported since there are two co-existing mutable | |
193 | borrows of the same data. | |
194 | ||
195 | However, if we were to examine the lifetimes a bit more carefully, we | |
196 | can see that this error is unnecessary. Let's examine the lifetimes | |
197 | involved with `'a` in detail. We'll break apart all the steps involved | |
198 | in a call expression: | |
199 | ||
200 | 'a: { | |
201 | 'a_arg1: let a_temp1: ... = add; | |
c1a9b12d SL |
202 | 'a_arg2: let a_temp2: &'a mut usize = &'a mut (*x).f; |
203 | 'a_arg3: let a_temp3: usize = { | |
85aaf69f SL |
204 | let b_temp1: ... = inc; |
205 | let b_temp2: &'b = &'b mut (*x).f; | |
206 | 'b_call: b_temp1(b_temp2) | |
207 | }; | |
208 | 'a_call: a_temp1(a_temp2, a_temp3) // (**) | |
209 | } | |
210 | ||
211 | Here we see that the lifetime `'a` includes a number of substatements. | |
212 | In particular, there is this lifetime I've called `'a_call` that | |
213 | corresponds to the *actual execution of the function `add()`*, after | |
214 | all arguments have been evaluated. There is a corresponding lifetime | |
215 | `'b_call` for the execution of `inc()`. If we wanted to be precise | |
216 | about it, the lifetime of the two borrows should be `'a_call` and | |
217 | `'b_call` respectively, since the references that were created | |
218 | will not be dereferenced except during the execution itself. | |
219 | ||
220 | However, this model by itself is not sound. The reason is that | |
221 | while the two references that are created will never be used | |
222 | simultaneously, it is still true that the first reference is | |
223 | *created* before the second argument is evaluated, and so even though | |
224 | it will not be *dereferenced* during the evaluation of the second | |
225 | argument, it can still be *invalidated* by that evaluation. Consider | |
226 | this similar but unsound example: | |
227 | ||
c1a9b12d | 228 | struct Foo { f: usize, g: usize } |
85aaf69f | 229 | ... |
c1a9b12d | 230 | fn add(p: &mut usize, v: usize) { |
85aaf69f SL |
231 | *p += v; |
232 | } | |
233 | ... | |
c1a9b12d | 234 | fn consume(x: Box<Foo>) -> usize { |
85aaf69f SL |
235 | x.f + x.g |
236 | } | |
237 | fn weird() { | |
238 | let mut x: Box<Foo> = box Foo { ... }; | |
239 | 'a: add(&mut (*x).f, consume(x)) // (..) | |
240 | } | |
241 | ||
242 | In this case, the second argument to `add` actually consumes `x`, thus | |
243 | invalidating the first argument. | |
244 | ||
245 | So, for now, we exclude the `call` lifetimes from our model. | |
246 | Eventually I would like to include them, but we will have to make the | |
247 | borrow checker handle this situation correctly. In particular, if | |
248 | there is a reference created whose lifetime does not enclose | |
249 | the borrow expression, we must issue sufficient restrictions to ensure | |
250 | that the pointee remains valid. | |
251 | ||
c34b1796 AL |
252 | ## Modeling closures |
253 | ||
254 | Integrating closures properly into the model is a bit of | |
255 | work-in-progress. In an ideal world, we would model closures as | |
256 | closely as possible after their desugared equivalents. That is, a | |
257 | closure type would be modeled as a struct, and the region hierarchy of | |
258 | different closure bodies would be completely distinct from all other | |
259 | fns. We are generally moving in that direction but there are | |
260 | complications in terms of the implementation. | |
261 | ||
262 | In practice what we currently do is somewhat different. The basis for | |
263 | the current approach is the observation that the only time that | |
264 | regions from distinct fn bodies interact with one another is through | |
265 | an upvar or the type of a fn parameter (since closures live in the fn | |
266 | body namespace, they can in fact have fn parameters whose types | |
267 | include regions from the surrounding fn body). For these cases, there | |
268 | are separate mechanisms which ensure that the regions that appear in | |
269 | upvars/parameters outlive the dynamic extent of each call to the | |
270 | closure: | |
271 | ||
272 | 1. Types must outlive the region of any expression where they are used. | |
273 | For a closure type `C` to outlive a region `'r`, that implies that the | |
274 | types of all its upvars must outlive `'r`. | |
275 | 2. Parameters must outlive the region of any fn that they are passed to. | |
276 | ||
277 | Therefore, we can -- sort of -- assume that any region from an | |
278 | enclosing fns is larger than any region from one of its enclosed | |
279 | fn. And that is precisely what we do: when building the region | |
280 | hierarchy, each region lives in its own distinct subtree, but if we | |
281 | are asked to compute the `LUB(r1, r2)` of two regions, and those | |
282 | regions are in disjoint subtrees, we compare the lexical nesting of | |
283 | the two regions. | |
284 | ||
285 | *Ideas for improving the situation:* (FIXME #3696) The correctness | |
286 | argument here is subtle and a bit hand-wavy. The ideal, as stated | |
287 | earlier, would be to model things in such a way that it corresponds | |
288 | more closely to the desugared code. The best approach for doing this | |
289 | is a bit unclear: it may in fact be possible to *actually* desugar | |
290 | before we start, but I don't think so. The main option that I've been | |
291 | thinking through is imposing a "view shift" as we enter the fn body, | |
292 | so that regions appearing in the types of fn parameters and upvars are | |
293 | translated from being regions in the outer fn into free region | |
294 | parameters, just as they would be if we applied the desugaring. The | |
295 | challenge here is that type inference may not have fully run, so the | |
296 | types may not be fully known: we could probably do this translation | |
297 | lazilly, as type variables are instantiated. We would also have to | |
298 | apply a kind of inverse translation to the return value. This would be | |
299 | a good idea anyway, as right now it is possible for free regions | |
300 | instantiated within the closure to leak into the parent: this | |
301 | currently leads to type errors, since those regions cannot outlive any | |
302 | expressions within the parent hierarchy. Much like the current | |
303 | handling of closures, there are no known cases where this leads to a | |
304 | type-checking accepting incorrect code (though it sometimes rejects | |
305 | what might be considered correct code; see rust-lang/rust#22557), but | |
306 | it still doesn't feel like the right approach. | |
85aaf69f SL |
307 | |
308 | ### Skolemization | |
309 | ||
310 | For a discussion on skolemization and higher-ranked subtyping, please | |
311 | see the module `middle::infer::higher_ranked::doc`. |