src/librustc/middle/infer/region_inference/README.md

   1 Region inference
   2
   3 # Terminology
   4
   5 Note that we use the terms region and lifetime interchangeably,
   6 though the term `lifetime` is preferred.
   7
   8 # Introduction
   9
  10 Region inference uses a somewhat more involved algorithm than type
  11 inference.  It is not the most efficient thing ever written though it
  12 seems to work well enough in practice (famous last words).  The reason
  13 that we use a different algorithm is because, unlike with types, it is
  14 impractical to hand-annotate with regions (in some cases, there aren't
  15 even the requisite syntactic forms).  So we have to get it right, and
  16 it's worth spending more time on a more involved analysis.  Moreover,
  17 regions are a simpler case than types: they don't have aggregate
  18 structure, for example.
  19
  20 Unlike normal type inference, which is similar in spirit to H-M and thus
  21 works progressively, the region type inference works by accumulating
  22 constraints over the course of a function.  Finally, at the end of
  23 processing a function, we process and solve the constraints all at
  24 once.
  25
  26 The constraints are always of one of three possible forms:
  27
  28 - ConstrainVarSubVar(R_i, R_j) states that region variable R_i
  29   must be a subregion of R_j
  30 - ConstrainRegSubVar(R, R_i) states that the concrete region R
  31   (which must not be a variable) must be a subregion of the variable R_i
  32 - ConstrainVarSubReg(R_i, R) is the inverse
  33
  34 # Building up the constraints
  35
  36 Variables and constraints are created using the following methods:
  37
  38 - `new_region_var()` creates a new, unconstrained region variable;
  39 - `make_subregion(R_i, R_j)` states that R_i is a subregion of R_j
  40 - `lub_regions(R_i, R_j) -> R_k` returns a region R_k which is
  41   the smallest region that is greater than both R_i and R_j
  42 - `glb_regions(R_i, R_j) -> R_k` returns a region R_k which is
  43   the greatest region that is smaller than both R_i and R_j
  44
  45 The actual region resolution algorithm is not entirely
  46 obvious, though it is also not overly complex.
  47
  48 ## Snapshotting
  49
  50 It is also permitted to try (and rollback) changes to the graph.  This
  51 is done by invoking `start_snapshot()`, which returns a value.  Then
  52 later you can call `rollback_to()` which undoes the work.
  53 Alternatively, you can call `commit()` which ends all snapshots.
  54 Snapshots can be recursive---so you can start a snapshot when another
  55 is in progress, but only the root snapshot can "commit".
  56
  57 # Resolving constraints
  58
  59 The constraint resolution algorithm is not super complex but also not
  60 entirely obvious.  Here I describe the problem somewhat abstractly,
  61 then describe how the current code works.  There may be other, smarter
  62 ways of doing this with which I am unfamiliar and can't be bothered to
  63 research at the moment. - NDM
  64
  65 ## The problem
  66
  67 Basically our input is a directed graph where nodes can be divided
  68 into two categories: region variables and concrete regions.  Each edge
  69 `R -> S` in the graph represents a constraint that the region `R` is a
  70 subregion of the region `S`.
  71
  72 Region variable nodes can have arbitrary degree.  There is one region
  73 variable node per region variable.
  74
  75 Each concrete region node is associated with some, well, concrete
  76 region: e.g., a free lifetime, or the region for a particular scope.
  77 Note that there may be more than one concrete region node for a
  78 particular region value.  Moreover, because of how the graph is built,
  79 we know that all concrete region nodes have either in-degree 1 or
  80 out-degree 1.
  81
  82 Before resolution begins, we build up the constraints in a hashmap
  83 that maps `Constraint` keys to spans.  During resolution, we construct
  84 the actual `Graph` structure that we describe here.
  85
  86 ## Our current algorithm
  87
  88 We divide region variables into two groups: Expanding and Contracting.
  89 Expanding region variables are those that have a concrete region
  90 predecessor (direct or indirect).  Contracting region variables are
  91 all others.
  92
  93 We first resolve the values of Expanding region variables and then
  94 process Contracting ones.  We currently use an iterative, fixed-point
  95 procedure (but read on, I believe this could be replaced with a linear
  96 walk).  Basically we iterate over the edges in the graph, ensuring
  97 that, if the source of the edge has a value, then this value is a
  98 subregion of the target value.  If the target does not yet have a
  99 value, it takes the value from the source.  If the target already had
 100 a value, then the resulting value is Least Upper Bound of the old and
 101 new values. When we are done, each Expanding node will have the
 102 smallest region that it could possibly have and still satisfy the
 103 constraints.
 104
 105 We next process the Contracting nodes.  Here we again iterate over the
 106 edges, only this time we move values from target to source (if the
 107 source is a Contracting node).  For each contracting node, we compute
 108 its value as the GLB of all its successors.  Basically contracting
 109 nodes ensure that there is overlap between their successors; we will
 110 ultimately infer the largest overlap possible.
 111
 112 # The Region Hierarchy
 113
 114 ## Without closures
 115
 116 Let's first consider the region hierarchy without thinking about
 117 closures, because they add a lot of complications. The region
 118 hierarchy *basically* mirrors the lexical structure of the code.
 119 There is a region for every piece of 'evaluation' that occurs, meaning
 120 every expression, block, and pattern (patterns are considered to
 121 "execute" by testing the value they are applied to and creating any
 122 relevant bindings).  So, for example:
 123
 124     fn foo(x: int, y: int) { // -+
 125     //  +------------+       //  |
 126     //  |      +-----+       //  |
 127     //  |  +-+ +-+ +-+       //  |
 128     //  |  | | | | | |       //  |
 129     //  v  v v v v v v       //  |
 130         let z = x + y;       //  |
 131         ...                  //  |
 132     }                        // -+
 133
 134     fn bar() { ... }
 135
 136 In this example, there is a region for the fn body block as a whole,
 137 and then a subregion for the declaration of the local variable.
 138 Within that, there are sublifetimes for the assignment pattern and
 139 also the expression `x + y`. The expression itself has sublifetimes
 140 for evaluating `x` and `y`.
 141
 142 ## Function calls
 143
 144 Function calls are a bit tricky. I will describe how we handle them
 145 *now* and then a bit about how we can improve them (Issue #6268).
 146
 147 Consider a function call like `func(expr1, expr2)`, where `func`,
 148 `arg1`, and `arg2` are all arbitrary expressions. Currently,
 149 we construct a region hierarchy like:
 150
 151     +----------------+
 152     |                |
 153     +--+ +---+  +---+|
 154     v  v v   v  v   vv
 155     func(expr1, expr2)
 156
 157 Here you can see that the call as a whole has a region and the
 158 function plus arguments are subregions of that. As a side-effect of
 159 this, we get a lot of spurious errors around nested calls, in
 160 particular when combined with `&mut` functions. For example, a call
 161 like this one
 162
 163     self.foo(self.bar())
 164
 165 where both `foo` and `bar` are `&mut self` functions will always yield
 166 an error.
 167
 168 Here is a more involved example (which is safe) so we can see what's
 169 going on:
 170
 171     struct Foo { f: uint, g: uint }
 172     ...
 173     fn add(p: &mut uint, v: uint) {
 174         *p += v;
 175     }
 176     ...
 177     fn inc(p: &mut uint) -> uint {
 178         *p += 1; *p
 179     }
 180     fn weird() {
 181         let mut x: Box<Foo> = box Foo { ... };
 182         'a: add(&mut (*x).f,
 183                 'b: inc(&mut (*x).f)) // (..)
 184     }
 185
 186 The important part is the line marked `(..)` which contains a call to
 187 `add()`. The first argument is a mutable borrow of the field `f`.  The
 188 second argument also borrows the field `f`. Now, in the current borrow
 189 checker, the first borrow is given the lifetime of the call to
 190 `add()`, `'a`.  The second borrow is given the lifetime of `'b` of the
 191 call to `inc()`. Because `'b` is considered to be a sublifetime of
 192 `'a`, an error is reported since there are two co-existing mutable
 193 borrows of the same data.
 194
 195 However, if we were to examine the lifetimes a bit more carefully, we
 196 can see that this error is unnecessary. Let's examine the lifetimes
 197 involved with `'a` in detail. We'll break apart all the steps involved
 198 in a call expression:
 199
 200     'a: {
 201         'a_arg1: let a_temp1: ... = add;
 202         'a_arg2: let a_temp2: &'a mut uint = &'a mut (*x).f;
 203         'a_arg3: let a_temp3: uint = {
 204             let b_temp1: ... = inc;
 205             let b_temp2: &'b = &'b mut (*x).f;
 206             'b_call: b_temp1(b_temp2)
 207         };
 208         'a_call: a_temp1(a_temp2, a_temp3) // (**)
 209     }
 210
 211 Here we see that the lifetime `'a` includes a number of substatements.
 212 In particular, there is this lifetime I've called `'a_call` that
 213 corresponds to the *actual execution of the function `add()`*, after
 214 all arguments have been evaluated. There is a corresponding lifetime
 215 `'b_call` for the execution of `inc()`. If we wanted to be precise
 216 about it, the lifetime of the two borrows should be `'a_call` and
 217 `'b_call` respectively, since the references that were created
 218 will not be dereferenced except during the execution itself.
 219
 220 However, this model by itself is not sound. The reason is that
 221 while the two references that are created will never be used
 222 simultaneously, it is still true that the first reference is
 223 *created* before the second argument is evaluated, and so even though
 224 it will not be *dereferenced* during the evaluation of the second
 225 argument, it can still be *invalidated* by that evaluation. Consider
 226 this similar but unsound example:
 227
 228     struct Foo { f: uint, g: uint }
 229     ...
 230     fn add(p: &mut uint, v: uint) {
 231         *p += v;
 232     }
 233     ...
 234     fn consume(x: Box<Foo>) -> uint {
 235         x.f + x.g
 236     }
 237     fn weird() {
 238         let mut x: Box<Foo> = box Foo { ... };
 239         'a: add(&mut (*x).f, consume(x)) // (..)
 240     }
 241
 242 In this case, the second argument to `add` actually consumes `x`, thus
 243 invalidating the first argument.
 244
 245 So, for now, we exclude the `call` lifetimes from our model.
 246 Eventually I would like to include them, but we will have to make the
 247 borrow checker handle this situation correctly. In particular, if
 248 there is a reference created whose lifetime does not enclose
 249 the borrow expression, we must issue sufficient restrictions to ensure
 250 that the pointee remains valid.
 251
 252 ## Adding closures
 253
 254 The other significant complication to the region hierarchy is
 255 closures. I will describe here how closures should work, though some
 256 of the work to implement this model is ongoing at the time of this
 257 writing.
 258
 259 The body of closures are type-checked along with the function that
 260 creates them. However, unlike other expressions that appear within the
 261 function body, it is not entirely obvious when a closure body executes
 262 with respect to the other expressions. This is because the closure
 263 body will execute whenever the closure is called; however, we can
 264 never know precisely when the closure will be called, especially
 265 without some sort of alias analysis.
 266
 267 However, we can place some sort of limits on when the closure
 268 executes.  In particular, the type of every closure `fn:'r K` includes
 269 a region bound `'r`. This bound indicates the maximum lifetime of that
 270 closure; once we exit that region, the closure cannot be called
 271 anymore. Therefore, we say that the lifetime of the closure body is a
 272 sublifetime of the closure bound, but the closure body itself is unordered
 273 with respect to other parts of the code.
 274
 275 For example, consider the following fragment of code:
 276
 277     'a: {
 278          let closure: fn:'a() = || 'b: {
 279              'c: ...
 280          };
 281          'd: ...
 282     }
 283
 284 Here we have four lifetimes, `'a`, `'b`, `'c`, and `'d`. The closure
 285 `closure` is bounded by the lifetime `'a`. The lifetime `'b` is the
 286 lifetime of the closure body, and `'c` is some statement within the
 287 closure body. Finally, `'d` is a statement within the outer block that
 288 created the closure.
 289
 290 We can say that the closure body `'b` is a sublifetime of `'a` due to
 291 the closure bound. By the usual lexical scoping conventions, the
 292 statement `'c` is clearly a sublifetime of `'b`, and `'d` is a
 293 sublifetime of `'d`. However, there is no ordering between `'c` and
 294 `'d` per se (this kind of ordering between statements is actually only
 295 an issue for dataflow; passes like the borrow checker must assume that
 296 closures could execute at any time from the moment they are created
 297 until they go out of scope).
 298
 299 ### Complications due to closure bound inference
 300
 301 There is only one problem with the above model: in general, we do not
 302 actually *know* the closure bounds during region inference! In fact,
 303 closure bounds are almost always region variables! This is very tricky
 304 because the inference system implicitly assumes that we can do things
 305 like compute the LUB of two scoped lifetimes without needing to know
 306 the values of any variables.
 307
 308 Here is an example to illustrate the problem:
 309
 310     fn identify<T>(x: T) -> T { x }
 311
 312     fn foo() { // 'foo is the function body
 313       'a: {
 314            let closure = identity(|| 'b: {
 315                'c: ...
 316            });
 317            'd: closure();
 318       }
 319       'e: ...;
 320     }
 321
 322 In this example, the closure bound is not explicit. At compile time,
 323 we will create a region variable (let's call it `V0`) to represent the
 324 closure bound.
 325
 326 The primary difficulty arises during the constraint propagation phase.
 327 Imagine there is some variable with incoming edges from `'c` and `'d`.
 328 This means that the value of the variable must be `LUB('c,
 329 'd)`. However, without knowing what the closure bound `V0` is, we
 330 can't compute the LUB of `'c` and `'d`! Any we don't know the closure
 331 bound until inference is done.
 332
 333 The solution is to rely on the fixed point nature of inference.
 334 Basically, when we must compute `LUB('c, 'd)`, we just use the current
 335 value for `V0` as the closure's bound. If `V0`'s binding should
 336 change, then we will do another round of inference, and the result of
 337 `LUB('c, 'd)` will change.
 338
 339 One minor implication of this is that the graph does not in fact track
 340 the full set of dependencies between edges. We cannot easily know
 341 whether the result of a LUB computation will change, since there may
 342 be indirect dependencies on other variables that are not reflected on
 343 the graph. Therefore, we must *always* iterate over all edges when
 344 doing the fixed point calculation, not just those adjacent to nodes
 345 whose values have changed.
 346
 347 Were it not for this requirement, we could in fact avoid fixed-point
 348 iteration altogether. In that universe, we could instead first
 349 identify and remove strongly connected components (SCC) in the graph.
 350 Note that such components must consist solely of region variables; all
 351 of these variables can effectively be unified into a single variable.
 352 Once SCCs are removed, we are left with a DAG.  At this point, we
 353 could walk the DAG in topological order once to compute the expanding
 354 nodes, and again in reverse topological order to compute the
 355 contracting nodes. However, as I said, this does not work given the
 356 current treatment of closure bounds, but perhaps in the future we can
 357 address this problem somehow and make region inference somewhat more
 358 efficient. Note that this is solely a matter of performance, not
 359 expressiveness.
 360
 361 ### Skolemization
 362
 363 For a discussion on skolemization and higher-ranked subtyping, please
 364 see the module `middle::infer::higher_ranked::doc`.