src/librustc/middle/traits/README.md

   1 # TRAIT RESOLUTION
   2
   3 This document describes the general process and points out some non-obvious
   4 things.
   5
   6 ## Major concepts
   7
   8 Trait resolution is the process of pairing up an impl with each
   9 reference to a trait. So, for example, if there is a generic function like:
  10
  11     fn clone_slice<T:Clone>(x: &[T]) -> Vec<T> { ... }
  12
  13 and then a call to that function:
  14
  15     let v: Vec<isize> = clone_slice([1, 2, 3])
  16
  17 it is the job of trait resolution to figure out (in which case)
  18 whether there exists an impl of `isize : Clone`
  19
  20 Note that in some cases, like generic functions, we may not be able to
  21 find a specific impl, but we can figure out that the caller must
  22 provide an impl. To see what I mean, consider the body of `clone_slice`:
  23
  24     fn clone_slice<T:Clone>(x: &[T]) -> Vec<T> {
  25         let mut v = Vec::new();
  26         for e in &x {
  27             v.push((*e).clone()); // (*)
  28         }
  29     }
  30
  31 The line marked `(*)` is only legal if `T` (the type of `*e`)
  32 implements the `Clone` trait. Naturally, since we don't know what `T`
  33 is, we can't find the specific impl; but based on the bound `T:Clone`,
  34 we can say that there exists an impl which the caller must provide.
  35
  36 We use the term *obligation* to refer to a trait reference in need of
  37 an impl.
  38
  39 ## Overview
  40
  41 Trait resolution consists of three major parts:
  42
  43 - SELECTION: Deciding how to resolve a specific obligation. For
  44   example, selection might decide that a specific obligation can be
  45   resolved by employing an impl which matches the self type, or by
  46   using a parameter bound. In the case of an impl, Selecting one
  47   obligation can create *nested obligations* because of where clauses
  48   on the impl itself. It may also require evaluating those nested
  49   obligations to resolve ambiguities.
  50
  51 - FULFILLMENT: The fulfillment code is what tracks that obligations
  52   are completely fulfilled. Basically it is a worklist of obligations
  53   to be selected: once selection is successful, the obligation is
  54   removed from the worklist and any nested obligations are enqueued.
  55
  56 - COHERENCE: The coherence checks are intended to ensure that there
  57   are never overlapping impls, where two impls could be used with
  58   equal precedence.
  59
  60 ## Selection
  61
  62 Selection is the process of deciding whether an obligation can be
  63 resolved and, if so, how it is to be resolved (via impl, where clause, etc).
  64 The main interface is the `select()` function, which takes an obligation
  65 and returns a `SelectionResult`. There are three possible outcomes:
  66
  67 - `Ok(Some(selection))` -- yes, the obligation can be resolved, and
  68   `selection` indicates how. If the impl was resolved via an impl,
  69   then `selection` may also indicate nested obligations that are required
  70   by the impl.
  71
  72 - `Ok(None)` -- we are not yet sure whether the obligation can be
  73   resolved or not. This happens most commonly when the obligation
  74   contains unbound type variables.
  75
  76 - `Err(err)` -- the obligation definitely cannot be resolved due to a
  77   type error, or because there are no impls that could possibly apply,
  78   etc.
  79
  80 The basic algorithm for selection is broken into two big phases:
  81 candidate assembly and confirmation.
  82
  83 ### Candidate assembly
  84
  85 Searches for impls/where-clauses/etc that might
  86 possibly be used to satisfy the obligation. Each of those is called
  87 a candidate. To avoid ambiguity, we want to find exactly one
  88 candidate that is definitively applicable. In some cases, we may not
  89 know whether an impl/where-clause applies or not -- this occurs when
  90 the obligation contains unbound inference variables.
  91
  92 The basic idea for candidate assembly is to do a first pass in which
  93 we identify all possible candidates. During this pass, all that we do
  94 is try and unify the type parameters. (In particular, we ignore any
  95 nested where clauses.) Presuming that this unification succeeds, the
  96 impl is added as a candidate.
  97
  98 Once this first pass is done, we can examine the set of candidates. If
  99 it is a singleton set, then we are done: this is the only impl in
 100 scope that could possibly apply. Otherwise, we can winnow down the set
 101 of candidates by using where clauses and other conditions. If this
 102 reduced set yields a single, unambiguous entry, we're good to go,
 103 otherwise the result is considered ambiguous.
 104
 105 #### The basic process: Inferring based on the impls we see
 106
 107 This process is easier if we work through some examples. Consider
 108 the following trait:
 109
 110 ```
 111 trait Convert<Target> {
 112     fn convert(&self) -> Target;
 113 }
 114 ```
 115
 116 This trait just has one method. It's about as simple as it gets. It
 117 converts from the (implicit) `Self` type to the `Target` type. If we
 118 wanted to permit conversion between `isize` and `usize`, we might
 119 implement `Convert` like so:
 120
 121 ```rust
 122 impl Convert<usize> for isize { ... } // isize -> usize
 123 impl Convert<isize> for usize { ... } // usize -> isize
 124 ```
 125
 126 Now imagine there is some code like the following:
 127
 128 ```rust
 129 let x: isize = ...;
 130 let y = x.convert();
 131 ```
 132
 133 The call to convert will generate a trait reference `Convert<$Y> for
 134 isize`, where `$Y` is the type variable representing the type of
 135 `y`. When we match this against the two impls we can see, we will find
 136 that only one remains: `Convert<usize> for isize`. Therefore, we can
 137 select this impl, which will cause the type of `$Y` to be unified to
 138 `usize`. (Note that while assembling candidates, we do the initial
 139 unifications in a transaction, so that they don't affect one another.)
 140
 141 There are tests to this effect in src/test/run-pass:
 142
 143    traits-multidispatch-infer-convert-source-and-target.rs
 144    traits-multidispatch-infer-convert-target.rs
 145
 146 #### Winnowing: Resolving ambiguities
 147
 148 But what happens if there are multiple impls where all the types
 149 unify? Consider this example:
 150
 151 ```rust
 152 trait Get {
 153     fn get(&self) -> Self;
 154 }
 155
 156 impl<T:Copy> Get for T {
 157     fn get(&self) -> T { *self }
 158 }
 159
 160 impl<T:Get> Get for Box<T> {
 161     fn get(&self) -> Box<T> { box get_it(&**self) }
 162 }
 163 ```
 164
 165 What happens when we invoke `get_it(&box 1_u16)`, for example? In this
 166 case, the `Self` type is `Box<u16>` -- that unifies with both impls,
 167 because the first applies to all types, and the second to all
 168 boxes. In the olden days we'd have called this ambiguous. But what we
 169 do now is do a second *winnowing* pass that considers where clauses
 170 and attempts to remove candidates -- in this case, the first impl only
 171 applies if `Box<u16> : Copy`, which doesn't hold. After winnowing,
 172 then, we are left with just one candidate, so we can proceed. There is
 173 a test of this in `src/test/run-pass/traits-conditional-dispatch.rs`.
 174
 175 #### Matching
 176
 177 The subroutines that decide whether a particular impl/where-clause/etc
 178 applies to a particular obligation. At the moment, this amounts to
 179 unifying the self types, but in the future we may also recursively
 180 consider some of the nested obligations, in the case of an impl.
 181
 182 #### Lifetimes and selection
 183
 184 Because of how that lifetime inference works, it is not possible to
 185 give back immediate feedback as to whether a unification or subtype
 186 relationship between lifetimes holds or not. Therefore, lifetime
 187 matching is *not* considered during selection. This is reflected in
 188 the fact that subregion assignment is infallible. This may yield
 189 lifetime constraints that will later be found to be in error (in
 190 contrast, the non-lifetime-constraints have already been checked
 191 during selection and can never cause an error, though naturally they
 192 may lead to other errors downstream).
 193
 194 #### Where clauses
 195
 196 Besides an impl, the other major way to resolve an obligation is via a
 197 where clause. The selection process is always given a *parameter
 198 environment* which contains a list of where clauses, which are
 199 basically obligations that can assume are satisfiable. We will iterate
 200 over that list and check whether our current obligation can be found
 201 in that list, and if so it is considered satisfied. More precisely, we
 202 want to check whether there is a where-clause obligation that is for
 203 the same trait (or some subtrait) and for which the self types match,
 204 using the definition of *matching* given above.
 205
 206 Consider this simple example:
 207
 208      trait A1 { ... }
 209      trait A2 : A1 { ... }
 210
 211      trait B { ... }
 212
 213      fn foo<X:A2+B> { ... }
 214
 215 Clearly we can use methods offered by `A1`, `A2`, or `B` within the
 216 body of `foo`. In each case, that will incur an obligation like `X :
 217 A1` or `X : A2`. The parameter environment will contain two
 218 where-clauses, `X : A2` and `X : B`. For each obligation, then, we
 219 search this list of where-clauses.  To resolve an obligation `X:A1`,
 220 we would note that `X:A2` implies that `X:A1`.
 221
 222 ### Confirmation
 223
 224 Confirmation unifies the output type parameters of the trait with the
 225 values found in the obligation, possibly yielding a type error.  If we
 226 return to our example of the `Convert` trait from the previous
 227 section, confirmation is where an error would be reported, because the
 228 impl specified that `T` would be `usize`, but the obligation reported
 229 `char`. Hence the result of selection would be an error.
 230
 231 ### Selection during translation
 232
 233 During type checking, we do not store the results of trait selection.
 234 We simply wish to verify that trait selection will succeed. Then
 235 later, at trans time, when we have all concrete types available, we
 236 can repeat the trait selection.  In this case, we do not consider any
 237 where-clauses to be in scope. We know that therefore each resolution
 238 will resolve to a particular impl.
 239
 240 One interesting twist has to do with nested obligations. In general, in trans,
 241 we only need to do a "shallow" selection for an obligation. That is, we wish to
 242 identify which impl applies, but we do not (yet) need to decide how to select
 243 any nested obligations. Nonetheless, we *do* currently do a complete resolution,
 244 and that is because it can sometimes inform the results of type inference. That is,
 245 we do not have the full substitutions in terms of the type variables of the impl available
 246 to us, so we must run trait selection to figure everything out.
 247
 248 Here is an example:
 249
 250     trait Foo { ... }
 251     impl<U,T:Bar<U>> Foo for Vec<T> { ... }
 252
 253     impl Bar<usize> for isize { ... }
 254
 255 After one shallow round of selection for an obligation like `Vec<isize>
 256 : Foo`, we would know which impl we want, and we would know that
 257 `T=isize`, but we do not know the type of `U`.  We must select the
 258 nested obligation `isize : Bar<U>` to find out that `U=usize`.
 259
 260 It would be good to only do *just as much* nested resolution as
 261 necessary. Currently, though, we just do a full resolution.
 262
 263 # Higher-ranked trait bounds
 264
 265 One of the more subtle concepts at work are *higher-ranked trait
 266 bounds*. An example of such a bound is `for<'a> MyTrait<&'a isize>`.
 267 Let's walk through how selection on higher-ranked trait references
 268 works.
 269
 270 ## Basic matching and skolemization leaks
 271
 272 Let's walk through the test `compile-fail/hrtb-just-for-static.rs` to see
 273 how it works. The test starts with the trait `Foo`:
 274
 275 ```rust
 276 trait Foo<X> {
 277     fn foo(&self, x: X) { }
 278 }
 279 ```
 280
 281 Let's say we have a function `want_hrtb` that wants a type which
 282 implements `Foo<&'a isize>` for any `'a`:
 283
 284 ```rust
 285 fn want_hrtb<T>() where T : for<'a> Foo<&'a isize> { ... }
 286 ```
 287
 288 Now we have a struct `AnyInt` that implements `Foo<&'a isize>` for any
 289 `'a`:
 290
 291 ```rust
 292 struct AnyInt;
 293 impl<'a> Foo<&'a isize> for AnyInt { }
 294 ```
 295
 296 And the question is, does `AnyInt : for<'a> Foo<&'a isize>`? We want the
 297 answer to be yes. The algorithm for figuring it out is closely related
 298 to the subtyping for higher-ranked types (which is described in
 299 `middle::infer::higher_ranked::doc`, but also in a [paper by SPJ] that
 300 I recommend you read).
 301
 302 1. Skolemize the obligation.
 303 2. Match the impl against the skolemized obligation.
 304 3. Check for skolemization leaks.
 305
 306 [paper by SPJ]: http://research.microsoft.com/en-us/um/people/simonpj/papers/higher-rank/
 307
 308 So let's work through our example. The first thing we would do is to
 309 skolemize the obligation, yielding `AnyInt : Foo<&'0 isize>` (here `'0`
 310 represents skolemized region #0). Note that now have no quantifiers;
 311 in terms of the compiler type, this changes from a `ty::PolyTraitRef`
 312 to a `TraitRef`. We would then create the `TraitRef` from the impl,
 313 using fresh variables for it's bound regions (and thus getting
 314 `Foo<&'$a isize>`, where `'$a` is the inference variable for `'a`). Next
 315 we relate the two trait refs, yielding a graph with the constraint
 316 that `'0 == '$a`. Finally, we check for skolemization "leaks" -- a
 317 leak is basically any attempt to relate a skolemized region to another
 318 skolemized region, or to any region that pre-existed the impl match.
 319 The leak check is done by searching from the skolemized region to find
 320 the set of regions that it is related to in any way. This is called
 321 the "taint" set. To pass the check, that set must consist *solely* of
 322 itself and region variables from the impl. If the taint set includes
 323 any other region, then the match is a failure. In this case, the taint
 324 set for `'0` is `{'0, '$a}`, and hence the check will succeed.
 325
 326 Let's consider a failure case. Imagine we also have a struct
 327
 328 ```rust
 329 struct StaticInt;
 330 impl Foo<&'static isize> for StaticInt;
 331 ```
 332
 333 We want the obligation `StaticInt : for<'a> Foo<&'a isize>` to be
 334 considered unsatisfied. The check begins just as before. `'a` is
 335 skolemized to `'0` and the impl trait reference is instantiated to
 336 `Foo<&'static isize>`. When we relate those two, we get a constraint
 337 like `'static == '0`. This means that the taint set for `'0` is `{'0,
 338 'static}`, which fails the leak check.
 339
 340 ## Higher-ranked trait obligations
 341
 342 Once the basic matching is done, we get to another interesting topic:
 343 how to deal with impl obligations. I'll work through a simple example
 344 here. Imagine we have the traits `Foo` and `Bar` and an associated impl:
 345
 346 ```
 347 trait Foo<X> {
 348     fn foo(&self, x: X) { }
 349 }
 350
 351 trait Bar<X> {
 352     fn bar(&self, x: X) { }
 353 }
 354
 355 impl<X,F> Foo<X> for F
 356     where F : Bar<X>
 357 {
 358 }
 359 ```
 360
 361 Now let's say we have a obligation `for<'a> Foo<&'a isize>` and we match
 362 this impl. What obligation is generated as a result? We want to get
 363 `for<'a> Bar<&'a isize>`, but how does that happen?
 364
 365 After the matching, we are in a position where we have a skolemized
 366 substitution like `X => &'0 isize`. If we apply this substitution to the
 367 impl obligations, we get `F : Bar<&'0 isize>`. Obviously this is not
 368 directly usable because the skolemized region `'0` cannot leak out of
 369 our computation.
 370
 371 What we do is to create an inverse mapping from the taint set of `'0`
 372 back to the original bound region (`'a`, here) that `'0` resulted
 373 from. (This is done in `higher_ranked::plug_leaks`). We know that the
 374 leak check passed, so this taint set consists solely of the skolemized
 375 region itself plus various intermediate region variables. We then walk
 376 the trait-reference and convert every region in that taint set back to
 377 a late-bound region, so in this case we'd wind up with `for<'a> F :
 378 Bar<&'a isize>`.
 379
 380 # Caching and subtle considerations therewith
 381
 382 In general we attempt to cache the results of trait selection.  This
 383 is a somewhat complex process. Part of the reason for this is that we
 384 want to be able to cache results even when all the types in the trait
 385 reference are not fully known. In that case, it may happen that the
 386 trait selection process is also influencing type variables, so we have
 387 to be able to not only cache the *result* of the selection process,
 388 but *replay* its effects on the type variables.
 389
 390 ## An example
 391
 392 The high-level idea of how the cache works is that we first replace
 393 all unbound inference variables with skolemized versions. Therefore,
 394 if we had a trait reference `usize : Foo<$1>`, where `$n` is an unbound
 395 inference variable, we might replace it with `usize : Foo<%0>`, where
 396 `%n` is a skolemized type. We would then look this up in the cache.
 397 If we found a hit, the hit would tell us the immediate next step to
 398 take in the selection process: i.e., apply impl #22, or apply where
 399 clause `X : Foo<Y>`. Let's say in this case there is no hit.
 400 Therefore, we search through impls and where clauses and so forth, and
 401 we come to the conclusion that the only possible impl is this one,
 402 with def-id 22:
 403
 404     impl Foo<isize> for usize { ... } // Impl #22
 405
 406 We would then record in the cache `usize : Foo<%0> ==>
 407 ImplCandidate(22)`. Next we would confirm `ImplCandidate(22)`, which
 408 would (as a side-effect) unify `$1` with `isize`.
 409
 410 Now, at some later time, we might come along and see a `usize :
 411 Foo<$3>`.  When skolemized, this would yield `usize : Foo<%0>`, just as
 412 before, and hence the cache lookup would succeed, yielding
 413 `ImplCandidate(22)`. We would confirm `ImplCandidate(22)` which would
 414 (as a side-effect) unify `$3` with `isize`.
 415
 416 ## Where clauses and the local vs global cache
 417
 418 One subtle interaction is that the results of trait lookup will vary
 419 depending on what where clauses are in scope. Therefore, we actually
 420 have *two* caches, a local and a global cache. The local cache is
 421 attached to the `ParameterEnvironment` and the global cache attached
 422 to the `tcx`. We use the local cache whenever the result might depend
 423 on the where clauses that are in scope. The determination of which
 424 cache to use is done by the method `pick_candidate_cache` in
 425 `select.rs`. At the moment, we use a very simple, conservative rule:
 426 if there are any where-clauses in scope, then we use the local cache.
 427 We used to try and draw finer-grained distinctions, but that led to a
 428 serious of annoying and weird bugs like #22019 and #18290. This simple
 429 rule seems to be pretty clearly safe and also still retains a very
 430 high hit rate (~95% when compiling rustc).