]>
Commit | Line | Data |
---|---|---|
6a06907d XL |
1 | # Early and Late Bound Variables |
2 | ||
3 | In Rust, item definitions (like `fn`) can often have generic parameters, which | |
4 | are always [_universally_ quantified][quant]. That is, if you have a function | |
5 | like | |
6 | ||
7 | ```rust | |
8 | fn foo<T>(x: T) { } | |
9 | ``` | |
10 | ||
11 | this function is defined "for all T" (not "for some specific T", which would be | |
12 | [_existentially_ quantified][quant]). | |
13 | ||
14 | [quant]: ./appendix/background.md#quantified | |
15 | ||
16 | While Rust *items* can be quantified over types, lifetimes, and constants, the | |
17 | types of values in Rust are only ever quantified over lifetimes. So you can | |
18 | have a type like `for<'a> fn(&'a u32)`, which represents a function pointer | |
19 | that takes a reference with any lifetime, or `for<'a> dyn Trait<'a>`, which is | |
20 | a `dyn` trait for a trait implemented for any lifetime; but we have no type | |
21 | like `for<T> fn(T)`, which would be a function that takes a value of *any type* | |
22 | as a parameter. This is a consequence of monomorphization -- to support a value | |
23 | of type `for<T> fn(T)`, we would need a single function pointer that can be | |
24 | used for a parameter of any type, but in Rust we generate customized code for | |
25 | each parameter type. | |
26 | ||
3c0e092e | 27 | One consequence of this asymmetry is a weird split in how we represent some |
6a06907d XL |
28 | generic types: _early-_ and _late-_ bound parameters. |
29 | Basically, if we cannot represent a type (e.g. a universally quantified type), | |
30 | we have to bind it _early_ so that the unrepresentable type is never around. | |
31 | ||
32 | Consider the following example: | |
33 | ||
34 | ```rust,ignore | |
35 | fn foo<'a, 'b, T>(x: &'a u32, y: &'b T) where T: 'b { ... } | |
36 | ``` | |
37 | ||
38 | We cannot treat `'a`, `'b`, and `T` in the same way. Types in Rust can't have | |
39 | `for<T> { .. }`, only `for<'a> {...}`, so whenever you reference `foo` the type | |
40 | you get back can't be `for<'a, 'b, T> fn(&'a u32, y: &'b T)`. Instead, the `T` | |
41 | must be substituted early. In particular, you have: | |
42 | ||
43 | ```rust,ignore | |
44 | let x = foo; // T, 'b have to be substituted here | |
45 | x(...); // 'a substituted here, at the point of call | |
46 | x(...); // 'a substituted here with a different value | |
47 | ``` | |
48 | ||
49 | ## Early-bound parameters | |
50 | ||
51 | Early-bound parameters in rustc are identified by an index, stored in the | |
52 | [`ParamTy`] struct for types or the [`EarlyBoundRegion`] struct for lifetimes. | |
53 | The index counts from the outermost declaration in scope. This means that as you | |
54 | add more binders inside, the index doesn't change. | |
55 | ||
56 | For example, | |
57 | ||
58 | ```rust,ignore | |
59 | trait Foo<T> { | |
60 | type Bar<U> = (Self, T, U); | |
61 | } | |
62 | ``` | |
63 | ||
64 | Here, the type `(Self, T, U)` would be `($0, $1, $2)`, where `$N` means a | |
65 | [`ParamTy`] with the index of `N`. | |
66 | ||
67 | In rustc, the [`Generics`] structure carries this information. So the | |
68 | [`Generics`] for `Bar` above would be just like for `U` and would indicate the | |
69 | 'parent' generics of `Foo`, which declares `Self` and `T`. You can read more | |
70 | in [this chapter](./generics.md). | |
71 | ||
72 | [`ParamTy`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.ParamTy.html | |
73 | [`EarlyBoundRegion`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.EarlyBoundRegion.html | |
74 | [`Generics`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Generics.html | |
75 | ||
76 | ## Late-bound parameters | |
77 | ||
78 | Late-bound parameters in `rustc` are handled quite differently (they are also | |
79 | specialized to lifetimes since, right now, only late-bound lifetimes are | |
80 | supported, though with GATs that has to change). We indicate their potential | |
81 | presence by a [`Binder`] type. The [`Binder`] doesn't know how many variables | |
82 | there are at that binding level. This can only be determined by walking the | |
83 | type itself and collecting them. So a type like `for<'a, 'b> ('a, 'b)` would be | |
84 | `for (^0.a, ^0.b)`. Here, we just write `for` because we don't know the names | |
85 | of the things bound within. | |
86 | ||
87 | Moreover, a reference to a late-bound lifetime is written `^0.a`: | |
88 | ||
89 | - The `0` is the index; it identifies that this lifetime is bound in the | |
90 | innermost binder (the `for`). | |
91 | - The `a` is the "name"; late-bound lifetimes in rustc are identified by a | |
92 | "name" -- the [`BoundRegionKind`] enum. This enum can contain a | |
93 | [`DefId`][defid] or it might have various "anonymous" numbered names. The | |
94 | latter arise from types like `fn(&u32, &u32)`, which are equivalent to | |
95 | something like `for<'a, 'b> fn(&'a u32, &'b u32)`, but the names of those | |
96 | lifetimes must be generated. | |
97 | ||
98 | This setup of not knowing the full set of variables at a binding level has some | |
99 | advantages and some disadvantages. The disadvantage is that you must walk the | |
100 | type to find out what is bound at the given level and so forth. The advantage | |
101 | is primarily that, when constructing types from Rust syntax, if we encounter | |
102 | anonymous regions like in `fn(&u32)`, we just create a fresh index and don't have | |
103 | to update the binder. | |
104 | ||
105 | [`Binder`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Binder.html | |
106 | [`BoundRegionKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/enum.BoundRegionKind.html | |
107 | [defid]: ./hir.html#identifiers-in-the-hir |