]> git.proxmox.com Git - rustc.git/blame - src/doc/nomicon/src/subtyping.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / nomicon / src / subtyping.md
CommitLineData
8bb4bdeb 1# Subtyping and Variance
c1a9b12d 2
94b46f34
XL
3Subtyping is a relationship between types that allows statically typed
4languages to be a bit more flexible and permissive.
5
450edc1f
XL
6Subtyping in Rust is a bit different from subtyping in other languages. This
7makes it harder to give simple examples, which is a problem since subtyping,
48663c56 8and especially variance, is already hard to understand properly. As in,
450edc1f
XL
9even compiler writers mess it up all the time.
10
11To keep things simple, this section will consider a small extension to the
12Rust language that adds a new and simpler subtyping relationship. After
13establishing concepts and issues under this simpler system,
14we will then relate it back to how subtyping actually occurs in Rust.
15
16So here's our simple extension, *Objective Rust*, featuring three new types:
17
450edc1f
XL
18```rust
19trait Animal {
20 fn snuggle(&self);
21 fn eat(&mut self);
22}
23
24trait Cat: Animal {
25 fn meow(&self);
26}
27
28trait Dog: Animal {
29 fn bark(&self);
30}
31```
32
33But unlike normal traits, we can use them as concrete and sized types, just like structs.
34
35Now, say we have a very simple function that takes an Animal, like this:
36
136023e0 37<!-- ignore: simplified code -->
450edc1f
XL
38```rust,ignore
39fn love(pet: Animal) {
40 pet.snuggle();
41}
42```
43
44By default, static types must match *exactly* for a program to compile. As such,
45this code won't compile:
46
136023e0 47<!-- ignore: simplified code -->
450edc1f
XL
48```rust,ignore
49let mr_snuggles: Cat = ...;
50love(mr_snuggles); // ERROR: expected Animal, found Cat
51```
52
53Mr. Snuggles is a Cat, and Cats aren't *exactly* Animals, so we can't love him! 😿
54
55This is annoying because Cats *are* Animals. They support every operation
56an Animal supports, so intuitively `love` shouldn't care if we pass it a `Cat`.
57We should be able to just **forget** the non-animal parts of our `Cat`, as they
58aren't necessary to love it.
59
60This is exactly the problem that *subtyping* is intended to fix. Because Cats are just
61Animals **and more**, we say Cat is a *subtype* of Animal (because Cats are a *subset*
62of all the Animals). Equivalently, we say that Animal is a *supertype* of Cat.
63With subtypes, we can tweak our overly strict static type system
64with a simple rule: anywhere a value of type `T` is expected, we will also
65accept values that are subtypes of `T`.
66
67Or more concretely: anywhere an Animal is expected, a Cat or Dog will also work.
68
69As we will see throughout the rest of this section, subtyping is a lot more complicated
70and subtle than this, but this simple rule is a very good 99% intuition. And unless you
71write unsafe code, the compiler will automatically handle all the corner cases for you.
72
73But this is the Rustonomicon. We're writing unsafe code, so we need to understand how
74this stuff really works, and how we can mess it up.
75
76The core problem is that this rule, naively applied, will lead to *meowing Dogs*. That is,
77we can convince someone that a Dog is actually a Cat. This completely destroys the fabric
923072b8 78of our static type system, making it worse than useless (and leading to Undefined Behavior).
450edc1f
XL
79
80Here's a simple example of this happening when we apply subtyping in a completely naive
81"find and replace" way.
82
136023e0 83<!-- ignore: simplified code -->
450edc1f
XL
84```rust,ignore
85fn evil_feeder(pet: &mut Animal) {
86 let spike: Dog = ...;
87
88 // `pet` is an Animal, and Dog is a subtype of Animal,
89 // so this should be fine, right..?
90 *pet = spike;
91}
92
93fn main() {
94 let mut mr_snuggles: Cat = ...;
95 evil_feeder(&mut mr_snuggles); // Replaces mr_snuggles with a Dog
96 mr_snuggles.meow(); // OH NO, MEOWING DOG!
97}
98```
99
100Clearly, we need a more robust system than "find and replace". That system is *variance*,
101which is a set of rules governing how subtyping should compose. Most importantly, variance
102defines situations where subtyping should be disabled.
103
104But before we get into variance, let's take a quick peek at where subtyping actually occurs in
105Rust: *lifetimes*!
106
107> NOTE: The typed-ness of lifetimes is a fairly arbitrary construct that some
108> disagree with. However it simplifies our analysis to treat lifetimes
109> and types uniformly.
110
111Lifetimes are just regions of code, and regions can be partially ordered with the *contains*
112(outlives) relationship. Subtyping on lifetimes is in terms of that relationship:
113if `'big: 'small` ("big contains small" or "big outlives small"), then `'big` is a subtype
94b46f34
XL
114of `'small`. This is a large source of confusion, because it seems backwards
115to many: the bigger region is a *subtype* of the smaller region. But it makes
450edc1f 116sense if you consider our Animal example: Cat is an Animal *and more*,
94b46f34
XL
117just as `'big` is `'small` *and more*.
118
119Put another way, if someone wants a reference that lives for `'small`,
120usually what they actually mean is that they want a reference that lives
121for *at least* `'small`. They don't actually care if the lifetimes match
450edc1f
XL
122exactly. So it should be ok for us to **forget** that something lives for
123`'big` and only remember that it lives for `'small`.
c1a9b12d 124
450edc1f
XL
125The meowing dog problem for lifetimes will result in us being able to
126store a short-lived reference in a place that expects a longer-lived one,
127creating a dangling reference and letting us use-after-free.
c1a9b12d 128
450edc1f
XL
129It will be useful to note that `'static`, the forever lifetime, is a subtype of
130every lifetime because by definition it outlives everything. We will be using
131this relationship in later examples to keep them as simple as possible.
94b46f34 132
450edc1f
XL
133With all that said, we still have no idea how to actually *use* subtyping of lifetimes,
134because nothing ever has type `'a`. Lifetimes only occur as part of some larger type
135like `&'a u32` or `IterMut<'a, u32>`. To apply lifetime subtyping, we need to know
136how to compose subtyping. Once again, we need *variance*.
94b46f34 137
136023e0 138## Variance
c1a9b12d
SL
139
140Variance is where things get a bit complicated.
141
142Variance is a property that *type constructors* have with respect to their
450edc1f
XL
143arguments. A type constructor in Rust is any generic type with unbound arguments.
144For instance `Vec` is a type constructor that takes a type `T` and returns
c1a9b12d
SL
145`Vec<T>`. `&` and `&mut` are type constructors that take two inputs: a
146lifetime, and a type to point to.
147
450edc1f
XL
148> NOTE: For convenience we will often refer to `F<T>` as a type constructor just so
149> that we can easily talk about `T`. Hopefully this is clear in context.
150
8faf50e0 151A type constructor F's *variance* is how the subtyping of its inputs affects the
450edc1f
XL
152subtyping of its outputs. There are three kinds of variance in Rust. Given two
153types `Sub` and `Super`, where `Sub` is a subtype of `Super`:
154
155* `F` is *covariant* if `F<Sub>` is a subtype of `F<Super>` (subtyping "passes through")
156* `F` is *contravariant* if `F<Super>` is a subtype of `F<Sub>` (subtyping is "inverted")
157* `F` is *invariant* otherwise (no subtyping relationship exists)
158
159If `F` has multiple type parameters, we can talk about the individual variances
160by saying that, for example, `F<T, U>` is covariant over `T` and invariant over `U`.
161
162It is very useful to keep in mind that covariance is, in practical terms, "the"
163variance. Almost all consideration of variance is in terms of whether something
164should be covariant or invariant. Actually witnessing contravariance is quite difficult
165in Rust, though it does in fact exist.
166
167Here is a table of important variances which the rest of this section will be devoted
168to trying to explain:
169
170| | | 'a | T | U |
171|---|-----------------|:---------:|:-----------------:|:---------:|
172| * | `&'a T ` | covariant | covariant | |
173| * | `&'a mut T` | covariant | invariant | |
174| * | `Box<T>` | | covariant | |
175| | `Vec<T>` | | covariant | |
176| * | `UnsafeCell<T>` | | invariant | |
177| | `Cell<T>` | | invariant | |
178| * | `fn(T) -> U` | | **contra**variant | covariant |
179| | `*const T` | | covariant | |
180| | `*mut T` | | invariant | |
181
182The types with \*'s are the ones we will be focusing on, as they are in
183some sense "fundamental". All the others can be understood by analogy to the others:
184
74b04a01
XL
185* `Vec<T>` and all other owning pointers and collections follow the same logic as `Box<T>`
186* `Cell<T>` and all other interior mutability types follow the same logic as `UnsafeCell<T>`
187* `*const T` follows the logic of `&T`
188* `*mut T` follows the logic of `&mut T` (or `UnsafeCell<T>`)
450edc1f 189
136023e0
XL
190For more types, see the ["Variance" section][variance-table] on the reference.
191
192[variance-table]: ../reference/subtyping.html#variance
193
450edc1f
XL
194> NOTE: the *only* source of contravariance in the language is the arguments to
195> a function, which is why it really doesn't come up much in practice. Invoking
196> contravariance involves higher-order programming with function pointers that
197> take references with specific lifetimes (as opposed to the usual "any lifetime",
198> which gets into higher rank lifetimes, which work independently of subtyping).
199
200Ok, that's enough type theory! Let's try to apply the concept of variance to Rust
201and look at some examples.
202
203First off, let's revisit the meowing dog example:
204
136023e0 205<!-- ignore: simplified code -->
450edc1f
XL
206```rust,ignore
207fn evil_feeder(pet: &mut Animal) {
208 let spike: Dog = ...;
209
210 // `pet` is an Animal, and Dog is a subtype of Animal,
211 // so this should be fine, right..?
212 *pet = spike;
213}
214
215fn main() {
216 let mut mr_snuggles: Cat = ...;
217 evil_feeder(&mut mr_snuggles); // Replaces mr_snuggles with a Dog
218 mr_snuggles.meow(); // OH NO, MEOWING DOG!
219}
220```
c1a9b12d 221
450edc1f
XL
222If we look at our table of variances, we see that `&mut T` is *invariant* over `T`.
223As it turns out, this completely fixes the issue! With invariance, the fact that
224Cat is a subtype of Animal doesn't matter; `&mut Cat` still won't be a subtype of
225`&mut Animal`. The static type checker will then correctly stop us from passing
226a Cat into `evil_feeder`.
c1a9b12d 227
450edc1f
XL
228The soundness of subtyping is based on the idea that it's ok to forget unnecessary
229details. But with references, there's always someone that remembers those details:
230the value being referenced. That value expects those details to keep being true,
231and may behave incorrectly if its expectations are violated.
c1a9b12d 232
450edc1f
XL
233The problem with making `&mut T` covariant over `T` is that it gives us the power
234to modify the original value *when we don't remember all of its constraints*.
235And so, we can make someone have a Dog when they're certain they still have a Cat.
c1a9b12d 236
450edc1f
XL
237With that established, we can easily see why `&T` being covariant over `T` *is*
238sound: it doesn't let you modify the value, only look at it. Without any way to
239mutate, there's no way for us to mess with any details. We can also see why
240`UnsafeCell` and all the other interior mutability types must be invariant: they
241make `&T` work like `&mut T`!
c1a9b12d 242
450edc1f
XL
243Now what about the lifetime on references? Why is it ok for both kinds of references
244to be covariant over their lifetimes? Well, here's a two-pronged argument:
c1a9b12d 245
450edc1f
XL
246First and foremost, subtyping references based on their lifetimes is *the entire point
247of subtyping in Rust*. The only reason we have subtyping is so we can pass
248long-lived things where short-lived things are expected. So it better work!
c1a9b12d 249
450edc1f
XL
250Second, and more seriously, lifetimes are only a part of the reference itself. The
251type of the referent is shared knowledge, which is why adjusting that type in only
252one place (the reference) can lead to issues. But if you shrink down a reference's
253lifetime when you hand it to someone, that lifetime information isn't shared in
e1599b0c 254any way. There are now two independent references with independent lifetimes.
450edc1f 255There's no way to mess with original reference's lifetime using the other one.
c1a9b12d 256
450edc1f
XL
257Or rather, the only way to mess with someone's lifetime is to build a meowing dog.
258But as soon as you try to build a meowing dog, the lifetime should be wrapped up
259in an invariant type, preventing the lifetime from being shrunk. To understand this
260better, let's port the meowing dog problem over to real Rust.
261
262In the meowing dog problem we take a subtype (Cat), convert it into a supertype
263(Animal), and then use that fact to overwrite the subtype with a value that satisfies
264the constraints of the supertype but not the subtype (Dog).
265
266So with lifetimes, we want to take a long-lived thing, convert it into a
267short-lived thing, and then use that to write something that doesn't live long
268enough into the place expecting something long-lived.
269
270Here it is:
c1a9b12d 271
136023e0 272```rust,compile_fail
450edc1f
XL
273fn evil_feeder<T>(input: &mut T, val: T) {
274 *input = val;
c1a9b12d
SL
275}
276
277fn main() {
450edc1f 278 let mut mr_snuggles: &'static str = "meow! :3"; // mr. snuggles forever!!
c1a9b12d 279 {
450edc1f
XL
280 let spike = String::from("bark! >:V");
281 let spike_str: &str = &spike; // Only lives for the block
282 evil_feeder(&mut mr_snuggles, spike_str); // EVIL!
c1a9b12d 283 }
450edc1f 284 println!("{}", mr_snuggles); // Use after free?
c1a9b12d
SL
285}
286```
287
450edc1f
XL
288And what do we get when we run this?
289
290```text
291error[E0597]: `spike` does not live long enough
136023e0 292 --> src/main.rs:9:31
450edc1f 293 |
136023e0
XL
2946 | let mut mr_snuggles: &'static str = "meow! :3"; // mr. snuggles forever!!
295 | ------------ type annotation requires that `spike` is borrowed for `'static`
296...
2979 | let spike_str: &str = &spike; // Only lives for the block
298 | ^^^^^^ borrowed value does not live long enough
29910 | evil_feeder(&mut mr_snuggles, spike_str); // EVIL!
450edc1f 30011 | }
136023e0 301 | - `spike` dropped here while still borrowed
450edc1f
XL
302```
303
304Good, it doesn't compile! Let's break down what's happening here in detail.
305
306First let's look at the new `evil_feeder` function:
c1a9b12d
SL
307
308```rust
450edc1f
XL
309fn evil_feeder<T>(input: &mut T, val: T) {
310 *input = val;
c1a9b12d
SL
311}
312```
313
532ac7d7 314All it does is take a mutable reference and a value and overwrite the referent with it.
450edc1f
XL
315What's important about this function is that it creates a type equality constraint. It
316clearly says in its signature the referent and the value must be the *exact same* type.
94b46f34 317
450edc1f 318Meanwhile, in the caller we pass in `&mut &'static str` and `&'spike_str str`.
c1a9b12d 319
450edc1f
XL
320Because `&mut T` is invariant over `T`, the compiler concludes it can't apply any subtyping
321to the first argument, and so `T` must be exactly `&'static str`.
c1a9b12d 322
450edc1f
XL
323The other argument is only an `&'a str`, which *is* covariant over `'a`. So the compiler
324adopts a constraint: `&'spike_str str` must be a subtype of `&'static str` (inclusive),
325which in turn implies `'spike_str` must be a subtype of `'static` (inclusive). Which is to say,
326`'spike_str` must contain `'static`. But only one thing contains `'static` -- `'static` itself!
c1a9b12d 327
450edc1f
XL
328This is why we get an error when we try to assign `&spike` to `spike_str`. The
329compiler has worked backwards to conclude `spike_str` must live forever, and `&spike`
330simply can't live that long.
331
332So even though references are covariant over their lifetimes, they "inherit" invariance
333whenever they're put into a context that could do something bad with that. In this case,
334we inherited invariance as soon as we put our reference inside an `&mut T`.
c1a9b12d 335
450edc1f
XL
336As it turns out, the argument for why it's ok for Box (and Vec, Hashmap, etc.) to
337be covariant is pretty similar to the argument for why it's ok for
338lifetimes to be covariant: as soon as you try to stuff them in something like a
339mutable reference, they inherit invariance and you're prevented from doing anything
340bad.
341
342However Box makes it easier to focus on by-value aspect of references that we
343partially glossed over.
344
345Unlike a lot of languages which allow values to be freely aliased at all times,
346Rust has a very strict rule: if you're allowed to mutate or move a value, you
347are guaranteed to be the only one with access to it.
348
349Consider the following code:
c1a9b12d 350
136023e0 351<!-- ignore: simplified code -->
c1a9b12d 352```rust,ignore
450edc1f
XL
353let mr_snuggles: Box<Cat> = ..;
354let spike: Box<Dog> = ..;
355
356let mut pet: Box<Animal>;
357pet = mr_snuggles;
358pet = spike;
c1a9b12d
SL
359```
360
450edc1f
XL
361There is no problem at all with the fact that we have forgotten that `mr_snuggles` was a Cat,
362or that we overwrote him with a Dog, because as soon as we moved mr_snuggles to a variable
363that only knew he was an Animal, **we destroyed the only thing in the universe that
364remembered he was a Cat**!
94b46f34 365
450edc1f
XL
366In contrast to the argument about immutable references being soundly covariant because they
367don't let you change anything, owned values can be covariant because they make you
368change *everything*. There is no connection between old locations and new locations.
369Applying by-value subtyping is an irreversible act of knowledge destruction, and
370without any memory of how things used to be, no one can be tricked into acting on
371that old information!
372
373Only one thing left to explain: function pointers.
374
375To see why `fn(T) -> U` should be covariant over `U`, consider the following signature:
c1a9b12d 376
136023e0 377<!-- ignore: simplified code -->
94b46f34 378```rust,ignore
450edc1f 379fn get_animal() -> Animal;
94b46f34
XL
380```
381
450edc1f
XL
382This function claims to produce an Animal. As such, it is perfectly valid to
383provide a function with the following signature instead:
94b46f34 384
136023e0 385<!-- ignore: simplified code -->
94b46f34 386```rust,ignore
450edc1f 387fn get_animal() -> Cat;
94b46f34
XL
388```
389
450edc1f
XL
390After all, Cats are Animals, so always producing a Cat is a perfectly valid way
391to produce Animals. Or to relate it back to real Rust: if we need a function
392that is supposed to produce something that lives for `'short`, it's perfectly
393fine for it to produce something that lives for `'long`. We don't care, we can
394just forget that fact.
94b46f34 395
450edc1f 396However, the same logic does not apply to *arguments*. Consider trying to satisfy:
c1a9b12d 397
136023e0 398<!-- ignore: simplified code -->
c1a9b12d 399```rust,ignore
450edc1f 400fn handle_animal(Animal);
c1a9b12d
SL
401```
402
136023e0 403with:
c1a9b12d 404
136023e0 405<!-- ignore: simplified code -->
c1a9b12d 406```rust,ignore
450edc1f 407fn handle_animal(Cat);
c1a9b12d
SL
408```
409
450edc1f
XL
410The first function can accept Dogs, but the second function absolutely can't.
411Covariance doesn't work here. But if we flip it around, it actually *does*
412work! If we need a function that can handle Cats, a function that can handle *any*
413Animal will surely work fine. Or to relate it back to real Rust: if we need a
414function that can handle anything that lives for at least `'long`, it's perfectly
415fine for it to be able to handle anything that lives for at least `'short`.
c1a9b12d 416
450edc1f
XL
417And that's why function types, unlike anything else in the language, are
418**contra**variant over their arguments.
c1a9b12d 419
450edc1f 420Now, this is all well and good for the types the standard library provides, but
c295e0f8 421how is variance determined for types that *you* define? A struct, informally
450edc1f
XL
422speaking, inherits the variance of its fields. If a struct `MyType`
423has a generic argument `A` that is used in a field `a`, then MyType's variance
424over `A` is exactly `a`'s variance over `A`.
c1a9b12d 425
450edc1f
XL
426However if `A` is used in multiple fields:
427
428* If all uses of `A` are covariant, then MyType is covariant over `A`
429* If all uses of `A` are contravariant, then MyType is contravariant over `A`
430* Otherwise, MyType is invariant over `A`
c1a9b12d
SL
431
432```rust
433use std::cell::Cell;
434
450edc1f 435struct MyType<'a, 'b, A: 'a, B: 'b, C, D, E, F, G, H, In, Out, Mixed> {
94b46f34
XL
436 a: &'a A, // covariant over 'a and A
437 b: &'b mut B, // covariant over 'b and invariant over B
438
439 c: *const C, // covariant over C
c1a9b12d 440 d: *mut D, // invariant over D
94b46f34
XL
441
442 e: E, // covariant over E
443 f: Vec<F>, // covariant over F
444 g: Cell<G>, // invariant over G
445
3c0e092e 446 h1: H, // would also be covariant over H except...
94b46f34
XL
447 h2: Cell<H>, // invariant over H, because invariance wins all conflicts
448
449 i: fn(In) -> Out, // contravariant over In, covariant over Out
450
451 k1: fn(Mixed) -> usize, // would be contravariant over Mixed except..
452 k2: Mixed, // invariant over Mixed, because invariance wins all conflicts
c1a9b12d
SL
453}
454```