]> git.proxmox.com Git - rustc.git/blob - src/doc/trpl/trait-objects.md
c01129057418cf1af9db4d73fb37ed52d4133f19
[rustc.git] / src / doc / trpl / trait-objects.md
1 % Trait Objects
2
3 When code involves polymorphism, there needs to be a mechanism to determine
4 which specific version is actually run. This is called ‘dispatch’. There are
5 two major forms of dispatch: static dispatch and dynamic dispatch. While Rust
6 favors static dispatch, it also supports dynamic dispatch through a mechanism
7 called ‘trait objects’.
8
9 ## Background
10
11 For the rest of this chapter, we’ll need a trait and some implementations.
12 Let’s make a simple one, `Foo`. It has one method that is expected to return a
13 `String`.
14
15 ```rust
16 trait Foo {
17 fn method(&self) -> String;
18 }
19 ```
20
21 We’ll also implement this trait for `u8` and `String`:
22
23 ```rust
24 # trait Foo { fn method(&self) -> String; }
25 impl Foo for u8 {
26 fn method(&self) -> String { format!("u8: {}", *self) }
27 }
28
29 impl Foo for String {
30 fn method(&self) -> String { format!("string: {}", *self) }
31 }
32 ```
33
34
35 ## Static dispatch
36
37 We can use this trait to perform static dispatch with trait bounds:
38
39 ```rust
40 # trait Foo { fn method(&self) -> String; }
41 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
42 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
43 fn do_something<T: Foo>(x: T) {
44 x.method();
45 }
46
47 fn main() {
48 let x = 5u8;
49 let y = "Hello".to_string();
50
51 do_something(x);
52 do_something(y);
53 }
54 ```
55
56 Rust uses ‘monomorphization’ to perform static dispatch here. This means that
57 Rust will create a special version of `do_something()` for both `u8` and
58 `String`, and then replace the call sites with calls to these specialized
59 functions. In other words, Rust generates something like this:
60
61 ```rust
62 # trait Foo { fn method(&self) -> String; }
63 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
64 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
65 fn do_something_u8(x: u8) {
66 x.method();
67 }
68
69 fn do_something_string(x: String) {
70 x.method();
71 }
72
73 fn main() {
74 let x = 5u8;
75 let y = "Hello".to_string();
76
77 do_something_u8(x);
78 do_something_string(y);
79 }
80 ```
81
82 This has a great upside: static dispatch allows function calls to be
83 inlined because the callee is known at compile time, and inlining is
84 the key to good optimization. Static dispatch is fast, but it comes at
85 a tradeoff: ‘code bloat’, due to many copies of the same function
86 existing in the binary, one for each type.
87
88 Furthermore, compilers aren’t perfect and may “optimize” code to become slower.
89 For example, functions inlined too eagerly will bloat the instruction cache
90 (cache rules everything around us). This is part of the reason that `#[inline]`
91 and `#[inline(always)]` should be used carefully, and one reason why using a
92 dynamic dispatch is sometimes more efficient.
93
94 However, the common case is that it is more efficient to use static dispatch,
95 and one can always have a thin statically-dispatched wrapper function that does
96 a dynamic dispatch, but not vice versa, meaning static calls are more flexible.
97 The standard library tries to be statically dispatched where possible for this
98 reason.
99
100 ## Dynamic dispatch
101
102 Rust provides dynamic dispatch through a feature called ‘trait objects’. Trait
103 objects, like `&Foo` or `Box<Foo>`, are normal values that store a value of
104 *any* type that implements the given trait, where the precise type can only be
105 known at runtime.
106
107 A trait object can be obtained from a pointer to a concrete type that
108 implements the trait by *casting* it (e.g. `&x as &Foo`) or *coercing* it
109 (e.g. using `&x` as an argument to a function that takes `&Foo`).
110
111 These trait object coercions and casts also work for pointers like `&mut T` to
112 `&mut Foo` and `Box<T>` to `Box<Foo>`, but that’s all at the moment. Coercions
113 and casts are identical.
114
115 This operation can be seen as ‘erasing’ the compiler’s knowledge about the
116 specific type of the pointer, and hence trait objects are sometimes referred to
117 as ‘type erasure’.
118
119 Coming back to the example above, we can use the same trait to perform dynamic
120 dispatch with trait objects by casting:
121
122 ```rust
123 # trait Foo { fn method(&self) -> String; }
124 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
125 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
126
127 fn do_something(x: &Foo) {
128 x.method();
129 }
130
131 fn main() {
132 let x = 5u8;
133 do_something(&x as &Foo);
134 }
135 ```
136
137 or by coercing:
138
139 ```rust
140 # trait Foo { fn method(&self) -> String; }
141 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
142 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
143
144 fn do_something(x: &Foo) {
145 x.method();
146 }
147
148 fn main() {
149 let x = "Hello".to_string();
150 do_something(&x);
151 }
152 ```
153
154 A function that takes a trait object is not specialized to each of the types
155 that implements `Foo`: only one copy is generated, often (but not always)
156 resulting in less code bloat. However, this comes at the cost of requiring
157 slower virtual function calls, and effectively inhibiting any chance of
158 inlining and related optimizations from occurring.
159
160 ### Why pointers?
161
162 Rust does not put things behind a pointer by default, unlike many managed
163 languages, so types can have different sizes. Knowing the size of the value at
164 compile time is important for things like passing it as an argument to a
165 function, moving it about on the stack and allocating (and deallocating) space
166 on the heap to store it.
167
168 For `Foo`, we would need to have a value that could be at least either a
169 `String` (24 bytes) or a `u8` (1 byte), as well as any other type for which
170 dependent crates may implement `Foo` (any number of bytes at all). There’s no
171 way to guarantee that this last point can work if the values are stored without
172 a pointer, because those other types can be arbitrarily large.
173
174 Putting the value behind a pointer means the size of the value is not relevant
175 when we are tossing a trait object around, only the size of the pointer itself.
176
177 ### Representation
178
179 The methods of the trait can be called on a trait object via a special record
180 of function pointers traditionally called a ‘vtable’ (created and managed by
181 the compiler).
182
183 Trait objects are both simple and complicated: their core representation and
184 layout is quite straight-forward, but there are some curly error messages and
185 surprising behaviors to discover.
186
187 Let’s start simple, with the runtime representation of a trait object. The
188 `std::raw` module contains structs with layouts that are the same as the
189 complicated built-in types, [including trait objects][stdraw]:
190
191 ```rust
192 # mod foo {
193 pub struct TraitObject {
194 pub data: *mut (),
195 pub vtable: *mut (),
196 }
197 # }
198 ```
199
200 [stdraw]: ../std/raw/struct.TraitObject.html
201
202 That is, a trait object like `&Foo` consists of a ‘data’ pointer and a ‘vtable’
203 pointer.
204
205 The data pointer addresses the data (of some unknown type `T`) that the trait
206 object is storing, and the vtable pointer points to the vtable (‘virtual method
207 table’) corresponding to the implementation of `Foo` for `T`.
208
209
210 A vtable is essentially a struct of function pointers, pointing to the concrete
211 piece of machine code for each method in the implementation. A method call like
212 `trait_object.method()` will retrieve the correct pointer out of the vtable and
213 then do a dynamic call of it. For example:
214
215 ```rust,ignore
216 struct FooVtable {
217 destructor: fn(*mut ()),
218 size: usize,
219 align: usize,
220 method: fn(*const ()) -> String,
221 }
222
223 // u8:
224
225 fn call_method_on_u8(x: *const ()) -> String {
226 // the compiler guarantees that this function is only called
227 // with `x` pointing to a u8
228 let byte: &u8 = unsafe { &*(x as *const u8) };
229
230 byte.method()
231 }
232
233 static Foo_for_u8_vtable: FooVtable = FooVtable {
234 destructor: /* compiler magic */,
235 size: 1,
236 align: 1,
237
238 // cast to a function pointer
239 method: call_method_on_u8 as fn(*const ()) -> String,
240 };
241
242
243 // String:
244
245 fn call_method_on_String(x: *const ()) -> String {
246 // the compiler guarantees that this function is only called
247 // with `x` pointing to a String
248 let string: &String = unsafe { &*(x as *const String) };
249
250 string.method()
251 }
252
253 static Foo_for_String_vtable: FooVtable = FooVtable {
254 destructor: /* compiler magic */,
255 // values for a 64-bit computer, halve them for 32-bit ones
256 size: 24,
257 align: 8,
258
259 method: call_method_on_String as fn(*const ()) -> String,
260 };
261 ```
262
263 The `destructor` field in each vtable points to a function that will clean up
264 any resources of the vtable’s type, for `u8` it is trivial, but for `String` it
265 will free the memory. This is necessary for owning trait objects like
266 `Box<Foo>`, which need to clean-up both the `Box` allocation as well as the
267 internal type when they go out of scope. The `size` and `align` fields store
268 the size of the erased type, and its alignment requirements; these are
269 essentially unused at the moment since the information is embedded in the
270 destructor, but will be used in the future, as trait objects are progressively
271 made more flexible.
272
273 Suppose we’ve got some values that implement `Foo`, then the explicit form of
274 construction and use of `Foo` trait objects might look a bit like (ignoring the
275 type mismatches: they’re all just pointers anyway):
276
277 ```rust,ignore
278 let a: String = "foo".to_string();
279 let x: u8 = 1;
280
281 // let b: &Foo = &a;
282 let b = TraitObject {
283 // store the data
284 data: &a,
285 // store the methods
286 vtable: &Foo_for_String_vtable
287 };
288
289 // let y: &Foo = x;
290 let y = TraitObject {
291 // store the data
292 data: &x,
293 // store the methods
294 vtable: &Foo_for_u8_vtable
295 };
296
297 // b.method();
298 (b.vtable.method)(b.data);
299
300 // y.method();
301 (y.vtable.method)(y.data);
302 ```
303
304 If `b` or `y` were owning trait objects (`Box<Foo>`), there would be a
305 `(b.vtable.destructor)(b.data)` (respectively `y`) call when they went out of
306 scope.