src/doc/book/trait-objects.md

   1 % Trait Objects
   2
   3 When code involves polymorphism, there needs to be a mechanism to determine
   4 which specific version is actually run. This is called ‘dispatch’. There are
   5 two major forms of dispatch: static dispatch and dynamic dispatch. While Rust
   6 favors static dispatch, it also supports dynamic dispatch through a mechanism
   7 called ‘trait objects’.
   8
   9 ## Background
  10
  11 For the rest of this chapter, we’ll need a trait and some implementations.
  12 Let’s make a simple one, `Foo`. It has one method that is expected to return a
  13 `String`.
  14
  15 ```rust
  16 trait Foo {
  17     fn method(&self) -> String;
  18 }
  19 ```
  20
  21 We’ll also implement this trait for `u8` and `String`:
  22
  23 ```rust
  24 # trait Foo { fn method(&self) -> String; }
  25 impl Foo for u8 {
  26     fn method(&self) -> String { format!("u8: {}", *self) }
  27 }
  28
  29 impl Foo for String {
  30     fn method(&self) -> String { format!("string: {}", *self) }
  31 }
  32 ```
  33
  34
  35 ## Static dispatch
  36
  37 We can use this trait to perform static dispatch with trait bounds:
  38
  39 ```rust
  40 # trait Foo { fn method(&self) -> String; }
  41 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
  42 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
  43 fn do_something<T: Foo>(x: T) {
  44     x.method();
  45 }
  46
  47 fn main() {
  48     let x = 5u8;
  49     let y = "Hello".to_string();
  50
  51     do_something(x);
  52     do_something(y);
  53 }
  54 ```
  55
  56 Rust uses ‘monomorphization’ to perform static dispatch here. This means that
  57 Rust will create a special version of `do_something()` for both `u8` and
  58 `String`, and then replace the call sites with calls to these specialized
  59 functions. In other words, Rust generates something like this:
  60
  61 ```rust
  62 # trait Foo { fn method(&self) -> String; }
  63 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
  64 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
  65 fn do_something_u8(x: u8) {
  66     x.method();
  67 }
  68
  69 fn do_something_string(x: String) {
  70     x.method();
  71 }
  72
  73 fn main() {
  74     let x = 5u8;
  75     let y = "Hello".to_string();
  76
  77     do_something_u8(x);
  78     do_something_string(y);
  79 }
  80 ```
  81
  82 This has a great upside: static dispatch allows function calls to be
  83 inlined because the callee is known at compile time, and inlining is
  84 the key to good optimization. Static dispatch is fast, but it comes at
  85 a tradeoff: ‘code bloat’, due to many copies of the same function
  86 existing in the binary, one for each type.
  87
  88 Furthermore, compilers aren’t perfect and may “optimize” code to become slower.
  89 For example, functions inlined too eagerly will bloat the instruction cache
  90 (cache rules everything around us). This is part of the reason that `#[inline]`
  91 and `#[inline(always)]` should be used carefully, and one reason why using a
  92 dynamic dispatch is sometimes more efficient.
  93
  94 However, the common case is that it is more efficient to use static dispatch,
  95 and one can always have a thin statically-dispatched wrapper function that does
  96 a dynamic dispatch, but not vice versa, meaning static calls are more flexible.
  97 The standard library tries to be statically dispatched where possible for this
  98 reason.
  99
 100 ## Dynamic dispatch
 101
 102 Rust provides dynamic dispatch through a feature called ‘trait objects’. Trait
 103 objects, like `&Foo` or `Box<Foo>`, are normal values that store a value of
 104 *any* type that implements the given trait, where the precise type can only be
 105 known at runtime.
 106
 107 A trait object can be obtained from a pointer to a concrete type that
 108 implements the trait by *casting* it (e.g. `&x as &Foo`) or *coercing* it
 109 (e.g. using `&x` as an argument to a function that takes `&Foo`).
 110
 111 These trait object coercions and casts also work for pointers like `&mut T` to
 112 `&mut Foo` and `Box<T>` to `Box<Foo>`, but that’s all at the moment. Coercions
 113 and casts are identical.
 114
 115 This operation can be seen as ‘erasing’ the compiler’s knowledge about the
 116 specific type of the pointer, and hence trait objects are sometimes referred to
 117 as ‘type erasure’.
 118
 119 Coming back to the example above, we can use the same trait to perform dynamic
 120 dispatch with trait objects by casting:
 121
 122 ```rust
 123 # trait Foo { fn method(&self) -> String; }
 124 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
 125 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
 126 fn do_something(x: &Foo) {
 127     x.method();
 128 }
 129
 130 fn main() {
 131     let x = 5u8;
 132     do_something(&x as &Foo);
 133 }
 134 ```
 135
 136 or by coercing:
 137
 138 ```rust
 139 # trait Foo { fn method(&self) -> String; }
 140 # impl Foo for u8 { fn method(&self) -> String { format!("u8: {}", *self) } }
 141 # impl Foo for String { fn method(&self) -> String { format!("string: {}", *self) } }
 142 fn do_something(x: &Foo) {
 143     x.method();
 144 }
 145
 146 fn main() {
 147     let x = "Hello".to_string();
 148     do_something(&x);
 149 }
 150 ```
 151
 152 A function that takes a trait object is not specialized to each of the types
 153 that implements `Foo`: only one copy is generated, often (but not always)
 154 resulting in less code bloat. However, this comes at the cost of requiring
 155 slower virtual function calls, and effectively inhibiting any chance of
 156 inlining and related optimizations from occurring.
 157
 158 ### Why pointers?
 159
 160 Rust does not put things behind a pointer by default, unlike many managed
 161 languages, so types can have different sizes. Knowing the size of the value at
 162 compile time is important for things like passing it as an argument to a
 163 function, moving it about on the stack and allocating (and deallocating) space
 164 on the heap to store it.
 165
 166 For `Foo`, we would need to have a value that could be at least either a
 167 `String` (24 bytes) or a `u8` (1 byte), as well as any other type for which
 168 dependent crates may implement `Foo` (any number of bytes at all). There’s no
 169 way to guarantee that this last point can work if the values are stored without
 170 a pointer, because those other types can be arbitrarily large.
 171
 172 Putting the value behind a pointer means the size of the value is not relevant
 173 when we are tossing a trait object around, only the size of the pointer itself.
 174
 175 ### Representation
 176
 177 The methods of the trait can be called on a trait object via a special record
 178 of function pointers traditionally called a ‘vtable’ (created and managed by
 179 the compiler).
 180
 181 Trait objects are both simple and complicated: their core representation and
 182 layout is quite straight-forward, but there are some curly error messages and
 183 surprising behaviors to discover.
 184
 185 Let’s start simple, with the runtime representation of a trait object. The
 186 `std::raw` module contains structs with layouts that are the same as the
 187 complicated built-in types, [including trait objects][stdraw]:
 188
 189 ```rust
 190 # mod foo {
 191 pub struct TraitObject {
 192     pub data: *mut (),
 193     pub vtable: *mut (),
 194 }
 195 # }
 196 ```
 197
 198 [stdraw]: ../std/raw/struct.TraitObject.html
 199
 200 That is, a trait object like `&Foo` consists of a ‘data’ pointer and a ‘vtable’
 201 pointer.
 202
 203 The data pointer addresses the data (of some unknown type `T`) that the trait
 204 object is storing, and the vtable pointer points to the vtable (‘virtual method
 205 table’) corresponding to the implementation of `Foo` for `T`.
 206
 207
 208 A vtable is essentially a struct of function pointers, pointing to the concrete
 209 piece of machine code for each method in the implementation. A method call like
 210 `trait_object.method()` will retrieve the correct pointer out of the vtable and
 211 then do a dynamic call of it. For example:
 212
 213 ```rust,ignore
 214 struct FooVtable {
 215     destructor: fn(*mut ()),
 216     size: usize,
 217     align: usize,
 218     method: fn(*const ()) -> String,
 219 }
 220
 221 // u8:
 222
 223 fn call_method_on_u8(x: *const ()) -> String {
 224     // the compiler guarantees that this function is only called
 225     // with `x` pointing to a u8
 226     let byte: &u8 = unsafe { &*(x as *const u8) };
 227
 228     byte.method()
 229 }
 230
 231 static Foo_for_u8_vtable: FooVtable = FooVtable {
 232     destructor: /* compiler magic */,
 233     size: 1,
 234     align: 1,
 235
 236     // cast to a function pointer
 237     method: call_method_on_u8 as fn(*const ()) -> String,
 238 };
 239
 240
 241 // String:
 242
 243 fn call_method_on_String(x: *const ()) -> String {
 244     // the compiler guarantees that this function is only called
 245     // with `x` pointing to a String
 246     let string: &String = unsafe { &*(x as *const String) };
 247
 248     string.method()
 249 }
 250
 251 static Foo_for_String_vtable: FooVtable = FooVtable {
 252     destructor: /* compiler magic */,
 253     // values for a 64-bit computer, halve them for 32-bit ones
 254     size: 24,
 255     align: 8,
 256
 257     method: call_method_on_String as fn(*const ()) -> String,
 258 };
 259 ```
 260
 261 The `destructor` field in each vtable points to a function that will clean up
 262 any resources of the vtable’s type: for `u8` it is trivial, but for `String` it
 263 will free the memory. This is necessary for owning trait objects like
 264 `Box<Foo>`, which need to clean-up both the `Box` allocation as well as the
 265 internal type when they go out of scope. The `size` and `align` fields store
 266 the size of the erased type, and its alignment requirements; these are
 267 essentially unused at the moment since the information is embedded in the
 268 destructor, but will be used in the future, as trait objects are progressively
 269 made more flexible.
 270
 271 Suppose we’ve got some values that implement `Foo`. The explicit form of
 272 construction and use of `Foo` trait objects might look a bit like (ignoring the
 273 type mismatches: they’re all pointers anyway):
 274
 275 ```rust,ignore
 276 let a: String = "foo".to_string();
 277 let x: u8 = 1;
 278
 279 // let b: &Foo = &a;
 280 let b = TraitObject {
 281     // store the data
 282     data: &a,
 283     // store the methods
 284     vtable: &Foo_for_String_vtable
 285 };
 286
 287 // let y: &Foo = x;
 288 let y = TraitObject {
 289     // store the data
 290     data: &x,
 291     // store the methods
 292     vtable: &Foo_for_u8_vtable
 293 };
 294
 295 // b.method();
 296 (b.vtable.method)(b.data);
 297
 298 // y.method();
 299 (y.vtable.method)(y.data);
 300 ```
 301
 302 ## Object Safety
 303
 304 Not every trait can be used to make a trait object. For example, vectors implement
 305 `Clone`, but if we try to make a trait object:
 306
 307 ```rust,ignore
 308 let v = vec![1, 2, 3];
 309 let o = &v as &Clone;
 310 ```
 311
 312 We get an error:
 313
 314 ```text
 315 error: cannot convert to a trait object because trait `core::clone::Clone` is not object-safe [E0038]
 316 let o = &v as &Clone;
 317         ^~
 318 note: the trait cannot require that `Self : Sized`
 319 let o = &v as &Clone;
 320         ^~
 321 ```
 322
 323 The error says that `Clone` is not ‘object-safe’. Only traits that are
 324 object-safe can be made into trait objects. A trait is object-safe if both of
 325 these are true:
 326
 327 * the trait does not require that `Self: Sized`
 328 * all of its methods are object-safe
 329
 330 So what makes a method object-safe? Each method must require that `Self: Sized`
 331 or all of the following:
 332
 333 * must not have any type parameters
 334 * must not use `Self`
 335
 336 Whew! As we can see, almost all of these rules talk about `Self`. A good intuition
 337 is “except in special circumstances, if your trait’s method uses `Self`, it is not
 338 object-safe.”