src/doc/book/src/ch10-01-syntax.md

   1 ## Generic Data Types
   2
   3 We can use generics to create definitions for items like function signatures or
   4 structs, which we can then use with many different concrete data types. Let’s
   5 first look at how to define functions, structs, enums, and methods using
   6 generics. Then we’ll discuss how generics affect code performance.
   7
   8 ### In Function Definitions
   9
  10 When defining a function that uses generics, we place the generics in the
  11 signature of the function where we would usually specify the data types of the
  12 parameters and return value. Doing so makes our code more flexible and provides
  13 more functionality to callers of our function while preventing code duplication.
  14
  15 Continuing with our `largest` function, Listing 10-4 shows two functions that
  16 both find the largest value in a slice.
  17
  18 <span class="filename">Filename: src/main.rs</span>
  19
  20 ```rust
  21 fn largest_i32(list: &[i32]) -> i32 {
  22     let mut largest = list[0];
  23
  24     for &item in list.iter() {
  25         if item > largest {
  26             largest = item;
  27         }
  28     }
  29
  30     largest
  31 }
  32
  33 fn largest_char(list: &[char]) -> char {
  34     let mut largest = list[0];
  35
  36     for &item in list.iter() {
  37         if item > largest {
  38             largest = item;
  39         }
  40     }
  41
  42     largest
  43 }
  44
  45 fn main() {
  46     let number_list = vec![34, 50, 25, 100, 65];
  47
  48     let result = largest_i32(&number_list);
  49     println!("The largest number is {}", result);
  50 #    assert_eq!(result, 100);
  51
  52     let char_list = vec!['y', 'm', 'a', 'q'];
  53
  54     let result = largest_char(&char_list);
  55     println!("The largest char is {}", result);
  56 #    assert_eq!(result, 'y');
  57 }
  58 ```
  59
  60 <span class="caption">Listing 10-4: Two functions that differ only in their
  61 names and the types in their signatures</span>
  62
  63 The `largest_i32` function is the one we extracted in Listing 10-3 that finds
  64 the largest `i32` in a slice. The `largest_char` function finds the largest
  65 `char` in a slice. The function bodies have the same code, so let’s eliminate
  66 the duplication by introducing a generic type parameter in a single function.
  67
  68 To parameterize the types in the new function we’ll define, we need to name the
  69 type parameter, just as we do for the value parameters to a function. You can
  70 use any identifier as a type parameter name. But we’ll use `T` because, by
  71 convention, parameter names in Rust are short, often just a letter, and Rust’s
  72 type-naming convention is CamelCase. Short for “type,” `T` is the default
  73 choice of most Rust programmers.
  74
  75 When we use a parameter in the body of the function, we have to declare the
  76 parameter name in the signature so the compiler knows what that name means.
  77 Similarly, when we use a type parameter name in a function signature, we have
  78 to declare the type parameter name before we use it. To define the generic
  79 `largest` function, place type name declarations inside angle brackets, `<>`,
  80 between the name of the function and the parameter list, like this:
  81
  82 ```rust,ignore
  83 fn largest<T>(list: &[T]) -> T {
  84 ```
  85
  86 We read this definition as: the function `largest` is generic over some type
  87 `T`. This function has one parameter named `list`, which is a slice of values
  88 of type `T`. The `largest` function will return a value of the same type `T`.
  89
  90 Listing 10-5 shows the combined `largest` function definition using the generic
  91 data type in its signature. The listing also shows how we can call the function
  92 with either a slice of `i32` values or `char` values. Note that this code won’t
  93 compile yet, but we’ll fix it later in this chapter.
  94
  95 <span class="filename">Filename: src/main.rs</span>
  96
  97 ```rust,ignore,does_not_compile
  98 fn largest<T>(list: &[T]) -> T {
  99     let mut largest = list[0];
 100
 101     for &item in list.iter() {
 102         if item > largest {
 103             largest = item;
 104         }
 105     }
 106
 107     largest
 108 }
 109
 110 fn main() {
 111     let number_list = vec![34, 50, 25, 100, 65];
 112
 113     let result = largest(&number_list);
 114     println!("The largest number is {}", result);
 115
 116     let char_list = vec!['y', 'm', 'a', 'q'];
 117
 118     let result = largest(&char_list);
 119     println!("The largest char is {}", result);
 120 }
 121 ```
 122
 123 <span class="caption">Listing 10-5: A definition of the `largest` function that
 124 uses generic type parameters but doesn’t compile yet</span>
 125
 126 If we compile this code right now, we’ll get this error:
 127
 128 ```text
 129 error[E0369]: binary operation `>` cannot be applied to type `T`
 130  --> src/main.rs:5:12
 131   |
 132 5 |         if item > largest {
 133   |            ^^^^^^^^^^^^^^
 134   |
 135   = note: an implementation of `std::cmp::PartialOrd` might be missing for `T`
 136 ```
 137
 138 The note mentions `std::cmp::PartialOrd`, which is a *trait*. We’ll talk about
 139 traits in the next section. For now, this error states that the body of
 140 `largest` won’t work for all possible types that `T` could be. Because we want
 141 to compare values of type `T` in the body, we can only use types whose values
 142 can be ordered. To enable comparisons, the standard library has the
 143 `std::cmp::PartialOrd` trait that you can implement on types (see Appendix C
 144 for more on this trait). You’ll learn how to specify that a generic type has a
 145 particular trait in the [“Traits as Parameters”][traits-as-parameters]<!--
 146 ignore --> section, but let’s first explore other ways of using generic type
 147 parameters.
 148
 149 ### In Struct Definitions
 150
 151 We can also define structs to use a generic type parameter in one or more
 152 fields using the `<>` syntax. Listing 10-6 shows how to define a `Point<T>`
 153 struct to hold `x` and `y` coordinate values of any type.
 154
 155 <span class="filename">Filename: src/main.rs</span>
 156
 157 ```rust
 158 struct Point<T> {
 159     x: T,
 160     y: T,
 161 }
 162
 163 fn main() {
 164     let integer = Point { x: 5, y: 10 };
 165     let float = Point { x: 1.0, y: 4.0 };
 166 }
 167 ```
 168
 169 <span class="caption">Listing 10-6: A `Point<T>` struct that holds `x` and `y`
 170 values of type `T`</span>
 171
 172 The syntax for using generics in struct definitions is similar to that used in
 173 function definitions. First, we declare the name of the type parameter inside
 174 angle brackets just after the name of the struct. Then we can use the generic
 175 type in the struct definition where we would otherwise specify concrete data
 176 types.
 177
 178 Note that because we’ve used only one generic type to define `Point<T>`, this
 179 definition says that the `Point<T>` struct is generic over some type `T`, and
 180 the fields `x` and `y` are *both* that same type, whatever that type may be. If
 181 we create an instance of a `Point<T>` that has values of different types, as in
 182 Listing 10-7, our code won’t compile.
 183
 184 <span class="filename">Filename: src/main.rs</span>
 185
 186 ```rust,ignore,does_not_compile
 187 struct Point<T> {
 188     x: T,
 189     y: T,
 190 }
 191
 192 fn main() {
 193     let wont_work = Point { x: 5, y: 4.0 };
 194 }
 195 ```
 196
 197 <span class="caption">Listing 10-7: The fields `x` and `y` must be the same
 198 type because both have the same generic data type `T`.</span>
 199
 200 In this example, when we assign the integer value 5 to `x`, we let the
 201 compiler know that the generic type `T` will be an integer for this instance of
 202 `Point<T>`. Then when we specify 4.0 for `y`, which we’ve defined to have the
 203 same type as `x`, we’ll get a type mismatch error like this:
 204
 205 ```text
 206 error[E0308]: mismatched types
 207  --> src/main.rs:7:38
 208   |
 209 7 |     let wont_work = Point { x: 5, y: 4.0 };
 210   |                                      ^^^ expected integral variable, found
 211 floating-point variable
 212   |
 213   = note: expected type `{integer}`
 214              found type `{float}`
 215 ```
 216
 217 To define a `Point` struct where `x` and `y` are both generics but could have
 218 different types, we can use multiple generic type parameters. For example, in
 219 Listing 10-8, we can change the definition of `Point` to be generic over types
 220 `T` and `U` where `x` is of type `T` and `y` is of type `U`.
 221
 222 <span class="filename">Filename: src/main.rs</span>
 223
 224 ```rust
 225 struct Point<T, U> {
 226     x: T,
 227     y: U,
 228 }
 229
 230 fn main() {
 231     let both_integer = Point { x: 5, y: 10 };
 232     let both_float = Point { x: 1.0, y: 4.0 };
 233     let integer_and_float = Point { x: 5, y: 4.0 };
 234 }
 235 ```
 236
 237 <span class="caption">Listing 10-8: A `Point<T, U>` generic over two types so
 238 that `x` and `y` can be values of different types</span>
 239
 240 Now all the instances of `Point` shown are allowed! You can use as many generic
 241 type parameters in a definition as you want, but using more than a few makes
 242 your code hard to read. When you need lots of generic types in your code, it
 243 could indicate that your code needs restructuring into smaller pieces.
 244
 245 ### In Enum Definitions
 246
 247 As we did with structs, we can define enums to hold generic data types in their
 248 variants. Let’s take another look at the `Option<T>` enum that the standard
 249 library provides, which we used in Chapter 6:
 250
 251 ```rust
 252 enum Option<T> {
 253     Some(T),
 254     None,
 255 }
 256 ```
 257
 258 This definition should now make more sense to you. As you can see, `Option<T>`
 259 is an enum that is generic over type `T` and has two variants: `Some`, which
 260 holds one value of type `T`, and a `None` variant that doesn’t hold any value.
 261 By using the `Option<T>` enum, we can express the abstract concept of having an
 262 optional value, and because `Option<T>` is generic, we can use this abstraction
 263 no matter what the type of the optional value is.
 264
 265 Enums can use multiple generic types as well. The definition of the `Result`
 266 enum that we used in Chapter 9 is one example:
 267
 268 ```rust
 269 enum Result<T, E> {
 270     Ok(T),
 271     Err(E),
 272 }
 273 ```
 274
 275 The `Result` enum is generic over two types, `T` and `E`, and has two variants:
 276 `Ok`, which holds a value of type `T`, and `Err`, which holds a value of type
 277 `E`. This definition makes it convenient to use the `Result` enum anywhere we
 278 have an operation that might succeed (return a value of some type `T`) or fail
 279 (return an error of some type `E`). In fact, this is what we used to open a
 280 file in Listing 9-3, where `T` was filled in with the type `std::fs::File` when
 281 the file was opened successfully and `E` was filled in with the type
 282 `std::io::Error` when there were problems opening the file.
 283
 284 When you recognize situations in your code with multiple struct or enum
 285 definitions that differ only in the types of the values they hold, you can
 286 avoid duplication by using generic types instead.
 287
 288 ### In Method Definitions
 289
 290 We can implement methods on structs and enums (as we did in Chapter 5) and use
 291 generic types in their definitions, too. Listing 10-9 shows the `Point<T>`
 292 struct we defined in Listing 10-6 with a method named `x` implemented on it.
 293
 294 <span class="filename">Filename: src/main.rs</span>
 295
 296 ```rust
 297 struct Point<T> {
 298     x: T,
 299     y: T,
 300 }
 301
 302 impl<T> Point<T> {
 303     fn x(&self) -> &T {
 304         &self.x
 305     }
 306 }
 307
 308 fn main() {
 309     let p = Point { x: 5, y: 10 };
 310
 311     println!("p.x = {}", p.x());
 312 }
 313 ```
 314
 315 <span class="caption">Listing 10-9: Implementing a method named `x` on the
 316 `Point<T>` struct that will return a reference to the `x` field of type
 317 `T`</span>
 318
 319 Here, we’ve defined a method named `x` on `Point<T>` that returns a reference
 320 to the data in the field `x`.
 321
 322 Note that we have to declare `T` just after `impl` so we can use it to specify
 323 that we’re implementing methods on the type `Point<T>`.  By declaring `T` as a
 324 generic type after `impl`, Rust can identify that the type in the angle
 325 brackets in `Point` is a generic type rather than a concrete type.
 326
 327 We could, for example, implement methods only on `Point<f32>` instances rather
 328 than on `Point<T>` instances with any generic type. In Listing 10-10 we use the
 329 concrete type `f32`, meaning we don’t declare any types after `impl`.
 330
 331 ```rust
 332 # struct Point<T> {
 333 #     x: T,
 334 #     y: T,
 335 # }
 336 #
 337 impl Point<f32> {
 338     fn distance_from_origin(&self) -> f32 {
 339         (self.x.powi(2) + self.y.powi(2)).sqrt()
 340     }
 341 }
 342 ```
 343
 344 <span class="caption">Listing 10-10: An `impl` block that only applies to a
 345 struct with a particular concrete type for the generic type parameter `T`</span>
 346
 347 This code means the type `Point<f32>` will have a method named
 348 `distance_from_origin` and other instances of `Point<T>` where `T` is not of
 349 type `f32` will not have this method defined. The method measures how far our
 350 point is from the point at coordinates (0.0, 0.0) and uses mathematical
 351 operations that are available only for floating point types.
 352
 353 Generic type parameters in a struct definition aren’t always the same as those
 354 you use in that struct’s method signatures. For example, Listing 10-11 defines
 355 the method `mixup` on the `Point<T, U>` struct from Listing 10-8. The method
 356 takes another `Point` as a parameter, which might have different types from the
 357 `self` `Point` we’re calling `mixup` on. The method creates a new `Point`
 358 instance with the `x` value from the `self` `Point` (of type `T`) and the `y`
 359 value from the passed-in `Point` (of type `W`).
 360
 361 <span class="filename">Filename: src/main.rs</span>
 362
 363 ```rust
 364 struct Point<T, U> {
 365     x: T,
 366     y: U,
 367 }
 368
 369 impl<T, U> Point<T, U> {
 370     fn mixup<V, W>(self, other: Point<V, W>) -> Point<T, W> {
 371         Point {
 372             x: self.x,
 373             y: other.y,
 374         }
 375     }
 376 }
 377
 378 fn main() {
 379     let p1 = Point { x: 5, y: 10.4 };
 380     let p2 = Point { x: "Hello", y: 'c'};
 381
 382     let p3 = p1.mixup(p2);
 383
 384     println!("p3.x = {}, p3.y = {}", p3.x, p3.y);
 385 }
 386 ```
 387
 388 <span class="caption">Listing 10-11: A method that uses different generic types
 389 from its struct’s definition</span>
 390
 391 In `main`, we’ve defined a `Point` that has an `i32` for `x` (with value `5`)
 392 and an `f64` for `y` (with value `10.4`). The `p2` variable is a `Point` struct
 393 that has a string slice for `x` (with value `"Hello"`) and a `char` for `y`
 394 (with value `c`). Calling `mixup` on `p1` with the argument `p2` gives us `p3`,
 395 which will have an `i32` for `x`, because `x` came from `p1`. The `p3` variable
 396 will have a `char` for `y`, because `y` came from `p2`. The `println!` macro
 397 call will print `p3.x = 5, p3.y = c`.
 398
 399 The purpose of this example is to demonstrate a situation in which some generic
 400 parameters are declared with `impl` and some are declared with the method
 401 definition. Here, the generic parameters `T` and `U` are declared after `impl`,
 402 because they go with the struct definition. The generic parameters `V` and `W`
 403 are declared after `fn mixup`, because they’re only relevant to the method.
 404
 405 ### Performance of Code Using Generics
 406
 407 You might be wondering whether there is a runtime cost when you’re using
 408 generic type parameters. The good news is that Rust implements generics in such
 409 a way that your code doesn’t run any slower using generic types than it would
 410 with concrete types.
 411
 412 Rust accomplishes this by performing monomorphization of the code that is using
 413 generics at compile time. *Monomorphization* is the process of turning generic
 414 code into specific code by filling in the concrete types that are used when
 415 compiled.
 416
 417 In this process, the compiler does the opposite of the steps we used to create
 418 the generic function in Listing 10-5: the compiler looks at all the places
 419 where generic code is called and generates code for the concrete types the
 420 generic code is called with.
 421
 422 Let’s look at how this works with an example that uses the standard library’s
 423 `Option<T>` enum:
 424
 425 ```rust
 426 let integer = Some(5);
 427 let float = Some(5.0);
 428 ```
 429
 430 When Rust compiles this code, it performs monomorphization. During that
 431 process, the compiler reads the values that have been used in `Option<T>`
 432 instances and identifies two kinds of `Option<T>`: one is `i32` and the other
 433 is `f64`. As such, it expands the generic definition of `Option<T>` into
 434 `Option_i32` and `Option_f64`, thereby replacing the generic definition with
 435 the specific ones.
 436
 437 The monomorphized version of the code looks like the following. The generic
 438 `Option<T>` is replaced with the specific definitions created by the compiler:
 439
 440 <span class="filename">Filename: src/main.rs</span>
 441
 442 ```rust
 443 enum Option_i32 {
 444     Some(i32),
 445     None,
 446 }
 447
 448 enum Option_f64 {
 449     Some(f64),
 450     None,
 451 }
 452
 453 fn main() {
 454     let integer = Option_i32::Some(5);
 455     let float = Option_f64::Some(5.0);
 456 }
 457 ```
 458
 459 Because Rust compiles generic code into code that specifies the type in each
 460 instance, we pay no runtime cost for using generics. When the code runs, it
 461 performs just as it would if we had duplicated each definition by hand. The
 462 process of monomorphization makes Rust’s generics extremely efficient at
 463 runtime.
 464
 465 [traits-as-parameters]: ch10-02-traits.html#traits-as-parameters