src/doc/book/second-edition/src/ch10-01-syntax.md

   1 ## Generic Data Types
   2
   3 Using generics where we usually place types, like in function signatures or
   4 structs, lets us create definitions that we can use for many different concrete
   5 data types. Let’s take a look at how to define functions, structs, enums, and
   6 methods using generics, and at the end of this section we’ll discuss the
   7 performance of code using generics.
   8
   9 ### Using Generic Data Types in Function Definitions
  10
  11 We can define functions that use generics in the signature of the function
  12 where the data types of the parameters and return value go. In this way, the
  13 code we write can be more flexible and provide more functionality to callers of
  14 our function, while not introducing code duplication.
  15
  16 Continuing with our `largest` function, Listing 10-4 shows two functions
  17 providing the same functionality to find the largest value in a slice. The
  18 first function is the one we extracted in Listing 10-3 that finds the largest
  19 `i32` in a slice. The second function finds the largest `char` in a slice:
  20
  21 <span class="filename">Filename: src/main.rs</span>
  22
  23 ```rust
  24 fn largest_i32(list: &[i32]) -> i32 {
  25     let mut largest = list[0];
  26
  27     for &item in list.iter() {
  28         if item > largest {
  29             largest = item;
  30         }
  31     }
  32
  33     largest
  34 }
  35
  36 fn largest_char(list: &[char]) -> char {
  37     let mut largest = list[0];
  38
  39     for &item in list.iter() {
  40         if item > largest {
  41             largest = item;
  42         }
  43     }
  44
  45     largest
  46 }
  47
  48 fn main() {
  49     let number_list = vec![34, 50, 25, 100, 65];
  50
  51     let result = largest_i32(&number_list);
  52     println!("The largest number is {}", result);
  53 #    assert_eq!(result, 100);
  54
  55     let char_list = vec!['y', 'm', 'a', 'q'];
  56
  57     let result = largest_char(&char_list);
  58     println!("The largest char is {}", result);
  59 #    assert_eq!(result, 'y');
  60 }
  61 ```
  62
  63 <span class="caption">Listing 10-4: Two functions that differ only in their
  64 names and the types in their signatures</span>
  65
  66 Here, the functions `largest_i32` and `largest_char` have the exact same body,
  67 so it would be nice if we could turn these two functions into one and get rid
  68 of the duplication. Luckily, we can do that by introducing a generic type
  69 parameter!
  70
  71 To parameterize the types in the signature of the one function we’re going to
  72 define, we need to create a name for the type parameter, just like how we give
  73 names for the value parameters to a function. We’re going to choose the name
  74 `T`. Any identifier can be used as a type parameter name, but we’re choosing
  75 `T` because Rust’s type naming convention is CamelCase. Generic type parameter
  76 names also tend to be short by convention, often just one letter. Short for
  77 “type”, `T` is the default choice of most Rust programmers.
  78
  79 When we use a parameter in the body of the function, we have to declare the
  80 parameter in the signature so that the compiler knows what that name in the
  81 body means. Similarly, when we use a type parameter name in a function
  82 signature, we have to declare the type parameter name before we use it. Type
  83 name declarations go in angle brackets between the name of the function and the
  84 parameter list.
  85
  86 The function signature of the generic `largest` function we’re going to define
  87 will look like this:
  88
  89 ```rust,ignore
  90 fn largest<T>(list: &[T]) -> T {
  91 ```
  92
  93 We would read this as: the function `largest` is generic over some type `T`. It
  94 has one parameter named `list`, and the type of `list` is a slice of values of
  95 type `T`. The `largest` function will return a value of the same type `T`.
  96
  97 Listing 10-5 shows the unified `largest` function definition using the generic
  98 data type in its signature, and shows how we’ll be able to call `largest` with
  99 either a slice of `i32` values or `char` values. Note that this code won’t
 100 compile yet!
 101
 102 <span class="filename">Filename: src/main.rs</span>
 103
 104 ```rust,ignore
 105 fn largest<T>(list: &[T]) -> T {
 106     let mut largest = list[0];
 107
 108     for &item in list.iter() {
 109         if item > largest {
 110             largest = item;
 111         }
 112     }
 113
 114     largest
 115 }
 116
 117 fn main() {
 118     let number_list = vec![34, 50, 25, 100, 65];
 119
 120     let result = largest(&number_list);
 121     println!("The largest number is {}", result);
 122
 123     let char_list = vec!['y', 'm', 'a', 'q'];
 124
 125     let result = largest(&char_list);
 126     println!("The largest char is {}", result);
 127 }
 128 ```
 129
 130 <span class="caption">Listing 10-5: A definition of the `largest` function that
 131 uses generic type parameters but doesn’t compile yet</span>
 132
 133 If we try to compile this code right now, we’ll get this error:
 134
 135 ```text
 136 error[E0369]: binary operation `>` cannot be applied to type `T`
 137   |
 138 5 |         if item > largest {
 139   |            ^^^^
 140   |
 141 note: an implementation of `std::cmp::PartialOrd` might be missing for `T`
 142 ```
 143
 144 The note mentions `std::cmp::PartialOrd`, which is a *trait*. We’re going to
 145 talk about traits in the next section, but briefly, what this error is saying
 146 is that the body of `largest` won’t work for all possible types that `T` could
 147 be; since we want to compare values of type `T` in the body, we can only use
 148 types that know how to be ordered. The standard library has defined the trait
 149 `std::cmp::PartialOrd` that types can implement to enable comparisons. We’ll
 150 come back to traits and how to specify that a generic type has a particular
 151 trait in the next section, but let’s set this example aside for a moment and
 152 explore other places we can use generic type parameters first.
 153
 154 <!-- Liz: this is the reason we had the topics in the order we did in the first
 155 draft of this chapter; it's hard to do anything interesting with generic types
 156 in functions unless you also know about traits and trait bounds. I think this
 157 ordering could work out okay, though, and keep a stronger thread with the
 158 `longest` function going through the whole chapter, but we do pause with a
 159 not-yet-compiling example here, which I know isn't ideal either. Let us know
 160 what you think. /Carol -->
 161
 162 ### Using Generic Data Types in Struct Definitions
 163
 164 We can define structs to use a generic type parameter in one or more of the
 165 struct’s fields with the `<>` syntax too. Listing 10-6 shows the definition and
 166 use of a `Point` struct that can hold `x` and `y` coordinate values of any type:
 167
 168 <span class="filename">Filename: src/main.rs</span>
 169
 170 ```rust
 171 struct Point<T> {
 172     x: T,
 173     y: T,
 174 }
 175
 176 fn main() {
 177     let integer = Point { x: 5, y: 10 };
 178     let float = Point { x: 1.0, y: 4.0 };
 179 }
 180 ```
 181
 182 <span class="caption">Listing 10-6: A `Point` struct that holds `x` and `y`
 183 values of type `T`</span>
 184
 185 The syntax is similar to using generics in function definitions. First, we have
 186 to declare the name of the type parameter within angle brackets just after the
 187 name of the struct. Then we can use the generic type in the struct definition
 188 where we would specify concrete data types.
 189
 190 Note that because we’ve only used one generic type in the definition of
 191 `Point`, what we’re saying is that the `Point` struct is generic over some type
 192 `T`, and the fields `x` and `y` are *both* that same type, whatever it ends up
 193 being. If we try to create an instance of a `Point` that has values of
 194 different types, as in Listing 10-7, our code won’t compile:
 195
 196 <span class="filename">Filename: src/main.rs</span>
 197
 198 ```rust,ignore
 199 struct Point<T> {
 200     x: T,
 201     y: T,
 202 }
 203
 204 fn main() {
 205     let wont_work = Point { x: 5, y: 4.0 };
 206 }
 207 ```
 208
 209 <span class="caption">Listing 10-7: The fields `x` and `y` must be the same
 210 type because both have the same generic data type `T`</span>
 211
 212 If we try to compile this, we’ll get the following error:
 213
 214 ```text
 215 error[E0308]: mismatched types
 216  -->
 217   |
 218 7 |     let wont_work = Point { x: 5, y: 4.0 };
 219   |                                      ^^^ expected integral variable, found
 220   floating-point variable
 221   |
 222   = note: expected type `{integer}`
 223   = note:    found type `{float}`
 224 ```
 225
 226 When we assigned the integer value 5 to `x`, the compiler then knows for this
 227 instance of `Point` that the generic type `T` will be an integer. Then when we
 228 specified 4.0 for `y`, which is defined to have the same type as `x`, we get a
 229 type mismatch error.
 230
 231 If we wanted to define a `Point` struct where `x` and `y` could have different
 232 types but still have those types be generic, we can use multiple generic type
 233 parameters. In listing 10-8, we’ve changed the definition of `Point` to be
 234 generic over types `T` and `U`. The field `x` is of type `T`, and the field `y`
 235 is of type `U`:
 236
 237 <span class="filename">Filename: src/main.rs</span>
 238
 239 ```rust
 240 struct Point<T, U> {
 241     x: T,
 242     y: U,
 243 }
 244
 245 fn main() {
 246     let both_integer = Point { x: 5, y: 10 };
 247     let both_float = Point { x: 1.0, y: 4.0 };
 248     let integer_and_float = Point { x: 5, y: 4.0 };
 249 }
 250 ```
 251
 252 <span class="caption">Listing 10-8: A `Point` generic over two types so that
 253 `x` and `y` may be values of different types</span>
 254
 255 Now all of these instances of `Point` are allowed! You can use as many generic
 256 type parameters in a definition as you want, but using more than a few gets
 257 hard to read and understand. If you get to a point of needing lots of generic
 258 types, it’s probably a sign that your code could use some restructuring to be
 259 separated into smaller pieces.
 260
 261 ### Using Generic Data Types in Enum Definitions
 262
 263 Similarly to structs, enums can be defined to hold generic data types in their
 264 variants. We used the `Option<T>` enum provided by the standard library in
 265 Chapter 6, and now its definition should make more sense. Let’s take another
 266 look:
 267
 268 ```rust
 269 enum Option<T> {
 270     Some(T),
 271     None,
 272 }
 273 ```
 274
 275 In other words, `Option<T>` is an enum generic in type `T`. It has two
 276 variants: `Some`, which holds one value of type `T`, and a `None` variant that
 277 doesn’t hold any value. The standard library only has to have this one
 278 definition to support the creation of values of this enum that have any
 279 concrete type. The idea of “an optional value” is a more abstract concept than
 280 one specific type, and Rust lets us express this abstract concept without lots
 281 of duplication.
 282
 283 Enums can use multiple generic types as well. The definition of the `Result`
 284 enum that we used in Chapter 9 is one example:
 285
 286 ```rust
 287 enum Result<T, E> {
 288     Ok(T),
 289     Err(E),
 290 }
 291 ```
 292
 293 The `Result` enum is generic over two types, `T` and `E`. `Result` has two
 294 variants: `Ok`, which holds a value of type `T`, and `Err`, which holds a value
 295 of type `E`. This definition makes it convenient to use the `Result` enum
 296 anywhere we have an operation that might succeed (and return a value of some
 297 type `T`) or fail (and return an error of some type `E`). Recall Listing 9-2
 298 when we opened a file: in that case, `T` was filled in with the type
 299 `std::fs::File` when the file was opened successfully and `E` was filled in
 300 with the type `std::io::Error` when there were problems opening the file.
 301
 302 When you recognize situations in your code with multiple struct or enum
 303 definitions that differ only in the types of the values they hold, you can
 304 remove the duplication by using the same process we used with the function
 305 definitions to introduce generic types instead.
 306
 307 ### Using Generic Data Types in Method Definitions
 308
 309 Like we did in Chapter 5, we can implement methods on structs and enums that
 310 have generic types in their definitions. Listing 10-9 shows the `Point<T>`
 311 struct we defined in Listing 10-6. We’ve then defined a method named `x` on
 312 `Point<T>` that returns a reference to the data in the field `x`:
 313
 314 <span class="filename">Filename: src/main.rs</span>
 315
 316 ```rust
 317 struct Point<T> {
 318     x: T,
 319     y: T,
 320 }
 321
 322 impl<T> Point<T> {
 323     fn x(&self) -> &T {
 324         &self.x
 325     }
 326 }
 327
 328 fn main() {
 329     let p = Point { x: 5, y: 10 };
 330
 331     println!("p.x = {}", p.x());
 332 }
 333 ```
 334
 335 <span class="caption">Listing 10-9: Implementing a method named `x` on the
 336 `Point<T>` struct that will return a reference to the `x` field, which is of
 337 type `T`.</span>
 338
 339 Note that we have to declare `T` just after `impl` in order to use `T` in the
 340 type `Point<T>`. Declaring `T` as a generic type after the `impl` is how Rust
 341 knows the type in the angle brackets in `Point` is a generic type rather than a
 342 concrete type. For example, we could choose to implement methods on
 343 `Point<f32>` instances rather than `Point` instances with any generic type.
 344 Listing 10-10 shows that we don’t declare anything after the `impl` in this
 345 case, since we’re using a concrete type, `f32`:
 346
 347 ```rust
 348 # struct Point<T> {
 349 #     x: T,
 350 #     y: T,
 351 # }
 352 #
 353 impl Point<f32> {
 354     fn distance_from_origin(&self) -> f32 {
 355         (self.x.powi(2) + self.y.powi(2)).sqrt()
 356     }
 357 }
 358 ```
 359
 360 <span class="caption">Listing 10-10: Building an `impl` block which only
 361 applies to a struct with a specific type is used for the generic type parameter
 362 `T`</span>
 363
 364 This code means the type `Point<f32>` will have a method named
 365 `distance_from_origin`, and other instances of `Point<T>` where `T` is not of
 366 type `f32` will not have this method defined. This method measures how far our
 367 point is from the point of coordinates (0.0, 0.0) and uses mathematical
 368 operations which are only available for floating-point types.
 369
 370 Generic type parameters in a struct definition aren’t always the same generic
 371 type parameters you want to use in that struct’s method signatures. Listing
 372 10-11 defines a method `mixup` on the `Point<T, U>` struct from Listing 10-8.
 373 The method takes another `Point` as a parameter, which might have different
 374 types than the `self` `Point` that we’re calling `mixup` on. The method creates
 375 a new `Point` instance that has the `x` value from the `self` `Point` (which is
 376 of type `T`) and the `y` value from the passed-in `Point` (which is of type
 377 `W`):
 378
 379 <span class="filename">Filename: src/main.rs</span>
 380
 381 ```rust
 382 struct Point<T, U> {
 383     x: T,
 384     y: U,
 385 }
 386
 387 impl<T, U> Point<T, U> {
 388     fn mixup<V, W>(self, other: Point<V, W>) -> Point<T, W> {
 389         Point {
 390             x: self.x,
 391             y: other.y,
 392         }
 393     }
 394 }
 395
 396 fn main() {
 397     let p1 = Point { x: 5, y: 10.4 };
 398     let p2 = Point { x: "Hello", y: 'c'};
 399
 400     let p3 = p1.mixup(p2);
 401
 402     println!("p3.x = {}, p3.y = {}", p3.x, p3.y);
 403 }
 404 ```
 405
 406 <span class="caption">Listing 10-11: Methods that use different generic types
 407 than their struct’s definition</span>
 408
 409 In `main`, we’ve defined a `Point` that has an `i32` for `x` (with value `5`)
 410 and an `f64` for `y` (with value `10.4`). `p2` is a `Point` that has a string
 411 slice for `x` (with value `"Hello"`) and a `char` for `y` (with value `c`).
 412 Calling `mixup` on `p1` with the argument `p2` gives us `p3`, which will have
 413 an `i32` for `x`, since `x` came from `p1`. `p3` will have a `char` for `y`,
 414 since `y` came from `p2`. The `println!` will print `p3.x = 5, p3.y = c`.
 415
 416 Note that the generic parameters `T` and `U` are declared after `impl`, since
 417 they go with the struct definition. The generic parameters `V` and `W` are
 418 declared after `fn mixup`, since they are only relevant to the method.
 419
 420 ### Performance of Code Using Generics
 421
 422 You may have been reading this section and wondering if there’s a run-time cost
 423 to using generic type parameters. Good news: the way that Rust has implemented
 424 generics means that your code will not run any slower than if you had specified
 425 concrete types instead of generic type parameters!
 426
 427 Rust accomplishes this by performing *monomorphization* of code using generics
 428 at compile time. Monomorphization is the process of turning generic code into
 429 specific code with the concrete types that are actually used filled in.
 430
 431 What the compiler does is the opposite of the steps that we performed to create
 432 the generic function in Listing 10-5. The compiler looks at all the places that
 433 generic code is called and generates code for the concrete types that the
 434 generic code is called with.
 435
 436 Let’s work through an example that uses the standard library’s `Option` enum:
 437
 438 ```rust
 439 let integer = Some(5);
 440 let float = Some(5.0);
 441 ```
 442
 443 When Rust compiles this code, it will perform monomorphization. The compiler
 444 will read the values that have been passed to `Option` and see that we have two
 445 kinds of `Option<T>`: one is `i32`, and one is `f64`. As such, it will expand
 446 the generic definition of `Option<T>` into `Option_i32` and `Option_f64`,
 447 thereby replacing the generic definition with the specific ones.
 448
 449 The monomorphized version of our code that the compiler generates looks like
 450 this, with the uses of the generic `Option` replaced with the specific
 451 definitions created by the compiler:
 452
 453 <span class="filename">Filename: src/main.rs</span>
 454
 455 ```rust
 456 enum Option_i32 {
 457     Some(i32),
 458     None,
 459 }
 460
 461 enum Option_f64 {
 462     Some(f64),
 463     None,
 464 }
 465
 466 fn main() {
 467     let integer = Option_i32::Some(5);
 468     let float = Option_f64::Some(5.0);
 469 }
 470 ```
 471
 472 We can write the non-duplicated code using generics, and Rust will compile that
 473 into code that specifies the type in each instance. That means we pay no
 474 runtime cost for using generics; when the code runs, it performs just like it
 475 would if we had duplicated each particular definition by hand. The process of
 476 monomorphization is what makes Rust’s generics extremely efficient at runtime.