]> git.proxmox.com Git - rustc.git/blob - src/doc/book/nostarch/chapter19.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / book / nostarch / chapter19.md
1 <!-- DO NOT EDIT THIS FILE.
2
3 This file is periodically generated from the content in the `/src/`
4 directory, so all fixes need to be made in `/src/`.
5 -->
6
7 [TOC]
8
9 # Advanced Features
10
11 By now, you’ve learned the most commonly used parts of the Rust programming
12 language. Before we do one more project in Chapter 20, we’ll look at a few
13 aspects of the language you might run into every once in a while, but may not
14 use every day. You can use this chapter as a reference for when you encounter
15 any unknowns. The features covered here are useful in very specific situations.
16 Although you might not reach for them often, we want to make sure you have a
17 grasp of all the features Rust has to offer.
18
19 In this chapter, we’ll cover:
20
21 * Unsafe Rust: how to opt out of some of Rust’s guarantees and take
22 responsibility for manually upholding those guarantees
23 * Advanced traits: associated types, default type parameters, fully qualified
24 syntax, supertraits, and the newtype pattern in relation to traits
25 * Advanced types: more about the newtype pattern, type aliases, the never type,
26 and dynamically sized types
27 * Advanced functions and closures: function pointers and returning closures
28 * Macros: ways to define code that defines more code at compile time
29
30 It’s a panoply of Rust features with something for everyone! Let’s dive in!
31
32 ## Unsafe Rust
33
34 All the code we’ve discussed so far has had Rust’s memory safety guarantees
35 enforced at compile time. However, Rust has a second language hidden inside it
36 that doesn’t enforce these memory safety guarantees: it’s called *unsafe Rust*
37 and works just like regular Rust, but gives us extra superpowers.
38
39 Unsafe Rust exists because, by nature, static analysis is conservative. When
40 the compiler tries to determine whether or not code upholds the guarantees,
41 it’s better for it to reject some valid programs than to accept some invalid
42 programs. Although the code *might* be okay, if the Rust compiler doesn’t have
43 enough information to be confident, it will reject the code. In these cases,
44 you can use unsafe code to tell the compiler, “Trust me, I know what I’m
45 doing.” Be warned, however, that you use unsafe Rust at your own risk: if you
46 use unsafe code incorrectly, problems can occur due to memory unsafety, such as
47 null pointer dereferencing.
48
49 Another reason Rust has an unsafe alter ego is that the underlying computer
50 hardware is inherently unsafe. If Rust didn’t let you do unsafe operations, you
51 couldn’t do certain tasks. Rust needs to allow you to do low-level systems
52 programming, such as directly interacting with the operating system or even
53 writing your own operating system. Working with low-level systems programming
54 is one of the goals of the language. Let’s explore what we can do with unsafe
55 Rust and how to do it.
56
57 ### Unsafe Superpowers
58
59 To switch to unsafe Rust, use the `unsafe` keyword and then start a new block
60 that holds the unsafe code. You can take five actions in unsafe Rust that you
61 can’t in safe Rust, which we call *unsafe superpowers*. Those superpowers
62 include the ability to:
63
64 * Dereference a raw pointer
65 * Call an unsafe function or method
66 * Access or modify a mutable static variable
67 * Implement an unsafe trait
68 * Access fields of `union`s
69
70 It’s important to understand that `unsafe` doesn’t turn off the borrow checker
71 or disable any other of Rust’s safety checks: if you use a reference in unsafe
72 code, it will still be checked. The `unsafe` keyword only gives you access to
73 these five features that are then not checked by the compiler for memory
74 safety. You’ll still get some degree of safety inside of an unsafe block.
75
76 In addition, `unsafe` does not mean the code inside the block is necessarily
77 dangerous or that it will definitely have memory safety problems: the intent is
78 that as the programmer, you’ll ensure the code inside an `unsafe` block will
79 access memory in a valid way.
80
81 People are fallible, and mistakes will happen, but by requiring these five
82 unsafe operations to be inside blocks annotated with `unsafe` you’ll know that
83 any errors related to memory safety must be within an `unsafe` block. Keep
84 `unsafe` blocks small; you’ll be thankful later when you investigate memory
85 bugs.
86
87 To isolate unsafe code as much as possible, it’s best to enclose unsafe code
88 within a safe abstraction and provide a safe API, which we’ll discuss later in
89 the chapter when we examine unsafe functions and methods. Parts of the standard
90 library are implemented as safe abstractions over unsafe code that has been
91 audited. Wrapping unsafe code in a safe abstraction prevents uses of `unsafe`
92 from leaking out into all the places that you or your users might want to use
93 the functionality implemented with `unsafe` code, because using a safe
94 abstraction is safe.
95
96 Let’s look at each of the five unsafe superpowers in turn. We’ll also look at
97 some abstractions that provide a safe interface to unsafe code.
98
99 ### Dereferencing a Raw Pointer
100
101 In Chapter 4, in the “Dangling References” section, we mentioned that the
102 compiler ensures references are always valid. Unsafe Rust has two new types
103 called *raw pointers* that are similar to references. As with references, raw
104 pointers can be immutable or mutable and are written as `*const T` and `*mut
105 T`, respectively. The asterisk isn’t the dereference operator; it’s part of the
106 type name. In the context of raw pointers, *immutable* means that the pointer
107 can’t be directly assigned to after being dereferenced.
108
109 Different from references and smart pointers, raw pointers:
110
111 * Are allowed to ignore the borrowing rules by having both immutable and
112 mutable pointers or multiple mutable pointers to the same location
113 * Aren’t guaranteed to point to valid memory
114 * Are allowed to be null
115 * Don’t implement any automatic cleanup
116
117 By opting out of having Rust enforce these guarantees, you can give up
118 guaranteed safety in exchange for greater performance or the ability to
119 interface with another language or hardware where Rust’s guarantees don’t apply.
120
121 Listing 19-1 shows how to create an immutable and a mutable raw pointer from
122 references.
123
124 ```
125 let mut num = 5;
126
127 let r1 = &num as *const i32;
128 let r2 = &mut num as *mut i32;
129 ```
130
131 Listing 19-1: Creating raw pointers from references
132
133 Notice that we don’t include the `unsafe` keyword in this code. We can create
134 raw pointers in safe code; we just can’t dereference raw pointers outside an
135 unsafe block, as you’ll see in a bit.
136
137 We’ve created raw pointers by using `as` to cast an immutable and a mutable
138 reference into their corresponding raw pointer types. Because we created them
139 directly from references guaranteed to be valid, we know these particular raw
140 pointers are valid, but we can’t make that assumption about just any raw
141 pointer.
142
143 To demonstrate this, next we’ll create a raw pointer whose validity we can’t be
144 so certain of. Listing 19-2 shows how to create a raw pointer to an arbitrary
145 location in memory. Trying to use arbitrary memory is undefined: there might be
146 data at that address or there might not, the compiler might optimize the code
147 so there is no memory access, or the program might error with a segmentation
148 fault. Usually, there is no good reason to write code like this, but it is
149 possible.
150
151 ```
152 let address = 0x012345usize;
153 let r = address as *const i32;
154 ```
155
156 Listing 19-2: Creating a raw pointer to an arbitrary memory address
157
158 Recall that we can create raw pointers in safe code, but we can’t *dereference*
159 raw pointers and read the data being pointed to. In Listing 19-3, we use the
160 dereference operator `*` on a raw pointer that requires an `unsafe` block.
161
162 ```
163 let mut num = 5;
164
165 let r1 = &num as *const i32;
166 let r2 = &mut num as *mut i32;
167
168 unsafe {
169 println!("r1 is: {}", *r1);
170 println!("r2 is: {}", *r2);
171 }
172 ```
173
174 Listing 19-3: Dereferencing raw pointers within an `unsafe` block
175
176 Creating a pointer does no harm; it’s only when we try to access the value that
177 it points at that we might end up dealing with an invalid value.
178
179 Note also that in Listing 19-1 and 19-3, we created `*const i32` and `*mut i32`
180 raw pointers that both pointed to the same memory location, where `num` is
181 stored. If we instead tried to create an immutable and a mutable reference to
182 `num`, the code would not have compiled because Rust’s ownership rules don’t
183 allow a mutable reference at the same time as any immutable references. With
184 raw pointers, we can create a mutable pointer and an immutable pointer to the
185 same location and change data through the mutable pointer, potentially creating
186 a data race. Be careful!
187
188 With all of these dangers, why would you ever use raw pointers? One major use
189 case is when interfacing with C code, as you’ll see in the next section,
190 “Calling an Unsafe Function or Method.” Another case is when building up safe
191 abstractions that the borrow checker doesn’t understand. We’ll introduce unsafe
192 functions and then look at an example of a safe abstraction that uses unsafe
193 code.
194
195 ### Calling an Unsafe Function or Method
196
197 The second type of operation you can perform in an unsafe block is calling
198 unsafe functions. Unsafe functions and methods look exactly like regular
199 functions and methods, but they have an extra `unsafe` before the rest of the
200 definition. The `unsafe` keyword in this context indicates the function has
201 requirements we need to uphold when we call this function, because Rust can’t
202 guarantee we’ve met these requirements. By calling an unsafe function within an
203 `unsafe` block, we’re saying that we’ve read this function’s documentation and
204 take responsibility for upholding the function’s contracts.
205
206 Here is an unsafe function named `dangerous` that doesn’t do anything in its
207 body:
208
209 ```
210 unsafe fn dangerous() {}
211
212 unsafe {
213 dangerous();
214 }
215 ```
216
217 We must call the `dangerous` function within a separate `unsafe` block. If we
218 try to call `dangerous` without the `unsafe` block, we’ll get an error:
219
220 ```
221 error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
222 --> src/main.rs:4:5
223 |
224 4 | dangerous();
225 | ^^^^^^^^^^^ call to unsafe function
226 |
227 = note: consult the function's documentation for information on how to avoid undefined behavior
228 ```
229
230 With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s
231 documentation, we understand how to use it properly, and we’ve verified that
232 we’re fulfilling the contract of the function.
233
234 Bodies of unsafe functions are effectively `unsafe` blocks, so to perform other
235 unsafe operations within an unsafe function, we don’t need to add another
236 `unsafe` block.
237
238 #### Creating a Safe Abstraction over Unsafe Code
239
240 Just because a function contains unsafe code doesn’t mean we need to mark the
241 entire function as unsafe. In fact, wrapping unsafe code in a safe function is
242 a common abstraction. As an example, let’s study the `split_at_mut` function
243 from the standard library, which requires some unsafe code. We’ll explore how
244 we might implement it. This safe method is defined on mutable slices: it takes
245 one slice and makes it two by splitting the slice at the index given as an
246 argument. Listing 19-4 shows how to use `split_at_mut`.
247
248 ```
249 let mut v = vec![1, 2, 3, 4, 5, 6];
250
251 let r = &mut v[..];
252
253 let (a, b) = r.split_at_mut(3);
254
255 assert_eq!(a, &mut [1, 2, 3]);
256 assert_eq!(b, &mut [4, 5, 6]);
257 ```
258
259 Listing 19-4: Using the safe `split_at_mut` function
260
261 We can’t implement this function using only safe Rust. An attempt might look
262 something like Listing 19-5, which won’t compile. For simplicity, we’ll
263 implement `split_at_mut` as a function rather than a method and only for slices
264 of `i32` values rather than for a generic type `T`.
265
266 ```
267 fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
268 let len = values.len();
269
270 assert!(mid <= len);
271
272 (&mut values[..mid], &mut values[mid..])
273 }
274 ```
275
276 Listing 19-5: An attempted implementation of `split_at_mut` using only safe Rust
277
278 This function first gets the total length of the slice. Then it asserts that
279 the index given as a parameter is within the slice by checking whether it’s
280 less than or equal to the length. The assertion means that if we pass an index
281 that is greater than the length to split the slice at, the function will panic
282 before it attempts to use that index.
283
284 Then we return two mutable slices in a tuple: one from the start of the
285 original slice to the `mid` index and another from `mid` to the end of the
286 slice.
287
288 When we try to compile the code in Listing 19-5, we’ll get an error:
289
290 ```
291 error[E0499]: cannot borrow `*values` as mutable more than once at a time
292 --> src/main.rs:6:31
293 |
294 1 | fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
295 | - let's call the lifetime of this reference `'1`
296 ...
297 6 | (&mut values[..mid], &mut values[mid..])
298 | --------------------------^^^^^^--------
299 | | | |
300 | | | second mutable borrow occurs here
301 | | first mutable borrow occurs here
302 | returning this value requires that `*values` is borrowed for `'1`
303 ```
304
305 Rust’s borrow checker can’t understand that we’re borrowing different parts of
306 the slice; it only knows that we’re borrowing from the same slice twice.
307 Borrowing different parts of a slice is fundamentally okay because the two
308 slices aren’t overlapping, but Rust isn’t smart enough to know this. When we
309 know code is okay, but Rust doesn’t, it’s time to reach for unsafe code.
310
311 Listing 19-6 shows how to use an `unsafe` block, a raw pointer, and some calls
312 to unsafe functions to make the implementation of `split_at_mut` work.
313
314 ```
315 use std::slice;
316
317 fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
318 [1] let len = values.len();
319 [2] let ptr = values.as_mut_ptr();
320
321 [3] assert!(mid <= len);
322
323 [4] unsafe {
324 (
325 [5] slice::from_raw_parts_mut(ptr, mid),
326 [6] slice::from_raw_parts_mut(ptr.add(mid), len - mid),
327 )
328 }
329 }
330 ```
331
332 Listing 19-6: Using unsafe code in the implementation of the `split_at_mut`
333 function
334
335
336 Recall from “The Slice Type” section in Chapter 4 that a slice is a pointer to
337 some data and the length of the slice. We use the `len` method to get the
338 length of a slice [1] and the `as_mut_ptr` method to access the raw pointer of
339 a slice [2]. In this case, because we have a mutable slice to `i32` values,
340 `as_mut_ptr` returns a raw pointer with the type `*mut i32`, which we’ve stored
341 in the variable `ptr`.
342
343 We keep the assertion that the `mid` index is within the slice [3]. Then we get
344 to the unsafe code [4]: the `slice::from_raw_parts_mut` function takes a raw
345 pointer and a length, and it creates a slice. We use it to create a slice that
346 starts from `ptr` and is `mid` items long [5]. Then we call the `add` method on
347 `ptr` with `mid` as an argument to get a raw pointer that starts at `mid`, and
348 we create a slice using that pointer and the remaining number of items after
349 `mid` as the length [6].
350
351 The function `slice::from_raw_parts_mut` is unsafe because it takes a raw
352 pointer and must trust that this pointer is valid. The `add` method on raw
353 pointers is also unsafe, because it must trust that the offset location is also
354 a valid pointer. Therefore, we had to put an `unsafe` block around our calls to
355 `slice::from_raw_parts_mut` and `add` so we could call them. By looking at
356 the code and by adding the assertion that `mid` must be less than or equal to
357 `len`, we can tell that all the raw pointers used within the `unsafe` block
358 will be valid pointers to data within the slice. This is an acceptable and
359 appropriate use of `unsafe`.
360
361 Note that we don’t need to mark the resulting `split_at_mut` function as
362 `unsafe`, and we can call this function from safe Rust. We’ve created a safe
363 abstraction to the unsafe code with an implementation of the function that uses
364 `unsafe` code in a safe way, because it creates only valid pointers from the
365 data this function has access to.
366
367 In contrast, the use of `slice::from_raw_parts_mut` in Listing 19-7 would
368 likely crash when the slice is used. This code takes an arbitrary memory
369 location and creates a slice 10,000 items long.
370
371 ```
372 use std::slice;
373
374 let address = 0x01234usize;
375 let r = address as *mut i32;
376
377 let values: &[i32] = unsafe { slice::from_raw_parts_mut(r, 10000) };
378 ```
379
380 Listing 19-7: Creating a slice from an arbitrary memory location
381
382 We don’t own the memory at this arbitrary location, and there is no guarantee
383 that the slice this code creates contains valid `i32` values. Attempting to use
384 `values` as though it’s a valid slice results in undefined behavior.
385
386 #### Using `extern` Functions to Call External Code
387
388 Sometimes, your Rust code might need to interact with code written in another
389 language. For this, Rust has the keyword `extern` that facilitates the creation
390 and use of a *Foreign Function Interface (FFI)*. An FFI is a way for a
391 programming language to define functions and enable a different (foreign)
392 programming language to call those functions.
393
394 Listing 19-8 demonstrates how to set up an integration with the `abs` function
395 from the C standard library. Functions declared within `extern` blocks are
396 always unsafe to call from Rust code. The reason is that other languages don’t
397 enforce Rust’s rules and guarantees, and Rust can’t check them, so
398 responsibility falls on the programmer to ensure safety.
399
400 Filename: src/main.rs
401
402 ```
403 extern "C" {
404 fn abs(input: i32) -> i32;
405 }
406
407 fn main() {
408 unsafe {
409 println!("Absolute value of -3 according to C: {}", abs(-3));
410 }
411 }
412 ```
413
414 Listing 19-8: Declaring and calling an `extern` function defined in another
415 language
416
417 Within the `extern "C"` block, we list the names and signatures of external
418 functions from another language we want to call. The `"C"` part defines which
419 *application binary interface (ABI)* the external function uses: the ABI
420 defines how to call the function at the assembly level. The `"C"` ABI is the
421 most common and follows the C programming language’s ABI.
422
423 <!-- Totally optional - but do we want to mention the other external types
424 that Rust supports here? Also, do we want to mention there are helper
425 crates for connecting to other languages, include C++?
426 /JT -->
427 <!-- I don't really want to get into the other external types or other
428 languages; there are other resources that cover these topics better than I
429 could here. /Carol -->
430
431 > #### Calling Rust Functions from Other Languages
432 >
433 > We can also use `extern` to create an interface that allows other languages
434 > to call Rust functions. Instead of an creating a whole `extern` block, we add
435 > the `extern` keyword and specify the ABI to use just before the `fn` keyword
436 > for the relevant function. We also need to add a `#[no_mangle]` annotation to
437 > tell the Rust compiler not to mangle the name of this function. *Mangling* is
438 > when a compiler changes the name we’ve given a function to a different name
439 > that contains more information for other parts of the compilation process to
440 > consume but is less human readable. Every programming language compiler
441 > mangles names slightly differently, so for a Rust function to be nameable by
442 > other languages, we must disable the Rust compiler’s name mangling.
443 >
444 > In the following example, we make the `call_from_c` function accessible from
445 > C code, after it’s compiled to a shared library and linked from C:
446 >
447 > ```
448 > #[no_mangle]
449 > pub extern "C" fn call_from_c() {
450 > println!("Just called a Rust function from C!");
451 > }
452 > ```
453 >
454 > This usage of `extern` does not require `unsafe`.
455
456 ### Accessing or Modifying a Mutable Static Variable
457
458 In this book, we’ve not yet talked about *global variables*, which Rust does
459 support but can be problematic with Rust’s ownership rules. If two threads are
460 accessing the same mutable global variable, it can cause a data race.
461
462 In Rust, global variables are called *static* variables. Listing 19-9 shows an
463 example declaration and use of a static variable with a string slice as a
464 value.
465
466 Filename: src/main.rs
467
468 ```
469 static HELLO_WORLD: &str = "Hello, world!";
470
471 fn main() {
472 println!("name is: {}", HELLO_WORLD);
473 }
474 ```
475
476 Listing 19-9: Defining and using an immutable static variable
477
478 Static variables are similar to constants, which we discussed in the
479 “Differences Between Variables and Constants” section in Chapter 3. The names
480 of static variables are in `SCREAMING_SNAKE_CASE` by convention. Static
481 variables can only store references with the `'static` lifetime, which means
482 the Rust compiler can figure out the lifetime and we aren’t required to
483 annotate it explicitly. Accessing an immutable static variable is safe.
484
485 A subtle difference between constants and immutable static variables is that
486 values in a static variable have a fixed address in memory. Using the value
487 will always access the same data. Constants, on the other hand, are allowed to
488 duplicate their data whenever they’re used. Another difference is that static
489 variables can be mutable. Accessing and modifying mutable static variables is
490 *unsafe*. Listing 19-10 shows how to declare, access, and modify a mutable
491 static variable named `COUNTER`.
492
493 Filename: src/main.rs
494
495 ```
496 static mut COUNTER: u32 = 0;
497
498 fn add_to_count(inc: u32) {
499 unsafe {
500 COUNTER += inc;
501 }
502 }
503
504 fn main() {
505 add_to_count(3);
506
507 unsafe {
508 println!("COUNTER: {}", COUNTER);
509 }
510 }
511 ```
512
513 Listing 19-10: Reading from or writing to a mutable static variable is unsafe
514
515 As with regular variables, we specify mutability using the `mut` keyword. Any
516 code that reads or writes from `COUNTER` must be within an `unsafe` block. This
517 code compiles and prints `COUNTER: 3` as we would expect because it’s single
518 threaded. Having multiple threads access `COUNTER` would likely result in data
519 races.
520
521 With mutable data that is globally accessible, it’s difficult to ensure there
522 are no data races, which is why Rust considers mutable static variables to be
523 unsafe. Where possible, it’s preferable to use the concurrency techniques and
524 thread-safe smart pointers we discussed in Chapter 16 so the compiler checks
525 that data accessed from different threads is done safely.
526
527 ### Implementing an Unsafe Trait
528
529 We can use `unsafe` to implement an unsafe trait. A trait is unsafe when at
530 least one of its methods has some invariant that the compiler can’t verify. We
531 declare that a trait is `unsafe` by adding the `unsafe` keyword before `trait`
532 and marking the implementation of the trait as `unsafe` too, as shown in
533 Listing 19-11.
534
535 ```
536 unsafe trait Foo {
537 // methods go here
538 }
539
540 unsafe impl Foo for i32 {
541 // method implementations go here
542 }
543
544 fn main() {}
545 ```
546
547 Listing 19-11: Defining and implementing an unsafe trait
548
549 By using `unsafe impl`, we’re promising that we’ll uphold the invariants that
550 the compiler can’t verify.
551
552 As an example, recall the `Sync` and `Send` marker traits we discussed in the
553 “Extensible Concurrency with the `Sync` and `Send` Traits” section in Chapter
554 16: the compiler implements these traits automatically if our types are
555 composed entirely of `Send` and `Sync` types. If we implement a type that
556 contains a type that is not `Send` or `Sync`, such as raw pointers, and we want
557 to mark that type as `Send` or `Sync`, we must use `unsafe`. Rust can’t verify
558 that our type upholds the guarantees that it can be safely sent across threads
559 or accessed from multiple threads; therefore, we need to do those checks
560 manually and indicate as such with `unsafe`.
561
562 ### Accessing Fields of a Union
563
564 The final action that works only with `unsafe` is accessing fields of a
565 *union*. A `union` is similar to a `struct`, but only one declared field is
566 used in a particular instance at one time. Unions are primarily used to
567 interface with unions in C code. Accessing union fields is unsafe because Rust
568 can’t guarantee the type of the data currently being stored in the union
569 instance. You can learn more about unions in the Rust Reference at
570 *https://doc.rust-lang.org/reference/items/unions.html*.
571
572 ### When to Use Unsafe Code
573
574 Using `unsafe` to use one of the five superpowers just discussed isn’t wrong or
575 even frowned upon, but it is trickier to get `unsafe` code correct because the
576 compiler can’t help uphold memory safety. When you have a reason to use
577 `unsafe` code, you can do so, and having the explicit `unsafe` annotation makes
578 it easier to track down the source of problems when they occur.
579
580 ## Advanced Traits
581
582 We first covered traits in the “Traits: Defining Shared Behavior” section of
583 Chapter 10, but we didn’t discuss the more advanced details. Now that you know
584 more about Rust, we can get into the nitty-gritty.
585
586 ### Specifying Placeholder Types in Trait Definitions with Associated Types
587
588 *Associated types* connect a type placeholder with a trait such that the trait
589 method definitions can use these placeholder types in their signatures. The
590 implementor of a trait will specify the concrete type to be used instead of the
591 placeholder type for the particular implementation. That way, we can define a
592 trait that uses some types without needing to know exactly what those types are
593 until the trait is implemented.
594
595 We’ve described most of the advanced features in this chapter as being rarely
596 needed. Associated types are somewhere in the middle: they’re used more rarely
597 than features explained in the rest of the book but more commonly than many of
598 the other features discussed in this chapter.
599
600 One example of a trait with an associated type is the `Iterator` trait that the
601 standard library provides. The associated type is named `Item` and stands in
602 for the type of the values the type implementing the `Iterator` trait is
603 iterating over. The definition of the `Iterator` trait is as shown in Listing
604 19-12.
605
606 ```
607 pub trait Iterator {
608 type Item;
609
610 fn next(&mut self) -> Option<Self::Item>;
611 }
612 ```
613
614 Listing 19-12: The definition of the `Iterator` trait that has an associated
615 type `Item`
616
617 The type `Item` is a placeholder, and the `next` method’s definition shows that
618 it will return values of type `Option<Self::Item>`. Implementors of the
619 `Iterator` trait will specify the concrete type for `Item`, and the `next`
620 method will return an `Option` containing a value of that concrete type.
621
622 Associated types might seem like a similar concept to generics, in that the
623 latter allow us to define a function without specifying what types it can
624 handle. To examine the difference between the two concepts, we’ll look at an
625 implementation of the `Iterator` trait on a type named `Counter` that specifies
626 the `Item` type is `u32`:
627
628 Filename: src/lib.rs
629
630 ```
631 impl Iterator for Counter {
632 type Item = u32;
633
634 fn next(&mut self) -> Option<Self::Item> {
635 // --snip--
636 ```
637
638 This syntax seems comparable to that of generics. So why not just define the
639 `Iterator` trait with generics, as shown in Listing 19-13?
640
641 ```
642 pub trait Iterator<T> {
643 fn next(&mut self) -> Option<T>;
644 }
645 ```
646
647 Listing 19-13: A hypothetical definition of the `Iterator` trait using generics
648
649 The difference is that when using generics, as in Listing 19-13, we must
650 annotate the types in each implementation; because we can also implement
651 `Iterator<String> for Counter` or any other type, we could have multiple
652 implementations of `Iterator` for `Counter`. In other words, when a trait has a
653 generic parameter, it can be implemented for a type multiple times, changing
654 the concrete types of the generic type parameters each time. When we use the
655 `next` method on `Counter`, we would have to provide type annotations to
656 indicate which implementation of `Iterator` we want to use.
657
658 With associated types, we don’t need to annotate types because we can’t
659 implement a trait on a type multiple times. In Listing 19-12 with the
660 definition that uses associated types, we can only choose what the type of
661 `Item` will be once, because there can only be one `impl Iterator for Counter`.
662 We don’t have to specify that we want an iterator of `u32` values everywhere
663 that we call `next` on `Counter`.
664
665 Associated types also become part of the trait’s contract: implementors of the
666 trait must provide a type to stand in for the associated type placeholder.
667 Associated types often have a name that describes how the type will be used,
668 and documenting the associated type in the API documentation is good practice.
669
670 <!-- It also makes the type a part of the trait's contract. Not sure if
671 too subtle of a point, but the associated type of a trait is part of the
672 require things that the implementor must provide. They often also have a name
673 that may clue you in as to how that required type will be used.
674 /JT -->
675 <!-- Great points, I've added a small paragraph here! /Carol -->
676
677 ### Default Generic Type Parameters and Operator Overloading
678
679 When we use generic type parameters, we can specify a default concrete type for
680 the generic type. This eliminates the need for implementors of the trait to
681 specify a concrete type if the default type works. You specify a default type
682 when declaring a generic type with the `<PlaceholderType=ConcreteType>` syntax.
683
684 A great example of a situation where this technique is useful is with *operator
685 overloading*, in which you customize the behavior of an operator (such as `+`)
686 in particular situations.
687
688 Rust doesn’t allow you to create your own operators or overload arbitrary
689 operators. But you can overload the operations and corresponding traits listed
690 in `std::ops` by implementing the traits associated with the operator. For
691 example, in Listing 19-14 we overload the `+` operator to add two `Point`
692 instances together. We do this by implementing the `Add` trait on a `Point`
693 struct:
694
695 Filename: src/main.rs
696
697 ```
698 use std::ops::Add;
699
700 #[derive(Debug, Copy, Clone, PartialEq)]
701 struct Point {
702 x: i32,
703 y: i32,
704 }
705
706 impl Add for Point {
707 type Output = Point;
708
709 fn add(self, other: Point) -> Point {
710 Point {
711 x: self.x + other.x,
712 y: self.y + other.y,
713 }
714 }
715 }
716
717 fn main() {
718 assert_eq!(
719 Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
720 Point { x: 3, y: 3 }
721 );
722 }
723 ```
724
725 Listing 19-14: Implementing the `Add` trait to overload the `+` operator for
726 `Point` instances
727
728 The `add` method adds the `x` values of two `Point` instances and the `y`
729 values of two `Point` instances to create a new `Point`. The `Add` trait has an
730 associated type named `Output` that determines the type returned from the `add`
731 method.
732
733 The default generic type in this code is within the `Add` trait. Here is its
734 definition:
735
736 ```
737 trait Add<Rhs=Self> {
738 type Output;
739
740 fn add(self, rhs: Rhs) -> Self::Output;
741 }
742 ```
743
744 This code should look generally familiar: a trait with one method and an
745 associated type. The new part is `Rhs=Self`: this syntax is called *default
746 type parameters*. The `Rhs` generic type parameter (short for “right hand
747 side”) defines the type of the `rhs` parameter in the `add` method. If we don’t
748 specify a concrete type for `Rhs` when we implement the `Add` trait, the type
749 of `Rhs` will default to `Self`, which will be the type we’re implementing
750 `Add` on.
751
752 When we implemented `Add` for `Point`, we used the default for `Rhs` because we
753 wanted to add two `Point` instances. Let’s look at an example of implementing
754 the `Add` trait where we want to customize the `Rhs` type rather than using the
755 default.
756
757 We have two structs, `Millimeters` and `Meters`, holding values in different
758 units. This thin wrapping of an existing type in another struct is known as the
759 *newtype pattern*, which we describe in more detail in the “Using the Newtype
760 Pattern to Implement External Traits on External Types” section. We want to add
761 values in millimeters to values in meters and have the implementation of `Add`
762 do the conversion correctly. We can implement `Add` for `Millimeters` with
763 `Meters` as the `Rhs`, as shown in Listing 19-15.
764
765 Filename: src/lib.rs
766
767 ```
768 use std::ops::Add;
769
770 struct Millimeters(u32);
771 struct Meters(u32);
772
773 impl Add<Meters> for Millimeters {
774 type Output = Millimeters;
775
776 fn add(self, other: Meters) -> Millimeters {
777 Millimeters(self.0 + (other.0 * 1000))
778 }
779 }
780 ```
781
782 Listing 19-15: Implementing the `Add` trait on `Millimeters` to add
783 `Millimeters` to `Meters`
784
785 To add `Millimeters` and `Meters`, we specify `impl Add<Meters>` to set the
786 value of the `Rhs` type parameter instead of using the default of `Self`.
787
788 You’ll use default type parameters in two main ways:
789
790 * To extend a type without breaking existing code
791 * To allow customization in specific cases most users won’t need
792
793 The standard library’s `Add` trait is an example of the second purpose:
794 usually, you’ll add two like types, but the `Add` trait provides the ability to
795 customize beyond that. Using a default type parameter in the `Add` trait
796 definition means you don’t have to specify the extra parameter most of the
797 time. In other words, a bit of implementation boilerplate isn’t needed, making
798 it easier to use the trait.
799
800 The first purpose is similar to the second but in reverse: if you want to add a
801 type parameter to an existing trait, you can give it a default to allow
802 extension of the functionality of the trait without breaking the existing
803 implementation code.
804
805 ### Fully Qualified Syntax for Disambiguation: Calling Methods with the Same Name
806
807 Nothing in Rust prevents a trait from having a method with the same name as
808 another trait’s method, nor does Rust prevent you from implementing both traits
809 on one type. It’s also possible to implement a method directly on the type with
810 the same name as methods from traits.
811
812 When calling methods with the same name, you’ll need to tell Rust which one you
813 want to use. Consider the code in Listing 19-16 where we’ve defined two traits,
814 `Pilot` and `Wizard`, that both have a method called `fly`. We then implement
815 both traits on a type `Human` that already has a method named `fly` implemented
816 on it. Each `fly` method does something different.
817
818 Filename: src/main.rs
819
820 ```
821 trait Pilot {
822 fn fly(&self);
823 }
824
825 trait Wizard {
826 fn fly(&self);
827 }
828
829 struct Human;
830
831 impl Pilot for Human {
832 fn fly(&self) {
833 println!("This is your captain speaking.");
834 }
835 }
836
837 impl Wizard for Human {
838 fn fly(&self) {
839 println!("Up!");
840 }
841 }
842
843 impl Human {
844 fn fly(&self) {
845 println!("*waving arms furiously*");
846 }
847 }
848 ```
849
850 Listing 19-16: Two traits are defined to have a `fly` method and are
851 implemented on the `Human` type, and a `fly` method is implemented on `Human`
852 directly
853
854 When we call `fly` on an instance of `Human`, the compiler defaults to calling
855 the method that is directly implemented on the type, as shown in Listing 19-17.
856
857 Filename: src/main.rs
858
859 ```
860 fn main() {
861 let person = Human;
862 person.fly();
863 }
864 ```
865
866 Listing 19-17: Calling `fly` on an instance of `Human`
867
868 Running this code will print `*waving arms furiously*`, showing that Rust
869 called the `fly` method implemented on `Human` directly.
870
871 To call the `fly` methods from either the `Pilot` trait or the `Wizard` trait,
872 we need to use more explicit syntax to specify which `fly` method we mean.
873 Listing 19-18 demonstrates this syntax.
874
875 Filename: src/main.rs
876
877 ```
878 fn main() {
879 let person = Human;
880 Pilot::fly(&person);
881 Wizard::fly(&person);
882 person.fly();
883 }
884 ```
885
886 Listing 19-18: Specifying which trait’s `fly` method we want to call
887
888 Specifying the trait name before the method name clarifies to Rust which
889 implementation of `fly` we want to call. We could also write
890 `Human::fly(&person)`, which is equivalent to the `person.fly()` that we used
891 in Listing 19-18, but this is a bit longer to write if we don’t need to
892 disambiguate.
893
894 Running this code prints the following:
895
896 ```
897 $ cargo run
898 Compiling traits-example v0.1.0 (file:///projects/traits-example)
899 Finished dev [unoptimized + debuginfo] target(s) in 0.46s
900 Running `target/debug/traits-example`
901 This is your captain speaking.
902 Up!
903 *waving arms furiously*
904 ```
905
906 Because the `fly` method takes a `self` parameter, if we had two *types* that
907 both implement one *trait*, Rust could figure out which implementation of a
908 trait to use based on the type of `self`.
909
910 However, associated functions that are not methods don’t have a `self`
911 parameter. When there are multiple types or traits that define non-method
912 functions with the same function name, Rust doesn't always know which type you
913 mean unless you use *fully qualified syntax*. For example, in Listing 19-19 we
914 create a trait for an animal shelter that wants to name all baby dogs *Spot*.
915 We make an `Animal` trait with an associated non-method function `baby_name`.
916 The `Animal` trait is implemented for the struct `Dog`, on which we also
917 provide an associated non-method function `baby_name` directly.
918
919 Filename: src/main.rs
920
921 ```
922 trait Animal {
923 fn baby_name() -> String;
924 }
925
926 struct Dog;
927
928 impl Dog {
929 fn baby_name() -> String {
930 String::from("Spot")
931 }
932 }
933
934 impl Animal for Dog {
935 fn baby_name() -> String {
936 String::from("puppy")
937 }
938 }
939
940 fn main() {
941 println!("A baby dog is called a {}", Dog::baby_name());
942 }
943 ```
944
945 Listing 19-19: A trait with an associated function and a type with an
946 associated function of the same name that also implements the trait
947
948 We implement the code for naming all puppies Spot in the `baby_name` associated
949 function that is defined on `Dog`. The `Dog` type also implements the trait
950 `Animal`, which describes characteristics that all animals have. Baby dogs are
951 called puppies, and that is expressed in the implementation of the `Animal`
952 trait on `Dog` in the `baby_name` function associated with the `Animal` trait.
953
954 In `main`, we call the `Dog::baby_name` function, which calls the associated
955 function defined on `Dog` directly. This code prints the following:
956
957 ```
958 A baby dog is called a Spot
959 ```
960
961 This output isn’t what we wanted. We want to call the `baby_name` function that
962 is part of the `Animal` trait that we implemented on `Dog` so the code prints
963 `A baby dog is called a puppy`. The technique of specifying the trait name that
964 we used in Listing 19-18 doesn’t help here; if we change `main` to the code in
965 Listing 19-20, we’ll get a compilation error.
966
967 Filename: src/main.rs
968
969 ```
970 fn main() {
971 println!("A baby dog is called a {}", Animal::baby_name());
972 }
973 ```
974
975 Listing 19-20: Attempting to call the `baby_name` function from the `Animal`
976 trait, but Rust doesn’t know which implementation to use
977
978 Because `Animal::baby_name` doesn’t have a `self` parameter, and there could be
979 other types that implement the `Animal` trait, Rust can’t figure out which
980 implementation of `Animal::baby_name` we want. We’ll get this compiler error:
981
982 ```
983 error[E0283]: type annotations needed
984 --> src/main.rs:20:43
985 |
986 20 | println!("A baby dog is called a {}", Animal::baby_name());
987 | ^^^^^^^^^^^^^^^^^ cannot infer type
988 |
989 = note: cannot satisfy `_: Animal`
990 ```
991
992 To disambiguate and tell Rust that we want to use the implementation of
993 `Animal` for `Dog` as opposed to the implementation of `Animal` for some other
994 type, we need to use fully qualified syntax. Listing 19-21 demonstrates how to
995 use fully qualified syntax.
996
997 Filename: src/main.rs
998
999 ```
1000 fn main() {
1001 println!("A baby dog is called a {}", <Dog as Animal>::baby_name());
1002 }
1003 ```
1004
1005 Listing 19-21: Using fully qualified syntax to specify that we want to call the
1006 `baby_name` function from the `Animal` trait as implemented on `Dog`
1007
1008 We’re providing Rust with a type annotation within the angle brackets, which
1009 indicates we want to call the `baby_name` method from the `Animal` trait as
1010 implemented on `Dog` by saying that we want to treat the `Dog` type as an
1011 `Animal` for this function call. This code will now print what we want:
1012
1013 ```
1014 A baby dog is called a puppy
1015 ```
1016
1017 In general, fully qualified syntax is defined as follows:
1018
1019 ```
1020 <Type as Trait>::function(receiver_if_method, next_arg, ...);
1021 ```
1022
1023 For associated functions that aren’t methods, there would not be a `receiver`:
1024 there would only be the list of other arguments. You could use fully qualified
1025 syntax everywhere that you call functions or methods. However, you’re allowed
1026 to omit any part of this syntax that Rust can figure out from other information
1027 in the program. You only need to use this more verbose syntax in cases where
1028 there are multiple implementations that use the same name and Rust needs help
1029 to identify which implementation you want to call.
1030
1031 ### Using Supertraits to Require One Trait’s Functionality Within Another Trait
1032
1033 Sometimes, you might write a trait definition that depends on another trait:
1034 for a type to implement the first trait, you want to require that type to also
1035 implement the second trait. You would do this so that your trait definition can
1036 make use of the associated items of the second trait. The trait your trait
1037 definition is relying on is called a *supertrait* of your trait.
1038
1039 For example, let’s say we want to make an `OutlinePrint` trait with an
1040 `outline_print` method that will print a given value formatted so that it's
1041 framed in asterisks. That is, given a `Point` struct that implements the
1042 standard library trait `Display` to result in `(x, y)`, when we
1043 call `outline_print` on a `Point` instance that has `1` for `x` and `3` for
1044 `y`, it should print the following:
1045
1046 ```
1047 **********
1048 * *
1049 * (1, 3) *
1050 * *
1051 **********
1052 ```
1053
1054 In the implementation of the `outline_print` method, we want to use the
1055 `Display` trait’s functionality. Therefore, we need to specify that the
1056 `OutlinePrint` trait will work only for types that also implement `Display` and
1057 provide the functionality that `OutlinePrint` needs. We can do that in the
1058 trait definition by specifying `OutlinePrint: Display`. This technique is
1059 similar to adding a trait bound to the trait. Listing 19-22 shows an
1060 implementation of the `OutlinePrint` trait.
1061
1062 Filename: src/main.rs
1063
1064 ```
1065 use std::fmt;
1066
1067 trait OutlinePrint: fmt::Display {
1068 fn outline_print(&self) {
1069 let output = self.to_string();
1070 let len = output.len();
1071 println!("{}", "*".repeat(len + 4));
1072 println!("*{}*", " ".repeat(len + 2));
1073 println!("* {} *", output);
1074 println!("*{}*", " ".repeat(len + 2));
1075 println!("{}", "*".repeat(len + 4));
1076 }
1077 }
1078 ```
1079
1080 Listing 19-22: Implementing the `OutlinePrint` trait that requires the
1081 functionality from `Display`
1082
1083 Because we’ve specified that `OutlinePrint` requires the `Display` trait, we
1084 can use the `to_string` function that is automatically implemented for any type
1085 that implements `Display`. If we tried to use `to_string` without adding a
1086 colon and specifying the `Display` trait after the trait name, we’d get an
1087 error saying that no method named `to_string` was found for the type `&Self` in
1088 the current scope.
1089
1090 Let’s see what happens when we try to implement `OutlinePrint` on a type that
1091 doesn’t implement `Display`, such as the `Point` struct:
1092
1093 Filename: src/main.rs
1094
1095 ```
1096 struct Point {
1097 x: i32,
1098 y: i32,
1099 }
1100
1101 impl OutlinePrint for Point {}
1102 ```
1103
1104 We get an error saying that `Display` is required but not implemented:
1105
1106 ```
1107 error[E0277]: `Point` doesn't implement `std::fmt::Display`
1108 --> src/main.rs:20:6
1109 |
1110 20 | impl OutlinePrint for Point {}
1111 | ^^^^^^^^^^^^ `Point` cannot be formatted with the default formatter
1112 |
1113 = help: the trait `std::fmt::Display` is not implemented for `Point`
1114 = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
1115 note: required by a bound in `OutlinePrint`
1116 --> src/main.rs:3:21
1117 |
1118 3 | trait OutlinePrint: fmt::Display {
1119 | ^^^^^^^^^^^^ required by this bound in `OutlinePrint`
1120 ```
1121
1122 To fix this, we implement `Display` on `Point` and satisfy the constraint that
1123 `OutlinePrint` requires, like so:
1124
1125 Filename: src/main.rs
1126
1127 ```
1128 use std::fmt;
1129
1130 impl fmt::Display for Point {
1131 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
1132 write!(f, "({}, {})", self.x, self.y)
1133 }
1134 }
1135 ```
1136
1137 Then implementing the `OutlinePrint` trait on `Point` will compile
1138 successfully, and we can call `outline_print` on a `Point` instance to display
1139 it within an outline of asterisks.
1140
1141 ### Using the Newtype Pattern to Implement External Traits on External Types
1142
1143 In Chapter 10 in the “Implementing a Trait on a Type” section, we mentioned the
1144 orphan rule that states we’re only allowed to implement a trait on a type if
1145 either the trait or the type are local to our crate.
1146 It’s possible to get
1147 around this restriction using the *newtype pattern*, which involves creating a
1148 new type in a tuple struct. (We covered tuple structs in the “Using Tuple
1149 Structs without Named Fields to Create Different Types” section of Chapter 5.)
1150 The tuple struct will have one field and be a thin wrapper around the type we
1151 want to implement a trait for. Then the wrapper type is local to our crate, and
1152 we can implement the trait on the wrapper. *Newtype* is a term that originates
1153 from the Haskell programming language. There is no runtime performance penalty
1154 for using this pattern, and the wrapper type is elided at compile time.
1155
1156 As an example, let’s say we want to implement `Display` on `Vec<T>`, which the
1157 orphan rule prevents us from doing directly because the `Display` trait and the
1158 `Vec<T>` type are defined outside our crate. We can make a `Wrapper` struct
1159 that holds an instance of `Vec<T>`; then we can implement `Display` on
1160 `Wrapper` and use the `Vec<T>` value, as shown in Listing 19-23.
1161
1162 Filename: src/main.rs
1163
1164 ```
1165 use std::fmt;
1166
1167 struct Wrapper(Vec<String>);
1168
1169 impl fmt::Display for Wrapper {
1170 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
1171 write!(f, "[{}]", self.0.join(", "))
1172 }
1173 }
1174
1175 fn main() {
1176 let w = Wrapper(vec![String::from("hello"), String::from("world")]);
1177 println!("w = {}", w);
1178 }
1179 ```
1180
1181 Listing 19-23: Creating a `Wrapper` type around `Vec<String>` to implement
1182 `Display`
1183
1184 The implementation of `Display` uses `self.0` to access the inner `Vec<T>`,
1185 because `Wrapper` is a tuple struct and `Vec<T>` is the item at index 0 in the
1186 tuple. Then we can use the functionality of the `Display` type on `Wrapper`.
1187
1188 The downside of using this technique is that `Wrapper` is a new type, so it
1189 doesn’t have the methods of the value it’s holding. We would have to implement
1190 all the methods of `Vec<T>` directly on `Wrapper` such that the methods
1191 delegate to `self.0`, which would allow us to treat `Wrapper` exactly like a
1192 `Vec<T>`. If we wanted the new type to have every method the inner type has,
1193 implementing the `Deref` trait (discussed in Chapter 15 in the “Treating Smart
1194 Pointers Like Regular References with the `Deref` Trait” section) on the
1195 `Wrapper` to return the inner type would be a solution. If we don’t want the
1196 `Wrapper` type to have all the methods of the inner type—for example, to
1197 restrict the `Wrapper` type’s behavior—we would have to implement just the
1198 methods we do want manually.
1199
1200 This newtype pattern is also useful even when traits are not involved. Let’s
1201 switch focus and look at some advanced ways to interact with Rust’s type system.
1202
1203 ## Advanced Types
1204
1205 The Rust type system has some features that we’ve so far mentioned but haven’t
1206 yet discussed. We’ll start by discussing newtypes in general as we examine why
1207 newtypes are useful as types. Then we’ll move on to type aliases, a feature
1208 similar to newtypes but with slightly different semantics. We’ll also discuss
1209 the `!` type and dynamically sized types.
1210
1211 ### Using the Newtype Pattern for Type Safety and Abstraction
1212
1213 > Note: This section assumes you’ve read the earlier section “Using the
1214 > Newtype Pattern to Implement External Traits on External
1215 > Types.”
1216
1217 The newtype pattern is also useful for tasks beyond those we’ve discussed so
1218 far, including statically enforcing that values are never confused and
1219 indicating the units of a value. You saw an example of using newtypes to
1220 indicate units in Listing 19-15: recall that the `Millimeters` and `Meters`
1221 structs wrapped `u32` values in a newtype. If we wrote a function with a
1222 parameter of type `Millimeters`, we couldn’t compile a program that
1223 accidentally tried to call that function with a value of type `Meters` or a
1224 plain `u32`.
1225
1226 We can also use the newtype pattern to abstract away some implementation
1227 details of a type: the new type can expose a public API that is different from
1228 the API of the private inner type.
1229
1230 Newtypes can also hide internal implementation. For example, we could provide a
1231 `People` type to wrap a `HashMap<i32, String>` that stores a person’s ID
1232 associated with their name. Code using `People` would only interact with the
1233 public API we provide, such as a method to add a name string to the `People`
1234 collection; that code wouldn’t need to know that we assign an `i32` ID to names
1235 internally. The newtype pattern is a lightweight way to achieve encapsulation
1236 to hide implementation details, which we discussed in the “Encapsulation that
1237 Hides Implementation Details” section of Chapter 17.
1238
1239 ### Creating Type Synonyms with Type Aliases
1240
1241 Rust provides the ability to declare a *type alias* to give an existing type
1242 another name. For this we use the `type` keyword. For example, we can create
1243 the alias `Kilometers` to `i32` like so:
1244
1245 ```
1246 type Kilometers = i32;
1247 ```
1248
1249 Now, the alias `Kilometers` is a *synonym* for `i32`; unlike the `Millimeters`
1250 and `Meters` types we created in Listing 19-15, `Kilometers` is not a separate,
1251 new type. Values that have the type `Kilometers` will be treated the same as
1252 values of type `i32`:
1253
1254 ```
1255 type Kilometers = i32;
1256
1257 let x: i32 = 5;
1258 let y: Kilometers = 5;
1259
1260 println!("x + y = {}", x + y);
1261 ```
1262
1263 Because `Kilometers` and `i32` are the same type, we can add values of both
1264 types and we can pass `Kilometers` values to functions that take `i32`
1265 parameters. However, using this method, we don’t get the type checking benefits
1266 that we get from the newtype pattern discussed earlier. In other words, if we
1267 mix up `Kilometers` and `i32` values somewhere, the compiler will not give us
1268 an error.
1269
1270 <!-- Having a few battle wounds trying to debug using this pattern, it's
1271 definitely good to warn people that if they use type aliases to the same base
1272 type in their program (like multiple aliases to `usize`), they're asking for
1273 trouble as the typechecker will not help them if they mix up their types.
1274 /JT -->
1275 <!-- I'm not sure if JT was saying this paragraph was good or it could use more
1276 emphasis? I've added a sentence to the end of the paragraph above in case it
1277 was the latter /Carol -->
1278
1279 The main use case for type synonyms is to reduce repetition. For example, we
1280 might have a lengthy type like this:
1281
1282 ```
1283 Box<dyn Fn() + Send + 'static>
1284 ```
1285
1286 Writing this lengthy type in function signatures and as type annotations all
1287 over the code can be tiresome and error prone. Imagine having a project full of
1288 code like that in Listing 19-24.
1289
1290 ```
1291 let f: Box<dyn Fn() + Send + 'static> = Box::new(|| println!("hi"));
1292
1293 fn takes_long_type(f: Box<dyn Fn() + Send + 'static>) {
1294 // --snip--
1295 }
1296
1297 fn returns_long_type() -> Box<dyn Fn() + Send + 'static> {
1298 // --snip--
1299 }
1300 ```
1301
1302 Listing 19-24: Using a long type in many places
1303
1304 A type alias makes this code more manageable by reducing the repetition. In
1305 Listing 19-25, we’ve introduced an alias named `Thunk` for the verbose type and
1306 can replace all uses of the type with the shorter alias `Thunk`.
1307
1308 ```
1309 type Thunk = Box<dyn Fn() + Send + 'static>;
1310
1311 let f: Thunk = Box::new(|| println!("hi"));
1312
1313 fn takes_long_type(f: Thunk) {
1314 // --snip--
1315 }
1316
1317 fn returns_long_type() -> Thunk {
1318 // --snip--
1319 }
1320 ```
1321
1322 Listing 19-25: Introducing a type alias `Thunk` to reduce repetition
1323
1324 This code is much easier to read and write! Choosing a meaningful name for a
1325 type alias can help communicate your intent as well (*thunk* is a word for code
1326 to be evaluated at a later time, so it’s an appropriate name for a closure that
1327 gets stored).
1328
1329 Type aliases are also commonly used with the `Result<T, E>` type for reducing
1330 repetition. Consider the `std::io` module in the standard library. I/O
1331 operations often return a `Result<T, E>` to handle situations when operations
1332 fail to work. This library has a `std::io::Error` struct that represents all
1333 possible I/O errors. Many of the functions in `std::io` will be returning
1334 `Result<T, E>` where the `E` is `std::io::Error`, such as these functions in
1335 the `Write` trait:
1336
1337 ```
1338 use std::fmt;
1339 use std::io::Error;
1340
1341 pub trait Write {
1342 fn write(&mut self, buf: &[u8]) -> Result<usize, Error>;
1343 fn flush(&mut self) -> Result<(), Error>;
1344
1345 fn write_all(&mut self, buf: &[u8]) -> Result<(), Error>;
1346 fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<(), Error>;
1347 }
1348 ```
1349
1350 The `Result<..., Error>` is repeated a lot. As such, `std::io` has this type
1351 alias declaration:
1352
1353 ```
1354 type Result<T> = std::result::Result<T, std::io::Error>;
1355 ```
1356
1357 Because this declaration is in the `std::io` module, we can use the fully
1358 qualified alias `std::io::Result<T>`; that is, a `Result<T, E>` with the `E`
1359 filled in as `std::io::Error`. The `Write` trait function signatures end up
1360 looking like this:
1361
1362 ```
1363 pub trait Write {
1364 fn write(&mut self, buf: &[u8]) -> Result<usize>;
1365 fn flush(&mut self) -> Result<()>;
1366
1367 fn write_all(&mut self, buf: &[u8]) -> Result<()>;
1368 fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
1369 }
1370 ```
1371
1372 The type alias helps in two ways: it makes code easier to write *and* it gives
1373 us a consistent interface across all of `std::io`. Because it’s an alias, it’s
1374 just another `Result<T, E>`, which means we can use any methods that work on
1375 `Result<T, E>` with it, as well as special syntax like the `?` operator.
1376
1377 ### The Never Type that Never Returns
1378
1379 Rust has a special type named `!` that’s known in type theory lingo as the
1380 *empty type* because it has no values. We prefer to call it the *never type*
1381 because it stands in the place of the return type when a function will never
1382 return. Here is an example:
1383
1384 ```
1385 fn bar() -> ! {
1386 // --snip--
1387 }
1388 ```
1389
1390 This code is read as “the function `bar` returns never.” Functions that return
1391 never are called *diverging functions*. We can’t create values of the type `!`
1392 so `bar` can never possibly return.
1393
1394 But what use is a type you can never create values for? Recall the code from
1395 Listing 2-5, part of the number guessing game; we’ve reproduced a bit of it
1396 here in Listing 19-26.
1397
1398 ```
1399 let guess: u32 = match guess.trim().parse() {
1400 Ok(num) => num,
1401 Err(_) => continue,
1402 };
1403 ```
1404
1405 Listing 19-26: A `match` with an arm that ends in `continue`
1406
1407 At the time, we skipped over some details in this code. In Chapter 6 in “The
1408 `match` Control Flow Operator” section, we discussed that `match` arms must all
1409 return the same type. So, for example, the following code doesn’t work:
1410
1411 ```
1412 let guess = match guess.trim().parse() {
1413 Ok(_) => 5,
1414 Err(_) => "hello",
1415 };
1416 ```
1417
1418 The type of `guess` in this code would have to be an integer *and* a string,
1419 and Rust requires that `guess` have only one type. So what does `continue`
1420 return? How were we allowed to return a `u32` from one arm and have another arm
1421 that ends with `continue` in Listing 19-26?
1422
1423 As you might have guessed, `continue` has a `!` value. That is, when Rust
1424 computes the type of `guess`, it looks at both match arms, the former with a
1425 value of `u32` and the latter with a `!` value. Because `!` can never have a
1426 value, Rust decides that the type of `guess` is `u32`.
1427
1428 The formal way of describing this behavior is that expressions of type `!` can
1429 be coerced into any other type. We’re allowed to end this `match` arm with
1430 `continue` because `continue` doesn’t return a value; instead, it moves control
1431 back to the top of the loop, so in the `Err` case, we never assign a value to
1432 `guess`.
1433
1434 The never type is useful with the `panic!` macro as well. Recall the `unwrap`
1435 function that we call on `Option<T>` values to produce a value or panic with
1436 this definition:
1437
1438 ```
1439 impl<T> Option<T> {
1440 pub fn unwrap(self) -> T {
1441 match self {
1442 Some(val) => val,
1443 None => panic!("called `Option::unwrap()` on a `None` value"),
1444 }
1445 }
1446 }
1447 ```
1448
1449 In this code, the same thing happens as in the `match` in Listing 19-26: Rust
1450 sees that `val` has the type `T` and `panic!` has the type `!`, so the result
1451 of the overall `match` expression is `T`. This code works because `panic!`
1452 doesn’t produce a value; it ends the program. In the `None` case, we won’t be
1453 returning a value from `unwrap`, so this code is valid.
1454
1455 One final expression that has the type `!` is a `loop`:
1456
1457 ```
1458 print!("forever ");
1459
1460 loop {
1461 print!("and ever ");
1462 }
1463 ```
1464
1465 Here, the loop never ends, so `!` is the value of the expression. However, this
1466 wouldn’t be true if we included a `break`, because the loop would terminate
1467 when it got to the `break`.
1468
1469 ### Dynamically Sized Types and the `Sized` Trait
1470
1471 Rust needs to know certain details about its types, such as how much space to
1472 allocate for a value of a particular type. This leaves one corner of its type
1473 system a little confusing at first: the concept of *dynamically sized types*.
1474 Sometimes referred to as *DSTs* or *unsized types*, these types let us write
1475 code using values whose size we can know only at runtime.
1476
1477 Let’s dig into the details of a dynamically sized type called `str`, which
1478 we’ve been using throughout the book. That’s right, not `&str`, but `str` on
1479 its own, is a DST. We can’t know how long the string is until runtime, meaning
1480 we can’t create a variable of type `str`, nor can we take an argument of type
1481 `str`. Consider the following code, which does not work:
1482
1483 ```
1484 let s1: str = "Hello there!";
1485 let s2: str = "How's it going?";
1486 ```
1487
1488 Rust needs to know how much memory to allocate for any value of a particular
1489 type, and all values of a type must use the same amount of memory. If Rust
1490 allowed us to write this code, these two `str` values would need to take up the
1491 same amount of space. But they have different lengths: `s1` needs 12 bytes of
1492 storage and `s2` needs 15. This is why it’s not possible to create a variable
1493 holding a dynamically sized type.
1494
1495 So what do we do? In this case, you already know the answer: we make the types
1496 of `s1` and `s2` a `&str` rather than a `str`. Recall from the “String Slices”
1497 section of Chapter 4 that the slice data structure just stores the starting
1498 position and the length of the slice. So although a `&T` is a single value that
1499 stores the memory address of where the `T` is located, a `&str` is *two*
1500 values: the address of the `str` and its length. As such, we can know the size
1501 of a `&str` value at compile time: it’s twice the length of a `usize`. That is,
1502 we always know the size of a `&str`, no matter how long the string it refers to
1503 is. In general, this is the way in which dynamically sized types are used in
1504 Rust: they have an extra bit of metadata that stores the size of the dynamic
1505 information. The golden rule of dynamically sized types is that we must always
1506 put values of dynamically sized types behind a pointer of some kind.
1507
1508 We can combine `str` with all kinds of pointers: for example, `Box<str>` or
1509 `Rc<str>`. In fact, you’ve seen this before but with a different dynamically
1510 sized type: traits. Every trait is a dynamically sized type we can refer to by
1511 using the name of the trait. In Chapter 17 in the “Using Trait Objects That
1512 Allow for Values of Different Types” section, we mentioned that to use traits
1513 as trait objects, we must put them behind a pointer, such as `&dyn Trait` or
1514 `Box<dyn Trait>` (`Rc<dyn Trait>` would work too).
1515
1516 To work with DSTs, Rust provides the `Sized` trait to determine whether or not
1517 a type’s size is known at compile time. This trait is automatically implemented
1518 for everything whose size is known at compile time. In addition, Rust
1519 implicitly adds a bound on `Sized` to every generic function. That is, a
1520 generic function definition like this:
1521
1522 ```
1523 fn generic<T>(t: T) {
1524 // --snip--
1525 }
1526 ```
1527
1528 is actually treated as though we had written this:
1529
1530 ```
1531 fn generic<T: Sized>(t: T) {
1532 // --snip--
1533 }
1534 ```
1535
1536 By default, generic functions will work only on types that have a known size at
1537 compile time. However, you can use the following special syntax to relax this
1538 restriction:
1539
1540 ```
1541 fn generic<T: ?Sized>(t: &T) {
1542 // --snip--
1543 }
1544 ```
1545
1546 A trait bound on `?Sized` means “`T` may or may not be `Sized`” and this
1547 notation overrides the default that generic types must have a known size at
1548 compile time. The `?Trait` syntax with this meaning is only available for
1549 `Sized`, not any other traits.
1550
1551 Also note that we switched the type of the `t` parameter from `T` to `&T`.
1552 Because the type might not be `Sized`, we need to use it behind some kind of
1553 pointer. In this case, we’ve chosen a reference.
1554
1555 Next, we’ll talk about functions and closures!
1556
1557 ## Advanced Functions and Closures
1558
1559 This section explores some advanced features related to functions and closures,
1560 including function pointers and returning closures.
1561
1562 ### Function Pointers
1563
1564 We’ve talked about how to pass closures to functions; you can also pass regular
1565 functions to functions! This technique is useful when you want to pass a
1566 function you’ve already defined rather than defining a new closure. Functions
1567 coerce to the type `fn` (with a lowercase f), not to be confused with the `Fn`
1568 closure trait. The `fn` type is called a *function pointer*. Passing functions
1569 with function pointers will allow you to use functions as arguments to other
1570 functions.
1571
1572 The syntax for specifying that a parameter is a function pointer is similar to
1573 that of closures, as shown in Listing 19-27, where we’ve defined a function
1574 `add_one` that adds one to its parameter. The function `do_twice` takes two
1575 parameters: a function pointer to any function that takes an `i32` parameter
1576 and returns an `i32`, and one `i32 value`. The `do_twice` function calls the
1577 function `f` twice, passing it the `arg` value, then adds the two function call
1578 results together. The `main` function calls `do_twice` with the arguments
1579 `add_one` and `5`.
1580
1581 Filename: src/main.rs
1582
1583 ```
1584 fn add_one(x: i32) -> i32 {
1585 x + 1
1586 }
1587
1588 fn do_twice(f: fn(i32) -> i32, arg: i32) -> i32 {
1589 f(arg) + f(arg)
1590 }
1591
1592 fn main() {
1593 let answer = do_twice(add_one, 5);
1594
1595 println!("The answer is: {}", answer);
1596 }
1597 ```
1598
1599 Listing 19-27: Using the `fn` type to accept a function pointer as an argument
1600
1601 This code prints `The answer is: 12`. We specify that the parameter `f` in
1602 `do_twice` is an `fn` that takes one parameter of type `i32` and returns an
1603 `i32`. We can then call `f` in the body of `do_twice`. In `main`, we can pass
1604 the function name `add_one` as the first argument to `do_twice`.
1605
1606 Unlike closures, `fn` is a type rather than a trait, so we specify `fn` as the
1607 parameter type directly rather than declaring a generic type parameter with one
1608 of the `Fn` traits as a trait bound.
1609
1610 Function pointers implement all three of the closure traits (`Fn`, `FnMut`, and
1611 `FnOnce`), meaning you can always pass a function pointer as an argument for a
1612 function that expects a closure. It’s best to write functions using a generic
1613 type and one of the closure traits so your functions can accept either
1614 functions or closures.
1615
1616 That said, one example of where you would want to only accept `fn` and not
1617 closures is when interfacing with external code that doesn’t have closures: C
1618 functions can accept functions as arguments, but C doesn’t have closures.
1619
1620 As an example of where you could use either a closure defined inline or a named
1621 function, let’s look at a use of the `map` method provided by the `Iterator`
1622 trait in the standard library. To use the `map` function to turn a
1623 vector of numbers into a vector of strings, we could use a closure, like this:
1624
1625 ```
1626 let list_of_numbers = vec![1, 2, 3];
1627 let list_of_strings: Vec<String> =
1628 list_of_numbers.iter().map(|i| i.to_string()).collect();
1629 ```
1630
1631 Or we could name a function as the argument to `map` instead of the closure,
1632 like this:
1633
1634 ```
1635 let list_of_numbers = vec![1, 2, 3];
1636 let list_of_strings: Vec<String> =
1637 list_of_numbers.iter().map(ToString::to_string).collect();
1638 ```
1639
1640 Note that we must use the fully qualified syntax that we talked about earlier
1641 in the “Advanced Traits” section because there are multiple functions available
1642 named `to_string`.
1643
1644 Here, we’re using the `to_string` function defined in the
1645 `ToString` trait, which the standard library has implemented for any type that
1646 implements `Display`.
1647
1648 Recall from the “Enum values” section of Chapter 6 that the name of each enum
1649 variant that we define also becomes an initializer function. We can use these
1650 initializer functions as function pointers that implement the closure traits,
1651 which means we can specify the initializer functions as arguments for methods
1652 that take closures, like so:
1653
1654 ```
1655 enum Status {
1656 Value(u32),
1657 Stop,
1658 }
1659
1660 let list_of_statuses: Vec<Status> = (0u32..20).map(Status::Value).collect();
1661 ```
1662
1663 Here we create `Status::Value` instances using each `u32` value in the range
1664 that `map` is called on by using the initializer function of `Status::Value`.
1665 Some people prefer this style, and some people prefer to use closures. They
1666 compile to the same code, so use whichever style is clearer to you.
1667
1668 ### Returning Closures
1669
1670 Closures are represented by traits, which means you can’t return closures
1671 directly. In most cases where you might want to return a trait, you can instead
1672 use the concrete type that implements the trait as the return value of the
1673 function. However, you can’t do that with closures because they don’t have a
1674 concrete type that is returnable; you’re not allowed to use the function
1675 pointer `fn` as a return type, for example.
1676
1677 The following code tries to return a closure directly, but it won’t compile:
1678
1679 ```
1680 fn returns_closure() -> dyn Fn(i32) -> i32 {
1681 |x| x + 1
1682 }
1683 ```
1684
1685 The compiler error is as follows:
1686
1687 ```
1688 error[E0746]: return type cannot have an unboxed trait object
1689 --> src/lib.rs:1:25
1690 |
1691 1 | fn returns_closure() -> dyn Fn(i32) -> i32 {
1692 | ^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
1693 |
1694 = note: for information on `impl Trait`, see <https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-implement-traits>
1695 help: use `impl Fn(i32) -> i32` as the return type, as all return paths are of type `[closure@src/lib.rs:2:5: 2:14]`, which implements `Fn(i32) -> i32`
1696 |
1697 1 | fn returns_closure() -> impl Fn(i32) -> i32 {
1698 | ~~~~~~~~~~~~~~~~~~~
1699 ```
1700
1701 The error references the `Sized` trait again! Rust doesn’t know how much space
1702 it will need to store the closure. We saw a solution to this problem earlier.
1703 We can use a trait object:
1704
1705 ```
1706 fn returns_closure() -> Box<dyn Fn(i32) -> i32> {
1707 Box::new(|x| x + 1)
1708 }
1709 ```
1710
1711 This code will compile just fine. For more about trait objects, refer to the
1712 section “Using Trait Objects That Allow for Values of Different Types” in
1713 Chapter 17.
1714
1715 Next, let’s look at macros!
1716
1717 ## Macros
1718
1719 We’ve used macros like `println!` throughout this book, but we haven’t fully
1720 explored what a macro is and how it works. The term *macro* refers to a family
1721 of features in Rust: *declarative* macros with `macro_rules!` and three kinds
1722 of *procedural* macros:
1723
1724 * Custom `#[derive]` macros that specify code added with the `derive` attribute
1725 used on structs and enums
1726 * Attribute-like macros that define custom attributes usable on any item
1727 * Function-like macros that look like function calls but operate on the tokens
1728 specified as their argument
1729
1730 We’ll talk about each of these in turn, but first, let’s look at why we even
1731 need macros when we already have functions.
1732
1733 ### The Difference Between Macros and Functions
1734
1735 Fundamentally, macros are a way of writing code that writes other code, which
1736 is known as *metaprogramming*. In Appendix C, we discuss the `derive`
1737 attribute, which generates an implementation of various traits for you. We’ve
1738 also used the `println!` and `vec!` macros throughout the book. All of these
1739 macros *expand* to produce more code than the code you’ve written manually.
1740
1741 Metaprogramming is useful for reducing the amount of code you have to write and
1742 maintain, which is also one of the roles of functions. However, macros have
1743 some additional powers that functions don’t.
1744
1745 A function signature must declare the number and type of parameters the
1746 function has. Macros, on the other hand, can take a variable number of
1747 parameters: we can call `println!("hello")` with one argument or
1748 `println!("hello {}", name)` with two arguments. Also, macros are expanded
1749 before the compiler interprets the meaning of the code, so a macro can, for
1750 example, implement a trait on a given type. A function can’t, because it gets
1751 called at runtime and a trait needs to be implemented at compile time.
1752
1753 The downside to implementing a macro instead of a function is that macro
1754 definitions are more complex than function definitions because you’re writing
1755 Rust code that writes Rust code. Due to this indirection, macro definitions are
1756 generally more difficult to read, understand, and maintain than function
1757 definitions.
1758
1759 Another important difference between macros and functions is that you must
1760 define macros or bring them into scope *before* you call them in a file, as
1761 opposed to functions you can define anywhere and call anywhere.
1762
1763 ### Declarative Macros with `macro_rules!` for General Metaprogramming
1764
1765 The most widely used form of macros in Rust is the *declarative macro*. These
1766 are also sometimes referred to as “macros by example,” “`macro_rules!` macros,”
1767 or just plain “macros.” At their core, declarative macros allow you to write
1768 something similar to a Rust `match` expression. As discussed in Chapter 6,
1769 `match` expressions are control structures that take an expression, compare the
1770 resulting value of the expression to patterns, and then run the code associated
1771 with the matching pattern. Macros also compare a value to patterns that are
1772 associated with particular code: in this situation, the value is the literal
1773 Rust source code passed to the macro; the patterns are compared with the
1774 structure of that source code; and the code associated with each pattern, when
1775 matched, replaces the code passed to the macro. This all happens during
1776 compilation.
1777
1778 To define a macro, you use the `macro_rules!` construct. Let’s explore how to
1779 use `macro_rules!` by looking at how the `vec!` macro is defined. Chapter 8
1780 covered how we can use the `vec!` macro to create a new vector with particular
1781 values. For example, the following macro creates a new vector containing three
1782 integers:
1783
1784 ```
1785 let v: Vec<u32> = vec![1, 2, 3];
1786 ```
1787
1788 We could also use the `vec!` macro to make a vector of two integers or a vector
1789 of five string slices. We wouldn’t be able to use a function to do the same
1790 because we wouldn’t know the number or type of values up front.
1791
1792 Listing 19-28 shows a slightly simplified definition of the `vec!` macro.
1793
1794 Filename: src/lib.rs
1795
1796 ```
1797 [1] #[macro_export]
1798 [2] macro_rules! vec {
1799 [3] ( $( $x:expr ),* ) => {
1800 {
1801 let mut temp_vec = Vec::new();
1802 [4] $(
1803 [5] temp_vec.push($x [6]);
1804 )*
1805 [7] temp_vec
1806 }
1807 };
1808 }
1809 ```
1810
1811 Listing 19-28: A simplified version of the `vec!` macro definition
1812
1813 > Note: The actual definition of the `vec!` macro in the standard library
1814 > includes code to preallocate the correct amount of memory up front. That code
1815 > is an optimization that we don’t include here to make the example simpler.
1816
1817 The `#[macro_export]` annotation [1] indicates that this macro should be made
1818 available whenever the crate in which the macro is defined is brought into
1819 scope. Without this annotation, the macro can’t be brought into scope.
1820
1821 We then start the macro definition with `macro_rules!` and the name of the
1822 macro we’re defining *without* the exclamation mark [2]. The name, in this case
1823 `vec`, is followed by curly brackets denoting the body of the macro definition.
1824
1825 The structure in the `vec!` body is similar to the structure of a `match`
1826 expression. Here we have one arm with the pattern `( $( $x:expr ),* )`,
1827 followed by `=>` and the block of code associated with this pattern [3]. If the
1828 pattern matches, the associated block of code will be emitted. Given that this
1829 is the only pattern in this macro, there is only one valid way to match; any
1830 other pattern will result in an error. More complex macros will have more than
1831 one arm.
1832
1833 Valid pattern syntax in macro definitions is different than the pattern syntax
1834 covered in Chapter 18 because macro patterns are matched against Rust code
1835 structure rather than values. Let’s walk through what the pattern pieces in
1836 Listing 19-28 mean; for the full macro pattern syntax, see the Rust Reference
1837 at *https://doc.rust-lang.org/reference/macros-by-example.html*.
1838
1839 First, we use a set of parentheses to encompass the whole pattern. We use a
1840 dollar sign (`$`) to declare a variable in the macro system that will contain
1841 the Rust code matching the pattern. The dollar sign makes it clear this is a
1842 macro variable as opposed to a regular Rust variable.
1843 Next comes a set of parentheses that captures values that match the
1844 pattern within the parentheses for use in the replacement code. Within `$()` is
1845 `$x:expr`, which matches any Rust expression and gives the expression the name
1846 `$x`.
1847
1848 The comma following `$()` indicates that a literal comma separator character
1849 could optionally appear after the code that matches the code in `$()`. The `*`
1850 specifies that the pattern matches zero or more of whatever precedes the `*`.
1851
1852 When we call this macro with `vec![1, 2, 3];`, the `$x` pattern matches three
1853 times with the three expressions `1`, `2`, and `3`.
1854
1855 Now let’s look at the pattern in the body of the code associated with this arm:
1856 `temp_vec.push()` [5] within `$()*` [4][7] is generated for each part that
1857 matches `$()` in the pattern zero or more times depending on how many times the
1858 pattern matches. The `$x` [6] is replaced with each expression matched. When we
1859 call this macro with `vec![1, 2, 3];`, the code generated that replaces this
1860 macro call will be the following:
1861
1862 ```
1863 {
1864 let mut temp_vec = Vec::new();
1865 temp_vec.push(1);
1866 temp_vec.push(2);
1867 temp_vec.push(3);
1868 temp_vec
1869 }
1870 ```
1871
1872 We’ve defined a macro that can take any number of arguments of any type and can
1873 generate code to create a vector containing the specified elements.
1874
1875 To learn more about how to write macros, consult the online documentation or
1876 other resources, such as “The Little Book of Rust Macros” at
1877 *https://veykril.github.io/tlborm/* started by Daniel Keep and continued by
1878 Lukas Wirth.
1879
1880 <!-- Not sure what "In the future, Rust will have a second kind of declarative
1881 macro" means here. I suspect we're "stuck" with the two kinds of macros we
1882 already have today, at least I don't see much energy in pushing to add a third
1883 just yet.
1884 /JT -->
1885 <!-- Yeah, great catch, I think that part was back when we had more dreams that
1886 have now been postponed/abandoned. I've removed. /Carol -->
1887
1888 ### Procedural Macros for Generating Code from Attributes
1889
1890 The second form of macros is the *procedural macro*, which acts more like a
1891 function (and is a type of procedure). Procedural macros accept some code as an
1892 input, operate on that code, and produce some code as an output rather than
1893 matching against patterns and replacing the code with other code as declarative
1894 macros do. The three kinds of procedural macros are custom derive,
1895 attribute-like, and function-like, and all work in a similar fashion.
1896
1897 When creating procedural macros, the definitions must reside in their own crate
1898 with a special crate type. This is for complex technical reasons that we hope
1899 to eliminate in the future. In Listing 19-29, we show how to define a
1900 procedural macro, where `some_attribute` is a placeholder for using a specific
1901 macro variety.
1902
1903 Filename: src/lib.rs
1904
1905 ```
1906 use proc_macro;
1907
1908 #[some_attribute]
1909 pub fn some_name(input: TokenStream) -> TokenStream {
1910 }
1911 ```
1912
1913 Listing 19-29: An example of defining a procedural macro
1914
1915 The function that defines a procedural macro takes a `TokenStream` as an input
1916 and produces a `TokenStream` as an output. The `TokenStream` type is defined by
1917 the `proc_macro` crate that is included with Rust and represents a sequence of
1918 tokens. This is the core of the macro: the source code that the macro is
1919 operating on makes up the input `TokenStream`, and the code the macro produces
1920 is the output `TokenStream`. The function also has an attribute attached to it
1921 that specifies which kind of procedural macro we’re creating. We can have
1922 multiple kinds of procedural macros in the same crate.
1923
1924 Let’s look at the different kinds of procedural macros. We’ll start with a
1925 custom derive macro and then explain the small dissimilarities that make the
1926 other forms different.
1927
1928 ### How to Write a Custom `derive` Macro
1929
1930 Let’s create a crate named `hello_macro` that defines a trait named
1931 `HelloMacro` with one associated function named `hello_macro`. Rather than
1932 making our users implement the `HelloMacro` trait for each of their types,
1933 we’ll provide a procedural macro so users can annotate their type with
1934 `#[derive(HelloMacro)]` to get a default implementation of the `hello_macro`
1935 function. The default implementation will print `Hello, Macro! My name is
1936 TypeName!` where `TypeName` is the name of the type on which this trait has
1937 been defined. In other words, we’ll write a crate that enables another
1938 programmer to write code like Listing 19-30 using our crate.
1939
1940 Filename: src/main.rs
1941
1942 ```
1943 use hello_macro::HelloMacro;
1944 use hello_macro_derive::HelloMacro;
1945
1946 #[derive(HelloMacro)]
1947 struct Pancakes;
1948
1949 fn main() {
1950 Pancakes::hello_macro();
1951 }
1952 ```
1953
1954 Listing 19-30: The code a user of our crate will be able to write when using
1955 our procedural macro
1956
1957 This code will print `Hello, Macro! My name is Pancakes!` when we’re done. The
1958 first step is to make a new library crate, like this:
1959
1960 ```
1961 $ cargo new hello_macro --lib
1962 ```
1963
1964 Next, we’ll define the `HelloMacro` trait and its associated function:
1965
1966 Filename: src/lib.rs
1967
1968 ```
1969 pub trait HelloMacro {
1970 fn hello_macro();
1971 }
1972 ```
1973
1974 We have a trait and its function. At this point, our crate user could implement
1975 the trait to achieve the desired functionality, like so:
1976
1977 ```
1978 use hello_macro::HelloMacro;
1979
1980 struct Pancakes;
1981
1982 impl HelloMacro for Pancakes {
1983 fn hello_macro() {
1984 println!("Hello, Macro! My name is Pancakes!");
1985 }
1986 }
1987
1988 fn main() {
1989 Pancakes::hello_macro();
1990 }
1991 ```
1992
1993 However, they would need to write the implementation block for each type they
1994 wanted to use with `hello_macro`; we want to spare them from having to do this
1995 work.
1996
1997 Additionally, we can’t yet provide the `hello_macro` function with default
1998 implementation that will print the name of the type the trait is implemented
1999 on: Rust doesn’t have reflection capabilities, so it can’t look up the type’s
2000 name at runtime. We need a macro to generate code at compile time.
2001
2002 The next step is to define the procedural macro. At the time of this writing,
2003 procedural macros need to be in their own crate. Eventually, this restriction
2004 might be lifted. The convention for structuring crates and macro crates is as
2005 follows: for a crate named `foo`, a custom derive procedural macro crate is
2006 called `foo_derive`. Let’s start a new crate called `hello_macro_derive` inside
2007 our `hello_macro` project:
2008
2009 ```
2010 $ cargo new hello_macro_derive --lib
2011 ```
2012
2013 Our two crates are tightly related, so we create the procedural macro crate
2014 within the directory of our `hello_macro` crate. If we change the trait
2015 definition in `hello_macro`, we’ll have to change the implementation of the
2016 procedural macro in `hello_macro_derive` as well. The two crates will need to
2017 be published separately, and programmers using these crates will need to add
2018 both as dependencies and bring them both into scope. We could instead have the
2019 `hello_macro` crate use `hello_macro_derive` as a dependency and re-export the
2020 procedural macro code. However, the way we’ve structured the project makes it
2021 possible for programmers to use `hello_macro` even if they don’t want the
2022 `derive` functionality.
2023
2024 We need to declare the `hello_macro_derive` crate as a procedural macro crate.
2025 We’ll also need functionality from the `syn` and `quote` crates, as you’ll see
2026 in a moment, so we need to add them as dependencies. Add the following to the
2027 *Cargo.toml* file for `hello_macro_derive`:
2028
2029 Filename: hello_macro_derive/Cargo.toml
2030
2031 ```
2032 [lib]
2033 proc-macro = true
2034
2035 [dependencies]
2036 syn = "1.0"
2037 quote = "1.0"
2038 ```
2039
2040 To start defining the procedural macro, place the code in Listing 19-31 into
2041 your *src/lib.rs* file for the `hello_macro_derive` crate. Note that this code
2042 won’t compile until we add a definition for the `impl_hello_macro` function.
2043
2044 Filename: hello_macro_derive/src/lib.rs
2045
2046 ```
2047 use proc_macro::TokenStream;
2048 use quote::quote;
2049 use syn;
2050
2051 #[proc_macro_derive(HelloMacro)]
2052 pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
2053 // Construct a representation of Rust code as a syntax tree
2054 // that we can manipulate
2055 let ast = syn::parse(input).unwrap();
2056
2057 // Build the trait implementation
2058 impl_hello_macro(&ast)
2059 }
2060 ```
2061
2062 Listing 19-31: Code that most procedural macro crates will require in order to
2063 process Rust code
2064
2065 Notice that we’ve split the code into the `hello_macro_derive` function, which
2066 is responsible for parsing the `TokenStream`, and the `impl_hello_macro`
2067 function, which is responsible for transforming the syntax tree: this makes
2068 writing a procedural macro more convenient. The code in the outer function
2069 (`hello_macro_derive` in this case) will be the same for almost every
2070 procedural macro crate you see or create. The code you specify in the body of
2071 the inner function (`impl_hello_macro` in this case) will be different
2072 depending on your procedural macro’s purpose.
2073
2074 We’ve introduced three new crates: `proc_macro`, `syn` (available from
2075 *https://crates.io/crates/syn*), and `quote` (available from
2076 *https://crates.io/crates/quote*). The `proc_macro` crate comes with Rust, so
2077 we didn’t need to add that to the dependencies in *Cargo.toml*. The
2078 `proc_macro` crate is the compiler’s API that allows us to read and manipulate
2079 Rust code from our code.
2080
2081 The `syn` crate parses Rust code from a string into a data structure that we
2082 can perform operations on. The `quote` crate turns `syn` data structures back
2083 into Rust code. These crates make it much simpler to parse any sort of Rust
2084 code we might want to handle: writing a full parser for Rust code is no simple
2085 task.
2086
2087 The `hello_macro_derive` function will be called when a user of our library
2088 specifies `#[derive(HelloMacro)]` on a type. This is possible because we’ve
2089 annotated the `hello_macro_derive` function here with `proc_macro_derive` and
2090 specified the name `HelloMacro`, which matches our trait name; this is the
2091 convention most procedural macros follow.
2092
2093 The `hello_macro_derive` function first converts the `input` from a
2094 `TokenStream` to a data structure that we can then interpret and perform
2095 operations on. This is where `syn` comes into play. The `parse` function in
2096 `syn` takes a `TokenStream` and returns a `DeriveInput` struct representing the
2097 parsed Rust code. Listing 19-32 shows the relevant parts of the `DeriveInput`
2098 struct we get from parsing the `struct Pancakes;` string:
2099
2100 ```
2101 DeriveInput {
2102 // --snip--
2103
2104 ident: Ident {
2105 ident: "Pancakes",
2106 span: #0 bytes(95..103)
2107 },
2108 data: Struct(
2109 DataStruct {
2110 struct_token: Struct,
2111 fields: Unit,
2112 semi_token: Some(
2113 Semi
2114 )
2115 }
2116 )
2117 }
2118 ```
2119
2120 Listing 19-32: The `DeriveInput` instance we get when parsing the code that has
2121 the macro’s attribute in Listing 19-30
2122
2123 The fields of this struct show that the Rust code we’ve parsed is a unit struct
2124 with the `ident` (identifier, meaning the name) of `Pancakes`. There are more
2125 fields on this struct for describing all sorts of Rust code; check the `syn`
2126 documentation for `DeriveInput` at
2127 *https://docs.rs/syn/1.0/syn/struct.DeriveInput.html* for more information.
2128
2129 Soon we’ll define the `impl_hello_macro` function, which is where we’ll build
2130 the new Rust code we want to include. But before we do, note that the output
2131 for our derive macro is also a `TokenStream`. The returned `TokenStream` is
2132 added to the code that our crate users write, so when they compile their crate,
2133 they’ll get the extra functionality that we provide in the modified
2134 `TokenStream`.
2135
2136 You might have noticed that we’re calling `unwrap` to cause the
2137 `hello_macro_derive` function to panic if the call to the `syn::parse` function
2138 fails here. It’s necessary for our procedural macro to panic on errors because
2139 `proc_macro_derive` functions must return `TokenStream` rather than `Result` to
2140 conform to the procedural macro API. We’ve simplified this example by using
2141 `unwrap`; in production code, you should provide more specific error messages
2142 about what went wrong by using `panic!` or `expect`.
2143
2144 Now that we have the code to turn the annotated Rust code from a `TokenStream`
2145 into a `DeriveInput` instance, let’s generate the code that implements the
2146 `HelloMacro` trait on the annotated type, as shown in Listing 19-33.
2147
2148 Filename: hello_macro_derive/src/lib.rs
2149
2150 ```
2151 fn impl_hello_macro(ast: &syn::DeriveInput) -> TokenStream {
2152 let name = &ast.ident;
2153 let gen = quote! {
2154 impl HelloMacro for #name {
2155 fn hello_macro() {
2156 println!("Hello, Macro! My name is {}!", stringify!(#name));
2157 }
2158 }
2159 };
2160 gen.into()
2161 }
2162 ```
2163
2164 Listing 19-33: Implementing the `HelloMacro` trait using the parsed Rust code
2165
2166 We get an `Ident` struct instance containing the name (identifier) of the
2167 annotated type using `ast.ident`. The struct in Listing 19-32 shows that when
2168 we run the `impl_hello_macro` function on the code in Listing 19-30, the
2169 `ident` we get will have the `ident` field with a value of `"Pancakes"`. Thus,
2170 the `name` variable in Listing 19-33 will contain an `Ident` struct instance
2171 that, when printed, will be the string `"Pancakes"`, the name of the struct in
2172 Listing 19-30.
2173
2174 The `quote!` macro lets us define the Rust code that we want to return. The
2175 compiler expects something different to the direct result of the `quote!`
2176 macro’s execution, so we need to convert it to a `TokenStream`. We do this by
2177 calling the `into` method, which consumes this intermediate representation and
2178 returns a value of the required `TokenStream` type.
2179
2180 The `quote!` macro also provides some very cool templating mechanics: we can
2181 enter `#name`, and `quote!` will replace it with the value in the variable
2182 `name`. You can even do some repetition similar to the way regular macros work.
2183 Check out the `quote` crate’s docs at *https://docs.rs/quote* for a thorough
2184 introduction.
2185
2186 We want our procedural macro to generate an implementation of our `HelloMacro`
2187 trait for the type the user annotated, which we can get by using `#name`. The
2188 trait implementation has the one function `hello_macro`, whose body contains the
2189 functionality we want to provide: printing `Hello, Macro! My name is` and then
2190 the name of the annotated type.
2191
2192 The `stringify!` macro used here is built into Rust. It takes a Rust
2193 expression, such as `1 + 2`, and at compile time turns the expression into a
2194 string literal, such as `"1 + 2"`. This is different than `format!` or
2195 `println!`, macros which evaluate the expression and then turn the result into
2196 a `String`. There is a possibility that the `#name` input might be an
2197 expression to print literally, so we use `stringify!`. Using `stringify!` also
2198 saves an allocation by converting `#name` to a string literal at compile time.
2199
2200 At this point, `cargo build` should complete successfully in both `hello_macro`
2201 and `hello_macro_derive`. Let’s hook up these crates to the code in Listing
2202 19-30 to see the procedural macro in action! Create a new binary project in
2203 your *projects* directory using `cargo new pancakes`. We need to add
2204 `hello_macro` and `hello_macro_derive` as dependencies in the `pancakes`
2205 crate’s *Cargo.toml*. If you’re publishing your versions of `hello_macro` and
2206 `hello_macro_derive` to *https://crates.io/*, they would be regular
2207 dependencies; if not, you can specify them as `path` dependencies as follows:
2208
2209 ```
2210 [dependencies]
2211 hello_macro = { path = "../hello_macro" }
2212 hello_macro_derive = { path = "../hello_macro/hello_macro_derive" }
2213 ```
2214
2215 Put the code in Listing 19-30 into *src/main.rs*, and run `cargo run`: it
2216 should print `Hello, Macro! My name is Pancakes!` The implementation of the
2217 `HelloMacro` trait from the procedural macro was included without the
2218 `pancakes` crate needing to implement it; the `#[derive(HelloMacro)]` added the
2219 trait implementation.
2220
2221 Next, let’s explore how the other kinds of procedural macros differ from custom
2222 derive macros.
2223
2224 ### Attribute-like macros
2225
2226 Attribute-like macros are similar to custom derive macros, but instead of
2227 generating code for the `derive` attribute, they allow you to create new
2228 attributes. They’re also more flexible: `derive` only works for structs and
2229 enums; attributes can be applied to other items as well, such as functions.
2230 Here’s an example of using an attribute-like macro: say you have an attribute
2231 named `route` that annotates functions when using a web application framework:
2232
2233 ```
2234 #[route(GET, "/")]
2235 fn index() {
2236 ```
2237
2238 This `#[route]` attribute would be defined by the framework as a procedural
2239 macro. The signature of the macro definition function would look like this:
2240
2241 ```
2242 #[proc_macro_attribute]
2243 pub fn route(attr: TokenStream, item: TokenStream) -> TokenStream {
2244 ```
2245
2246 Here, we have two parameters of type `TokenStream`. The first is for the
2247 contents of the attribute: the `GET, "/"` part. The second is the body of the
2248 item the attribute is attached to: in this case, `fn index() {}` and the rest
2249 of the function’s body.
2250
2251 Other than that, attribute-like macros work the same way as custom derive
2252 macros: you create a crate with the `proc-macro` crate type and implement a
2253 function that generates the code you want!
2254
2255 ### Function-like macros
2256
2257 Function-like macros define macros that look like function calls. Similarly to
2258 `macro_rules!` macros, they’re more flexible than functions; for example, they
2259 can take an unknown number of arguments. However, `macro_rules!` macros can be
2260 defined only using the match-like syntax we discussed in the section
2261 “Declarative Macros with `macro_rules!` for General Metaprogramming” earlier.
2262 Function-like macros take a `TokenStream` parameter and their definition
2263 manipulates that `TokenStream` using Rust code as the other two types of
2264 procedural macros do. An example of a function-like macro is an `sql!` macro
2265 that might be called like so:
2266
2267 ```
2268 let sql = sql!(SELECT * FROM posts WHERE id=1);
2269 ```
2270
2271 This macro would parse the SQL statement inside it and check that it’s
2272 syntactically correct, which is much more complex processing than a
2273 `macro_rules!` macro can do. The `sql!` macro would be defined like this:
2274
2275 ```
2276 #[proc_macro]
2277 pub fn sql(input: TokenStream) -> TokenStream {
2278 ```
2279
2280 This definition is similar to the custom derive macro’s signature: we receive
2281 the tokens that are inside the parentheses and return the code we wanted to
2282 generate.
2283
2284 <!-- I may get a few looks for this, but I wonder if we should trim the
2285 procedural macros section above a bit. There's a lot of information in there,
2286 but it feels like something we could intro and then point people off to other
2287 materials for. Reason being (and I know I may be in the minority here),
2288 procedural macros are something we should use only rarely in our Rust projects.
2289 They are a burden on the compiler, have the potential to hurt readability and
2290 maintainability, and... you know the saying with great power comes great
2291 responsibilty and all that. /JT -->
2292 <!-- I think we felt obligated to have this section when procedural macros were
2293 introduced because there wasn't any documentation for them. I feel like the
2294 custom derive is the most common kind people want to make... While I'd love to
2295 not have to maintain this section, I asked around and people seemed generally
2296 in favor of keeping it, so I think I will, for now. /Carol -->
2297
2298 ## Summary
2299
2300 Whew! Now you have some Rust features in your toolbox that you likely won’t use
2301 often, but you’ll know they’re available in very particular circumstances.
2302 We’ve introduced several complex topics so that when you encounter them in
2303 error message suggestions or in other peoples’ code, you’ll be able to
2304 recognize these concepts and syntax. Use this chapter as a reference to guide
2305 you to solutions.
2306
2307 Next, we’ll put everything we’ve discussed throughout the book into practice
2308 and do one more project!