]> git.proxmox.com Git - rustc.git/blame - src/doc/book/nostarch/chapter19.md
New upstream version 1.63.0+dfsg1
[rustc.git] / src / doc / book / nostarch / chapter19.md
CommitLineData
5e7ed085
FG
1<!-- DO NOT EDIT THIS FILE.
2
3This file is periodically generated from the content in the `/src/`
4directory, so all fixes need to be made in `/src/`.
5-->
6
7[TOC]
8
9# Advanced Features
10
11By now, you’ve learned the most commonly used parts of the Rust programming
12language. Before we do one more project in Chapter 20, we’ll look at a few
923072b8
FG
13aspects of the language you might run into every once in a while, but may not
14use every day. You can use this chapter as a reference for when you encounter
15any unknowns. The features covered here are useful in very specific situations.
16Although you might not reach for them often, we want to make sure you have a
17grasp of all the features Rust has to offer.
5e7ed085
FG
18
19In this chapter, we’ll cover:
20
21* Unsafe Rust: how to opt out of some of Rust’s guarantees and take
22 responsibility for manually upholding those guarantees
23* Advanced traits: associated types, default type parameters, fully qualified
24 syntax, supertraits, and the newtype pattern in relation to traits
25* Advanced types: more about the newtype pattern, type aliases, the never type,
26 and dynamically sized types
27* Advanced functions and closures: function pointers and returning closures
28* Macros: ways to define code that defines more code at compile time
29
30It’s a panoply of Rust features with something for everyone! Let’s dive in!
31
32## Unsafe Rust
33
34All the code we’ve discussed so far has had Rust’s memory safety guarantees
35enforced at compile time. However, Rust has a second language hidden inside it
36that doesn’t enforce these memory safety guarantees: it’s called *unsafe Rust*
37and works just like regular Rust, but gives us extra superpowers.
38
39Unsafe Rust exists because, by nature, static analysis is conservative. When
40the compiler tries to determine whether or not code upholds the guarantees,
923072b8
FG
41it’s better for it to reject some valid programs than to accept some invalid
42programs. Although the code *might* be okay, if the Rust compiler doesn’t have
43enough information to be confident, it will reject the code. In these cases,
44you can use unsafe code to tell the compiler, “Trust me, I know what I’m
45doing.” Be warned, however, that you use unsafe Rust at your own risk: if you
46use unsafe code incorrectly, problems can occur due to memory unsafety, such as
47null pointer dereferencing.
5e7ed085
FG
48
49Another reason Rust has an unsafe alter ego is that the underlying computer
50hardware is inherently unsafe. If Rust didn’t let you do unsafe operations, you
51couldn’t do certain tasks. Rust needs to allow you to do low-level systems
52programming, such as directly interacting with the operating system or even
53writing your own operating system. Working with low-level systems programming
54is one of the goals of the language. Let’s explore what we can do with unsafe
55Rust and how to do it.
56
57### Unsafe Superpowers
58
59To switch to unsafe Rust, use the `unsafe` keyword and then start a new block
923072b8
FG
60that holds the unsafe code. You can take five actions in unsafe Rust that you
61can’t in safe Rust, which we call *unsafe superpowers*. Those superpowers
62include the ability to:
5e7ed085
FG
63
64* Dereference a raw pointer
65* Call an unsafe function or method
66* Access or modify a mutable static variable
67* Implement an unsafe trait
68* Access fields of `union`s
69
70It’s important to understand that `unsafe` doesn’t turn off the borrow checker
71or disable any other of Rust’s safety checks: if you use a reference in unsafe
72code, it will still be checked. The `unsafe` keyword only gives you access to
73these five features that are then not checked by the compiler for memory
74safety. You’ll still get some degree of safety inside of an unsafe block.
75
76In addition, `unsafe` does not mean the code inside the block is necessarily
77dangerous or that it will definitely have memory safety problems: the intent is
78that as the programmer, you’ll ensure the code inside an `unsafe` block will
79access memory in a valid way.
80
81People are fallible, and mistakes will happen, but by requiring these five
82unsafe operations to be inside blocks annotated with `unsafe` you’ll know that
83any errors related to memory safety must be within an `unsafe` block. Keep
84`unsafe` blocks small; you’ll be thankful later when you investigate memory
85bugs.
86
87To isolate unsafe code as much as possible, it’s best to enclose unsafe code
88within a safe abstraction and provide a safe API, which we’ll discuss later in
89the chapter when we examine unsafe functions and methods. Parts of the standard
90library are implemented as safe abstractions over unsafe code that has been
91audited. Wrapping unsafe code in a safe abstraction prevents uses of `unsafe`
92from leaking out into all the places that you or your users might want to use
93the functionality implemented with `unsafe` code, because using a safe
94abstraction is safe.
95
96Let’s look at each of the five unsafe superpowers in turn. We’ll also look at
97some abstractions that provide a safe interface to unsafe code.
98
99### Dereferencing a Raw Pointer
100
101In Chapter 4, in the “Dangling References” section, we mentioned that the
102compiler ensures references are always valid. Unsafe Rust has two new types
103called *raw pointers* that are similar to references. As with references, raw
104pointers can be immutable or mutable and are written as `*const T` and `*mut
105T`, respectively. The asterisk isn’t the dereference operator; it’s part of the
106type name. In the context of raw pointers, *immutable* means that the pointer
107can’t be directly assigned to after being dereferenced.
108
109Different from references and smart pointers, raw pointers:
110
111* Are allowed to ignore the borrowing rules by having both immutable and
112 mutable pointers or multiple mutable pointers to the same location
113* Aren’t guaranteed to point to valid memory
114* Are allowed to be null
115* Don’t implement any automatic cleanup
116
117By opting out of having Rust enforce these guarantees, you can give up
118guaranteed safety in exchange for greater performance or the ability to
119interface with another language or hardware where Rust’s guarantees don’t apply.
120
121Listing 19-1 shows how to create an immutable and a mutable raw pointer from
122references.
123
124```
125let mut num = 5;
126
127let r1 = &num as *const i32;
128let r2 = &mut num as *mut i32;
129```
130
131Listing 19-1: Creating raw pointers from references
132
133Notice that we don’t include the `unsafe` keyword in this code. We can create
134raw pointers in safe code; we just can’t dereference raw pointers outside an
135unsafe block, as you’ll see in a bit.
136
137We’ve created raw pointers by using `as` to cast an immutable and a mutable
138reference into their corresponding raw pointer types. Because we created them
139directly from references guaranteed to be valid, we know these particular raw
140pointers are valid, but we can’t make that assumption about just any raw
141pointer.
142
923072b8
FG
143To demonstrate this, next we’ll create a raw pointer whose validity we can’t be
144so certain of. Listing 19-2 shows how to create a raw pointer to an arbitrary
145location in memory. Trying to use arbitrary memory is undefined: there might be
146data at that address or there might not, the compiler might optimize the code
147so there is no memory access, or the program might error with a segmentation
148fault. Usually, there is no good reason to write code like this, but it is
149possible.
5e7ed085
FG
150
151```
152let address = 0x012345usize;
153let r = address as *const i32;
154```
155
156Listing 19-2: Creating a raw pointer to an arbitrary memory address
157
158Recall that we can create raw pointers in safe code, but we can’t *dereference*
159raw pointers and read the data being pointed to. In Listing 19-3, we use the
160dereference operator `*` on a raw pointer that requires an `unsafe` block.
161
162```
163let mut num = 5;
164
165let r1 = &num as *const i32;
166let r2 = &mut num as *mut i32;
167
168unsafe {
169 println!("r1 is: {}", *r1);
170 println!("r2 is: {}", *r2);
171}
172```
173
174Listing 19-3: Dereferencing raw pointers within an `unsafe` block
175
176Creating a pointer does no harm; it’s only when we try to access the value that
177it points at that we might end up dealing with an invalid value.
178
179Note also that in Listing 19-1 and 19-3, we created `*const i32` and `*mut i32`
180raw pointers that both pointed to the same memory location, where `num` is
181stored. If we instead tried to create an immutable and a mutable reference to
182`num`, the code would not have compiled because Rust’s ownership rules don’t
183allow a mutable reference at the same time as any immutable references. With
184raw pointers, we can create a mutable pointer and an immutable pointer to the
185same location and change data through the mutable pointer, potentially creating
186a data race. Be careful!
187
188With all of these dangers, why would you ever use raw pointers? One major use
189case is when interfacing with C code, as you’ll see in the next section,
190“Calling an Unsafe Function or Method.” Another case is when building up safe
191abstractions that the borrow checker doesn’t understand. We’ll introduce unsafe
192functions and then look at an example of a safe abstraction that uses unsafe
193code.
194
195### Calling an Unsafe Function or Method
196
923072b8
FG
197The second type of operation you can perform in an unsafe block is calling
198unsafe functions. Unsafe functions and methods look exactly like regular
199functions and methods, but they have an extra `unsafe` before the rest of the
200definition. The `unsafe` keyword in this context indicates the function has
201requirements we need to uphold when we call this function, because Rust can’t
202guarantee we’ve met these requirements. By calling an unsafe function within an
203`unsafe` block, we’re saying that we’ve read this function’s documentation and
204take responsibility for upholding the function’s contracts.
5e7ed085
FG
205
206Here is an unsafe function named `dangerous` that doesn’t do anything in its
207body:
208
209```
210unsafe fn dangerous() {}
211
212unsafe {
213 dangerous();
214}
215```
216
217We must call the `dangerous` function within a separate `unsafe` block. If we
218try to call `dangerous` without the `unsafe` block, we’ll get an error:
219
220```
221error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
222 --> src/main.rs:4:5
223 |
2244 | dangerous();
225 | ^^^^^^^^^^^ call to unsafe function
226 |
227 = note: consult the function's documentation for information on how to avoid undefined behavior
228```
229
923072b8
FG
230With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s
231documentation, we understand how to use it properly, and we’ve verified that
232we’re fulfilling the contract of the function.
5e7ed085
FG
233
234Bodies of unsafe functions are effectively `unsafe` blocks, so to perform other
235unsafe operations within an unsafe function, we don’t need to add another
236`unsafe` block.
237
238#### Creating a Safe Abstraction over Unsafe Code
239
240Just because a function contains unsafe code doesn’t mean we need to mark the
241entire function as unsafe. In fact, wrapping unsafe code in a safe function is
923072b8
FG
242a common abstraction. As an example, let’s study the `split_at_mut` function
243from the standard library, which requires some unsafe code. We’ll explore how
244we might implement it. This safe method is defined on mutable slices: it takes
245one slice and makes it two by splitting the slice at the index given as an
5e7ed085
FG
246argument. Listing 19-4 shows how to use `split_at_mut`.
247
248```
249let mut v = vec![1, 2, 3, 4, 5, 6];
250
251let r = &mut v[..];
252
253let (a, b) = r.split_at_mut(3);
254
255assert_eq!(a, &mut [1, 2, 3]);
256assert_eq!(b, &mut [4, 5, 6]);
257```
258
259Listing 19-4: Using the safe `split_at_mut` function
260
261We can’t implement this function using only safe Rust. An attempt might look
262something like Listing 19-5, which won’t compile. For simplicity, we’ll
263implement `split_at_mut` as a function rather than a method and only for slices
264of `i32` values rather than for a generic type `T`.
265
266```
267fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
268 let len = values.len();
269
270 assert!(mid <= len);
271
272 (&mut values[..mid], &mut values[mid..])
273}
274```
275
276Listing 19-5: An attempted implementation of `split_at_mut` using only safe Rust
277
278This function first gets the total length of the slice. Then it asserts that
279the index given as a parameter is within the slice by checking whether it’s
280less than or equal to the length. The assertion means that if we pass an index
281that is greater than the length to split the slice at, the function will panic
282before it attempts to use that index.
283
284Then we return two mutable slices in a tuple: one from the start of the
285original slice to the `mid` index and another from `mid` to the end of the
286slice.
287
288When we try to compile the code in Listing 19-5, we’ll get an error:
289
290```
291error[E0499]: cannot borrow `*values` as mutable more than once at a time
292 --> src/main.rs:6:31
293 |
2941 | fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
295 | - let's call the lifetime of this reference `'1`
296...
2976 | (&mut values[..mid], &mut values[mid..])
298 | --------------------------^^^^^^--------
299 | | | |
300 | | | second mutable borrow occurs here
301 | | first mutable borrow occurs here
302 | returning this value requires that `*values` is borrowed for `'1`
303```
304
305Rust’s borrow checker can’t understand that we’re borrowing different parts of
306the slice; it only knows that we’re borrowing from the same slice twice.
307Borrowing different parts of a slice is fundamentally okay because the two
308slices aren’t overlapping, but Rust isn’t smart enough to know this. When we
309know code is okay, but Rust doesn’t, it’s time to reach for unsafe code.
310
311Listing 19-6 shows how to use an `unsafe` block, a raw pointer, and some calls
312to unsafe functions to make the implementation of `split_at_mut` work.
313
314```
315use std::slice;
316
317fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
318 [1] let len = values.len();
319 [2] let ptr = values.as_mut_ptr();
320
321 [3] assert!(mid <= len);
322
323 [4] unsafe {
324 (
325 [5] slice::from_raw_parts_mut(ptr, mid),
326 [6] slice::from_raw_parts_mut(ptr.add(mid), len - mid),
327 )
328 }
329}
330```
331
332Listing 19-6: Using unsafe code in the implementation of the `split_at_mut`
333function
334
923072b8
FG
335
336Recall from “The Slice Type” section in Chapter 4 that a slice is a pointer to
5e7ed085
FG
337some data and the length of the slice. We use the `len` method to get the
338length of a slice [1] and the `as_mut_ptr` method to access the raw pointer of
339a slice [2]. In this case, because we have a mutable slice to `i32` values,
340`as_mut_ptr` returns a raw pointer with the type `*mut i32`, which we’ve stored
341in the variable `ptr`.
342
343We keep the assertion that the `mid` index is within the slice [3]. Then we get
344to the unsafe code [4]: the `slice::from_raw_parts_mut` function takes a raw
923072b8
FG
345pointer and a length, and it creates a slice. We use it to create a slice that
346starts from `ptr` and is `mid` items long [5]. Then we call the `add` method on
347`ptr` with `mid` as an argument to get a raw pointer that starts at `mid`, and
348we create a slice using that pointer and the remaining number of items after
349`mid` as the length [6].
5e7ed085
FG
350
351The function `slice::from_raw_parts_mut` is unsafe because it takes a raw
352pointer and must trust that this pointer is valid. The `add` method on raw
353pointers is also unsafe, because it must trust that the offset location is also
354a valid pointer. Therefore, we had to put an `unsafe` block around our calls to
355`slice::from_raw_parts_mut` and `add` so we could call them. By looking at
356the code and by adding the assertion that `mid` must be less than or equal to
357`len`, we can tell that all the raw pointers used within the `unsafe` block
358will be valid pointers to data within the slice. This is an acceptable and
359appropriate use of `unsafe`.
360
361Note that we don’t need to mark the resulting `split_at_mut` function as
362`unsafe`, and we can call this function from safe Rust. We’ve created a safe
363abstraction to the unsafe code with an implementation of the function that uses
364`unsafe` code in a safe way, because it creates only valid pointers from the
365data this function has access to.
366
367In contrast, the use of `slice::from_raw_parts_mut` in Listing 19-7 would
368likely crash when the slice is used. This code takes an arbitrary memory
369location and creates a slice 10,000 items long.
370
371```
372use std::slice;
373
374let address = 0x01234usize;
375let r = address as *mut i32;
376
377let values: &[i32] = unsafe { slice::from_raw_parts_mut(r, 10000) };
378```
379
380Listing 19-7: Creating a slice from an arbitrary memory location
381
382We don’t own the memory at this arbitrary location, and there is no guarantee
383that the slice this code creates contains valid `i32` values. Attempting to use
384`values` as though it’s a valid slice results in undefined behavior.
385
386#### Using `extern` Functions to Call External Code
387
388Sometimes, your Rust code might need to interact with code written in another
923072b8 389language. For this, Rust has the keyword `extern` that facilitates the creation
5e7ed085
FG
390and use of a *Foreign Function Interface (FFI)*. An FFI is a way for a
391programming language to define functions and enable a different (foreign)
392programming language to call those functions.
393
394Listing 19-8 demonstrates how to set up an integration with the `abs` function
395from the C standard library. Functions declared within `extern` blocks are
396always unsafe to call from Rust code. The reason is that other languages don’t
397enforce Rust’s rules and guarantees, and Rust can’t check them, so
398responsibility falls on the programmer to ensure safety.
399
400Filename: src/main.rs
401
402```
403extern "C" {
404 fn abs(input: i32) -> i32;
405}
406
407fn main() {
408 unsafe {
409 println!("Absolute value of -3 according to C: {}", abs(-3));
410 }
411}
412```
413
414Listing 19-8: Declaring and calling an `extern` function defined in another
415language
416
417Within the `extern "C"` block, we list the names and signatures of external
418functions from another language we want to call. The `"C"` part defines which
419*application binary interface (ABI)* the external function uses: the ABI
420defines how to call the function at the assembly level. The `"C"` ABI is the
421most common and follows the C programming language’s ABI.
422
923072b8
FG
423<!-- Totally optional - but do we want to mention the other external types
424that Rust supports here? Also, do we want to mention there are helper
425crates for connecting to other languages, include C++?
426/JT -->
427<!-- I don't really want to get into the other external types or other
428languages; there are other resources that cover these topics better than I
429could here. /Carol -->
430
5e7ed085
FG
431> #### Calling Rust Functions from Other Languages
432>
433> We can also use `extern` to create an interface that allows other languages
923072b8
FG
434> to call Rust functions. Instead of an creating a whole `extern` block, we add
435> the `extern` keyword and specify the ABI to use just before the `fn` keyword
436> for the relevant function. We also need to add a `#[no_mangle]` annotation to
437> tell the Rust compiler not to mangle the name of this function. *Mangling* is
438> when a compiler changes the name we’ve given a function to a different name
439> that contains more information for other parts of the compilation process to
440> consume but is less human readable. Every programming language compiler
441> mangles names slightly differently, so for a Rust function to be nameable by
442> other languages, we must disable the Rust compiler’s name mangling.
5e7ed085
FG
443>
444> In the following example, we make the `call_from_c` function accessible from
445> C code, after it’s compiled to a shared library and linked from C:
446>
447> ```
448> #[no_mangle]
449> pub extern "C" fn call_from_c() {
450> println!("Just called a Rust function from C!");
451> }
452> ```
453>
454> This usage of `extern` does not require `unsafe`.
455
456### Accessing or Modifying a Mutable Static Variable
457
923072b8
FG
458In this book, we’ve not yet talked about *global variables*, which Rust does
459support but can be problematic with Rust’s ownership rules. If two threads are
5e7ed085
FG
460accessing the same mutable global variable, it can cause a data race.
461
462In Rust, global variables are called *static* variables. Listing 19-9 shows an
463example declaration and use of a static variable with a string slice as a
464value.
465
466Filename: src/main.rs
467
468```
469static HELLO_WORLD: &str = "Hello, world!";
470
471fn main() {
472 println!("name is: {}", HELLO_WORLD);
473}
474```
475
476Listing 19-9: Defining and using an immutable static variable
477
478Static variables are similar to constants, which we discussed in the
479“Differences Between Variables and Constants” section in Chapter 3. The names
480of static variables are in `SCREAMING_SNAKE_CASE` by convention. Static
481variables can only store references with the `'static` lifetime, which means
482the Rust compiler can figure out the lifetime and we aren’t required to
483annotate it explicitly. Accessing an immutable static variable is safe.
484
923072b8
FG
485A subtle difference between constants and immutable static variables is that
486values in a static variable have a fixed address in memory. Using the value
487will always access the same data. Constants, on the other hand, are allowed to
488duplicate their data whenever they’re used. Another difference is that static
5e7ed085
FG
489variables can be mutable. Accessing and modifying mutable static variables is
490*unsafe*. Listing 19-10 shows how to declare, access, and modify a mutable
491static variable named `COUNTER`.
492
493Filename: src/main.rs
494
495```
496static mut COUNTER: u32 = 0;
497
498fn add_to_count(inc: u32) {
499 unsafe {
500 COUNTER += inc;
501 }
502}
503
504fn main() {
505 add_to_count(3);
506
507 unsafe {
508 println!("COUNTER: {}", COUNTER);
509 }
510}
511```
512
513Listing 19-10: Reading from or writing to a mutable static variable is unsafe
514
515As with regular variables, we specify mutability using the `mut` keyword. Any
516code that reads or writes from `COUNTER` must be within an `unsafe` block. This
517code compiles and prints `COUNTER: 3` as we would expect because it’s single
518threaded. Having multiple threads access `COUNTER` would likely result in data
519races.
520
521With mutable data that is globally accessible, it’s difficult to ensure there
522are no data races, which is why Rust considers mutable static variables to be
523unsafe. Where possible, it’s preferable to use the concurrency techniques and
524thread-safe smart pointers we discussed in Chapter 16 so the compiler checks
525that data accessed from different threads is done safely.
526
527### Implementing an Unsafe Trait
528
923072b8
FG
529We can use `unsafe` to implement an unsafe trait. A trait is unsafe when at
530least one of its methods has some invariant that the compiler can’t verify. We
531declare that a trait is `unsafe` by adding the `unsafe` keyword before `trait`
532and marking the implementation of the trait as `unsafe` too, as shown in
533Listing 19-11.
5e7ed085
FG
534
535```
536unsafe trait Foo {
537 // methods go here
538}
539
540unsafe impl Foo for i32 {
541 // method implementations go here
542}
543
544fn main() {}
545```
546
547Listing 19-11: Defining and implementing an unsafe trait
548
549By using `unsafe impl`, we’re promising that we’ll uphold the invariants that
550the compiler can’t verify.
551
552As an example, recall the `Sync` and `Send` marker traits we discussed in the
553“Extensible Concurrency with the `Sync` and `Send` Traits” section in Chapter
55416: the compiler implements these traits automatically if our types are
555composed entirely of `Send` and `Sync` types. If we implement a type that
556contains a type that is not `Send` or `Sync`, such as raw pointers, and we want
557to mark that type as `Send` or `Sync`, we must use `unsafe`. Rust can’t verify
558that our type upholds the guarantees that it can be safely sent across threads
559or accessed from multiple threads; therefore, we need to do those checks
560manually and indicate as such with `unsafe`.
561
562### Accessing Fields of a Union
563
564The final action that works only with `unsafe` is accessing fields of a
565*union*. A `union` is similar to a `struct`, but only one declared field is
566used in a particular instance at one time. Unions are primarily used to
567interface with unions in C code. Accessing union fields is unsafe because Rust
568can’t guarantee the type of the data currently being stored in the union
569instance. You can learn more about unions in the Rust Reference at
570*https://doc.rust-lang.org/reference/items/unions.html*.
571
572### When to Use Unsafe Code
573
923072b8
FG
574Using `unsafe` to use one of the five superpowers just discussed isn’t wrong or
575even frowned upon, but it is trickier to get `unsafe` code correct because the
576compiler can’t help uphold memory safety. When you have a reason to use
577`unsafe` code, you can do so, and having the explicit `unsafe` annotation makes
578it easier to track down the source of problems when they occur.
5e7ed085
FG
579
580## Advanced Traits
581
582We first covered traits in the “Traits: Defining Shared Behavior” section of
583Chapter 10, but we didn’t discuss the more advanced details. Now that you know
584more about Rust, we can get into the nitty-gritty.
585
586### Specifying Placeholder Types in Trait Definitions with Associated Types
587
588*Associated types* connect a type placeholder with a trait such that the trait
589method definitions can use these placeholder types in their signatures. The
923072b8
FG
590implementor of a trait will specify the concrete type to be used instead of the
591placeholder type for the particular implementation. That way, we can define a
592trait that uses some types without needing to know exactly what those types are
593until the trait is implemented.
5e7ed085
FG
594
595We’ve described most of the advanced features in this chapter as being rarely
596needed. Associated types are somewhere in the middle: they’re used more rarely
597than features explained in the rest of the book but more commonly than many of
598the other features discussed in this chapter.
599
600One example of a trait with an associated type is the `Iterator` trait that the
601standard library provides. The associated type is named `Item` and stands in
602for the type of the values the type implementing the `Iterator` trait is
923072b8
FG
603iterating over. The definition of the `Iterator` trait is as shown in Listing
60419-12.
5e7ed085
FG
605
606```
607pub trait Iterator {
608 type Item;
609
610 fn next(&mut self) -> Option<Self::Item>;
611}
612```
613
614Listing 19-12: The definition of the `Iterator` trait that has an associated
615type `Item`
616
923072b8
FG
617The type `Item` is a placeholder, and the `next` method’s definition shows that
618it will return values of type `Option<Self::Item>`. Implementors of the
5e7ed085
FG
619`Iterator` trait will specify the concrete type for `Item`, and the `next`
620method will return an `Option` containing a value of that concrete type.
621
622Associated types might seem like a similar concept to generics, in that the
623latter allow us to define a function without specifying what types it can
923072b8
FG
624handle. To examine the difference between the two concepts, we’ll look at an
625implementation of the `Iterator` trait on a type named `Counter` that specifies
626the `Item` type is `u32`:
5e7ed085
FG
627
628Filename: src/lib.rs
629
630```
631impl Iterator for Counter {
632 type Item = u32;
633
634 fn next(&mut self) -> Option<Self::Item> {
635 // --snip--
636```
637
638This syntax seems comparable to that of generics. So why not just define the
639`Iterator` trait with generics, as shown in Listing 19-13?
640
641```
642pub trait Iterator<T> {
643 fn next(&mut self) -> Option<T>;
644}
645```
646
647Listing 19-13: A hypothetical definition of the `Iterator` trait using generics
648
649The difference is that when using generics, as in Listing 19-13, we must
650annotate the types in each implementation; because we can also implement
651`Iterator<String> for Counter` or any other type, we could have multiple
652implementations of `Iterator` for `Counter`. In other words, when a trait has a
653generic parameter, it can be implemented for a type multiple times, changing
654the concrete types of the generic type parameters each time. When we use the
655`next` method on `Counter`, we would have to provide type annotations to
656indicate which implementation of `Iterator` we want to use.
657
658With associated types, we don’t need to annotate types because we can’t
659implement a trait on a type multiple times. In Listing 19-12 with the
660definition that uses associated types, we can only choose what the type of
661`Item` will be once, because there can only be one `impl Iterator for Counter`.
662We don’t have to specify that we want an iterator of `u32` values everywhere
663that we call `next` on `Counter`.
664
923072b8
FG
665Associated types also become part of the trait’s contract: implementors of the
666trait must provide a type to stand in for the associated type placeholder.
667Associated types often have a name that describes how the type will be used,
668and documenting the associated type in the API documentation is good practice.
669
670<!-- It also makes the type a part of the trait's contract. Not sure if
671too subtle of a point, but the associated type of a trait is part of the
672require things that the implementor must provide. They often also have a name
673that may clue you in as to how that required type will be used.
674/JT -->
675<!-- Great points, I've added a small paragraph here! /Carol -->
676
5e7ed085
FG
677### Default Generic Type Parameters and Operator Overloading
678
679When we use generic type parameters, we can specify a default concrete type for
680the generic type. This eliminates the need for implementors of the trait to
923072b8
FG
681specify a concrete type if the default type works. You specify a default type
682when declaring a generic type with the `<PlaceholderType=ConcreteType>` syntax.
5e7ed085 683
923072b8
FG
684A great example of a situation where this technique is useful is with *operator
685overloading*, in which you customize the behavior of an operator (such as `+`)
686in particular situations.
5e7ed085
FG
687
688Rust doesn’t allow you to create your own operators or overload arbitrary
689operators. But you can overload the operations and corresponding traits listed
690in `std::ops` by implementing the traits associated with the operator. For
691example, in Listing 19-14 we overload the `+` operator to add two `Point`
692instances together. We do this by implementing the `Add` trait on a `Point`
693struct:
694
695Filename: src/main.rs
696
697```
698use std::ops::Add;
699
700#[derive(Debug, Copy, Clone, PartialEq)]
701struct Point {
702 x: i32,
703 y: i32,
704}
705
706impl Add for Point {
707 type Output = Point;
708
709 fn add(self, other: Point) -> Point {
710 Point {
711 x: self.x + other.x,
712 y: self.y + other.y,
713 }
714 }
715}
716
717fn main() {
718 assert_eq!(
719 Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
720 Point { x: 3, y: 3 }
721 );
722}
723```
724
725Listing 19-14: Implementing the `Add` trait to overload the `+` operator for
726`Point` instances
727
728The `add` method adds the `x` values of two `Point` instances and the `y`
729values of two `Point` instances to create a new `Point`. The `Add` trait has an
730associated type named `Output` that determines the type returned from the `add`
731method.
732
733The default generic type in this code is within the `Add` trait. Here is its
734definition:
735
736```
737trait Add<Rhs=Self> {
738 type Output;
739
740 fn add(self, rhs: Rhs) -> Self::Output;
741}
742```
743
744This code should look generally familiar: a trait with one method and an
745associated type. The new part is `Rhs=Self`: this syntax is called *default
746type parameters*. The `Rhs` generic type parameter (short for “right hand
747side”) defines the type of the `rhs` parameter in the `add` method. If we don’t
748specify a concrete type for `Rhs` when we implement the `Add` trait, the type
749of `Rhs` will default to `Self`, which will be the type we’re implementing
750`Add` on.
751
752When we implemented `Add` for `Point`, we used the default for `Rhs` because we
753wanted to add two `Point` instances. Let’s look at an example of implementing
754the `Add` trait where we want to customize the `Rhs` type rather than using the
755default.
756
757We have two structs, `Millimeters` and `Meters`, holding values in different
758units. This thin wrapping of an existing type in another struct is known as the
759*newtype pattern*, which we describe in more detail in the “Using the Newtype
923072b8
FG
760Pattern to Implement External Traits on External Types” section. We want to add
761values in millimeters to values in meters and have the implementation of `Add`
762do the conversion correctly. We can implement `Add` for `Millimeters` with
763`Meters` as the `Rhs`, as shown in Listing 19-15.
5e7ed085
FG
764
765Filename: src/lib.rs
766
767```
768use std::ops::Add;
769
770struct Millimeters(u32);
771struct Meters(u32);
772
773impl Add<Meters> for Millimeters {
774 type Output = Millimeters;
775
776 fn add(self, other: Meters) -> Millimeters {
777 Millimeters(self.0 + (other.0 * 1000))
778 }
779}
780```
781
782Listing 19-15: Implementing the `Add` trait on `Millimeters` to add
783`Millimeters` to `Meters`
784
785To add `Millimeters` and `Meters`, we specify `impl Add<Meters>` to set the
786value of the `Rhs` type parameter instead of using the default of `Self`.
787
788You’ll use default type parameters in two main ways:
789
790* To extend a type without breaking existing code
791* To allow customization in specific cases most users won’t need
792
793The standard library’s `Add` trait is an example of the second purpose:
794usually, you’ll add two like types, but the `Add` trait provides the ability to
795customize beyond that. Using a default type parameter in the `Add` trait
796definition means you don’t have to specify the extra parameter most of the
797time. In other words, a bit of implementation boilerplate isn’t needed, making
798it easier to use the trait.
799
800The first purpose is similar to the second but in reverse: if you want to add a
801type parameter to an existing trait, you can give it a default to allow
802extension of the functionality of the trait without breaking the existing
803implementation code.
804
805### Fully Qualified Syntax for Disambiguation: Calling Methods with the Same Name
806
807Nothing in Rust prevents a trait from having a method with the same name as
808another trait’s method, nor does Rust prevent you from implementing both traits
809on one type. It’s also possible to implement a method directly on the type with
810the same name as methods from traits.
811
812When calling methods with the same name, you’ll need to tell Rust which one you
813want to use. Consider the code in Listing 19-16 where we’ve defined two traits,
814`Pilot` and `Wizard`, that both have a method called `fly`. We then implement
815both traits on a type `Human` that already has a method named `fly` implemented
816on it. Each `fly` method does something different.
817
818Filename: src/main.rs
819
820```
821trait Pilot {
822 fn fly(&self);
823}
824
825trait Wizard {
826 fn fly(&self);
827}
828
829struct Human;
830
831impl Pilot for Human {
832 fn fly(&self) {
833 println!("This is your captain speaking.");
834 }
835}
836
837impl Wizard for Human {
838 fn fly(&self) {
839 println!("Up!");
840 }
841}
842
843impl Human {
844 fn fly(&self) {
845 println!("*waving arms furiously*");
846 }
847}
848```
849
850Listing 19-16: Two traits are defined to have a `fly` method and are
851implemented on the `Human` type, and a `fly` method is implemented on `Human`
852directly
853
854When we call `fly` on an instance of `Human`, the compiler defaults to calling
855the method that is directly implemented on the type, as shown in Listing 19-17.
856
857Filename: src/main.rs
858
859```
860fn main() {
861 let person = Human;
862 person.fly();
863}
864```
865
866Listing 19-17: Calling `fly` on an instance of `Human`
867
868Running this code will print `*waving arms furiously*`, showing that Rust
869called the `fly` method implemented on `Human` directly.
870
871To call the `fly` methods from either the `Pilot` trait or the `Wizard` trait,
872we need to use more explicit syntax to specify which `fly` method we mean.
873Listing 19-18 demonstrates this syntax.
874
875Filename: src/main.rs
876
877```
878fn main() {
879 let person = Human;
880 Pilot::fly(&person);
881 Wizard::fly(&person);
882 person.fly();
883}
884```
885
886Listing 19-18: Specifying which trait’s `fly` method we want to call
887
888Specifying the trait name before the method name clarifies to Rust which
889implementation of `fly` we want to call. We could also write
890`Human::fly(&person)`, which is equivalent to the `person.fly()` that we used
891in Listing 19-18, but this is a bit longer to write if we don’t need to
892disambiguate.
893
894Running this code prints the following:
895
896```
897$ cargo run
898 Compiling traits-example v0.1.0 (file:///projects/traits-example)
899 Finished dev [unoptimized + debuginfo] target(s) in 0.46s
900 Running `target/debug/traits-example`
901This is your captain speaking.
902Up!
903*waving arms furiously*
904```
905
906Because the `fly` method takes a `self` parameter, if we had two *types* that
907both implement one *trait*, Rust could figure out which implementation of a
908trait to use based on the type of `self`.
909
910However, associated functions that are not methods don’t have a `self`
911parameter. When there are multiple types or traits that define non-method
912functions with the same function name, Rust doesn't always know which type you
923072b8
FG
913mean unless you use *fully qualified syntax*. For example, in Listing 19-19 we
914create a trait for an animal shelter that wants to name all baby dogs *Spot*.
915We make an `Animal` trait with an associated non-method function `baby_name`.
916The `Animal` trait is implemented for the struct `Dog`, on which we also
917provide an associated non-method function `baby_name` directly.
5e7ed085
FG
918
919Filename: src/main.rs
920
921```
922trait Animal {
923 fn baby_name() -> String;
924}
925
926struct Dog;
927
928impl Dog {
929 fn baby_name() -> String {
930 String::from("Spot")
931 }
932}
933
934impl Animal for Dog {
935 fn baby_name() -> String {
936 String::from("puppy")
937 }
938}
939
940fn main() {
941 println!("A baby dog is called a {}", Dog::baby_name());
942}
943```
944
945Listing 19-19: A trait with an associated function and a type with an
946associated function of the same name that also implements the trait
947
923072b8
FG
948We implement the code for naming all puppies Spot in the `baby_name` associated
949function that is defined on `Dog`. The `Dog` type also implements the trait
950`Animal`, which describes characteristics that all animals have. Baby dogs are
951called puppies, and that is expressed in the implementation of the `Animal`
952trait on `Dog` in the `baby_name` function associated with the `Animal` trait.
5e7ed085
FG
953
954In `main`, we call the `Dog::baby_name` function, which calls the associated
955function defined on `Dog` directly. This code prints the following:
956
957```
958A baby dog is called a Spot
959```
960
961This output isn’t what we wanted. We want to call the `baby_name` function that
962is part of the `Animal` trait that we implemented on `Dog` so the code prints
963`A baby dog is called a puppy`. The technique of specifying the trait name that
964we used in Listing 19-18 doesn’t help here; if we change `main` to the code in
965Listing 19-20, we’ll get a compilation error.
966
967Filename: src/main.rs
968
969```
970fn main() {
971 println!("A baby dog is called a {}", Animal::baby_name());
972}
973```
974
975Listing 19-20: Attempting to call the `baby_name` function from the `Animal`
976trait, but Rust doesn’t know which implementation to use
977
978Because `Animal::baby_name` doesn’t have a `self` parameter, and there could be
979other types that implement the `Animal` trait, Rust can’t figure out which
980implementation of `Animal::baby_name` we want. We’ll get this compiler error:
981
982```
983error[E0283]: type annotations needed
984 --> src/main.rs:20:43
985 |
98620 | println!("A baby dog is called a {}", Animal::baby_name());
987 | ^^^^^^^^^^^^^^^^^ cannot infer type
988 |
989 = note: cannot satisfy `_: Animal`
990```
991
992To disambiguate and tell Rust that we want to use the implementation of
993`Animal` for `Dog` as opposed to the implementation of `Animal` for some other
994type, we need to use fully qualified syntax. Listing 19-21 demonstrates how to
995use fully qualified syntax.
996
997Filename: src/main.rs
998
999```
1000fn main() {
1001 println!("A baby dog is called a {}", <Dog as Animal>::baby_name());
1002}
1003```
1004
1005Listing 19-21: Using fully qualified syntax to specify that we want to call the
1006`baby_name` function from the `Animal` trait as implemented on `Dog`
1007
1008We’re providing Rust with a type annotation within the angle brackets, which
1009indicates we want to call the `baby_name` method from the `Animal` trait as
1010implemented on `Dog` by saying that we want to treat the `Dog` type as an
1011`Animal` for this function call. This code will now print what we want:
1012
1013```
1014A baby dog is called a puppy
1015```
1016
1017In general, fully qualified syntax is defined as follows:
1018
1019```
1020<Type as Trait>::function(receiver_if_method, next_arg, ...);
1021```
1022
1023For associated functions that aren’t methods, there would not be a `receiver`:
1024there would only be the list of other arguments. You could use fully qualified
1025syntax everywhere that you call functions or methods. However, you’re allowed
1026to omit any part of this syntax that Rust can figure out from other information
1027in the program. You only need to use this more verbose syntax in cases where
1028there are multiple implementations that use the same name and Rust needs help
1029to identify which implementation you want to call.
1030
1031### Using Supertraits to Require One Trait’s Functionality Within Another Trait
1032
923072b8
FG
1033Sometimes, you might write a trait definition that depends on another trait:
1034for a type to implement the first trait, you want to require that type to also
1035implement the second trait. You would do this so that your trait definition can
1036make use of the associated items of the second trait. The trait your trait
1037definition is relying on is called a *supertrait* of your trait.
5e7ed085
FG
1038
1039For example, let’s say we want to make an `OutlinePrint` trait with an
923072b8
FG
1040`outline_print` method that will print a given value formatted so that it's
1041framed in asterisks. That is, given a `Point` struct that implements the
1042standard library trait `Display` to result in `(x, y)`, when we
5e7ed085
FG
1043call `outline_print` on a `Point` instance that has `1` for `x` and `3` for
1044`y`, it should print the following:
1045
1046```
1047**********
1048* *
1049* (1, 3) *
1050* *
1051**********
1052```
1053
923072b8
FG
1054In the implementation of the `outline_print` method, we want to use the
1055`Display` trait’s functionality. Therefore, we need to specify that the
1056`OutlinePrint` trait will work only for types that also implement `Display` and
1057provide the functionality that `OutlinePrint` needs. We can do that in the
1058trait definition by specifying `OutlinePrint: Display`. This technique is
1059similar to adding a trait bound to the trait. Listing 19-22 shows an
1060implementation of the `OutlinePrint` trait.
5e7ed085
FG
1061
1062Filename: src/main.rs
1063
1064```
1065use std::fmt;
1066
1067trait OutlinePrint: fmt::Display {
1068 fn outline_print(&self) {
1069 let output = self.to_string();
1070 let len = output.len();
1071 println!("{}", "*".repeat(len + 4));
1072 println!("*{}*", " ".repeat(len + 2));
1073 println!("* {} *", output);
1074 println!("*{}*", " ".repeat(len + 2));
1075 println!("{}", "*".repeat(len + 4));
1076 }
1077}
1078```
1079
1080Listing 19-22: Implementing the `OutlinePrint` trait that requires the
1081functionality from `Display`
1082
1083Because we’ve specified that `OutlinePrint` requires the `Display` trait, we
1084can use the `to_string` function that is automatically implemented for any type
1085that implements `Display`. If we tried to use `to_string` without adding a
1086colon and specifying the `Display` trait after the trait name, we’d get an
1087error saying that no method named `to_string` was found for the type `&Self` in
1088the current scope.
1089
1090Let’s see what happens when we try to implement `OutlinePrint` on a type that
1091doesn’t implement `Display`, such as the `Point` struct:
1092
1093Filename: src/main.rs
1094
1095```
1096struct Point {
1097 x: i32,
1098 y: i32,
1099}
1100
1101impl OutlinePrint for Point {}
1102```
1103
1104We get an error saying that `Display` is required but not implemented:
1105
1106```
1107error[E0277]: `Point` doesn't implement `std::fmt::Display`
1108 --> src/main.rs:20:6
1109 |
111020 | impl OutlinePrint for Point {}
1111 | ^^^^^^^^^^^^ `Point` cannot be formatted with the default formatter
1112 |
1113 = help: the trait `std::fmt::Display` is not implemented for `Point`
1114 = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
1115note: required by a bound in `OutlinePrint`
1116 --> src/main.rs:3:21
1117 |
11183 | trait OutlinePrint: fmt::Display {
1119 | ^^^^^^^^^^^^ required by this bound in `OutlinePrint`
1120```
1121
1122To fix this, we implement `Display` on `Point` and satisfy the constraint that
1123`OutlinePrint` requires, like so:
1124
1125Filename: src/main.rs
1126
1127```
1128use std::fmt;
1129
1130impl fmt::Display for Point {
1131 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
1132 write!(f, "({}, {})", self.x, self.y)
1133 }
1134}
1135```
1136
1137Then implementing the `OutlinePrint` trait on `Point` will compile
1138successfully, and we can call `outline_print` on a `Point` instance to display
1139it within an outline of asterisks.
1140
1141### Using the Newtype Pattern to Implement External Traits on External Types
1142
1143In Chapter 10 in the “Implementing a Trait on a Type” section, we mentioned the
923072b8
FG
1144orphan rule that states we’re only allowed to implement a trait on a type if
1145either the trait or the type are local to our crate.
1146It’s possible to get
5e7ed085
FG
1147around this restriction using the *newtype pattern*, which involves creating a
1148new type in a tuple struct. (We covered tuple structs in the “Using Tuple
1149Structs without Named Fields to Create Different Types” section of Chapter 5.)
1150The tuple struct will have one field and be a thin wrapper around the type we
1151want to implement a trait for. Then the wrapper type is local to our crate, and
1152we can implement the trait on the wrapper. *Newtype* is a term that originates
1153from the Haskell programming language. There is no runtime performance penalty
1154for using this pattern, and the wrapper type is elided at compile time.
1155
1156As an example, let’s say we want to implement `Display` on `Vec<T>`, which the
1157orphan rule prevents us from doing directly because the `Display` trait and the
1158`Vec<T>` type are defined outside our crate. We can make a `Wrapper` struct
1159that holds an instance of `Vec<T>`; then we can implement `Display` on
1160`Wrapper` and use the `Vec<T>` value, as shown in Listing 19-23.
1161
1162Filename: src/main.rs
1163
1164```
1165use std::fmt;
1166
1167struct Wrapper(Vec<String>);
1168
1169impl fmt::Display for Wrapper {
1170 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
1171 write!(f, "[{}]", self.0.join(", "))
1172 }
1173}
1174
1175fn main() {
1176 let w = Wrapper(vec![String::from("hello"), String::from("world")]);
1177 println!("w = {}", w);
1178}
1179```
1180
1181Listing 19-23: Creating a `Wrapper` type around `Vec<String>` to implement
1182`Display`
1183
1184The implementation of `Display` uses `self.0` to access the inner `Vec<T>`,
1185because `Wrapper` is a tuple struct and `Vec<T>` is the item at index 0 in the
1186tuple. Then we can use the functionality of the `Display` type on `Wrapper`.
1187
1188The downside of using this technique is that `Wrapper` is a new type, so it
1189doesn’t have the methods of the value it’s holding. We would have to implement
1190all the methods of `Vec<T>` directly on `Wrapper` such that the methods
1191delegate to `self.0`, which would allow us to treat `Wrapper` exactly like a
1192`Vec<T>`. If we wanted the new type to have every method the inner type has,
1193implementing the `Deref` trait (discussed in Chapter 15 in the “Treating Smart
1194Pointers Like Regular References with the `Deref` Trait” section) on the
1195`Wrapper` to return the inner type would be a solution. If we don’t want the
1196`Wrapper` type to have all the methods of the inner type—for example, to
1197restrict the `Wrapper` type’s behavior—we would have to implement just the
1198methods we do want manually.
1199
923072b8
FG
1200This newtype pattern is also useful even when traits are not involved. Let’s
1201switch focus and look at some advanced ways to interact with Rust’s type system.
5e7ed085
FG
1202
1203## Advanced Types
1204
923072b8
FG
1205The Rust type system has some features that we’ve so far mentioned but haven’t
1206yet discussed. We’ll start by discussing newtypes in general as we examine why
1207newtypes are useful as types. Then we’ll move on to type aliases, a feature
1208similar to newtypes but with slightly different semantics. We’ll also discuss
1209the `!` type and dynamically sized types.
5e7ed085
FG
1210
1211### Using the Newtype Pattern for Type Safety and Abstraction
1212
1213> Note: This section assumes you’ve read the earlier section “Using the
1214> Newtype Pattern to Implement External Traits on External
1215> Types.”
1216
923072b8
FG
1217The newtype pattern is also useful for tasks beyond those we’ve discussed so
1218far, including statically enforcing that values are never confused and
1219indicating the units of a value. You saw an example of using newtypes to
1220indicate units in Listing 19-15: recall that the `Millimeters` and `Meters`
1221structs wrapped `u32` values in a newtype. If we wrote a function with a
1222parameter of type `Millimeters`, we couldn’t compile a program that
1223accidentally tried to call that function with a value of type `Meters` or a
1224plain `u32`.
5e7ed085 1225
923072b8 1226We can also use the newtype pattern to abstract away some implementation
5e7ed085
FG
1227details of a type: the new type can expose a public API that is different from
1228the API of the private inner type.
1229
1230Newtypes can also hide internal implementation. For example, we could provide a
1231`People` type to wrap a `HashMap<i32, String>` that stores a person’s ID
1232associated with their name. Code using `People` would only interact with the
1233public API we provide, such as a method to add a name string to the `People`
1234collection; that code wouldn’t need to know that we assign an `i32` ID to names
1235internally. The newtype pattern is a lightweight way to achieve encapsulation
1236to hide implementation details, which we discussed in the “Encapsulation that
1237Hides Implementation Details” section of Chapter 17.
1238
1239### Creating Type Synonyms with Type Aliases
1240
923072b8
FG
1241Rust provides the ability to declare a *type alias* to give an existing type
1242another name. For this we use the `type` keyword. For example, we can create
1243the alias `Kilometers` to `i32` like so:
5e7ed085
FG
1244
1245```
1246type Kilometers = i32;
1247```
1248
1249Now, the alias `Kilometers` is a *synonym* for `i32`; unlike the `Millimeters`
1250and `Meters` types we created in Listing 19-15, `Kilometers` is not a separate,
1251new type. Values that have the type `Kilometers` will be treated the same as
1252values of type `i32`:
1253
1254```
1255type Kilometers = i32;
1256
1257let x: i32 = 5;
1258let y: Kilometers = 5;
1259
1260println!("x + y = {}", x + y);
1261```
1262
1263Because `Kilometers` and `i32` are the same type, we can add values of both
1264types and we can pass `Kilometers` values to functions that take `i32`
1265parameters. However, using this method, we don’t get the type checking benefits
923072b8
FG
1266that we get from the newtype pattern discussed earlier. In other words, if we
1267mix up `Kilometers` and `i32` values somewhere, the compiler will not give us
1268an error.
1269
1270<!-- Having a few battle wounds trying to debug using this pattern, it's
1271definitely good to warn people that if they use type aliases to the same base
1272type in their program (like multiple aliases to `usize`), they're asking for
1273trouble as the typechecker will not help them if they mix up their types.
1274/JT -->
1275<!-- I'm not sure if JT was saying this paragraph was good or it could use more
1276emphasis? I've added a sentence to the end of the paragraph above in case it
1277was the latter /Carol -->
5e7ed085
FG
1278
1279The main use case for type synonyms is to reduce repetition. For example, we
1280might have a lengthy type like this:
1281
1282```
1283Box<dyn Fn() + Send + 'static>
1284```
1285
1286Writing this lengthy type in function signatures and as type annotations all
1287over the code can be tiresome and error prone. Imagine having a project full of
1288code like that in Listing 19-24.
1289
1290```
1291let f: Box<dyn Fn() + Send + 'static> = Box::new(|| println!("hi"));
1292
1293fn takes_long_type(f: Box<dyn Fn() + Send + 'static>) {
1294 // --snip--
1295}
1296
1297fn returns_long_type() -> Box<dyn Fn() + Send + 'static> {
1298 // --snip--
1299}
1300```
1301
1302Listing 19-24: Using a long type in many places
1303
1304A type alias makes this code more manageable by reducing the repetition. In
1305Listing 19-25, we’ve introduced an alias named `Thunk` for the verbose type and
1306can replace all uses of the type with the shorter alias `Thunk`.
1307
1308```
1309type Thunk = Box<dyn Fn() + Send + 'static>;
1310
1311let f: Thunk = Box::new(|| println!("hi"));
1312
1313fn takes_long_type(f: Thunk) {
1314 // --snip--
1315}
1316
1317fn returns_long_type() -> Thunk {
1318 // --snip--
1319}
1320```
1321
1322Listing 19-25: Introducing a type alias `Thunk` to reduce repetition
1323
1324This code is much easier to read and write! Choosing a meaningful name for a
1325type alias can help communicate your intent as well (*thunk* is a word for code
1326to be evaluated at a later time, so it’s an appropriate name for a closure that
1327gets stored).
1328
1329Type aliases are also commonly used with the `Result<T, E>` type for reducing
1330repetition. Consider the `std::io` module in the standard library. I/O
1331operations often return a `Result<T, E>` to handle situations when operations
1332fail to work. This library has a `std::io::Error` struct that represents all
1333possible I/O errors. Many of the functions in `std::io` will be returning
1334`Result<T, E>` where the `E` is `std::io::Error`, such as these functions in
1335the `Write` trait:
1336
1337```
1338use std::fmt;
1339use std::io::Error;
1340
1341pub trait Write {
1342 fn write(&mut self, buf: &[u8]) -> Result<usize, Error>;
1343 fn flush(&mut self) -> Result<(), Error>;
1344
1345 fn write_all(&mut self, buf: &[u8]) -> Result<(), Error>;
1346 fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<(), Error>;
1347}
1348```
1349
1350The `Result<..., Error>` is repeated a lot. As such, `std::io` has this type
1351alias declaration:
1352
1353```
1354type Result<T> = std::result::Result<T, std::io::Error>;
1355```
1356
1357Because this declaration is in the `std::io` module, we can use the fully
923072b8 1358qualified alias `std::io::Result<T>`; that is, a `Result<T, E>` with the `E`
5e7ed085
FG
1359filled in as `std::io::Error`. The `Write` trait function signatures end up
1360looking like this:
1361
1362```
1363pub trait Write {
1364 fn write(&mut self, buf: &[u8]) -> Result<usize>;
1365 fn flush(&mut self) -> Result<()>;
1366
1367 fn write_all(&mut self, buf: &[u8]) -> Result<()>;
1368 fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
1369}
1370```
1371
1372The type alias helps in two ways: it makes code easier to write *and* it gives
1373us a consistent interface across all of `std::io`. Because it’s an alias, it’s
1374just another `Result<T, E>`, which means we can use any methods that work on
1375`Result<T, E>` with it, as well as special syntax like the `?` operator.
1376
1377### The Never Type that Never Returns
1378
1379Rust has a special type named `!` that’s known in type theory lingo as the
1380*empty type* because it has no values. We prefer to call it the *never type*
1381because it stands in the place of the return type when a function will never
1382return. Here is an example:
1383
1384```
1385fn bar() -> ! {
1386 // --snip--
1387}
1388```
1389
1390This code is read as “the function `bar` returns never.” Functions that return
1391never are called *diverging functions*. We can’t create values of the type `!`
1392so `bar` can never possibly return.
1393
1394But what use is a type you can never create values for? Recall the code from
923072b8
FG
1395Listing 2-5, part of the number guessing game; we’ve reproduced a bit of it
1396here in Listing 19-26.
5e7ed085
FG
1397
1398```
1399let guess: u32 = match guess.trim().parse() {
1400 Ok(num) => num,
1401 Err(_) => continue,
1402};
1403```
1404
1405Listing 19-26: A `match` with an arm that ends in `continue`
1406
1407At the time, we skipped over some details in this code. In Chapter 6 in “The
923072b8
FG
1408`match` Control Flow Operator” section, we discussed that `match` arms must all
1409return the same type. So, for example, the following code doesn’t work:
5e7ed085
FG
1410
1411```
1412let guess = match guess.trim().parse() {
1413 Ok(_) => 5,
1414 Err(_) => "hello",
1415};
1416```
1417
1418The type of `guess` in this code would have to be an integer *and* a string,
1419and Rust requires that `guess` have only one type. So what does `continue`
1420return? How were we allowed to return a `u32` from one arm and have another arm
1421that ends with `continue` in Listing 19-26?
1422
1423As you might have guessed, `continue` has a `!` value. That is, when Rust
1424computes the type of `guess`, it looks at both match arms, the former with a
1425value of `u32` and the latter with a `!` value. Because `!` can never have a
1426value, Rust decides that the type of `guess` is `u32`.
1427
1428The formal way of describing this behavior is that expressions of type `!` can
1429be coerced into any other type. We’re allowed to end this `match` arm with
1430`continue` because `continue` doesn’t return a value; instead, it moves control
1431back to the top of the loop, so in the `Err` case, we never assign a value to
1432`guess`.
1433
923072b8
FG
1434The never type is useful with the `panic!` macro as well. Recall the `unwrap`
1435function that we call on `Option<T>` values to produce a value or panic with
1436this definition:
5e7ed085
FG
1437
1438```
1439impl<T> Option<T> {
1440 pub fn unwrap(self) -> T {
1441 match self {
1442 Some(val) => val,
1443 None => panic!("called `Option::unwrap()` on a `None` value"),
1444 }
1445 }
1446}
1447```
1448
1449In this code, the same thing happens as in the `match` in Listing 19-26: Rust
1450sees that `val` has the type `T` and `panic!` has the type `!`, so the result
1451of the overall `match` expression is `T`. This code works because `panic!`
1452doesn’t produce a value; it ends the program. In the `None` case, we won’t be
1453returning a value from `unwrap`, so this code is valid.
1454
1455One final expression that has the type `!` is a `loop`:
1456
1457```
1458print!("forever ");
1459
1460loop {
1461 print!("and ever ");
1462}
1463```
1464
1465Here, the loop never ends, so `!` is the value of the expression. However, this
1466wouldn’t be true if we included a `break`, because the loop would terminate
1467when it got to the `break`.
1468
1469### Dynamically Sized Types and the `Sized` Trait
1470
923072b8
FG
1471Rust needs to know certain details about its types, such as how much space to
1472allocate for a value of a particular type. This leaves one corner of its type
1473system a little confusing at first: the concept of *dynamically sized types*.
1474Sometimes referred to as *DSTs* or *unsized types*, these types let us write
1475code using values whose size we can know only at runtime.
5e7ed085
FG
1476
1477Let’s dig into the details of a dynamically sized type called `str`, which
1478we’ve been using throughout the book. That’s right, not `&str`, but `str` on
1479its own, is a DST. We can’t know how long the string is until runtime, meaning
1480we can’t create a variable of type `str`, nor can we take an argument of type
1481`str`. Consider the following code, which does not work:
1482
1483```
1484let s1: str = "Hello there!";
1485let s2: str = "How's it going?";
1486```
1487
1488Rust needs to know how much memory to allocate for any value of a particular
1489type, and all values of a type must use the same amount of memory. If Rust
1490allowed us to write this code, these two `str` values would need to take up the
1491same amount of space. But they have different lengths: `s1` needs 12 bytes of
1492storage and `s2` needs 15. This is why it’s not possible to create a variable
1493holding a dynamically sized type.
1494
1495So what do we do? In this case, you already know the answer: we make the types
923072b8
FG
1496of `s1` and `s2` a `&str` rather than a `str`. Recall from the “String Slices”
1497section of Chapter 4 that the slice data structure just stores the starting
1498position and the length of the slice. So although a `&T` is a single value that
1499stores the memory address of where the `T` is located, a `&str` is *two*
1500values: the address of the `str` and its length. As such, we can know the size
1501of a `&str` value at compile time: it’s twice the length of a `usize`. That is,
1502we always know the size of a `&str`, no matter how long the string it refers to
1503is. In general, this is the way in which dynamically sized types are used in
1504Rust: they have an extra bit of metadata that stores the size of the dynamic
1505information. The golden rule of dynamically sized types is that we must always
1506put values of dynamically sized types behind a pointer of some kind.
5e7ed085
FG
1507
1508We can combine `str` with all kinds of pointers: for example, `Box<str>` or
1509`Rc<str>`. In fact, you’ve seen this before but with a different dynamically
1510sized type: traits. Every trait is a dynamically sized type we can refer to by
1511using the name of the trait. In Chapter 17 in the “Using Trait Objects That
1512Allow for Values of Different Types” section, we mentioned that to use traits
1513as trait objects, we must put them behind a pointer, such as `&dyn Trait` or
1514`Box<dyn Trait>` (`Rc<dyn Trait>` would work too).
1515
923072b8
FG
1516To work with DSTs, Rust provides the `Sized` trait to determine whether or not
1517a type’s size is known at compile time. This trait is automatically implemented
1518for everything whose size is known at compile time. In addition, Rust
1519implicitly adds a bound on `Sized` to every generic function. That is, a
1520generic function definition like this:
5e7ed085
FG
1521
1522```
1523fn generic<T>(t: T) {
1524 // --snip--
1525}
1526```
1527
1528is actually treated as though we had written this:
1529
1530```
1531fn generic<T: Sized>(t: T) {
1532 // --snip--
1533}
1534```
1535
1536By default, generic functions will work only on types that have a known size at
1537compile time. However, you can use the following special syntax to relax this
1538restriction:
1539
1540```
1541fn generic<T: ?Sized>(t: &T) {
1542 // --snip--
1543}
1544```
1545
1546A trait bound on `?Sized` means “`T` may or may not be `Sized`” and this
1547notation overrides the default that generic types must have a known size at
1548compile time. The `?Trait` syntax with this meaning is only available for
1549`Sized`, not any other traits.
1550
1551Also note that we switched the type of the `t` parameter from `T` to `&T`.
1552Because the type might not be `Sized`, we need to use it behind some kind of
1553pointer. In this case, we’ve chosen a reference.
1554
1555Next, we’ll talk about functions and closures!
1556
1557## Advanced Functions and Closures
1558
1559This section explores some advanced features related to functions and closures,
1560including function pointers and returning closures.
1561
1562### Function Pointers
1563
1564We’ve talked about how to pass closures to functions; you can also pass regular
1565functions to functions! This technique is useful when you want to pass a
923072b8
FG
1566function you’ve already defined rather than defining a new closure. Functions
1567coerce to the type `fn` (with a lowercase f), not to be confused with the `Fn`
1568closure trait. The `fn` type is called a *function pointer*. Passing functions
5e7ed085 1569with function pointers will allow you to use functions as arguments to other
923072b8
FG
1570functions.
1571
1572The syntax for specifying that a parameter is a function pointer is similar to
1573that of closures, as shown in Listing 19-27, where we’ve defined a function
1574`add_one` that adds one to its parameter. The function `do_twice` takes two
1575parameters: a function pointer to any function that takes an `i32` parameter
1576and returns an `i32`, and one `i32 value`. The `do_twice` function calls the
1577function `f` twice, passing it the `arg` value, then adds the two function call
1578results together. The `main` function calls `do_twice` with the arguments
1579`add_one` and `5`.
5e7ed085
FG
1580
1581Filename: src/main.rs
1582
1583```
1584fn add_one(x: i32) -> i32 {
1585 x + 1
1586}
1587
1588fn do_twice(f: fn(i32) -> i32, arg: i32) -> i32 {
1589 f(arg) + f(arg)
1590}
1591
1592fn main() {
1593 let answer = do_twice(add_one, 5);
1594
1595 println!("The answer is: {}", answer);
1596}
1597```
1598
1599Listing 19-27: Using the `fn` type to accept a function pointer as an argument
1600
1601This code prints `The answer is: 12`. We specify that the parameter `f` in
1602`do_twice` is an `fn` that takes one parameter of type `i32` and returns an
1603`i32`. We can then call `f` in the body of `do_twice`. In `main`, we can pass
1604the function name `add_one` as the first argument to `do_twice`.
1605
1606Unlike closures, `fn` is a type rather than a trait, so we specify `fn` as the
1607parameter type directly rather than declaring a generic type parameter with one
1608of the `Fn` traits as a trait bound.
1609
1610Function pointers implement all three of the closure traits (`Fn`, `FnMut`, and
923072b8 1611`FnOnce`), meaning you can always pass a function pointer as an argument for a
5e7ed085
FG
1612function that expects a closure. It’s best to write functions using a generic
1613type and one of the closure traits so your functions can accept either
1614functions or closures.
1615
923072b8
FG
1616That said, one example of where you would want to only accept `fn` and not
1617closures is when interfacing with external code that doesn’t have closures: C
1618functions can accept functions as arguments, but C doesn’t have closures.
5e7ed085
FG
1619
1620As an example of where you could use either a closure defined inline or a named
923072b8
FG
1621function, let’s look at a use of the `map` method provided by the `Iterator`
1622trait in the standard library. To use the `map` function to turn a
5e7ed085
FG
1623vector of numbers into a vector of strings, we could use a closure, like this:
1624
1625```
1626let list_of_numbers = vec![1, 2, 3];
1627let list_of_strings: Vec<String> =
1628 list_of_numbers.iter().map(|i| i.to_string()).collect();
1629```
1630
1631Or we could name a function as the argument to `map` instead of the closure,
1632like this:
1633
1634```
1635let list_of_numbers = vec![1, 2, 3];
1636let list_of_strings: Vec<String> =
1637 list_of_numbers.iter().map(ToString::to_string).collect();
1638```
1639
1640Note that we must use the fully qualified syntax that we talked about earlier
1641in the “Advanced Traits” section because there are multiple functions available
923072b8
FG
1642named `to_string`.
1643
1644Here, we’re using the `to_string` function defined in the
5e7ed085
FG
1645`ToString` trait, which the standard library has implemented for any type that
1646implements `Display`.
1647
1648Recall from the “Enum values” section of Chapter 6 that the name of each enum
1649variant that we define also becomes an initializer function. We can use these
1650initializer functions as function pointers that implement the closure traits,
1651which means we can specify the initializer functions as arguments for methods
1652that take closures, like so:
1653
1654```
1655enum Status {
1656 Value(u32),
1657 Stop,
1658}
1659
1660let list_of_statuses: Vec<Status> = (0u32..20).map(Status::Value).collect();
1661```
1662
1663Here we create `Status::Value` instances using each `u32` value in the range
1664that `map` is called on by using the initializer function of `Status::Value`.
1665Some people prefer this style, and some people prefer to use closures. They
1666compile to the same code, so use whichever style is clearer to you.
1667
1668### Returning Closures
1669
1670Closures are represented by traits, which means you can’t return closures
1671directly. In most cases where you might want to return a trait, you can instead
1672use the concrete type that implements the trait as the return value of the
923072b8 1673function. However, you can’t do that with closures because they don’t have a
5e7ed085
FG
1674concrete type that is returnable; you’re not allowed to use the function
1675pointer `fn` as a return type, for example.
1676
1677The following code tries to return a closure directly, but it won’t compile:
1678
1679```
1680fn returns_closure() -> dyn Fn(i32) -> i32 {
1681 |x| x + 1
1682}
1683```
1684
1685The compiler error is as follows:
1686
1687```
1688error[E0746]: return type cannot have an unboxed trait object
1689 --> src/lib.rs:1:25
1690 |
16911 | fn returns_closure() -> dyn Fn(i32) -> i32 {
1692 | ^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
1693 |
1694 = note: for information on `impl Trait`, see <https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-implement-traits>
1695help: use `impl Fn(i32) -> i32` as the return type, as all return paths are of type `[closure@src/lib.rs:2:5: 2:14]`, which implements `Fn(i32) -> i32`
1696 |
16971 | fn returns_closure() -> impl Fn(i32) -> i32 {
1698 | ~~~~~~~~~~~~~~~~~~~
1699```
1700
1701The error references the `Sized` trait again! Rust doesn’t know how much space
1702it will need to store the closure. We saw a solution to this problem earlier.
1703We can use a trait object:
1704
1705```
1706fn returns_closure() -> Box<dyn Fn(i32) -> i32> {
1707 Box::new(|x| x + 1)
1708}
1709```
1710
1711This code will compile just fine. For more about trait objects, refer to the
1712section “Using Trait Objects That Allow for Values of Different Types” in
1713Chapter 17.
1714
1715Next, let’s look at macros!
1716
1717## Macros
1718
1719We’ve used macros like `println!` throughout this book, but we haven’t fully
1720explored what a macro is and how it works. The term *macro* refers to a family
1721of features in Rust: *declarative* macros with `macro_rules!` and three kinds
1722of *procedural* macros:
1723
1724* Custom `#[derive]` macros that specify code added with the `derive` attribute
1725 used on structs and enums
1726* Attribute-like macros that define custom attributes usable on any item
1727* Function-like macros that look like function calls but operate on the tokens
1728 specified as their argument
1729
1730We’ll talk about each of these in turn, but first, let’s look at why we even
1731need macros when we already have functions.
1732
1733### The Difference Between Macros and Functions
1734
1735Fundamentally, macros are a way of writing code that writes other code, which
1736is known as *metaprogramming*. In Appendix C, we discuss the `derive`
1737attribute, which generates an implementation of various traits for you. We’ve
1738also used the `println!` and `vec!` macros throughout the book. All of these
1739macros *expand* to produce more code than the code you’ve written manually.
1740
1741Metaprogramming is useful for reducing the amount of code you have to write and
1742maintain, which is also one of the roles of functions. However, macros have
1743some additional powers that functions don’t.
1744
1745A function signature must declare the number and type of parameters the
1746function has. Macros, on the other hand, can take a variable number of
1747parameters: we can call `println!("hello")` with one argument or
1748`println!("hello {}", name)` with two arguments. Also, macros are expanded
1749before the compiler interprets the meaning of the code, so a macro can, for
1750example, implement a trait on a given type. A function can’t, because it gets
1751called at runtime and a trait needs to be implemented at compile time.
1752
1753The downside to implementing a macro instead of a function is that macro
1754definitions are more complex than function definitions because you’re writing
1755Rust code that writes Rust code. Due to this indirection, macro definitions are
1756generally more difficult to read, understand, and maintain than function
1757definitions.
1758
1759Another important difference between macros and functions is that you must
1760define macros or bring them into scope *before* you call them in a file, as
1761opposed to functions you can define anywhere and call anywhere.
1762
1763### Declarative Macros with `macro_rules!` for General Metaprogramming
1764
923072b8
FG
1765The most widely used form of macros in Rust is the *declarative macro*. These
1766are also sometimes referred to as “macros by example,” “`macro_rules!` macros,”
1767or just plain “macros.” At their core, declarative macros allow you to write
5e7ed085
FG
1768something similar to a Rust `match` expression. As discussed in Chapter 6,
1769`match` expressions are control structures that take an expression, compare the
1770resulting value of the expression to patterns, and then run the code associated
1771with the matching pattern. Macros also compare a value to patterns that are
1772associated with particular code: in this situation, the value is the literal
1773Rust source code passed to the macro; the patterns are compared with the
1774structure of that source code; and the code associated with each pattern, when
1775matched, replaces the code passed to the macro. This all happens during
1776compilation.
1777
1778To define a macro, you use the `macro_rules!` construct. Let’s explore how to
1779use `macro_rules!` by looking at how the `vec!` macro is defined. Chapter 8
1780covered how we can use the `vec!` macro to create a new vector with particular
1781values. For example, the following macro creates a new vector containing three
1782integers:
1783
1784```
1785let v: Vec<u32> = vec![1, 2, 3];
1786```
1787
1788We could also use the `vec!` macro to make a vector of two integers or a vector
1789of five string slices. We wouldn’t be able to use a function to do the same
1790because we wouldn’t know the number or type of values up front.
1791
1792Listing 19-28 shows a slightly simplified definition of the `vec!` macro.
1793
1794Filename: src/lib.rs
1795
1796```
1797[1] #[macro_export]
1798[2] macro_rules! vec {
1799 [3] ( $( $x:expr ),* ) => {
1800 {
1801 let mut temp_vec = Vec::new();
1802 [4] $(
1803 [5] temp_vec.push($x [6]);
1804 )*
1805 [7] temp_vec
1806 }
1807 };
1808}
1809```
1810
1811Listing 19-28: A simplified version of the `vec!` macro definition
1812
1813> Note: The actual definition of the `vec!` macro in the standard library
1814> includes code to preallocate the correct amount of memory up front. That code
1815> is an optimization that we don’t include here to make the example simpler.
1816
1817The `#[macro_export]` annotation [1] indicates that this macro should be made
1818available whenever the crate in which the macro is defined is brought into
1819scope. Without this annotation, the macro can’t be brought into scope.
1820
1821We then start the macro definition with `macro_rules!` and the name of the
1822macro we’re defining *without* the exclamation mark [2]. The name, in this case
1823`vec`, is followed by curly brackets denoting the body of the macro definition.
1824
1825The structure in the `vec!` body is similar to the structure of a `match`
1826expression. Here we have one arm with the pattern `( $( $x:expr ),* )`,
1827followed by `=>` and the block of code associated with this pattern [3]. If the
1828pattern matches, the associated block of code will be emitted. Given that this
1829is the only pattern in this macro, there is only one valid way to match; any
1830other pattern will result in an error. More complex macros will have more than
1831one arm.
1832
1833Valid pattern syntax in macro definitions is different than the pattern syntax
1834covered in Chapter 18 because macro patterns are matched against Rust code
1835structure rather than values. Let’s walk through what the pattern pieces in
1836Listing 19-28 mean; for the full macro pattern syntax, see the Rust Reference
1837at *https://doc.rust-lang.org/reference/macros-by-example.html*.
1838
923072b8
FG
1839First, we use a set of parentheses to encompass the whole pattern. We use a
1840dollar sign (`$`) to declare a variable in the macro system that will contain
1841the Rust code matching the pattern. The dollar sign makes it clear this is a
1842macro variable as opposed to a regular Rust variable.
1843Next comes a set of parentheses that captures values that match the
5e7ed085
FG
1844pattern within the parentheses for use in the replacement code. Within `$()` is
1845`$x:expr`, which matches any Rust expression and gives the expression the name
1846`$x`.
1847
1848The comma following `$()` indicates that a literal comma separator character
1849could optionally appear after the code that matches the code in `$()`. The `*`
1850specifies that the pattern matches zero or more of whatever precedes the `*`.
1851
1852When we call this macro with `vec![1, 2, 3];`, the `$x` pattern matches three
1853times with the three expressions `1`, `2`, and `3`.
1854
1855Now let’s look at the pattern in the body of the code associated with this arm:
1856`temp_vec.push()` [5] within `$()*` [4][7] is generated for each part that
1857matches `$()` in the pattern zero or more times depending on how many times the
1858pattern matches. The `$x` [6] is replaced with each expression matched. When we
1859call this macro with `vec![1, 2, 3];`, the code generated that replaces this
1860macro call will be the following:
1861
1862```
1863{
1864 let mut temp_vec = Vec::new();
1865 temp_vec.push(1);
1866 temp_vec.push(2);
1867 temp_vec.push(3);
1868 temp_vec
1869}
1870```
1871
1872We’ve defined a macro that can take any number of arguments of any type and can
1873generate code to create a vector containing the specified elements.
1874
923072b8
FG
1875To learn more about how to write macros, consult the online documentation or
1876other resources, such as “The Little Book of Rust Macros” at
1877*https://veykril.github.io/tlborm/* started by Daniel Keep and continued by
1878Lukas Wirth.
5e7ed085 1879
923072b8
FG
1880<!-- Not sure what "In the future, Rust will have a second kind of declarative
1881macro" means here. I suspect we're "stuck" with the two kinds of macros we
1882already have today, at least I don't see much energy in pushing to add a third
1883just yet.
1884/JT -->
1885<!-- Yeah, great catch, I think that part was back when we had more dreams that
1886have now been postponed/abandoned. I've removed. /Carol -->
5e7ed085 1887
923072b8 1888### Procedural Macros for Generating Code from Attributes
5e7ed085 1889
923072b8
FG
1890The second form of macros is the *procedural macro*, which acts more like a
1891function (and is a type of procedure). Procedural macros accept some code as an
1892input, operate on that code, and produce some code as an output rather than
1893matching against patterns and replacing the code with other code as declarative
1894macros do. The three kinds of procedural macros are custom derive,
1895attribute-like, and function-like, and all work in a similar fashion.
5e7ed085
FG
1896
1897When creating procedural macros, the definitions must reside in their own crate
1898with a special crate type. This is for complex technical reasons that we hope
923072b8
FG
1899to eliminate in the future. In Listing 19-29, we show how to define a
1900procedural macro, where `some_attribute` is a placeholder for using a specific
5e7ed085
FG
1901macro variety.
1902
1903Filename: src/lib.rs
1904
1905```
1906use proc_macro;
1907
1908#[some_attribute]
1909pub fn some_name(input: TokenStream) -> TokenStream {
1910}
1911```
1912
1913Listing 19-29: An example of defining a procedural macro
1914
1915The function that defines a procedural macro takes a `TokenStream` as an input
1916and produces a `TokenStream` as an output. The `TokenStream` type is defined by
1917the `proc_macro` crate that is included with Rust and represents a sequence of
1918tokens. This is the core of the macro: the source code that the macro is
1919operating on makes up the input `TokenStream`, and the code the macro produces
1920is the output `TokenStream`. The function also has an attribute attached to it
1921that specifies which kind of procedural macro we’re creating. We can have
1922multiple kinds of procedural macros in the same crate.
1923
1924Let’s look at the different kinds of procedural macros. We’ll start with a
1925custom derive macro and then explain the small dissimilarities that make the
1926other forms different.
1927
1928### How to Write a Custom `derive` Macro
1929
1930Let’s create a crate named `hello_macro` that defines a trait named
1931`HelloMacro` with one associated function named `hello_macro`. Rather than
923072b8
FG
1932making our users implement the `HelloMacro` trait for each of their types,
1933we’ll provide a procedural macro so users can annotate their type with
5e7ed085
FG
1934`#[derive(HelloMacro)]` to get a default implementation of the `hello_macro`
1935function. The default implementation will print `Hello, Macro! My name is
1936TypeName!` where `TypeName` is the name of the type on which this trait has
1937been defined. In other words, we’ll write a crate that enables another
1938programmer to write code like Listing 19-30 using our crate.
1939
1940Filename: src/main.rs
1941
1942```
1943use hello_macro::HelloMacro;
1944use hello_macro_derive::HelloMacro;
1945
1946#[derive(HelloMacro)]
1947struct Pancakes;
1948
1949fn main() {
1950 Pancakes::hello_macro();
1951}
1952```
1953
1954Listing 19-30: The code a user of our crate will be able to write when using
1955our procedural macro
1956
1957This code will print `Hello, Macro! My name is Pancakes!` when we’re done. The
1958first step is to make a new library crate, like this:
1959
1960```
1961$ cargo new hello_macro --lib
1962```
1963
1964Next, we’ll define the `HelloMacro` trait and its associated function:
1965
1966Filename: src/lib.rs
1967
1968```
1969pub trait HelloMacro {
1970 fn hello_macro();
1971}
1972```
1973
1974We have a trait and its function. At this point, our crate user could implement
1975the trait to achieve the desired functionality, like so:
1976
1977```
1978use hello_macro::HelloMacro;
1979
1980struct Pancakes;
1981
1982impl HelloMacro for Pancakes {
1983 fn hello_macro() {
1984 println!("Hello, Macro! My name is Pancakes!");
1985 }
1986}
1987
1988fn main() {
1989 Pancakes::hello_macro();
1990}
1991```
1992
1993However, they would need to write the implementation block for each type they
1994wanted to use with `hello_macro`; we want to spare them from having to do this
1995work.
1996
1997Additionally, we can’t yet provide the `hello_macro` function with default
1998implementation that will print the name of the type the trait is implemented
1999on: Rust doesn’t have reflection capabilities, so it can’t look up the type’s
2000name at runtime. We need a macro to generate code at compile time.
2001
2002The next step is to define the procedural macro. At the time of this writing,
2003procedural macros need to be in their own crate. Eventually, this restriction
2004might be lifted. The convention for structuring crates and macro crates is as
2005follows: for a crate named `foo`, a custom derive procedural macro crate is
2006called `foo_derive`. Let’s start a new crate called `hello_macro_derive` inside
2007our `hello_macro` project:
2008
2009```
2010$ cargo new hello_macro_derive --lib
2011```
2012
2013Our two crates are tightly related, so we create the procedural macro crate
2014within the directory of our `hello_macro` crate. If we change the trait
2015definition in `hello_macro`, we’ll have to change the implementation of the
2016procedural macro in `hello_macro_derive` as well. The two crates will need to
2017be published separately, and programmers using these crates will need to add
2018both as dependencies and bring them both into scope. We could instead have the
2019`hello_macro` crate use `hello_macro_derive` as a dependency and re-export the
2020procedural macro code. However, the way we’ve structured the project makes it
2021possible for programmers to use `hello_macro` even if they don’t want the
2022`derive` functionality.
2023
2024We need to declare the `hello_macro_derive` crate as a procedural macro crate.
2025We’ll also need functionality from the `syn` and `quote` crates, as you’ll see
2026in a moment, so we need to add them as dependencies. Add the following to the
2027*Cargo.toml* file for `hello_macro_derive`:
2028
2029Filename: hello_macro_derive/Cargo.toml
2030
2031```
2032[lib]
2033proc-macro = true
2034
2035[dependencies]
2036syn = "1.0"
2037quote = "1.0"
2038```
2039
2040To start defining the procedural macro, place the code in Listing 19-31 into
2041your *src/lib.rs* file for the `hello_macro_derive` crate. Note that this code
2042won’t compile until we add a definition for the `impl_hello_macro` function.
2043
2044Filename: hello_macro_derive/src/lib.rs
2045
2046```
2047use proc_macro::TokenStream;
2048use quote::quote;
2049use syn;
2050
2051#[proc_macro_derive(HelloMacro)]
2052pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
2053 // Construct a representation of Rust code as a syntax tree
2054 // that we can manipulate
2055 let ast = syn::parse(input).unwrap();
2056
2057 // Build the trait implementation
2058 impl_hello_macro(&ast)
2059}
2060```
2061
2062Listing 19-31: Code that most procedural macro crates will require in order to
2063process Rust code
2064
2065Notice that we’ve split the code into the `hello_macro_derive` function, which
2066is responsible for parsing the `TokenStream`, and the `impl_hello_macro`
2067function, which is responsible for transforming the syntax tree: this makes
2068writing a procedural macro more convenient. The code in the outer function
2069(`hello_macro_derive` in this case) will be the same for almost every
2070procedural macro crate you see or create. The code you specify in the body of
2071the inner function (`impl_hello_macro` in this case) will be different
2072depending on your procedural macro’s purpose.
2073
2074We’ve introduced three new crates: `proc_macro`, `syn` (available from
2075*https://crates.io/crates/syn*), and `quote` (available from
2076*https://crates.io/crates/quote*). The `proc_macro` crate comes with Rust, so
2077we didn’t need to add that to the dependencies in *Cargo.toml*. The
2078`proc_macro` crate is the compiler’s API that allows us to read and manipulate
2079Rust code from our code.
2080
2081The `syn` crate parses Rust code from a string into a data structure that we
2082can perform operations on. The `quote` crate turns `syn` data structures back
2083into Rust code. These crates make it much simpler to parse any sort of Rust
2084code we might want to handle: writing a full parser for Rust code is no simple
2085task.
2086
2087The `hello_macro_derive` function will be called when a user of our library
2088specifies `#[derive(HelloMacro)]` on a type. This is possible because we’ve
2089annotated the `hello_macro_derive` function here with `proc_macro_derive` and
923072b8 2090specified the name `HelloMacro`, which matches our trait name; this is the
5e7ed085
FG
2091convention most procedural macros follow.
2092
2093The `hello_macro_derive` function first converts the `input` from a
2094`TokenStream` to a data structure that we can then interpret and perform
2095operations on. This is where `syn` comes into play. The `parse` function in
2096`syn` takes a `TokenStream` and returns a `DeriveInput` struct representing the
2097parsed Rust code. Listing 19-32 shows the relevant parts of the `DeriveInput`
2098struct we get from parsing the `struct Pancakes;` string:
2099
2100```
2101DeriveInput {
2102 // --snip--
2103
2104 ident: Ident {
2105 ident: "Pancakes",
2106 span: #0 bytes(95..103)
2107 },
2108 data: Struct(
2109 DataStruct {
2110 struct_token: Struct,
2111 fields: Unit,
2112 semi_token: Some(
2113 Semi
2114 )
2115 }
2116 )
2117}
2118```
2119
2120Listing 19-32: The `DeriveInput` instance we get when parsing the code that has
2121the macro’s attribute in Listing 19-30
2122
2123The fields of this struct show that the Rust code we’ve parsed is a unit struct
2124with the `ident` (identifier, meaning the name) of `Pancakes`. There are more
2125fields on this struct for describing all sorts of Rust code; check the `syn`
2126documentation for `DeriveInput` at
2127*https://docs.rs/syn/1.0/syn/struct.DeriveInput.html* for more information.
2128
2129Soon we’ll define the `impl_hello_macro` function, which is where we’ll build
2130the new Rust code we want to include. But before we do, note that the output
2131for our derive macro is also a `TokenStream`. The returned `TokenStream` is
2132added to the code that our crate users write, so when they compile their crate,
2133they’ll get the extra functionality that we provide in the modified
2134`TokenStream`.
2135
2136You might have noticed that we’re calling `unwrap` to cause the
2137`hello_macro_derive` function to panic if the call to the `syn::parse` function
2138fails here. It’s necessary for our procedural macro to panic on errors because
2139`proc_macro_derive` functions must return `TokenStream` rather than `Result` to
2140conform to the procedural macro API. We’ve simplified this example by using
2141`unwrap`; in production code, you should provide more specific error messages
2142about what went wrong by using `panic!` or `expect`.
2143
2144Now that we have the code to turn the annotated Rust code from a `TokenStream`
2145into a `DeriveInput` instance, let’s generate the code that implements the
2146`HelloMacro` trait on the annotated type, as shown in Listing 19-33.
2147
2148Filename: hello_macro_derive/src/lib.rs
2149
2150```
2151fn impl_hello_macro(ast: &syn::DeriveInput) -> TokenStream {
2152 let name = &ast.ident;
2153 let gen = quote! {
2154 impl HelloMacro for #name {
2155 fn hello_macro() {
2156 println!("Hello, Macro! My name is {}!", stringify!(#name));
2157 }
2158 }
2159 };
2160 gen.into()
2161}
2162```
2163
2164Listing 19-33: Implementing the `HelloMacro` trait using the parsed Rust code
2165
2166We get an `Ident` struct instance containing the name (identifier) of the
2167annotated type using `ast.ident`. The struct in Listing 19-32 shows that when
2168we run the `impl_hello_macro` function on the code in Listing 19-30, the
2169`ident` we get will have the `ident` field with a value of `"Pancakes"`. Thus,
2170the `name` variable in Listing 19-33 will contain an `Ident` struct instance
2171that, when printed, will be the string `"Pancakes"`, the name of the struct in
2172Listing 19-30.
2173
2174The `quote!` macro lets us define the Rust code that we want to return. The
2175compiler expects something different to the direct result of the `quote!`
2176macro’s execution, so we need to convert it to a `TokenStream`. We do this by
2177calling the `into` method, which consumes this intermediate representation and
2178returns a value of the required `TokenStream` type.
2179
2180The `quote!` macro also provides some very cool templating mechanics: we can
2181enter `#name`, and `quote!` will replace it with the value in the variable
2182`name`. You can even do some repetition similar to the way regular macros work.
2183Check out the `quote` crate’s docs at *https://docs.rs/quote* for a thorough
2184introduction.
2185
2186We want our procedural macro to generate an implementation of our `HelloMacro`
2187trait for the type the user annotated, which we can get by using `#name`. The
923072b8 2188trait implementation has the one function `hello_macro`, whose body contains the
5e7ed085
FG
2189functionality we want to provide: printing `Hello, Macro! My name is` and then
2190the name of the annotated type.
2191
2192The `stringify!` macro used here is built into Rust. It takes a Rust
2193expression, such as `1 + 2`, and at compile time turns the expression into a
2194string literal, such as `"1 + 2"`. This is different than `format!` or
2195`println!`, macros which evaluate the expression and then turn the result into
2196a `String`. There is a possibility that the `#name` input might be an
2197expression to print literally, so we use `stringify!`. Using `stringify!` also
2198saves an allocation by converting `#name` to a string literal at compile time.
2199
2200At this point, `cargo build` should complete successfully in both `hello_macro`
2201and `hello_macro_derive`. Let’s hook up these crates to the code in Listing
220219-30 to see the procedural macro in action! Create a new binary project in
2203your *projects* directory using `cargo new pancakes`. We need to add
2204`hello_macro` and `hello_macro_derive` as dependencies in the `pancakes`
2205crate’s *Cargo.toml*. If you’re publishing your versions of `hello_macro` and
2206`hello_macro_derive` to *https://crates.io/*, they would be regular
2207dependencies; if not, you can specify them as `path` dependencies as follows:
2208
2209```
2210[dependencies]
2211hello_macro = { path = "../hello_macro" }
2212hello_macro_derive = { path = "../hello_macro/hello_macro_derive" }
2213```
2214
2215Put the code in Listing 19-30 into *src/main.rs*, and run `cargo run`: it
2216should print `Hello, Macro! My name is Pancakes!` The implementation of the
2217`HelloMacro` trait from the procedural macro was included without the
2218`pancakes` crate needing to implement it; the `#[derive(HelloMacro)]` added the
2219trait implementation.
2220
2221Next, let’s explore how the other kinds of procedural macros differ from custom
2222derive macros.
2223
2224### Attribute-like macros
2225
2226Attribute-like macros are similar to custom derive macros, but instead of
2227generating code for the `derive` attribute, they allow you to create new
2228attributes. They’re also more flexible: `derive` only works for structs and
2229enums; attributes can be applied to other items as well, such as functions.
2230Here’s an example of using an attribute-like macro: say you have an attribute
2231named `route` that annotates functions when using a web application framework:
2232
2233```
2234#[route(GET, "/")]
2235fn index() {
2236```
2237
2238This `#[route]` attribute would be defined by the framework as a procedural
2239macro. The signature of the macro definition function would look like this:
2240
2241```
2242#[proc_macro_attribute]
2243pub fn route(attr: TokenStream, item: TokenStream) -> TokenStream {
2244```
2245
2246Here, we have two parameters of type `TokenStream`. The first is for the
2247contents of the attribute: the `GET, "/"` part. The second is the body of the
2248item the attribute is attached to: in this case, `fn index() {}` and the rest
2249of the function’s body.
2250
2251Other than that, attribute-like macros work the same way as custom derive
2252macros: you create a crate with the `proc-macro` crate type and implement a
2253function that generates the code you want!
2254
2255### Function-like macros
2256
2257Function-like macros define macros that look like function calls. Similarly to
2258`macro_rules!` macros, they’re more flexible than functions; for example, they
2259can take an unknown number of arguments. However, `macro_rules!` macros can be
2260defined only using the match-like syntax we discussed in the section
2261“Declarative Macros with `macro_rules!` for General Metaprogramming” earlier.
2262Function-like macros take a `TokenStream` parameter and their definition
2263manipulates that `TokenStream` using Rust code as the other two types of
2264procedural macros do. An example of a function-like macro is an `sql!` macro
2265that might be called like so:
2266
2267```
2268let sql = sql!(SELECT * FROM posts WHERE id=1);
2269```
2270
2271This macro would parse the SQL statement inside it and check that it’s
2272syntactically correct, which is much more complex processing than a
2273`macro_rules!` macro can do. The `sql!` macro would be defined like this:
2274
2275```
2276#[proc_macro]
2277pub fn sql(input: TokenStream) -> TokenStream {
2278```
2279
2280This definition is similar to the custom derive macro’s signature: we receive
2281the tokens that are inside the parentheses and return the code we wanted to
2282generate.
2283
923072b8
FG
2284<!-- I may get a few looks for this, but I wonder if we should trim the
2285procedural macros section above a bit. There's a lot of information in there,
2286but it feels like something we could intro and then point people off to other
2287materials for. Reason being (and I know I may be in the minority here),
2288procedural macros are something we should use only rarely in our Rust projects.
2289They are a burden on the compiler, have the potential to hurt readability and
2290maintainability, and... you know the saying with great power comes great
2291responsibilty and all that. /JT -->
2292<!-- I think we felt obligated to have this section when procedural macros were
2293introduced because there wasn't any documentation for them. I feel like the
2294custom derive is the most common kind people want to make... While I'd love to
2295not have to maintain this section, I asked around and people seemed generally
2296in favor of keeping it, so I think I will, for now. /Carol -->
2297
5e7ed085
FG
2298## Summary
2299
923072b8
FG
2300Whew! Now you have some Rust features in your toolbox that you likely won’t use
2301often, but you’ll know they’re available in very particular circumstances.
2302We’ve introduced several complex topics so that when you encounter them in
2303error message suggestions or in other peoples’ code, you’ll be able to
2304recognize these concepts and syntax. Use this chapter as a reference to guide
2305you to solutions.
5e7ed085
FG
2306
2307Next, we’ll put everything we’ve discussed throughout the book into practice
2308and do one more project!