]> git.proxmox.com Git - rustc.git/blob - src/doc/book/src/ch08-03-hash-maps.md
New upstream version 1.55.0+dfsg1
[rustc.git] / src / doc / book / src / ch08-03-hash-maps.md
1 ## Storing Keys with Associated Values in Hash Maps
2
3 The last of our common collections is the *hash map*. The type `HashMap<K, V>`
4 stores a mapping of keys of type `K` to values of type `V`. It does this via a
5 *hashing function*, which determines how it places these keys and values into
6 memory. Many programming languages support this kind of data structure, but
7 they often use a different name, such as hash, map, object, hash table,
8 dictionary, or associative array, just to name a few.
9
10 Hash maps are useful when you want to look up data not by using an index, as
11 you can with vectors, but by using a key that can be of any type. For example,
12 in a game, you could keep track of each team’s score in a hash map in which
13 each key is a team’s name and the values are each team’s score. Given a team
14 name, you can retrieve its score.
15
16 We’ll go over the basic API of hash maps in this section, but many more goodies
17 are hiding in the functions defined on `HashMap<K, V>` by the standard library.
18 As always, check the standard library documentation for more information.
19
20 ### Creating a New Hash Map
21
22 You can create an empty hash map with `new` and add elements with `insert`. In
23 Listing 8-20, we’re keeping track of the scores of two teams whose names are
24 Blue and Yellow. The Blue team starts with 10 points, and the Yellow team
25 starts with 50.
26
27 ```rust
28 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-20/src/main.rs:here}}
29 ```
30
31 <span class="caption">Listing 8-20: Creating a new hash map and inserting some
32 keys and values</span>
33
34 Note that we need to first `use` the `HashMap` from the collections portion of
35 the standard library. Of our three common collections, this one is the least
36 often used, so it’s not included in the features brought into scope
37 automatically in the prelude. Hash maps also have less support from the
38 standard library; there’s no built-in macro to construct them, for example.
39
40 Just like vectors, hash maps store their data on the heap. This `HashMap` has
41 keys of type `String` and values of type `i32`. Like vectors, hash maps are
42 homogeneous: all of the keys must have the same type, and all of the values
43 must have the same type.
44
45 Another way of constructing a hash map is by using iterators and the `collect`
46 method on a vector of tuples, where each tuple consists of a key and its value.
47 We’ll be going into more detail about iterators and their associated methods in
48 the [”Processing a Series of Items with Iterators” section of Chapter
49 13][iterators]<!-- ignore -->. The `collect` method gathers data into a number
50 of collection types, including `HashMap`. For example, if we had the team names
51 and initial scores in two separate vectors, we could use the `zip` method to
52 create an iterator of tuples where “Blue” is paired with 10, and so forth. Then
53 we could use the `collect` method to turn that iterator of tuples into a hash
54 map, as shown in Listing 8-21.
55
56 ```rust
57 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-21/src/main.rs:here}}
58 ```
59
60 <span class="caption">Listing 8-21: Creating a hash map from a list of teams
61 and a list of scores</span>
62
63 The type annotation `HashMap<_, _>` is needed here because it’s possible to
64 `collect` into many different data structures and Rust doesn’t know which you
65 want unless you specify. For the parameters for the key and value types,
66 however, we use underscores, and Rust can infer the types that the hash map
67 contains based on the types of the data in the vectors. In Listing 8-21, the
68 key type will be `String` and the value type will be `i32`, just as the types
69 were in Listing 8-20.
70
71 ### Hash Maps and Ownership
72
73 For types that implement the `Copy` trait, like `i32`, the values are copied
74 into the hash map. For owned values like `String`, the values will be moved and
75 the hash map will be the owner of those values, as demonstrated in Listing 8-22.
76
77 ```rust
78 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-22/src/main.rs:here}}
79 ```
80
81 <span class="caption">Listing 8-22: Showing that keys and values are owned by
82 the hash map once they’re inserted</span>
83
84 We aren’t able to use the variables `field_name` and `field_value` after
85 they’ve been moved into the hash map with the call to `insert`.
86
87 If we insert references to values into the hash map, the values won’t be moved
88 into the hash map. The values that the references point to must be valid for at
89 least as long as the hash map is valid. We’ll talk more about these issues in
90 the [“Validating References with
91 Lifetimes”][validating-references-with-lifetimes]<!-- ignore --> section in
92 Chapter 10.
93
94 ### Accessing Values in a Hash Map
95
96 We can get a value out of the hash map by providing its key to the `get`
97 method, as shown in Listing 8-23.
98
99 ```rust
100 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-23/src/main.rs:here}}
101 ```
102
103 <span class="caption">Listing 8-23: Accessing the score for the Blue team
104 stored in the hash map</span>
105
106 Here, `score` will have the value that’s associated with the Blue team, and the
107 result will be `Some(&10)`. The result is wrapped in `Some` because `get`
108 returns an `Option<&V>`; if there’s no value for that key in the hash map,
109 `get` will return `None`. The program will need to handle the `Option` in one
110 of the ways that we covered in Chapter 6.
111
112 We can iterate over each key/value pair in a hash map in a similar manner as we
113 do with vectors, using a `for` loop:
114
115 ```rust
116 {{#rustdoc_include ../listings/ch08-common-collections/no-listing-03-iterate-over-hashmap/src/main.rs:here}}
117 ```
118
119 This code will print each pair in an arbitrary order:
120
121 ```text
122 Yellow: 50
123 Blue: 10
124 ```
125
126 ### Updating a Hash Map
127
128 Although the number of keys and values is growable, each key can only have one
129 value associated with it at a time. When you want to change the data in a hash
130 map, you have to decide how to handle the case when a key already has a value
131 assigned. You could replace the old value with the new value, completely
132 disregarding the old value. You could keep the old value and ignore the new
133 value, only adding the new value if the key *doesn’t* already have a value. Or
134 you could combine the old value and the new value. Let’s look at how to do each
135 of these!
136
137 #### Overwriting a Value
138
139 If we insert a key and a value into a hash map and then insert that same key
140 with a different value, the value associated with that key will be replaced.
141 Even though the code in Listing 8-24 calls `insert` twice, the hash map will
142 only contain one key/value pair because we’re inserting the value for the Blue
143 team’s key both times.
144
145 ```rust
146 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-24/src/main.rs:here}}
147 ```
148
149 <span class="caption">Listing 8-24: Replacing a value stored with a particular
150 key</span>
151
152 This code will print `{"Blue": 25}`. The original value of `10` has been
153 overwritten.
154
155 #### Only Inserting a Value If the Key Has No Value
156
157 It’s common to check whether a particular key has a value and, if it doesn’t,
158 insert a value for it. Hash maps have a special API for this called `entry`
159 that takes the key you want to check as a parameter. The return value of the
160 `entry` method is an enum called `Entry` that represents a value that might or
161 might not exist. Let’s say we want to check whether the key for the Yellow team
162 has a value associated with it. If it doesn’t, we want to insert the value 50,
163 and the same for the Blue team. Using the `entry` API, the code looks like
164 Listing 8-25.
165
166 ```rust
167 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-25/src/main.rs:here}}
168 ```
169
170 <span class="caption">Listing 8-25: Using the `entry` method to only insert if
171 the key does not already have a value</span>
172
173 The `or_insert` method on `Entry` is defined to return a mutable reference to
174 the value for the corresponding `Entry` key if that key exists, and if not,
175 inserts the parameter as the new value for this key and returns a mutable
176 reference to the new value. This technique is much cleaner than writing the
177 logic ourselves and, in addition, plays more nicely with the borrow checker.
178
179 Running the code in Listing 8-25 will print `{"Yellow": 50, "Blue": 10}`. The
180 first call to `entry` will insert the key for the Yellow team with the value
181 50 because the Yellow team doesn’t have a value already. The second call to
182 `entry` will not change the hash map because the Blue team already has the
183 value 10.
184
185 #### Updating a Value Based on the Old Value
186
187 Another common use case for hash maps is to look up a key’s value and then
188 update it based on the old value. For instance, Listing 8-26 shows code that
189 counts how many times each word appears in some text. We use a hash map with
190 the words as keys and increment the value to keep track of how many times we’ve
191 seen that word. If it’s the first time we’ve seen a word, we’ll first insert
192 the value 0.
193
194 ```rust
195 {{#rustdoc_include ../listings/ch08-common-collections/listing-08-26/src/main.rs:here}}
196 ```
197
198 <span class="caption">Listing 8-26: Counting occurrences of words using a hash
199 map that stores words and counts</span>
200
201 This code will print `{"world": 2, "hello": 1, "wonderful": 1}`. The
202 `or_insert` method actually returns a mutable reference (`&mut V`) to the value
203 for this key. Here we store that mutable reference in the `count` variable, so
204 in order to assign to that value, we must first dereference `count` using the
205 asterisk (`*`). The mutable reference goes out of scope at the end of the `for`
206 loop, so all of these changes are safe and allowed by the borrowing rules.
207
208 ### Hashing Functions
209
210 By default, `HashMap` uses a hashing function called SipHash that can provide
211 resistance to Denial of Service (DoS) attacks involving hash tables[^siphash]. This
212 is not the fastest hashing algorithm available, but the trade-off for better
213 security that comes with the drop in performance is worth it. If you profile
214 your code and find that the default hash function is too slow for your
215 purposes, you can switch to another function by specifying a different
216 *hasher*. A hasher is a type that implements the `BuildHasher` trait. We’ll
217 talk about traits and how to implement them in Chapter 10. You don’t
218 necessarily have to implement your own hasher from scratch;
219 [crates.io](https://crates.io/) has libraries shared by other Rust users that
220 provide hashers implementing many common hashing algorithms.
221
222 [^siphash]: [https://en.wikipedia.org/wiki/SipHash](https://en.wikipedia.org/wiki/SipHash)
223
224 ## Summary
225
226 Vectors, strings, and hash maps will provide a large amount of functionality
227 necessary in programs when you need to store, access, and modify data. Here are
228 some exercises you should now be equipped to solve:
229
230 * Given a list of integers, use a vector and return the mean (the average
231 value), median (when sorted, the value in the middle position), and mode (the
232 value that occurs most often; a hash map will be helpful here) of the list.
233 * Convert strings to pig latin. The first consonant of each word is moved to
234 the end of the word and “ay” is added, so “first” becomes “irst-fay.” Words
235 that start with a vowel have “hay” added to the end instead (“apple” becomes
236 “apple-hay”). Keep in mind the details about UTF-8 encoding!
237 * Using a hash map and vectors, create a text interface to allow a user to add
238 employee names to a department in a company. For example, “Add Sally to
239 Engineering” or “Add Amir to Sales.” Then let the user retrieve a list of all
240 people in a department or all people in the company by department, sorted
241 alphabetically.
242
243 The standard library API documentation describes methods that vectors, strings,
244 and hash maps have that will be helpful for these exercises!
245
246 We’re getting into more complex programs in which operations can fail, so, it’s
247 a perfect time to discuss error handling. We’ll do that next!
248
249 [iterators]: ch13-02-iterators.html
250 [validating-references-with-lifetimes]:
251 ch10-03-lifetime-syntax.html#validating-references-with-lifetimes