]>
Commit | Line | Data |
---|---|---|
13cf67c4 XL |
1 | ## Storing Keys with Associated Values in Hash Maps |
2 | ||
3 | The last of our common collections is the *hash map*. The type `HashMap<K, V>` | |
4 | stores a mapping of keys of type `K` to values of type `V`. It does this via a | |
5 | *hashing function*, which determines how it places these keys and values into | |
6 | memory. Many programming languages support this kind of data structure, but | |
7 | they often use a different name, such as hash, map, object, hash table, | |
8 | dictionary, or associative array, just to name a few. | |
9 | ||
10 | Hash maps are useful when you want to look up data not by using an index, as | |
11 | you can with vectors, but by using a key that can be of any type. For example, | |
12 | in a game, you could keep track of each team’s score in a hash map in which | |
13 | each key is a team’s name and the values are each team’s score. Given a team | |
14 | name, you can retrieve its score. | |
15 | ||
16 | We’ll go over the basic API of hash maps in this section, but many more goodies | |
17 | are hiding in the functions defined on `HashMap<K, V>` by the standard library. | |
18 | As always, check the standard library documentation for more information. | |
19 | ||
20 | ### Creating a New Hash Map | |
21 | ||
22 | You can create an empty hash map with `new` and add elements with `insert`. In | |
23 | Listing 8-20, we’re keeping track of the scores of two teams whose names are | |
24 | Blue and Yellow. The Blue team starts with 10 points, and the Yellow team | |
69743fb6 | 25 | starts with 50. |
13cf67c4 XL |
26 | |
27 | ```rust | |
74b04a01 | 28 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-20/src/main.rs:here}} |
13cf67c4 XL |
29 | ``` |
30 | ||
31 | <span class="caption">Listing 8-20: Creating a new hash map and inserting some | |
32 | keys and values</span> | |
33 | ||
34 | Note that we need to first `use` the `HashMap` from the collections portion of | |
35 | the standard library. Of our three common collections, this one is the least | |
36 | often used, so it’s not included in the features brought into scope | |
37 | automatically in the prelude. Hash maps also have less support from the | |
38 | standard library; there’s no built-in macro to construct them, for example. | |
39 | ||
40 | Just like vectors, hash maps store their data on the heap. This `HashMap` has | |
41 | keys of type `String` and values of type `i32`. Like vectors, hash maps are | |
42 | homogeneous: all of the keys must have the same type, and all of the values | |
43 | must have the same type. | |
44 | ||
74b04a01 XL |
45 | Another way of constructing a hash map is by using iterators and the `collect` |
46 | method on a vector of tuples, where each tuple consists of a key and its value. | |
47 | We’ll be going into more detail about iterators and their associated methods in | |
48 | the [”Processing a Series of Items with Iterators” section of Chapter | |
49 | 13][iterators]<!-- ignore -->. The `collect` method gathers data into a number | |
50 | of collection types, including `HashMap`. For example, if we had the team names | |
51 | and initial scores in two separate vectors, we could use the `zip` method to | |
52 | create a vector of tuples where “Blue” is paired with 10, and so forth. Then we | |
53 | could use the `collect` method to turn that vector of tuples into a hash map, | |
54 | as shown in Listing 8-21. | |
13cf67c4 XL |
55 | |
56 | ```rust | |
74b04a01 | 57 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-21/src/main.rs:here}} |
13cf67c4 XL |
58 | ``` |
59 | ||
60 | <span class="caption">Listing 8-21: Creating a hash map from a list of teams | |
61 | and a list of scores</span> | |
62 | ||
63 | The type annotation `HashMap<_, _>` is needed here because it’s possible to | |
64 | `collect` into many different data structures and Rust doesn’t know which you | |
65 | want unless you specify. For the parameters for the key and value types, | |
66 | however, we use underscores, and Rust can infer the types that the hash map | |
74b04a01 XL |
67 | contains based on the types of the data in the vectors. In Listing 8-21, the |
68 | key type will be `String` and the value type will be `i32`, just as the types | |
69 | were in Listing 8-20. | |
13cf67c4 XL |
70 | |
71 | ### Hash Maps and Ownership | |
72 | ||
73 | For types that implement the `Copy` trait, like `i32`, the values are copied | |
74 | into the hash map. For owned values like `String`, the values will be moved and | |
69743fb6 | 75 | the hash map will be the owner of those values, as demonstrated in Listing 8-22. |
13cf67c4 XL |
76 | |
77 | ```rust | |
74b04a01 | 78 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-22/src/main.rs:here}} |
13cf67c4 XL |
79 | ``` |
80 | ||
81 | <span class="caption">Listing 8-22: Showing that keys and values are owned by | |
82 | the hash map once they’re inserted</span> | |
83 | ||
84 | We aren’t able to use the variables `field_name` and `field_value` after | |
85 | they’ve been moved into the hash map with the call to `insert`. | |
86 | ||
87 | If we insert references to values into the hash map, the values won’t be moved | |
88 | into the hash map. The values that the references point to must be valid for at | |
89 | least as long as the hash map is valid. We’ll talk more about these issues in | |
dc9dc135 XL |
90 | the [“Validating References with |
91 | Lifetimes”][validating-references-with-lifetimes]<!-- ignore --> section in | |
92 | Chapter 10. | |
13cf67c4 XL |
93 | |
94 | ### Accessing Values in a Hash Map | |
95 | ||
96 | We can get a value out of the hash map by providing its key to the `get` | |
69743fb6 | 97 | method, as shown in Listing 8-23. |
13cf67c4 XL |
98 | |
99 | ```rust | |
74b04a01 | 100 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-23/src/main.rs:here}} |
13cf67c4 XL |
101 | ``` |
102 | ||
103 | <span class="caption">Listing 8-23: Accessing the score for the Blue team | |
104 | stored in the hash map</span> | |
105 | ||
106 | Here, `score` will have the value that’s associated with the Blue team, and the | |
107 | result will be `Some(&10)`. The result is wrapped in `Some` because `get` | |
108 | returns an `Option<&V>`; if there’s no value for that key in the hash map, | |
109 | `get` will return `None`. The program will need to handle the `Option` in one | |
110 | of the ways that we covered in Chapter 6. | |
111 | ||
112 | We can iterate over each key/value pair in a hash map in a similar manner as we | |
113 | do with vectors, using a `for` loop: | |
114 | ||
115 | ```rust | |
74b04a01 | 116 | {{#rustdoc_include ../listings/ch08-common-collections/no-listing-03-iterate-over-hashmap/src/main.rs:here}} |
13cf67c4 XL |
117 | ``` |
118 | ||
119 | This code will print each pair in an arbitrary order: | |
120 | ||
121 | ```text | |
122 | Yellow: 50 | |
123 | Blue: 10 | |
124 | ``` | |
125 | ||
126 | ### Updating a Hash Map | |
127 | ||
128 | Although the number of keys and values is growable, each key can only have one | |
129 | value associated with it at a time. When you want to change the data in a hash | |
130 | map, you have to decide how to handle the case when a key already has a value | |
131 | assigned. You could replace the old value with the new value, completely | |
132 | disregarding the old value. You could keep the old value and ignore the new | |
133 | value, only adding the new value if the key *doesn’t* already have a value. Or | |
134 | you could combine the old value and the new value. Let’s look at how to do each | |
135 | of these! | |
136 | ||
137 | #### Overwriting a Value | |
138 | ||
139 | If we insert a key and a value into a hash map and then insert that same key | |
140 | with a different value, the value associated with that key will be replaced. | |
141 | Even though the code in Listing 8-24 calls `insert` twice, the hash map will | |
142 | only contain one key/value pair because we’re inserting the value for the Blue | |
69743fb6 | 143 | team’s key both times. |
13cf67c4 XL |
144 | |
145 | ```rust | |
74b04a01 | 146 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-24/src/main.rs:here}} |
13cf67c4 XL |
147 | ``` |
148 | ||
149 | <span class="caption">Listing 8-24: Replacing a value stored with a particular | |
150 | key</span> | |
151 | ||
152 | This code will print `{"Blue": 25}`. The original value of `10` has been | |
153 | overwritten. | |
154 | ||
155 | #### Only Inserting a Value If the Key Has No Value | |
156 | ||
157 | It’s common to check whether a particular key has a value and, if it doesn’t, | |
158 | insert a value for it. Hash maps have a special API for this called `entry` | |
159 | that takes the key you want to check as a parameter. The return value of the | |
69743fb6 XL |
160 | `entry` method is an enum called `Entry` that represents a value that might or |
161 | might not exist. Let’s say we want to check whether the key for the Yellow team | |
162 | has a value associated with it. If it doesn’t, we want to insert the value 50, | |
163 | and the same for the Blue team. Using the `entry` API, the code looks like | |
164 | Listing 8-25. | |
13cf67c4 XL |
165 | |
166 | ```rust | |
74b04a01 | 167 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-25/src/main.rs:here}} |
13cf67c4 XL |
168 | ``` |
169 | ||
170 | <span class="caption">Listing 8-25: Using the `entry` method to only insert if | |
171 | the key does not already have a value</span> | |
172 | ||
173 | The `or_insert` method on `Entry` is defined to return a mutable reference to | |
174 | the value for the corresponding `Entry` key if that key exists, and if not, | |
175 | inserts the parameter as the new value for this key and returns a mutable | |
176 | reference to the new value. This technique is much cleaner than writing the | |
177 | logic ourselves and, in addition, plays more nicely with the borrow checker. | |
178 | ||
179 | Running the code in Listing 8-25 will print `{"Yellow": 50, "Blue": 10}`. The | |
180 | first call to `entry` will insert the key for the Yellow team with the value | |
69743fb6 | 181 | 50 because the Yellow team doesn’t have a value already. The second call to |
13cf67c4 | 182 | `entry` will not change the hash map because the Blue team already has the |
69743fb6 | 183 | value 10. |
13cf67c4 XL |
184 | |
185 | #### Updating a Value Based on the Old Value | |
186 | ||
187 | Another common use case for hash maps is to look up a key’s value and then | |
188 | update it based on the old value. For instance, Listing 8-26 shows code that | |
189 | counts how many times each word appears in some text. We use a hash map with | |
190 | the words as keys and increment the value to keep track of how many times we’ve | |
191 | seen that word. If it’s the first time we’ve seen a word, we’ll first insert | |
69743fb6 | 192 | the value 0. |
13cf67c4 XL |
193 | |
194 | ```rust | |
74b04a01 | 195 | {{#rustdoc_include ../listings/ch08-common-collections/listing-08-26/src/main.rs:here}} |
13cf67c4 XL |
196 | ``` |
197 | ||
198 | <span class="caption">Listing 8-26: Counting occurrences of words using a hash | |
199 | map that stores words and counts</span> | |
200 | ||
201 | This code will print `{"world": 2, "hello": 1, "wonderful": 1}`. The | |
202 | `or_insert` method actually returns a mutable reference (`&mut V`) to the value | |
203 | for this key. Here we store that mutable reference in the `count` variable, so | |
204 | in order to assign to that value, we must first dereference `count` using the | |
205 | asterisk (`*`). The mutable reference goes out of scope at the end of the `for` | |
206 | loop, so all of these changes are safe and allowed by the borrowing rules. | |
207 | ||
208 | ### Hashing Functions | |
209 | ||
6a06907d XL |
210 | By default, `HashMap` uses a hashing function called SipHash that can provide |
211 | resistance to Denial of Service (DoS) attacks involving hash tables[^siphash]. This | |
69743fb6 XL |
212 | is not the fastest hashing algorithm available, but the trade-off for better |
213 | security that comes with the drop in performance is worth it. If you profile | |
214 | your code and find that the default hash function is too slow for your | |
215 | purposes, you can switch to another function by specifying a different | |
216 | *hasher*. A hasher is a type that implements the `BuildHasher` trait. We’ll | |
217 | talk about traits and how to implement them in Chapter 10. You don’t | |
218 | necessarily have to implement your own hasher from scratch; | |
dc9dc135 | 219 | [crates.io](https://crates.io/) has libraries shared by other Rust users that |
69743fb6 | 220 | provide hashers implementing many common hashing algorithms. |
13cf67c4 | 221 | |
fc512014 | 222 | [^siphash]: [https://en.wikipedia.org/wiki/SipHash](https://en.wikipedia.org/wiki/SipHash) |
13cf67c4 XL |
223 | |
224 | ## Summary | |
225 | ||
226 | Vectors, strings, and hash maps will provide a large amount of functionality | |
227 | necessary in programs when you need to store, access, and modify data. Here are | |
228 | some exercises you should now be equipped to solve: | |
229 | ||
230 | * Given a list of integers, use a vector and return the mean (the average | |
231 | value), median (when sorted, the value in the middle position), and mode (the | |
232 | value that occurs most often; a hash map will be helpful here) of the list. | |
233 | * Convert strings to pig latin. The first consonant of each word is moved to | |
234 | the end of the word and “ay” is added, so “first” becomes “irst-fay.” Words | |
235 | that start with a vowel have “hay” added to the end instead (“apple” becomes | |
236 | “apple-hay”). Keep in mind the details about UTF-8 encoding! | |
237 | * Using a hash map and vectors, create a text interface to allow a user to add | |
238 | employee names to a department in a company. For example, “Add Sally to | |
239 | Engineering” or “Add Amir to Sales.” Then let the user retrieve a list of all | |
240 | people in a department or all people in the company by department, sorted | |
241 | alphabetically. | |
242 | ||
243 | The standard library API documentation describes methods that vectors, strings, | |
244 | and hash maps have that will be helpful for these exercises! | |
245 | ||
246 | We’re getting into more complex programs in which operations can fail, so, it’s | |
247 | a perfect time to discuss error handling. We’ll do that next! | |
9fa01778 | 248 | |
74b04a01 | 249 | [iterators]: ch13-02-iterators.html |
9fa01778 XL |
250 | [validating-references-with-lifetimes]: |
251 | ch10-03-lifetime-syntax.html#validating-references-with-lifetimes |