]> git.proxmox.com Git - rustc.git/blame - library/std/src/collections/mod.rs
New upstream version 1.52.0~beta.3+dfsg1
[rustc.git] / library / std / src / collections / mod.rs
CommitLineData
1a4d82fc
JJ
1//! Collection types.
2//!
9346a6ac
AL
3//! Rust's standard collection library provides efficient implementations of the
4//! most common general purpose programming data structures. By using the
5//! standard implementations, it should be possible for two libraries to
6//! communicate without significant data conversion.
7//!
c30ab7b3 8//! To get this out of the way: you should probably just use [`Vec`] or [`HashMap`].
9346a6ac
AL
9//! These two collections cover most use cases for generic data storage and
10//! processing. They are exceptionally good at doing what they do. All the other
11//! collections in the standard library have specific use cases where they are
12//! the optimal choice, but these cases are borderline *niche* in comparison.
13//! Even when `Vec` and `HashMap` are technically suboptimal, they're probably a
14//! good enough choice to get started.
1a4d82fc
JJ
15//!
16//! Rust's collections can be grouped into four major categories:
17//!
c30ab7b3
SL
18//! * Sequences: [`Vec`], [`VecDeque`], [`LinkedList`]
19//! * Maps: [`HashMap`], [`BTreeMap`]
20//! * Sets: [`HashSet`], [`BTreeSet`]
21//! * Misc: [`BinaryHeap`]
1a4d82fc
JJ
22//!
23//! # When Should You Use Which Collection?
24//!
9346a6ac
AL
25//! These are fairly high-level and quick break-downs of when each collection
26//! should be considered. Detailed discussions of strengths and weaknesses of
27//! individual collections can be found on their own documentation pages.
1a4d82fc
JJ
28//!
29//! ### Use a `Vec` when:
9346a6ac
AL
30//! * You want to collect items up to be processed or sent elsewhere later, and
31//! don't care about any properties of the actual values being stored.
32//! * You want a sequence of elements in a particular order, and will only be
33//! appending to (or near) the end.
1a4d82fc
JJ
34//! * You want a stack.
35//! * You want a resizable array.
36//! * You want a heap-allocated array.
37//!
85aaf69f 38//! ### Use a `VecDeque` when:
c30ab7b3 39//! * You want a [`Vec`] that supports efficient insertion at both ends of the
9346a6ac 40//! sequence.
1a4d82fc
JJ
41//! * You want a queue.
42//! * You want a double-ended queue (deque).
43//!
85aaf69f 44//! ### Use a `LinkedList` when:
c30ab7b3 45//! * You want a [`Vec`] or [`VecDeque`] of unknown size, and can't tolerate
9346a6ac 46//! amortization.
85aaf69f 47//! * You want to efficiently split and append lists.
9346a6ac
AL
48//! * You are *absolutely* certain you *really*, *truly*, want a doubly linked
49//! list.
1a4d82fc
JJ
50//!
51//! ### Use a `HashMap` when:
52//! * You want to associate arbitrary keys with an arbitrary value.
53//! * You want a cache.
54//! * You want a map, with no extra functionality.
55//!
56//! ### Use a `BTreeMap` when:
2c00a5a8
XL
57//! * You want a map sorted by its keys.
58//! * You want to be able to get a range of entries on-demand.
1a4d82fc 59//! * You're interested in what the smallest or largest key-value pair is.
9346a6ac 60//! * You want to find the largest or smallest key that is smaller or larger
62682a34 61//! than something.
1a4d82fc 62//!
1a4d82fc
JJ
63//! ### Use the `Set` variant of any of these `Map`s when:
64//! * You just want to remember which keys you've seen.
65//! * There is no meaningful value to associate with your keys.
66//! * You just want a set.
67//!
1a4d82fc 68//! ### Use a `BinaryHeap` when:
9346a6ac
AL
69//!
70//! * You want to store a bunch of elements, but only ever want to process the
71//! "biggest" or "most important" one at any given time.
1a4d82fc
JJ
72//! * You want a priority queue.
73//!
85aaf69f
SL
74//! # Performance
75//!
9346a6ac
AL
76//! Choosing the right collection for the job requires an understanding of what
77//! each collection is good at. Here we briefly summarize the performance of
78//! different collections for certain important operations. For further details,
79//! see each type's documentation, and note that the names of actual methods may
80//! differ from the tables below on certain collections.
85aaf69f 81//!
9346a6ac
AL
82//! Throughout the documentation, we will follow a few conventions. For all
83//! operations, the collection's size is denoted by n. If another collection is
84//! involved in the operation, it contains m elements. Operations which have an
c30ab7b3 85//! *amortized* cost are suffixed with a `*`. Operations with an *expected*
9346a6ac 86//! cost are suffixed with a `~`.
85aaf69f 87//!
9346a6ac 88//! All amortized costs are for the potential need to resize when capacity is
3dfed10e 89//! exhausted. If a resize occurs it will take *O*(*n*) time. Our collections never
9346a6ac
AL
90//! automatically shrink, so removal operations aren't amortized. Over a
91//! sufficiently large series of operations, the average cost per operation will
92//! deterministically equal the given cost.
85aaf69f 93//!
c30ab7b3
SL
94//! Only [`HashMap`] has expected costs, due to the probabilistic nature of hashing.
95//! It is theoretically possible, though very unlikely, for [`HashMap`] to
9346a6ac 96//! experience worse performance.
85aaf69f
SL
97//!
98//! ## Sequences
99//!
c30ab7b3
SL
100//! | | get(i) | insert(i) | remove(i) | append | split_off(i) |
101//! |----------------|----------------|-----------------|----------------|--------|----------------|
102//! | [`Vec`] | O(1) | O(n-i)* | O(n-i) | O(m)* | O(n-i) |
103//! | [`VecDeque`] | O(1) | O(min(i, n-i))* | O(min(i, n-i)) | O(m)* | O(min(i, n-i)) |
104//! | [`LinkedList`] | O(min(i, n-i)) | O(min(i, n-i)) | O(min(i, n-i)) | O(1) | O(min(i, n-i)) |
85aaf69f 105//!
c30ab7b3
SL
106//! Note that where ties occur, [`Vec`] is generally going to be faster than [`VecDeque`], and
107//! [`VecDeque`] is generally going to be faster than [`LinkedList`].
85aaf69f
SL
108//!
109//! ## Maps
110//!
e9174d1e 111//! For Sets, all operations have the cost of the equivalent Map operation.
85aaf69f 112//!
5869c6ff
XL
113//! | | get | insert | remove | range | append |
114//! |--------------|-----------|-----------|-----------|-----------|--------|
115//! | [`HashMap`] | O(1)~ | O(1)~* | O(1)~ | N/A | N/A |
116//! | [`BTreeMap`] | O(log(n)) | O(log(n)) | O(log(n)) | O(log(n)) | O(n+m) |
85aaf69f 117//!
1a4d82fc
JJ
118//! # Correct and Efficient Usage of Collections
119//!
9346a6ac
AL
120//! Of course, knowing which collection is the right one for the job doesn't
121//! instantly permit you to use it correctly. Here are some quick tips for
122//! efficient and correct usage of the standard collections in general. If
123//! you're interested in how to use a specific collection in particular, consult
124//! its documentation for detailed discussion and code examples.
1a4d82fc
JJ
125//!
126//! ## Capacity Management
127//!
9346a6ac 128//! Many collections provide several constructors and methods that refer to
c30ab7b3 129//! "capacity". These collections are generally built on top of an array.
9346a6ac
AL
130//! Optimally, this array would be exactly the right size to fit only the
131//! elements stored in the collection, but for the collection to do this would
132//! be very inefficient. If the backing array was exactly the right size at all
133//! times, then every time an element is inserted, the collection would have to
134//! grow the array to fit it. Due to the way memory is allocated and managed on
135//! most computers, this would almost surely require allocating an entirely new
136//! array and copying every single element from the old one into the new one.
137//! Hopefully you can see that this wouldn't be very efficient to do on every
138//! operation.
139//!
140//! Most collections therefore use an *amortized* allocation strategy. They
141//! generally let themselves have a fair amount of unoccupied space so that they
142//! only have to grow on occasion. When they do grow, they allocate a
143//! substantially larger array to move the elements into so that it will take a
144//! while for another grow to be required. While this strategy is great in
145//! general, it would be even better if the collection *never* had to resize its
146//! backing array. Unfortunately, the collection itself doesn't have enough
147//! information to do this itself. Therefore, it is up to us programmers to give
148//! it hints.
149//!
cc61c64b 150//! Any `with_capacity` constructor will instruct the collection to allocate
9346a6ac
AL
151//! enough space for the specified number of elements. Ideally this will be for
152//! exactly that many elements, but some implementation details may prevent
48663c56
XL
153//! this. See collection-specific documentation for details. In general, use
154//! `with_capacity` when you know exactly how many elements will be inserted, or
155//! at least have a reasonable upper-bound on that number.
9346a6ac 156//!
cc61c64b 157//! When anticipating a large influx of elements, the `reserve` family of
9346a6ac 158//! methods can be used to hint to the collection how much room it should make
cc61c64b 159//! for the coming items. As with `with_capacity`, the precise behavior of
9346a6ac
AL
160//! these methods will be specific to the collection of interest.
161//!
162//! For optimal performance, collections will generally avoid shrinking
c30ab7b3 163//! themselves. If you believe that a collection will not soon contain any more
cc61c64b 164//! elements, or just really need the memory, the `shrink_to_fit` method prompts
9346a6ac
AL
165//! the collection to shrink the backing array to the minimum size capable of
166//! holding its elements.
167//!
168//! Finally, if ever you're interested in what the actual capacity of the
cc61c64b 169//! collection is, most collections provide a `capacity` method to query this
c30ab7b3 170//! information on demand. This can be useful for debugging purposes, or for
cc61c64b 171//! use with the `reserve` methods.
1a4d82fc
JJ
172//!
173//! ## Iterators
174//!
9346a6ac
AL
175//! Iterators are a powerful and robust mechanism used throughout Rust's
176//! standard libraries. Iterators provide a sequence of values in a generic,
177//! safe, efficient and convenient way. The contents of an iterator are usually
178//! *lazily* evaluated, so that only the values that are actually needed are
179//! ever actually produced, and no allocation need be done to temporarily store
180//! them. Iterators are primarily consumed using a `for` loop, although many
181//! functions also take iterators where a collection or sequence of values is
182//! desired.
183//!
184//! All of the standard collections provide several iterators for performing
185//! bulk manipulation of their contents. The three primary iterators almost
cc61c64b 186//! every collection should provide are `iter`, `iter_mut`, and `into_iter`.
9346a6ac
AL
187//! Some of these are not provided on collections where it would be unsound or
188//! unreasonable to provide them.
1a4d82fc 189//!
cc61c64b 190//! `iter` provides an iterator of immutable references to all the contents of a
c30ab7b3 191//! collection in the most "natural" order. For sequence collections like [`Vec`],
9346a6ac 192//! this means the items will be yielded in increasing order of index starting
c30ab7b3
SL
193//! at 0. For ordered collections like [`BTreeMap`], this means that the items
194//! will be yielded in sorted order. For unordered collections like [`HashMap`],
9346a6ac
AL
195//! the items will be yielded in whatever order the internal representation made
196//! most convenient. This is great for reading through all the contents of the
197//! collection.
1a4d82fc
JJ
198//!
199//! ```
85aaf69f 200//! let vec = vec![1, 2, 3, 4];
1a4d82fc
JJ
201//! for x in vec.iter() {
202//! println!("vec contained {}", x);
203//! }
204//! ```
205//!
cc61c64b
XL
206//! `iter_mut` provides an iterator of *mutable* references in the same order as
207//! `iter`. This is great for mutating all the contents of the collection.
1a4d82fc
JJ
208//!
209//! ```
85aaf69f 210//! let mut vec = vec![1, 2, 3, 4];
1a4d82fc
JJ
211//! for x in vec.iter_mut() {
212//! *x += 1;
213//! }
214//! ```
215//!
cc61c64b 216//! `into_iter` transforms the actual collection into an iterator over its
9346a6ac 217//! contents by-value. This is great when the collection itself is no longer
cc61c64b 218//! needed, and the values are needed elsewhere. Using `extend` with `into_iter`
9346a6ac 219//! is the main way that contents of one collection are moved into another.
cc61c64b
XL
220//! `extend` automatically calls `into_iter`, and takes any `T: `[`IntoIterator`].
221//! Calling `collect` on an iterator itself is also a great way to convert one
9346a6ac
AL
222//! collection into another. Both of these methods should internally use the
223//! capacity management tools discussed in the previous section to do this as
224//! efficiently as possible.
1a4d82fc
JJ
225//!
226//! ```
85aaf69f
SL
227//! let mut vec1 = vec![1, 2, 3, 4];
228//! let vec2 = vec![10, 20, 30, 40];
62682a34 229//! vec1.extend(vec2);
1a4d82fc
JJ
230//! ```
231//!
232//! ```
85aaf69f 233//! use std::collections::VecDeque;
1a4d82fc 234//!
85aaf69f
SL
235//! let vec = vec![1, 2, 3, 4];
236//! let buf: VecDeque<_> = vec.into_iter().collect();
1a4d82fc
JJ
237//! ```
238//!
9346a6ac 239//! Iterators also provide a series of *adapter* methods for performing common
cc61c64b
XL
240//! threads to sequences. Among the adapters are functional favorites like `map`,
241//! `fold`, `skip` and `take`. Of particular interest to collections is the
242//! `rev` adapter, that reverses any iterator that supports this operation. Most
9346a6ac
AL
243//! collections provide reversible iterators as the way to iterate over them in
244//! reverse order.
1a4d82fc
JJ
245//!
246//! ```
85aaf69f 247//! let vec = vec![1, 2, 3, 4];
1a4d82fc
JJ
248//! for x in vec.iter().rev() {
249//! println!("vec contained {}", x);
250//! }
251//! ```
252//!
9346a6ac
AL
253//! Several other collection methods also return iterators to yield a sequence
254//! of results but avoid allocating an entire collection to store the result in.
cc61c64b 255//! This provides maximum flexibility as `collect` or `extend` can be called to
9346a6ac
AL
256//! "pipe" the sequence into any collection if desired. Otherwise, the sequence
257//! can be looped over with a `for` loop. The iterator can also be discarded
258//! after partial use, preventing the computation of the unused items.
1a4d82fc
JJ
259//!
260//! ## Entries
261//!
cc61c64b 262//! The `entry` API is intended to provide an efficient mechanism for
9346a6ac
AL
263//! manipulating the contents of a map conditionally on the presence of a key or
264//! not. The primary motivating use case for this is to provide efficient
265//! accumulator maps. For instance, if one wishes to maintain a count of the
266//! number of times each key has been seen, they will have to perform some
267//! conditional logic on whether this is the first time the key has been seen or
cc61c64b 268//! not. Normally, this would require a `find` followed by an `insert`,
9346a6ac
AL
269//! effectively duplicating the search effort on each insertion.
270//!
271//! When a user calls `map.entry(&key)`, the map will search for the key and
272//! then yield a variant of the `Entry` enum.
273//!
274//! If a `Vacant(entry)` is yielded, then the key *was not* found. In this case
cc61c64b 275//! the only valid operation is to `insert` a value into the entry. When this is
9346a6ac 276//! done, the vacant entry is consumed and converted into a mutable reference to
62682a34 277//! the value that was inserted. This allows for further manipulation of the
9346a6ac
AL
278//! value beyond the lifetime of the search itself. This is useful if complex
279//! logic needs to be performed on the value regardless of whether the value was
280//! just inserted.
281//!
282//! If an `Occupied(entry)` is yielded, then the key *was* found. In this case,
cc61c64b 283//! the user has several options: they can `get`, `insert` or `remove` the
9346a6ac
AL
284//! value of the occupied entry. Additionally, they can convert the occupied
285//! entry into a mutable reference to its value, providing symmetry to the
cc61c64b 286//! vacant `insert` case.
1a4d82fc
JJ
287//!
288//! ### Examples
289//!
cc61c64b 290//! Here are the two primary ways in which `entry` is used. First, a simple
9346a6ac 291//! example where the logic performed on the values is trivial.
1a4d82fc
JJ
292//!
293//! #### Counting the number of times each character in a string occurs
294//!
295//! ```
9346a6ac 296//! use std::collections::btree_map::BTreeMap;
1a4d82fc
JJ
297//!
298//! let mut count = BTreeMap::new();
299//! let message = "she sells sea shells by the sea shore";
300//!
301//! for c in message.chars() {
c34b1796 302//! *count.entry(c).or_insert(0) += 1;
1a4d82fc
JJ
303//! }
304//!
305//! assert_eq!(count.get(&'s'), Some(&8));
306//!
85aaf69f 307//! println!("Number of occurrences of each character");
62682a34 308//! for (char, count) in &count {
1a4d82fc
JJ
309//! println!("{}: {}", char, count);
310//! }
311//! ```
312//!
9346a6ac 313//! When the logic to be performed on the value is more complex, we may simply
cc61c64b 314//! use the `entry` API to ensure that the value is initialized and perform the
9346a6ac 315//! logic afterwards.
1a4d82fc
JJ
316//!
317//! #### Tracking the inebriation of customers at a bar
318//!
319//! ```
9346a6ac 320//! use std::collections::btree_map::BTreeMap;
1a4d82fc 321//!
b039eaaf
SL
322//! // A client of the bar. They have a blood alcohol level.
323//! struct Person { blood_alcohol: f32 }
1a4d82fc 324//!
9fa01778
XL
325//! // All the orders made to the bar, by client ID.
326//! let orders = vec![1, 2, 1, 2, 3, 4, 1, 2, 2, 3, 4, 1, 1, 1];
1a4d82fc
JJ
327//!
328//! // Our clients.
329//! let mut blood_alcohol = BTreeMap::new();
330//!
62682a34 331//! for id in orders {
1a4d82fc
JJ
332//! // If this is the first time we've seen this customer, initialize them
333//! // with no blood alcohol. Otherwise, just retrieve them.
b039eaaf 334//! let person = blood_alcohol.entry(id).or_insert(Person { blood_alcohol: 0.0 });
1a4d82fc
JJ
335//!
336//! // Reduce their blood alcohol level. It takes time to order and drink a beer!
337//! person.blood_alcohol *= 0.9;
338//!
339//! // Check if they're sober enough to have another beer.
340//! if person.blood_alcohol > 0.3 {
341//! // Too drunk... for now.
b039eaaf 342//! println!("Sorry {}, I have to cut you off", id);
1a4d82fc
JJ
343//! } else {
344//! // Have another!
345//! person.blood_alcohol += 0.1;
346//! }
347//! }
348//! ```
b039eaaf
SL
349//!
350//! # Insert and complex keys
351//!
cc61c64b 352//! If we have a more complex key, calls to `insert` will
b039eaaf
SL
353//! not update the value of the key. For example:
354//!
355//! ```
356//! use std::cmp::Ordering;
357//! use std::collections::BTreeMap;
358//! use std::hash::{Hash, Hasher};
359//!
360//! #[derive(Debug)]
361//! struct Foo {
362//! a: u32,
363//! b: &'static str,
364//! }
365//!
366//! // we will compare `Foo`s by their `a` value only.
367//! impl PartialEq for Foo {
368//! fn eq(&self, other: &Self) -> bool { self.a == other.a }
369//! }
370//!
371//! impl Eq for Foo {}
372//!
373//! // we will hash `Foo`s by their `a` value only.
374//! impl Hash for Foo {
375//! fn hash<H: Hasher>(&self, h: &mut H) { self.a.hash(h); }
376//! }
377//!
378//! impl PartialOrd for Foo {
379//! fn partial_cmp(&self, other: &Self) -> Option<Ordering> { self.a.partial_cmp(&other.a) }
380//! }
381//!
382//! impl Ord for Foo {
383//! fn cmp(&self, other: &Self) -> Ordering { self.a.cmp(&other.a) }
384//! }
385//!
386//! let mut map = BTreeMap::new();
7453a54e 387//! map.insert(Foo { a: 1, b: "baz" }, 99);
b039eaaf
SL
388//!
389//! // We already have a Foo with an a of 1, so this will be updating the value.
7453a54e 390//! map.insert(Foo { a: 1, b: "xyz" }, 100);
b039eaaf 391//!
7453a54e
SL
392//! // The value has been updated...
393//! assert_eq!(map.values().next().unwrap(), &100);
394//!
395//! // ...but the key hasn't changed. b is still "baz", not "xyz".
b039eaaf
SL
396//! assert_eq!(map.keys().next().unwrap().b, "baz");
397//! ```
c30ab7b3 398//!
3dfed10e 399//! [`IntoIterator`]: crate::iter::IntoIterator
1a4d82fc 400
85aaf69f 401#![stable(feature = "rust1", since = "1.0.0")]
1a4d82fc 402
92a42be0 403#[stable(feature = "rust1", since = "1.0.0")]
6a06907d 404// FIXME(#82080) The deprecation here is only theoretical, and does not actually produce a warning.
0531ce1d
XL
405#[rustc_deprecated(reason = "moved to `std::ops::Bound`", since = "1.26.0")]
406#[doc(hidden)]
532ac7d7 407pub use crate::ops::Bound;
6a06907d 408
92a42be0 409#[stable(feature = "rust1", since = "1.0.0")]
8faf50e0 410pub use alloc_crate::collections::{binary_heap, btree_map, btree_set};
92a42be0 411#[stable(feature = "rust1", since = "1.0.0")]
8faf50e0 412pub use alloc_crate::collections::{linked_list, vec_deque};
60c5eb7d
XL
413#[stable(feature = "rust1", since = "1.0.0")]
414pub use alloc_crate::collections::{BTreeMap, BTreeSet, BinaryHeap};
415#[stable(feature = "rust1", since = "1.0.0")]
416pub use alloc_crate::collections::{LinkedList, VecDeque};
1a4d82fc 417
92a42be0 418#[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc 419pub use self::hash_map::HashMap;
92a42be0 420#[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc
JJ
421pub use self::hash_set::HashSet;
422
60c5eb7d 423#[unstable(feature = "try_reserve", reason = "new API", issue = "48043")]
e1599b0c 424pub use alloc_crate::collections::TryReserveError;
c30ab7b3 425
1a4d82fc
JJ
426mod hash;
427
85aaf69f 428#[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc 429pub mod hash_map {
60c5eb7d 430 //! A hash map implemented with quadratic probing and SIMD lookup.
92a42be0 431 #[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc
JJ
432 pub use super::hash::map::*;
433}
434
85aaf69f 435#[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc 436pub mod hash_set {
cc61c64b 437 //! A hash set implemented as a `HashMap` where the value is `()`.
92a42be0 438 #[stable(feature = "rust1", since = "1.0.0")]
1a4d82fc
JJ
439 pub use super::hash::set::*;
440}