]> git.proxmox.com Git - rustc.git/blob - src/doc/nomicon/src/vec/vec-zsts.md
New upstream version 1.67.1+dfsg1
[rustc.git] / src / doc / nomicon / src / vec / vec-zsts.md
1 # Handling Zero-Sized Types
2
3 It's time. We're going to fight the specter that is zero-sized types. Safe Rust
4 *never* needs to care about this, but Vec is very intensive on raw pointers and
5 raw allocations, which are exactly the two things that care about
6 zero-sized types. We need to be careful of two things:
7
8 * The raw allocator API has undefined behavior if you pass in 0 for an
9 allocation size.
10 * raw pointer offsets are no-ops for zero-sized types, which will break our
11 C-style pointer iterator.
12
13 Thankfully we abstracted out pointer-iterators and allocating handling into
14 `RawValIter` and `RawVec` respectively. How mysteriously convenient.
15
16 ## Allocating Zero-Sized Types
17
18 So if the allocator API doesn't support zero-sized allocations, what on earth
19 do we store as our allocation? `NonNull::dangling()` of course! Almost every operation
20 with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
21 to be considered to store or load them. This actually extends to `ptr::read` and
22 `ptr::write`: they won't actually look at the pointer at all. As such we never need
23 to change the pointer.
24
25 Note however that our previous reliance on running out of memory before overflow is
26 no longer valid with zero-sized types. We must explicitly guard against capacity
27 overflow for zero-sized types.
28
29 Due to our current architecture, all this means is writing 3 guards, one in each
30 method of `RawVec`.
31
32 <!-- ignore: simplified code -->
33 ```rust,ignore
34 impl<T> RawVec<T> {
35 fn new() -> Self {
36 // This branch should be stripped at compile time.
37 let cap = if mem::size_of::<T>() == 0 { usize::MAX } else { 0 };
38
39 // `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation"
40 RawVec {
41 ptr: NonNull::dangling(),
42 cap: cap,
43 }
44 }
45
46 fn grow(&mut self) {
47 // since we set the capacity to usize::MAX when T has size 0,
48 // getting to here necessarily means the Vec is overfull.
49 assert!(mem::size_of::<T>() != 0, "capacity overflow");
50
51 let (new_cap, new_layout) = if self.cap == 0 {
52 (1, Layout::array::<T>(1).unwrap())
53 } else {
54 // This can't overflow because we ensure self.cap <= isize::MAX.
55 let new_cap = 2 * self.cap;
56
57 // `Layout::array` checks that the number of bytes is <= usize::MAX,
58 // but this is redundant since old_layout.size() <= isize::MAX,
59 // so the `unwrap` should never fail.
60 let new_layout = Layout::array::<T>(new_cap).unwrap();
61 (new_cap, new_layout)
62 };
63
64 // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
65 assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
66
67 let new_ptr = if self.cap == 0 {
68 unsafe { alloc::alloc(new_layout) }
69 } else {
70 let old_layout = Layout::array::<T>(self.cap).unwrap();
71 let old_ptr = self.ptr.as_ptr() as *mut u8;
72 unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
73 };
74
75 // If allocation fails, `new_ptr` will be null, in which case we abort.
76 self.ptr = match NonNull::new(new_ptr as *mut T) {
77 Some(p) => p,
78 None => alloc::handle_alloc_error(new_layout),
79 };
80 self.cap = new_cap;
81 }
82 }
83
84 impl<T> Drop for RawVec<T> {
85 fn drop(&mut self) {
86 let elem_size = mem::size_of::<T>();
87
88 if self.cap != 0 && elem_size != 0 {
89 unsafe {
90 alloc::dealloc(
91 self.ptr.as_ptr() as *mut u8,
92 Layout::array::<T>(self.cap).unwrap(),
93 );
94 }
95 }
96 }
97 }
98 ```
99
100 That's it. We support pushing and popping zero-sized types now. Our iterators
101 (that aren't provided by slice Deref) are still busted, though.
102
103 ## Iterating Zero-Sized Types
104
105 Zero-sized offsets are no-ops. This means that our current design will always
106 initialize `start` and `end` as the same value, and our iterators will yield
107 nothing. The current solution to this is to cast the pointers to integers,
108 increment, and then cast them back:
109
110 <!-- ignore: simplified code -->
111 ```rust,ignore
112 impl<T> RawValIter<T> {
113 unsafe fn new(slice: &[T]) -> Self {
114 RawValIter {
115 start: slice.as_ptr(),
116 end: if mem::size_of::<T>() == 0 {
117 ((slice.as_ptr() as usize) + slice.len()) as *const _
118 } else if slice.len() == 0 {
119 slice.as_ptr()
120 } else {
121 slice.as_ptr().add(slice.len())
122 },
123 }
124 }
125 }
126 ```
127
128 Now we have a different bug. Instead of our iterators not running at all, our
129 iterators now run *forever*. We need to do the same trick in our iterator impls.
130 Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll
131 basically be treating the two pointers as if they point to bytes, we'll just
132 map size 0 to divide by 1. Here's what `next` will be:
133
134 <!-- ignore: simplified code -->
135 ```rust,ignore
136 fn next(&mut self) -> Option<T> {
137 if self.start == self.end {
138 None
139 } else {
140 unsafe {
141 let result = ptr::read(self.start);
142 self.start = if mem::size_of::<T>() == 0 {
143 (self.start as usize + 1) as *const _
144 } else {
145 self.start.offset(1)
146 };
147 Some(result)
148 }
149 }
150 }
151 ```
152
153 Do you see the "bug"? No one else did! The original author only noticed the
154 problem when linking to this page years later. This code is kind of dubious
155 because abusing the iterator pointers to be *counters* makes them unaligned!
156 Our *one job* when using ZSTs is to keep pointers aligned! *forehead slap*
157
158 Raw pointers don't need to be aligned at all times, so the basic trick of
159 using pointers as counters is *fine*, but they *should* definitely be aligned
160 when passed to `ptr::read`! This is *possibly* needless pedantry
161 because `ptr::read` is a noop for a ZST, but let's be a *little* more
162 responsible and read from `NonNull::dangling` on the ZST path.
163
164 (Alternatively you could call `read_unaligned` on the ZST path. Either is fine,
165 because either way we're making up a value from nothing and it all compiles
166 to doing nothing.)
167
168 <!-- ignore: simplified code -->
169 ```rust,ignore
170 impl<T> Iterator for RawValIter<T> {
171 type Item = T;
172 fn next(&mut self) -> Option<T> {
173 if self.start == self.end {
174 None
175 } else {
176 unsafe {
177 if mem::size_of::<T>() == 0 {
178 self.start = (self.start as usize + 1) as *const _;
179 Some(ptr::read(NonNull::<T>::dangling().as_ptr()))
180 } else {
181 let old_ptr = self.start;
182 self.start = self.start.offset(1);
183 Some(ptr::read(old_ptr))
184 }
185 }
186 }
187 }
188
189 fn size_hint(&self) -> (usize, Option<usize>) {
190 let elem_size = mem::size_of::<T>();
191 let len = (self.end as usize - self.start as usize)
192 / if elem_size == 0 { 1 } else { elem_size };
193 (len, Some(len))
194 }
195 }
196
197 impl<T> DoubleEndedIterator for RawValIter<T> {
198 fn next_back(&mut self) -> Option<T> {
199 if self.start == self.end {
200 None
201 } else {
202 unsafe {
203 if mem::size_of::<T>() == 0 {
204 self.end = (self.end as usize - 1) as *const _;
205 Some(ptr::read(NonNull::<T>::dangling().as_ptr()))
206 } else {
207 self.end = self.end.offset(-1);
208 Some(ptr::read(self.end))
209 }
210 }
211 }
212 }
213 }
214 ```
215
216 And that's it. Iteration works!