1 # Handling Zero-Sized Types
3 It's time. We're going to fight the specter that is zero-sized types. Safe Rust
4 *never* needs to care about this, but Vec is very intensive on raw pointers and
5 raw allocations, which are exactly the two things that care about
6 zero-sized types. We need to be careful of two things:
8 * The raw allocator API has undefined behavior if you pass in 0 for an
10 * raw pointer offsets are no-ops for zero-sized types, which will break our
11 C-style pointer iterator.
13 Thankfully we abstracted out pointer-iterators and allocating handling into
14 `RawValIter` and `RawVec` respectively. How mysteriously convenient.
16 ## Allocating Zero-Sized Types
18 So if the allocator API doesn't support zero-sized allocations, what on earth
19 do we store as our allocation? `NonNull::dangling()` of course! Almost every operation
20 with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
21 to be considered to store or load them. This actually extends to `ptr::read` and
22 `ptr::write`: they won't actually look at the pointer at all. As such we never need
23 to change the pointer.
25 Note however that our previous reliance on running out of memory before overflow is
26 no longer valid with zero-sized types. We must explicitly guard against capacity
27 overflow for zero-sized types.
29 Due to our current architecture, all this means is writing 3 guards, one in each
32 <!-- ignore: simplified code -->
36 // This branch should be stripped at compile time.
37 let cap = if mem::size_of::<T>() == 0 { usize::MAX } else { 0 };
39 // `NonNull::dangling()` doubles as "unallocated" and "zero-sized allocation"
41 ptr: NonNull::dangling(),
47 // since we set the capacity to usize::MAX when T has size 0,
48 // getting to here necessarily means the Vec is overfull.
49 assert!(mem::size_of::<T>() != 0, "capacity overflow");
51 let (new_cap, new_layout) = if self.cap == 0 {
52 (1, Layout::array::<T>(1).unwrap())
54 // This can't overflow because we ensure self.cap <= isize::MAX.
55 let new_cap = 2 * self.cap;
57 // `Layout::array` checks that the number of bytes is <= usize::MAX,
58 // but this is redundant since old_layout.size() <= isize::MAX,
59 // so the `unwrap` should never fail.
60 let new_layout = Layout::array::<T>(new_cap).unwrap();
64 // Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
65 assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
67 let new_ptr = if self.cap == 0 {
68 unsafe { alloc::alloc(new_layout) }
70 let old_layout = Layout::array::<T>(self.cap).unwrap();
71 let old_ptr = self.ptr.as_ptr() as *mut u8;
72 unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
75 // If allocation fails, `new_ptr` will be null, in which case we abort.
76 self.ptr = match NonNull::new(new_ptr as *mut T) {
78 None => alloc::handle_alloc_error(new_layout),
84 impl<T> Drop for RawVec<T> {
86 let elem_size = mem::size_of::<T>();
88 if self.cap != 0 && elem_size != 0 {
91 self.ptr.as_ptr() as *mut u8,
92 Layout::array::<T>(self.cap).unwrap(),
100 That's it. We support pushing and popping zero-sized types now. Our iterators
101 (that aren't provided by slice Deref) are still busted, though.
103 ## Iterating Zero-Sized Types
105 Zero-sized offsets are no-ops. This means that our current design will always
106 initialize `start` and `end` as the same value, and our iterators will yield
107 nothing. The current solution to this is to cast the pointers to integers,
108 increment, and then cast them back:
110 <!-- ignore: simplified code -->
112 impl<T> RawValIter<T> {
113 unsafe fn new(slice: &[T]) -> Self {
115 start: slice.as_ptr(),
116 end: if mem::size_of::<T>() == 0 {
117 ((slice.as_ptr() as usize) + slice.len()) as *const _
118 } else if slice.len() == 0 {
121 slice.as_ptr().add(slice.len())
128 Now we have a different bug. Instead of our iterators not running at all, our
129 iterators now run *forever*. We need to do the same trick in our iterator impls.
130 Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll
131 basically be treating the two pointers as if they point to bytes, we'll just
132 map size 0 to divide by 1. Here's what `next` will be:
134 <!-- ignore: simplified code -->
136 fn next(&mut self) -> Option<T> {
137 if self.start == self.end {
141 let result = ptr::read(self.start);
142 self.start = if mem::size_of::<T>() == 0 {
143 (self.start as usize + 1) as *const _
153 Do you see the "bug"? No one else did! The original author only noticed the
154 problem when linking to this page years later. This code is kind of dubious
155 because abusing the iterator pointers to be *counters* makes them unaligned!
156 Our *one job* when using ZSTs is to keep pointers aligned! *forehead slap*
158 Raw pointers don't need to be aligned at all times, so the basic trick of
159 using pointers as counters is *fine*, but they *should* definitely be aligned
160 when passed to `ptr::read`! This is *possibly* needless pedantry
161 because `ptr::read` is a noop for a ZST, but let's be a *little* more
162 responsible and read from `NonNull::dangling` on the ZST path.
164 (Alternatively you could call `read_unaligned` on the ZST path. Either is fine,
165 because either way we're making up a value from nothing and it all compiles
168 <!-- ignore: simplified code -->
170 impl<T> Iterator for RawValIter<T> {
172 fn next(&mut self) -> Option<T> {
173 if self.start == self.end {
177 if mem::size_of::<T>() == 0 {
178 self.start = (self.start as usize + 1) as *const _;
179 Some(ptr::read(NonNull::<T>::dangling().as_ptr()))
181 let old_ptr = self.start;
182 self.start = self.start.offset(1);
183 Some(ptr::read(old_ptr))
189 fn size_hint(&self) -> (usize, Option<usize>) {
190 let elem_size = mem::size_of::<T>();
191 let len = (self.end as usize - self.start as usize)
192 / if elem_size == 0 { 1 } else { elem_size };
197 impl<T> DoubleEndedIterator for RawValIter<T> {
198 fn next_back(&mut self) -> Option<T> {
199 if self.start == self.end {
203 if mem::size_of::<T>() == 0 {
204 self.end = (self.end as usize - 1) as *const _;
205 Some(ptr::read(NonNull::<T>::dangling().as_ptr()))
207 self.end = self.end.offset(-1);
208 Some(ptr::read(self.end))
216 And that's it. Iteration works!