]>
Commit | Line | Data |
---|---|---|
8bb4bdeb | 1 | # Allocating Memory |
c1a9b12d SL |
2 | |
3 | Using Unique throws a wrench in an important feature of Vec (and indeed all of | |
4 | the std collections): an empty Vec doesn't actually allocate at all. So if we | |
5 | can't allocate, but also can't put a null pointer in `ptr`, what do we do in | |
6 | `Vec::new`? Well, we just put some other garbage in there! | |
7 | ||
8 | This is perfectly fine because we already have `cap == 0` as our sentinel for no | |
9 | allocation. We don't even need to handle it specially in almost any code because | |
7cac9316 XL |
10 | we usually need to check if `cap > len` or `len > 0` anyway. The recommended |
11 | Rust value to put here is `mem::align_of::<T>()`. Unique provides a convenience | |
12 | for this: `Unique::empty()`. There are quite a few places where we'll | |
13 | want to use `empty` because there's no real allocation to talk about but | |
c1a9b12d SL |
14 | `null` would make the compiler do bad things. |
15 | ||
c1a9b12d SL |
16 | So: |
17 | ||
18 | ```rust,ignore | |
e9174d1e | 19 | #![feature(alloc, heap_api)] |
c1a9b12d | 20 | |
c1a9b12d SL |
21 | use std::mem; |
22 | ||
23 | impl<T> Vec<T> { | |
24 | fn new() -> Self { | |
25 | assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs"); | |
7cac9316 | 26 | Vec { ptr: Unique::empty(), len: 0, cap: 0 } |
c1a9b12d SL |
27 | } |
28 | } | |
29 | ``` | |
30 | ||
31 | I slipped in that assert there because zero-sized types will require some | |
32 | special handling throughout our code, and I want to defer the issue for now. | |
33 | Without this assert, some of our early drafts will do some Very Bad Things. | |
34 | ||
35 | Next we need to figure out what to actually do when we *do* want space. For | |
36 | that, we'll need to use the rest of the heap APIs. These basically allow us to | |
37 | talk directly to Rust's allocator (jemalloc by default). | |
38 | ||
39 | We'll also need a way to handle out-of-memory (OOM) conditions. The standard | |
83c7162d XL |
40 | library calls `std::alloc::oom()`, which in turn calls the the `oom` langitem. |
41 | By default this just aborts the program by executing an illegal cpu instruction. | |
42 | The reason we abort and don't panic is because unwinding can cause allocations | |
43 | to happen, and that seems like a bad thing to do when your allocator just came | |
44 | back with "hey I don't have any more memory". | |
c1a9b12d SL |
45 | |
46 | Of course, this is a bit silly since most platforms don't actually run out of | |
47 | memory in a conventional way. Your operating system will probably kill the | |
48 | application by another means if you legitimately start using up all the memory. | |
49 | The most likely way we'll trigger OOM is by just asking for ludicrous quantities | |
50 | of memory at once (e.g. half the theoretical address space). As such it's | |
51 | *probably* fine to panic and nothing bad will happen. Still, we're trying to be | |
52 | like the standard library as much as possible, so we'll just kill the whole | |
53 | program. | |
54 | ||
c1a9b12d SL |
55 | Okay, now we can write growing. Roughly, we want to have this logic: |
56 | ||
57 | ```text | |
58 | if cap == 0: | |
59 | allocate() | |
60 | cap = 1 | |
61 | else: | |
62 | reallocate() | |
63 | cap *= 2 | |
64 | ``` | |
65 | ||
66 | But Rust's only supported allocator API is so low level that we'll need to do a | |
67 | fair bit of extra work. We also need to guard against some special | |
68 | conditions that can occur with really large allocations or empty allocations. | |
69 | ||
70 | In particular, `ptr::offset` will cause us a lot of trouble, because it has | |
71 | the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to | |
72 | not have dealt with this instruction, here's the basic story with GEP: alias | |
73 | analysis, alias analysis, alias analysis. It's super important to an optimizing | |
74 | compiler to be able to reason about data dependencies and aliasing. | |
75 | ||
76 | As a simple example, consider the following fragment of code: | |
77 | ||
78 | ```rust | |
79 | # let x = &mut 0; | |
80 | # let y = &mut 0; | |
81 | *x *= 7; | |
82 | *y *= 3; | |
83 | ``` | |
84 | ||
85 | If the compiler can prove that `x` and `y` point to different locations in | |
86 | memory, the two operations can in theory be executed in parallel (by e.g. | |
87 | loading them into different registers and working on them independently). | |
88 | However the compiler can't do this in general because if x and y point to | |
89 | the same location in memory, the operations need to be done to the same value, | |
90 | and they can't just be merged afterwards. | |
91 | ||
92 | When you use GEP inbounds, you are specifically telling LLVM that the offsets | |
93 | you're about to do are within the bounds of a single "allocated" entity. The | |
94 | ultimate payoff being that LLVM can assume that if two pointers are known to | |
95 | point to two disjoint objects, all the offsets of those pointers are *also* | |
96 | known to not alias (because you won't just end up in some random place in | |
97 | memory). LLVM is heavily optimized to work with GEP offsets, and inbounds | |
98 | offsets are the best of all, so it's important that we use them as much as | |
99 | possible. | |
100 | ||
101 | So that's what GEP's about, how can it cause us trouble? | |
102 | ||
103 | The first problem is that we index into arrays with unsigned integers, but | |
104 | GEP (and as a consequence `ptr::offset`) takes a signed integer. This means | |
105 | that half of the seemingly valid indices into an array will overflow GEP and | |
106 | actually go in the wrong direction! As such we must limit all allocations to | |
107 | `isize::MAX` elements. This actually means we only need to worry about | |
108 | byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of | |
109 | the system's memory. However in order to avoid subtle corner cases where someone | |
110 | reinterprets some array of `< isize::MAX` objects as bytes, std limits all | |
111 | allocations to `isize::MAX` bytes. | |
112 | ||
113 | On all 64-bit targets that Rust currently supports we're artificially limited | |
114 | to significantly less than all 64 bits of the address space (modern x64 | |
115 | platforms only expose 48-bit addressing), so we can rely on just running out of | |
116 | memory first. However on 32-bit targets, particularly those with extensions to | |
117 | use more of the address space (PAE x86 or x32), it's theoretically possible to | |
118 | successfully allocate more than `isize::MAX` bytes of memory. | |
119 | ||
120 | However since this is a tutorial, we're not going to be particularly optimal | |
121 | here, and just unconditionally check, rather than use clever platform-specific | |
122 | `cfg`s. | |
123 | ||
124 | The other corner-case we need to worry about is empty allocations. There will | |
125 | be two kinds of empty allocations we need to worry about: `cap = 0` for all T, | |
126 | and `cap > 0` for zero-sized types. | |
127 | ||
128 | These cases are tricky because they come | |
129 | down to what LLVM means by "allocated". LLVM's notion of an | |
130 | allocation is significantly more abstract than how we usually use it. Because | |
131 | LLVM needs to work with different languages' semantics and custom allocators, | |
132 | it can't really intimately understand allocation. Instead, the main idea behind | |
133 | allocation is "doesn't overlap with other stuff". That is, heap allocations, | |
134 | stack allocations, and globals don't randomly overlap. Yep, it's about alias | |
a7813a04 | 135 | analysis. As such, Rust can technically play a bit fast and loose with the notion of |
c1a9b12d SL |
136 | an allocation as long as it's *consistent*. |
137 | ||
138 | Getting back to the empty allocation case, there are a couple of places where | |
139 | we want to offset by 0 as a consequence of generic code. The question is then: | |
140 | is it consistent to do so? For zero-sized types, we have concluded that it is | |
141 | indeed consistent to do a GEP inbounds offset by an arbitrary number of | |
142 | elements. This is a runtime no-op because every element takes up no space, | |
143 | and it's fine to pretend that there's infinite zero-sized types allocated | |
144 | at `0x01`. No allocator will ever allocate that address, because they won't | |
145 | allocate `0x00` and they generally allocate to some minimal alignment higher | |
146 | than a byte. Also generally the whole first page of memory is | |
147 | protected from being allocated anyway (a whole 4k, on many platforms). | |
148 | ||
149 | However what about for positive-sized types? That one's a bit trickier. In | |
150 | principle, you can argue that offsetting by 0 gives LLVM no information: either | |
151 | there's an element before the address or after it, but it can't know which. | |
152 | However we've chosen to conservatively assume that it may do bad things. As | |
153 | such we will guard against this case explicitly. | |
154 | ||
155 | *Phew* | |
156 | ||
157 | Ok with all the nonsense out of the way, let's actually allocate some memory: | |
158 | ||
159 | ```rust,ignore | |
83c7162d XL |
160 | use std::alloc::oom; |
161 | ||
c1a9b12d SL |
162 | fn grow(&mut self) { |
163 | // this is all pretty delicate, so let's say it's all unsafe | |
164 | unsafe { | |
165 | // current API requires us to specify size and alignment manually. | |
166 | let align = mem::align_of::<T>(); | |
167 | let elem_size = mem::size_of::<T>(); | |
168 | ||
169 | let (new_cap, ptr) = if self.cap == 0 { | |
170 | let ptr = heap::allocate(elem_size, align); | |
171 | (1, ptr) | |
172 | } else { | |
173 | // as an invariant, we can assume that `self.cap < isize::MAX`, | |
174 | // so this doesn't need to be checked. | |
175 | let new_cap = self.cap * 2; | |
176 | // Similarly this can't overflow due to previously allocating this | |
177 | let old_num_bytes = self.cap * elem_size; | |
178 | ||
179 | // check that the new allocation doesn't exceed `isize::MAX` at all | |
180 | // regardless of the actual size of the capacity. This combines the | |
181 | // `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks | |
182 | // we need to make. We lose the ability to allocate e.g. 2/3rds of | |
183 | // the address space with a single Vec of i16's on 32-bit though. | |
184 | // Alas, poor Yorick -- I knew him, Horatio. | |
185 | assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2, | |
186 | "capacity overflow"); | |
187 | ||
188 | let new_num_bytes = old_num_bytes * 2; | |
7cac9316 | 189 | let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _, |
c1a9b12d SL |
190 | old_num_bytes, |
191 | new_num_bytes, | |
192 | align); | |
193 | (new_cap, ptr) | |
194 | }; | |
195 | ||
196 | // If allocate or reallocate fail, we'll get `null` back | |
197 | if ptr.is_null() { oom(); } | |
198 | ||
199 | self.ptr = Unique::new(ptr as *mut _); | |
200 | self.cap = new_cap; | |
201 | } | |
202 | } | |
203 | ``` | |
204 | ||
205 | Nothing particularly tricky here. Just computing sizes and alignments and doing | |
206 | some careful multiplication checks. | |
207 |