]> git.proxmox.com Git - rustc.git/blame - src/doc/nomicon/src/leaking.md
New upstream version 1.62.1+dfsg1
[rustc.git] / src / doc / nomicon / src / leaking.md
CommitLineData
8bb4bdeb 1# Leaking
c1a9b12d
SL
2
3Ownership-based resource management is intended to simplify composition. You
4acquire resources when you create the object, and you release the resources when
5it gets destroyed. Since destruction is handled for you, it means you can't
6forget to release the resources, and it happens as soon as possible! Surely this
7is perfect and all of our problems are solved.
8
9Everything is terrible and we have new and exotic problems to try to solve.
10
11Many people like to believe that Rust eliminates resource leaks. In practice,
12this is basically true. You would be surprised to see a Safe Rust program
13leak resources in an uncontrolled way.
14
15However from a theoretical perspective this is absolutely not the case, no
16matter how you look at it. In the strictest sense, "leaking" is so abstract as
17to be unpreventable. It's quite trivial to initialize a collection at the start
18of a program, fill it with tons of objects with destructors, and then enter an
19infinite event loop that never refers to it. The collection will sit around
20uselessly, holding on to its precious resources until the program terminates (at
21which point all those resources would have been reclaimed by the OS anyway).
22
23We may consider a more restricted form of leak: failing to drop a value that is
24unreachable. Rust also doesn't prevent this. In fact Rust *has a function for
25doing this*: `mem::forget`. This function consumes the value it is passed *and
26then doesn't run its destructor*.
27
28In the past `mem::forget` was marked as unsafe as a sort of lint against using
29it, since failing to call a destructor is generally not a well-behaved thing to
30do (though useful for some special unsafe code). However this was generally
31determined to be an untenable stance to take: there are many ways to fail to
32call a destructor in safe code. The most famous example is creating a cycle of
33reference-counted pointers using interior mutability.
34
35It is reasonable for safe code to assume that destructor leaks do not happen, as
36any program that leaks destructors is probably wrong. However *unsafe* code
37cannot rely on destructors to be run in order to be safe. For most types this
38doesn't matter: if you leak the destructor then the type is by definition
39inaccessible, so it doesn't matter, right? For instance, if you leak a `Box<u8>`
40then you waste some memory but that's hardly going to violate memory-safety.
41
42However where we must be careful with destructor leaks are *proxy* types. These
43are types which manage access to a distinct object, but don't actually own it.
44Proxy objects are quite rare. Proxy objects you'll need to care about are even
45rarer. However we'll focus on three interesting examples in the standard
46library:
47
48* `vec::Drain`
49* `Rc`
50* `thread::scoped::JoinGuard`
51
c1a9b12d
SL
52## Drain
53
54`drain` is a collections API that moves data out of the container without
55consuming the container. This enables us to reuse the allocation of a `Vec`
56after claiming ownership over all of its contents. It produces an iterator
57(Drain) that returns the contents of the Vec by-value.
58
59Now, consider Drain in the middle of iteration: some values have been moved out,
60and others haven't. This means that part of the Vec is now full of logically
61uninitialized data! We could backshift all the elements in the Vec every time we
62remove a value, but this would have pretty catastrophic performance
63consequences.
64
65Instead, we would like Drain to fix the Vec's backing storage when it is
66dropped. It should run itself to completion, backshift any elements that weren't
67removed (drain supports subranges), and then fix Vec's `len`. It's even
68unwinding-safe! Easy!
69
70Now consider the following:
71
136023e0 72<!-- ignore: simplified code -->
c1a9b12d
SL
73```rust,ignore
74let mut vec = vec![Box::new(0); 4];
75
76{
77 // start draining, vec can no longer be accessed
78 let mut drainer = vec.drain(..);
79
80 // pull out two elements and immediately drop them
81 drainer.next();
82 drainer.next();
83
84 // get rid of drainer, but don't call its destructor
85 mem::forget(drainer);
86}
87
88// Oops, vec[0] was dropped, we're reading a pointer into free'd memory!
89println!("{}", vec[0]);
90```
91
b039eaaf 92This is pretty clearly Not Good. Unfortunately, we're kind of stuck between a
c1a9b12d
SL
93rock and a hard place: maintaining consistent state at every step has an
94enormous cost (and would negate any benefits of the API). Failing to maintain
b039eaaf 95consistent state gives us Undefined Behavior in safe code (making the API
c1a9b12d
SL
96unsound).
97
98So what can we do? Well, we can pick a trivially consistent state: set the Vec's
99len to be 0 when we start the iteration, and fix it up if necessary in the
100destructor. That way, if everything executes like normal we get the desired
b039eaaf 101behavior with minimal overhead. But if someone has the *audacity* to
c1a9b12d
SL
102mem::forget us in the middle of the iteration, all that does is *leak even more*
103(and possibly leave the Vec in an unexpected but otherwise consistent state).
104Since we've accepted that mem::forget is safe, this is definitely safe. We call
105leaks causing more leaks a *leak amplification*.
106
c1a9b12d
SL
107## Rc
108
109Rc is an interesting case because at first glance it doesn't appear to be a
110proxy value at all. After all, it manages the data it points to, and dropping
111all the Rcs for a value will drop that value. Leaking an Rc doesn't seem like it
112would be particularly dangerous. It will leave the refcount permanently
113incremented and prevent the data from being freed or dropped, but that seems
114just like Box, right?
115
116Nope.
117
118Let's consider a simplified implementation of Rc:
119
136023e0 120<!-- ignore: simplified code -->
c1a9b12d
SL
121```rust,ignore
122struct Rc<T> {
123 ptr: *mut RcBox<T>,
124}
125
126struct RcBox<T> {
127 data: T,
128 ref_count: usize,
129}
130
131impl<T> Rc<T> {
132 fn new(data: T) -> Self {
133 unsafe {
134 // Wouldn't it be nice if heap::allocate worked like this?
b039eaaf 135 let ptr = heap::allocate::<RcBox<T>>();
c1a9b12d
SL
136 ptr::write(ptr, RcBox {
137 data: data,
138 ref_count: 1,
139 });
140 Rc { ptr: ptr }
141 }
142 }
143
144 fn clone(&self) -> Self {
145 unsafe {
146 (*self.ptr).ref_count += 1;
147 }
148 Rc { ptr: self.ptr }
149 }
150}
151
152impl<T> Drop for Rc<T> {
153 fn drop(&mut self) {
154 unsafe {
c1a9b12d
SL
155 (*self.ptr).ref_count -= 1;
156 if (*self.ptr).ref_count == 0 {
157 // drop the data and then free it
158 ptr::read(self.ptr);
159 heap::deallocate(self.ptr);
160 }
161 }
162 }
163}
164```
165
166This code contains an implicit and subtle assumption: `ref_count` can fit in a
167`usize`, because there can't be more than `usize::MAX` Rcs in memory. However
168this itself assumes that the `ref_count` accurately reflects the number of Rcs
169in memory, which we know is false with `mem::forget`. Using `mem::forget` we can
170overflow the `ref_count`, and then get it down to 0 with outstanding Rcs. Then
171we can happily use-after-free the inner data. Bad Bad Not Good.
172
173This can be solved by just checking the `ref_count` and doing *something*. The
174standard library's stance is to just abort, because your program has become
175horribly degenerate. Also *oh my gosh* it's such a ridiculous corner case.
176
c1a9b12d
SL
177## thread::scoped::JoinGuard
178
c295e0f8
XL
179> Note: This API has already been removed from std, for more information
180> you may refer [issue #24292](https://github.com/rust-lang/rust/issues/24292).
181>
04454e1e
FG
182> This section remains here because we think this example is still
183> important, regardless of whether it is part of std or not.
c295e0f8 184
13cf67c4 185The thread::scoped API intended to allow threads to be spawned that reference
c1a9b12d
SL
186data on their parent's stack without any synchronization over that data by
187ensuring the parent joins the thread before any of the shared data goes out
188of scope.
189
136023e0 190<!-- ignore: simplified code -->
c1a9b12d
SL
191```rust,ignore
192pub fn scoped<'a, F>(f: F) -> JoinGuard<'a>
193 where F: FnOnce() + Send + 'a
194```
195
196Here `f` is some closure for the other thread to execute. Saying that
197`F: Send +'a` is saying that it closes over data that lives for `'a`, and it
198either owns that data or the data was Sync (implying `&data` is Send).
199
200Because JoinGuard has a lifetime, it keeps all the data it closes over
201borrowed in the parent thread. This means the JoinGuard can't outlive
202the data that the other thread is working on. When the JoinGuard *does* get
203dropped it blocks the parent thread, ensuring the child terminates before any
204of the closed-over data goes out of scope in the parent.
205
206Usage looked like:
207
136023e0 208<!-- ignore: simplified code -->
c1a9b12d
SL
209```rust,ignore
210let mut data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
211{
5099ac24 212 let mut guards = vec![];
c1a9b12d
SL
213 for x in &mut data {
214 // Move the mutable reference into the closure, and execute
215 // it on a different thread. The closure has a lifetime bound
216 // by the lifetime of the mutable reference `x` we store in it.
217 // The guard that is returned is in turn assigned the lifetime
218 // of the closure, so it also mutably borrows `data` as `x` did.
219 // This means we cannot access `data` until the guard goes away.
220 let guard = thread::scoped(move || {
221 *x *= 2;
222 });
223 // store the thread's guard for later
224 guards.push(guard);
225 }
226 // All guards are dropped here, forcing the threads to join
227 // (this thread blocks here until the others terminate).
228 // Once the threads join, the borrow expires and the data becomes
229 // accessible again in this thread.
230}
231// data is definitely mutated here.
232```
233
234In principle, this totally works! Rust's ownership system perfectly ensures it!
235...except it relies on a destructor being called to be safe.
236
136023e0 237<!-- ignore: simplified code -->
c1a9b12d
SL
238```rust,ignore
239let mut data = Box::new(0);
240{
241 let guard = thread::scoped(|| {
242 // This is at best a data race. At worst, it's also a use-after-free.
243 *data += 1;
244 });
245 // Because the guard is forgotten, expiring the loan without blocking this
246 // thread.
247 mem::forget(guard);
248}
249// So the Box is dropped here while the scoped thread may or may not be trying
250// to access it.
251```
252
253Dang. Here the destructor running was pretty fundamental to the API, and it had
b039eaaf 254to be scrapped in favor of a completely different design.