[rustc.git] / src / doc / nomicon / src / vec-alloc.md

# Allocating Memory

Using Unique throws a wrench in an important feature of Vec (and indeed all of
the std collections): an empty Vec doesn't actually allocate at all. So if we
can't allocate, but also can't put a null pointer in `ptr`, what do we do in
`Vec::new`? Well, we just put some other garbage in there!

This is perfectly fine because we already have `cap == 0` as our sentinel for no
allocation. We don't even need to handle it specially in almost any code because
we usually need to check if `cap > len` or `len > 0` anyway. The recommended
Rust value to put here is `mem::align_of::<T>()`. Unique provides a convenience
for this: `Unique::empty()`. There are quite a few places where we'll
want to use `empty` because there's no real allocation to talk about but
`null` would make the compiler do bad things.

So:

```rust,ignore
#![feature(alloc, heap_api)]

use std::mem;

impl<T> Vec<T> {
    fn new() -> Self {
        assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
        Vec { ptr: Unique::empty(), len: 0, cap: 0 }
    }
}
```

I slipped in that assert there because zero-sized types will require some
special handling throughout our code, and I want to defer the issue for now.
Without this assert, some of our early drafts will do some Very Bad Things.

Next we need to figure out what to actually do when we *do* want space. For
that, we'll need to use the rest of the heap APIs. These basically allow us to
talk directly to Rust's allocator (jemalloc by default).

We'll also need a way to handle out-of-memory (OOM) conditions. The standard
library calls `std::alloc::oom()`, which in turn calls the the `oom` langitem.
By default this just aborts the program by executing an illegal cpu instruction.
The reason we abort and don't panic is because unwinding can cause allocations
to happen, and that seems like a bad thing to do when your allocator just came
back with "hey I don't have any more memory".

Of course, this is a bit silly since most platforms don't actually run out of
memory in a conventional way. Your operating system will probably kill the
application by another means if you legitimately start using up all the memory.
The most likely way we'll trigger OOM is by just asking for ludicrous quantities
of memory at once (e.g. half the theoretical address space). As such it's
*probably* fine to panic and nothing bad will happen. Still, we're trying to be
like the standard library as much as possible, so we'll just kill the whole
program.

Okay, now we can write growing. Roughly, we want to have this logic:

```text
if cap == 0:
    allocate()
    cap = 1
else:
    reallocate()
    cap *= 2
```

But Rust's only supported allocator API is so low level that we'll need to do a
fair bit of extra work. We also need to guard against some special
conditions that can occur with really large allocations or empty allocations.

In particular, `ptr::offset` will cause us a lot of trouble, because it has
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
not have dealt with this instruction, here's the basic story with GEP: alias
analysis, alias analysis, alias analysis. It's super important to an optimizing
compiler to be able to reason about data dependencies and aliasing.

As a simple example, consider the following fragment of code:

```rust
# let x = &mut 0;
# let y = &mut 0;
*x *= 7;
*y *= 3;
```

If the compiler can prove that `x` and `y` point to different locations in
memory, the two operations can in theory be executed in parallel (by e.g.
loading them into different registers and working on them independently).
However the compiler can't do this in general because if x and y point to
the same location in memory, the operations need to be done to the same value,
and they can't just be merged afterwards.

When you use GEP inbounds, you are specifically telling LLVM that the offsets
you're about to do are within the bounds of a single "allocated" entity. The
ultimate payoff being that LLVM can assume that if two pointers are known to
point to two disjoint objects, all the offsets of those pointers are *also*
known to not alias (because you won't just end up in some random place in
memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
offsets are the best of all, so it's important that we use them as much as
possible.

So that's what GEP's about, how can it cause us trouble?

The first problem is that we index into arrays with unsigned integers, but
GEP (and as a consequence `ptr::offset`) takes a signed integer. This means
that half of the seemingly valid indices into an array will overflow GEP and
actually go in the wrong direction! As such we must limit all allocations to
`isize::MAX` elements. This actually means we only need to worry about
byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of
the system's memory. However in order to avoid subtle corner cases where someone
reinterprets some array of `< isize::MAX` objects as bytes, std limits all
allocations to `isize::MAX` bytes.

On all 64-bit targets that Rust currently supports we're artificially limited
to significantly less than all 64 bits of the address space (modern x64
platforms only expose 48-bit addressing), so we can rely on just running out of
memory first. However on 32-bit targets, particularly those with extensions to
use more of the address space (PAE x86 or x32), it's theoretically possible to
successfully allocate more than `isize::MAX` bytes of memory.

However since this is a tutorial, we're not going to be particularly optimal
here, and just unconditionally check, rather than use clever platform-specific
`cfg`s.

The other corner-case we need to worry about is empty allocations. There will
be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
and `cap > 0` for zero-sized types.

These cases are tricky because they come
down to what LLVM means by "allocated". LLVM's notion of an
allocation is significantly more abstract than how we usually use it. Because
LLVM needs to work with different languages' semantics and custom allocators,
it can't really intimately understand allocation. Instead, the main idea behind
allocation is "doesn't overlap with other stuff". That is, heap allocations,
stack allocations, and globals don't randomly overlap. Yep, it's about alias
analysis. As such, Rust can technically play a bit fast and loose with the notion of
an allocation as long as it's *consistent*.

Getting back to the empty allocation case, there are a couple of places where
we want to offset by 0 as a consequence of generic code. The question is then:
is it consistent to do so? For zero-sized types, we have concluded that it is
indeed consistent to do a GEP inbounds offset by an arbitrary number of
elements. This is a runtime no-op because every element takes up no space,
and it's fine to pretend that there's infinite zero-sized types allocated
at `0x01`. No allocator will ever allocate that address, because they won't
allocate `0x00` and they generally allocate to some minimal alignment higher
than a byte. Also generally the whole first page of memory is
protected from being allocated anyway (a whole 4k, on many platforms).

However what about for positive-sized types? That one's a bit trickier. In
principle, you can argue that offsetting by 0 gives LLVM no information: either
there's an element before the address or after it, but it can't know which.
However we've chosen to conservatively assume that it may do bad things. As
such we will guard against this case explicitly.

*Phew*

Ok with all the nonsense out of the way, let's actually allocate some memory:

```rust,ignore
use std::alloc::oom;

fn grow(&mut self) {
    // this is all pretty delicate, so let's say it's all unsafe
    unsafe {
        // current API requires us to specify size and alignment manually.
        let align = mem::align_of::<T>();
        let elem_size = mem::size_of::<T>();

        let (new_cap, ptr) = if self.cap == 0 {
            let ptr = heap::allocate(elem_size, align);
            (1, ptr)
        } else {
            // as an invariant, we can assume that `self.cap < isize::MAX`,
            // so this doesn't need to be checked.
            let new_cap = self.cap * 2;
            // Similarly this can't overflow due to previously allocating this
            let old_num_bytes = self.cap * elem_size;

            // check that the new allocation doesn't exceed `isize::MAX` at all
            // regardless of the actual size of the capacity. This combines the
            // `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks
            // we need to make. We lose the ability to allocate e.g. 2/3rds of
            // the address space with a single Vec of i16's on 32-bit though.
            // Alas, poor Yorick -- I knew him, Horatio.
            assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2,
                    "capacity overflow");

            let new_num_bytes = old_num_bytes * 2;
            let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
                                        old_num_bytes,
                                        new_num_bytes,
                                        align);
            (new_cap, ptr)
        };

        // If allocate or reallocate fail, we'll get `null` back
        if ptr.is_null() { oom(); }

        self.ptr = Unique::new(ptr as *mut _);
        self.cap = new_cap;
    }
}
```

Nothing particularly tricky here. Just computing sizes and alignments and doing
some careful multiplication checks.
Commit	Line	Data
8bb4bdeb	1	# Allocating Memory
c1a9b12d SL	2
	3	Using Unique throws a wrench in an important feature of Vec (and indeed all of
	4	the std collections): an empty Vec doesn't actually allocate at all. So if we
	5	can't allocate, but also can't put a null pointer in `ptr`, what do we do in
	6	`Vec::new`? Well, we just put some other garbage in there!
	7
	8	This is perfectly fine because we already have `cap == 0` as our sentinel for no
	9	allocation. We don't even need to handle it specially in almost any code because
7cac9316 XL	10	we usually need to check if `cap > len` or `len > 0` anyway. The recommended
	11	Rust value to put here is `mem::align_of::<T>()`. Unique provides a convenience
	12	for this: `Unique::empty()`. There are quite a few places where we'll
	13	want to use `empty` because there's no real allocation to talk about but
c1a9b12d SL	14	`null` would make the compiler do bad things.
c1a9b12d SL	15
c1a9b12d SL	16	So:
	17
	18	```rust,ignore
e9174d1e	19	#![feature(alloc, heap_api)]
c1a9b12d	20
c1a9b12d SL	21	use std::mem;
	22
	23	impl<T> Vec<T> {
	24	fn new() -> Self {
	25	assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
7cac9316	26	Vec { ptr: Unique::empty(), len: 0, cap: 0 }
c1a9b12d SL	27	}
	28	}
	29	```
	30
	31	I slipped in that assert there because zero-sized types will require some
	32	special handling throughout our code, and I want to defer the issue for now.
	33	Without this assert, some of our early drafts will do some Very Bad Things.
	34
	35	Next we need to figure out what to actually do when we do want space. For
	36	that, we'll need to use the rest of the heap APIs. These basically allow us to
	37	talk directly to Rust's allocator (jemalloc by default).
	38
	39	We'll also need a way to handle out-of-memory (OOM) conditions. The standard
83c7162d XL	40	library calls `std::alloc::oom()`, which in turn calls the the `oom` langitem.
	41	By default this just aborts the program by executing an illegal cpu instruction.
	42	The reason we abort and don't panic is because unwinding can cause allocations
	43	to happen, and that seems like a bad thing to do when your allocator just came
	44	back with "hey I don't have any more memory".
c1a9b12d SL	45
	46	Of course, this is a bit silly since most platforms don't actually run out of
	47	memory in a conventional way. Your operating system will probably kill the
	48	application by another means if you legitimately start using up all the memory.
	49	The most likely way we'll trigger OOM is by just asking for ludicrous quantities
	50	of memory at once (e.g. half the theoretical address space). As such it's
	51	probably fine to panic and nothing bad will happen. Still, we're trying to be
	52	like the standard library as much as possible, so we'll just kill the whole
	53	program.
	54
c1a9b12d SL	55	Okay, now we can write growing. Roughly, we want to have this logic:
	56
	57	```text
	58	if cap == 0:
	59	allocate()
	60	cap = 1
	61	else:
	62	reallocate()
	63	cap *= 2
	64	```
	65
	66	But Rust's only supported allocator API is so low level that we'll need to do a
	67	fair bit of extra work. We also need to guard against some special
	68	conditions that can occur with really large allocations or empty allocations.
	69
	70	In particular, `ptr::offset` will cause us a lot of trouble, because it has
	71	the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
	72	not have dealt with this instruction, here's the basic story with GEP: alias
	73	analysis, alias analysis, alias analysis. It's super important to an optimizing
	74	compiler to be able to reason about data dependencies and aliasing.
	75
	76	As a simple example, consider the following fragment of code:
	77
	78	```rust
	79	# let x = &mut 0;
	80	# let y = &mut 0;
	81	x = 7;
	82	y = 3;
	83	```
	84
	85	If the compiler can prove that `x` and `y` point to different locations in
	86	memory, the two operations can in theory be executed in parallel (by e.g.
	87	loading them into different registers and working on them independently).
	88	However the compiler can't do this in general because if x and y point to
	89	the same location in memory, the operations need to be done to the same value,
	90	and they can't just be merged afterwards.
	91
	92	When you use GEP inbounds, you are specifically telling LLVM that the offsets
	93	you're about to do are within the bounds of a single "allocated" entity. The
	94	ultimate payoff being that LLVM can assume that if two pointers are known to
	95	point to two disjoint objects, all the offsets of those pointers are also
	96	known to not alias (because you won't just end up in some random place in
	97	memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
	98	offsets are the best of all, so it's important that we use them as much as
	99	possible.
	100
	101	So that's what GEP's about, how can it cause us trouble?
	102
	103	The first problem is that we index into arrays with unsigned integers, but
	104	GEP (and as a consequence `ptr::offset`) takes a signed integer. This means
	105	that half of the seemingly valid indices into an array will overflow GEP and
	106	actually go in the wrong direction! As such we must limit all allocations to
	107	`isize::MAX` elements. This actually means we only need to worry about
	108	byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of
	109	the system's memory. However in order to avoid subtle corner cases where someone
	110	reinterprets some array of `< isize::MAX` objects as bytes, std limits all
	111	allocations to `isize::MAX` bytes.
	112
	113	On all 64-bit targets that Rust currently supports we're artificially limited
	114	to significantly less than all 64 bits of the address space (modern x64
	115	platforms only expose 48-bit addressing), so we can rely on just running out of
	116	memory first. However on 32-bit targets, particularly those with extensions to
	117	use more of the address space (PAE x86 or x32), it's theoretically possible to
	118	successfully allocate more than `isize::MAX` bytes of memory.
119
120	However since this is a tutorial, we're not going to be particularly optimal
121	here, and just unconditionally check, rather than use clever platform-specific
122	`cfg`s.
123
124	The other corner-case we need to worry about is empty allocations. There will
125	be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
126	and `cap > 0` for zero-sized types.
127
128	These cases are tricky because they come
129	down to what LLVM means by "allocated". LLVM's notion of an
130	allocation is significantly more abstract than how we usually use it. Because
131	LLVM needs to work with different languages' semantics and custom allocators,
132	it can't really intimately understand allocation. Instead, the main idea behind
133	allocation is "doesn't overlap with other stuff". That is, heap allocations,
134	stack allocations, and globals don't randomly overlap. Yep, it's about alias
a7813a04	135	analysis. As such, Rust can technically play a bit fast and loose with the notion of
c1a9b12d SL	136	an allocation as long as it's consistent.
	137
	138	Getting back to the empty allocation case, there are a couple of places where
	139	we want to offset by 0 as a consequence of generic code. The question is then:
	140	is it consistent to do so? For zero-sized types, we have concluded that it is
	141	indeed consistent to do a GEP inbounds offset by an arbitrary number of
	142	elements. This is a runtime no-op because every element takes up no space,
	143	and it's fine to pretend that there's infinite zero-sized types allocated
	144	at `0x01`. No allocator will ever allocate that address, because they won't
	145	allocate `0x00` and they generally allocate to some minimal alignment higher
	146	than a byte. Also generally the whole first page of memory is
	147	protected from being allocated anyway (a whole 4k, on many platforms).
	148
	149	However what about for positive-sized types? That one's a bit trickier. In
	150	principle, you can argue that offsetting by 0 gives LLVM no information: either
	151	there's an element before the address or after it, but it can't know which.
	152	However we've chosen to conservatively assume that it may do bad things. As
	153	such we will guard against this case explicitly.
	154
	155	Phew
	156
	157	Ok with all the nonsense out of the way, let's actually allocate some memory:
	158
	159	```rust,ignore
83c7162d XL	160	use std::alloc::oom;
83c7162d XL	161
c1a9b12d SL	162	fn grow(&mut self) {
	163	// this is all pretty delicate, so let's say it's all unsafe
	164	unsafe {
	165	// current API requires us to specify size and alignment manually.
	166	let align = mem::align_of::<T>();
	167	let elem_size = mem::size_of::<T>();
	168
	169	let (new_cap, ptr) = if self.cap == 0 {
	170	let ptr = heap::allocate(elem_size, align);
	171	(1, ptr)
	172	} else {
	173	// as an invariant, we can assume that `self.cap < isize::MAX`,
	174	// so this doesn't need to be checked.
	175	let new_cap = self.cap * 2;
	176	// Similarly this can't overflow due to previously allocating this
	177	let old_num_bytes = self.cap * elem_size;
	178
	179	// check that the new allocation doesn't exceed `isize::MAX` at all
	180	// regardless of the actual size of the capacity. This combines the
	181	// `new_cap <= isize::MAX` and `new_num_bytes <= usize::MAX` checks
	182	// we need to make. We lose the ability to allocate e.g. 2/3rds of
	183	// the address space with a single Vec of i16's on 32-bit though.
	184	// Alas, poor Yorick -- I knew him, Horatio.
	185	assert!(old_num_bytes <= (::std::isize::MAX as usize) / 2,
	186	"capacity overflow");
	187
	188	let new_num_bytes = old_num_bytes * 2;
7cac9316	189	let ptr = heap::reallocate(self.ptr.as_ptr() as *mut _,
c1a9b12d SL	190	old_num_bytes,
	191	new_num_bytes,
	192	align);
	193	(new_cap, ptr)
	194	};
	195
	196	// If allocate or reallocate fail, we'll get `null` back
	197	if ptr.is_null() { oom(); }
	198
	199	self.ptr = Unique::new(ptr as *mut _);
	200	self.cap = new_cap;
	201	}
	202	}
	203	```
	204
	205	Nothing particularly tricky here. Just computing sizes and alignments and doing
	206	some careful multiplication checks.
	207