]> git.proxmox.com Git - rustc.git/blob - src/doc/trpl/unsafe-code.md
Imported Upstream version 1.0.0+dfsg1
[rustc.git] / src / doc / trpl / unsafe-code.md
1 % Unsafe Code
2
3 # Introduction
4
5 Rust aims to provide safe abstractions over the low-level details of
6 the CPU and operating system, but sometimes one needs to drop down and
7 write code at that level. This guide aims to provide an overview of
8 the dangers and power one gets with Rust's unsafe subset.
9
10 Rust provides an escape hatch in the form of the `unsafe { ... }`
11 block which allows the programmer to dodge some of the compiler's
12 checks and do a wide range of operations, such as:
13
14 - dereferencing [raw pointers](#raw-pointers)
15 - calling a function via FFI ([covered by the FFI guide](ffi.html))
16 - casting between types bitwise (`transmute`, aka "reinterpret cast")
17 - [inline assembly](#inline-assembly)
18
19 Note that an `unsafe` block does not relax the rules about lifetimes
20 of `&` and the freezing of borrowed data.
21
22 Any use of `unsafe` is the programmer saying "I know more than you" to
23 the compiler, and, as such, the programmer should be very sure that
24 they actually do know more about why that piece of code is valid. In
25 general, one should try to minimize the amount of unsafe code in a
26 code base; preferably by using the bare minimum `unsafe` blocks to
27 build safe interfaces.
28
29 > **Note**: the low-level details of the Rust language are still in
30 > flux, and there is no guarantee of stability or backwards
31 > compatibility. In particular, there may be changes that do not cause
32 > compilation errors, but do cause semantic changes (such as invoking
33 > undefined behaviour). As such, extreme care is required.
34
35 # Pointers
36
37 ## References
38
39 One of Rust's biggest features is memory safety. This is achieved in
40 part via [the ownership system](ownership.html), which is how the
41 compiler can guarantee that every `&` reference is always valid, and,
42 for example, never pointing to freed memory.
43
44 These restrictions on `&` have huge advantages. However, they also
45 constrain how we can use them. For example, `&` doesn't behave
46 identically to C's pointers, and so cannot be used for pointers in
47 foreign function interfaces (FFI). Additionally, both immutable (`&`)
48 and mutable (`&mut`) references have some aliasing and freezing
49 guarantees, required for memory safety.
50
51 In particular, if you have an `&T` reference, then the `T` must not be
52 modified through that reference or any other reference. There are some
53 standard library types, e.g. `Cell` and `RefCell`, that provide inner
54 mutability by replacing compile time guarantees with dynamic checks at
55 runtime.
56
57 An `&mut` reference has a different constraint: when an object has an
58 `&mut T` pointing into it, then that `&mut` reference must be the only
59 such usable path to that object in the whole program. That is, an
60 `&mut` cannot alias with any other references.
61
62 Using `unsafe` code to incorrectly circumvent and violate these
63 restrictions is undefined behaviour. For example, the following
64 creates two aliasing `&mut` pointers, and is invalid.
65
66 ```
67 use std::mem;
68 let mut x: u8 = 1;
69
70 let ref_1: &mut u8 = &mut x;
71 let ref_2: &mut u8 = unsafe { mem::transmute(&mut *ref_1) };
72
73 // oops, ref_1 and ref_2 point to the same piece of data (x) and are
74 // both usable
75 *ref_1 = 10;
76 *ref_2 = 20;
77 ```
78
79 ## Raw pointers
80
81 Rust offers two additional pointer types (*raw pointers*), written as
82 `*const T` and `*mut T`. They're an approximation of C's `const T*` and `T*`
83 respectively; indeed, one of their most common uses is for FFI,
84 interfacing with external C libraries.
85
86 Raw pointers have much fewer guarantees than other pointer types
87 offered by the Rust language and libraries. For example, they
88
89 - are not guaranteed to point to valid memory and are not even
90 guaranteed to be non-null (unlike both `Box` and `&`);
91 - do not have any automatic clean-up, unlike `Box`, and so require
92 manual resource management;
93 - are plain-old-data, that is, they don't move ownership, again unlike
94 `Box`, hence the Rust compiler cannot protect against bugs like
95 use-after-free;
96 - lack any form of lifetimes, unlike `&`, and so the compiler cannot
97 reason about dangling pointers; and
98 - have no guarantees about aliasing or mutability other than mutation
99 not being allowed directly through a `*const T`.
100
101 Fortunately, they come with a redeeming feature: the weaker guarantees
102 mean weaker restrictions. The missing restrictions make raw pointers
103 appropriate as a building block for implementing things like smart
104 pointers and vectors inside libraries. For example, `*` pointers are
105 allowed to alias, allowing them to be used to write shared-ownership
106 types like reference counted and garbage collected pointers, and even
107 thread-safe shared memory types (`Rc` and the `Arc` types are both
108 implemented entirely in Rust).
109
110 There are two things that you are required to be careful about
111 (i.e. require an `unsafe { ... }` block) with raw pointers:
112
113 - dereferencing: they can have any value: so possible results include
114 a crash, a read of uninitialised memory, a use-after-free, or
115 reading data as normal.
116 - pointer arithmetic via the `offset` [intrinsic](#intrinsics) (or
117 `.offset` method): this intrinsic uses so-called "in-bounds"
118 arithmetic, that is, it is only defined behaviour if the result is
119 inside (or one-byte-past-the-end) of the object from which the
120 original pointer came.
121
122 The latter assumption allows the compiler to optimize more
123 effectively. As can be seen, actually *creating* a raw pointer is not
124 unsafe, and neither is converting to an integer.
125
126 ### References and raw pointers
127
128 At runtime, a raw pointer `*` and a reference pointing to the same
129 piece of data have an identical representation. In fact, an `&T`
130 reference will implicitly coerce to an `*const T` raw pointer in safe code
131 and similarly for the `mut` variants (both coercions can be performed
132 explicitly with, respectively, `value as *const T` and `value as *mut T`).
133
134 Going the opposite direction, from `*const` to a reference `&`, is not
135 safe. A `&T` is always valid, and so, at a minimum, the raw pointer
136 `*const T` has to point to a valid instance of type `T`. Furthermore,
137 the resulting pointer must satisfy the aliasing and mutability laws of
138 references. The compiler assumes these properties are true for any
139 references, no matter how they are created, and so any conversion from
140 raw pointers is asserting that they hold. The programmer *must*
141 guarantee this.
142
143 The recommended method for the conversion is
144
145 ```
146 let i: u32 = 1;
147 // explicit cast
148 let p_imm: *const u32 = &i as *const u32;
149 let mut m: u32 = 2;
150 // implicit coercion
151 let p_mut: *mut u32 = &mut m;
152
153 unsafe {
154 let ref_imm: &u32 = &*p_imm;
155 let ref_mut: &mut u32 = &mut *p_mut;
156 }
157 ```
158
159 The `&*x` dereferencing style is preferred to using a `transmute`.
160 The latter is far more powerful than necessary, and the more
161 restricted operation is harder to use incorrectly; for example, it
162 requires that `x` is a pointer (unlike `transmute`).
163
164
165
166 ## Making the unsafe safe(r)
167
168 There are various ways to expose a safe interface around some unsafe
169 code:
170
171 - store pointers privately (i.e. not in public fields of public
172 structs), so that you can see and control all reads and writes to
173 the pointer in one place.
174 - use `assert!()` a lot: since you can't rely on the protection of the
175 compiler & type-system to ensure that your `unsafe` code is correct
176 at compile-time, use `assert!()` to verify that it is doing the
177 right thing at run-time.
178 - implement the `Drop` for resource clean-up via a destructor, and use
179 RAII (Resource Acquisition Is Initialization). This reduces the need
180 for any manual memory management by users, and automatically ensures
181 that clean-up is always run, even when the thread panics.
182 - ensure that any data stored behind a raw pointer is destroyed at the
183 appropriate time.