]>
Commit | Line | Data |
---|---|---|
5099ac24 FG |
1 | //! This is a densely packed error representation which is used on targets with |
2 | //! 64-bit pointers. | |
3 | //! | |
4 | //! (Note that `bitpacked` vs `unpacked` here has no relationship to | |
5 | //! `#[repr(packed)]`, it just refers to attempting to use any available bits in | |
6 | //! a more clever manner than `rustc`'s default layout algorithm would). | |
7 | //! | |
8 | //! Conceptually, it stores the same data as the "unpacked" equivalent we use on | |
9 | //! other targets. Specifically, you can imagine it as an optimized version of | |
10 | //! the following enum (which is roughly equivalent to what's stored by | |
11 | //! `repr_unpacked::Repr`, e.g. `super::ErrorData<Box<Custom>>`): | |
12 | //! | |
13 | //! ```ignore (exposition-only) | |
14 | //! enum ErrorData { | |
15 | //! Os(i32), | |
16 | //! Simple(ErrorKind), | |
17 | //! SimpleMessage(&'static SimpleMessage), | |
18 | //! Custom(Box<Custom>), | |
19 | //! } | |
20 | //! ``` | |
21 | //! | |
22 | //! However, it packs this data into a 64bit non-zero value. | |
23 | //! | |
24 | //! This optimization not only allows `io::Error` to occupy a single pointer, | |
25 | //! but improves `io::Result` as well, especially for situations like | |
26 | //! `io::Result<()>` (which is now 64 bits) or `io::Result<u64>` (which is now | |
27 | //! 128 bits), which are quite common. | |
28 | //! | |
29 | //! # Layout | |
30 | //! Tagged values are 64 bits, with the 2 least significant bits used for the | |
1f0639a9 | 31 | //! tag. This means there are 4 "variants": |
5099ac24 FG |
32 | //! |
33 | //! - **Tag 0b00**: The first variant is equivalent to | |
34 | //! `ErrorData::SimpleMessage`, and holds a `&'static SimpleMessage` directly. | |
35 | //! | |
36 | //! `SimpleMessage` has an alignment >= 4 (which is requested with | |
37 | //! `#[repr(align)]` and checked statically at the bottom of this file), which | |
38 | //! means every `&'static SimpleMessage` should have the both tag bits as 0, | |
39 | //! meaning its tagged and untagged representation are equivalent. | |
40 | //! | |
41 | //! This means we can skip tagging it, which is necessary as this variant can | |
42 | //! be constructed from a `const fn`, which probably cannot tag pointers (or | |
43 | //! at least it would be difficult). | |
44 | //! | |
45 | //! - **Tag 0b01**: The other pointer variant holds the data for | |
46 | //! `ErrorData::Custom` and the remaining 62 bits are used to store a | |
47 | //! `Box<Custom>`. `Custom` also has alignment >= 4, so the bottom two bits | |
48 | //! are free to use for the tag. | |
49 | //! | |
50 | //! The only important thing to note is that `ptr::wrapping_add` and | |
51 | //! `ptr::wrapping_sub` are used to tag the pointer, rather than bitwise | |
52 | //! operations. This should preserve the pointer's provenance, which would | |
53 | //! otherwise be lost. | |
54 | //! | |
55 | //! - **Tag 0b10**: Holds the data for `ErrorData::Os(i32)`. We store the `i32` | |
56 | //! in the pointer's most significant 32 bits, and don't use the bits `2..32` | |
57 | //! for anything. Using the top 32 bits is just to let us easily recover the | |
58 | //! `i32` code with the correct sign. | |
59 | //! | |
60 | //! - **Tag 0b11**: Holds the data for `ErrorData::Simple(ErrorKind)`. This | |
61 | //! stores the `ErrorKind` in the top 32 bits as well, although it doesn't | |
62 | //! occupy nearly that many. Most of the bits are unused here, but it's not | |
63 | //! like we need them for anything else yet. | |
64 | //! | |
65 | //! # Use of `NonNull<()>` | |
66 | //! | |
67 | //! Everything is stored in a `NonNull<()>`, which is odd, but actually serves a | |
68 | //! purpose. | |
69 | //! | |
70 | //! Conceptually you might think of this more like: | |
71 | //! | |
72 | //! ```ignore (exposition-only) | |
73 | //! union Repr { | |
74 | //! // holds integer (Simple/Os) variants, and | |
75 | //! // provides access to the tag bits. | |
c620b35d | 76 | //! bits: NonZero<u64>, |
5099ac24 FG |
77 | //! // Tag is 0, so this is stored untagged. |
78 | //! msg: &'static SimpleMessage, | |
79 | //! // Tagged (offset) `Box<Custom>` pointer. | |
80 | //! tagged_custom: NonNull<()>, | |
81 | //! } | |
82 | //! ``` | |
83 | //! | |
84 | //! But there are a few problems with this: | |
85 | //! | |
86 | //! 1. Union access is equivalent to a transmute, so this representation would | |
87 | //! require we transmute between integers and pointers in at least one | |
88 | //! direction, which may be UB (and even if not, it is likely harder for a | |
89 | //! compiler to reason about than explicit ptr->int operations). | |
90 | //! | |
91 | //! 2. Even if all fields of a union have a niche, the union itself doesn't, | |
92 | //! although this may change in the future. This would make things like | |
93 | //! `io::Result<()>` and `io::Result<usize>` larger, which defeats part of | |
94 | //! the motivation of this bitpacking. | |
95 | //! | |
c620b35d | 96 | //! Storing everything in a `NonZero<usize>` (or some other integer) would be a |
5099ac24 FG |
97 | //! bit more traditional for pointer tagging, but it would lose provenance |
98 | //! information, couldn't be constructed from a `const fn`, and would probably | |
99 | //! run into other issues as well. | |
100 | //! | |
101 | //! The `NonNull<()>` seems like the only alternative, even if it's fairly odd | |
102 | //! to use a pointer type to store something that may hold an integer, some of | |
103 | //! the time. | |
104 | ||
9ffffee4 | 105 | use super::{Custom, ErrorData, ErrorKind, RawOsError, SimpleMessage}; |
5099ac24 | 106 | use core::marker::PhantomData; |
5e7ed085 | 107 | use core::ptr::{self, NonNull}; |
5099ac24 FG |
108 | |
109 | // The 2 least-significant bits are used as tag. | |
110 | const TAG_MASK: usize = 0b11; | |
111 | const TAG_SIMPLE_MESSAGE: usize = 0b00; | |
112 | const TAG_CUSTOM: usize = 0b01; | |
113 | const TAG_OS: usize = 0b10; | |
114 | const TAG_SIMPLE: usize = 0b11; | |
115 | ||
116 | /// The internal representation. | |
117 | /// | |
118 | /// See the module docs for more, this is just a way to hack in a check that we | |
119 | /// indeed are not unwind-safe. | |
120 | /// | |
121 | /// ```compile_fail,E0277 | |
122 | /// fn is_unwind_safe<T: core::panic::UnwindSafe>() {} | |
123 | /// is_unwind_safe::<std::io::Error>(); | |
124 | /// ``` | |
125 | #[repr(transparent)] | |
126 | pub(super) struct Repr(NonNull<()>, PhantomData<ErrorData<Box<Custom>>>); | |
127 | ||
128 | // All the types `Repr` stores internally are Send + Sync, and so is it. | |
129 | unsafe impl Send for Repr {} | |
130 | unsafe impl Sync for Repr {} | |
131 | ||
132 | impl Repr { | |
064997fb FG |
133 | pub(super) fn new(dat: ErrorData<Box<Custom>>) -> Self { |
134 | match dat { | |
135 | ErrorData::Os(code) => Self::new_os(code), | |
136 | ErrorData::Simple(kind) => Self::new_simple(kind), | |
137 | ErrorData::SimpleMessage(simple_message) => Self::new_simple_message(simple_message), | |
138 | ErrorData::Custom(b) => Self::new_custom(b), | |
139 | } | |
140 | } | |
141 | ||
5099ac24 FG |
142 | pub(super) fn new_custom(b: Box<Custom>) -> Self { |
143 | let p = Box::into_raw(b).cast::<u8>(); | |
144 | // Should only be possible if an allocator handed out a pointer with | |
145 | // wrong alignment. | |
5e7ed085 | 146 | debug_assert_eq!(p.addr() & TAG_MASK, 0); |
5099ac24 FG |
147 | // Note: We know `TAG_CUSTOM <= size_of::<Custom>()` (static_assert at |
148 | // end of file), and both the start and end of the expression must be | |
149 | // valid without address space wraparound due to `Box`'s semantics. | |
150 | // | |
151 | // This means it would be correct to implement this using `ptr::add` | |
152 | // (rather than `ptr::wrapping_add`), but it's unclear this would give | |
153 | // any benefit, so we just use `wrapping_add` instead. | |
154 | let tagged = p.wrapping_add(TAG_CUSTOM).cast::<()>(); | |
155 | // Safety: `TAG_CUSTOM + p` is the same as `TAG_CUSTOM | p`, | |
156 | // because `p`'s alignment means it isn't allowed to have any of the | |
157 | // `TAG_BITS` set (you can verify that addition and bitwise-or are the | |
158 | // same when the operands have no bits in common using a truth table). | |
159 | // | |
160 | // Then, `TAG_CUSTOM | p` is not zero, as that would require | |
161 | // `TAG_CUSTOM` and `p` both be zero, and neither is (as `p` came from a | |
162 | // box, and `TAG_CUSTOM` just... isn't zero -- it's `0b01`). Therefore, | |
163 | // `TAG_CUSTOM + p` isn't zero and so `tagged` can't be, and the | |
164 | // `new_unchecked` is safe. | |
165 | let res = Self(unsafe { NonNull::new_unchecked(tagged) }, PhantomData); | |
166 | // quickly smoke-check we encoded the right thing (This generally will | |
9c376795 | 167 | // only run in std's tests, unless the user uses -Zbuild-std) |
5099ac24 FG |
168 | debug_assert!(matches!(res.data(), ErrorData::Custom(_)), "repr(custom) encoding failed"); |
169 | res | |
170 | } | |
171 | ||
172 | #[inline] | |
9ffffee4 | 173 | pub(super) fn new_os(code: RawOsError) -> Self { |
5099ac24 FG |
174 | let utagged = ((code as usize) << 32) | TAG_OS; |
175 | // Safety: `TAG_OS` is not zero, so the result of the `|` is not 0. | |
c620b35d FG |
176 | let res = Self( |
177 | unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) }, | |
178 | PhantomData, | |
179 | ); | |
5099ac24 | 180 | // quickly smoke-check we encoded the right thing (This generally will |
9c376795 | 181 | // only run in std's tests, unless the user uses -Zbuild-std) |
5099ac24 FG |
182 | debug_assert!( |
183 | matches!(res.data(), ErrorData::Os(c) if c == code), | |
5e7ed085 | 184 | "repr(os) encoding failed for {code}" |
5099ac24 FG |
185 | ); |
186 | res | |
187 | } | |
188 | ||
189 | #[inline] | |
190 | pub(super) fn new_simple(kind: ErrorKind) -> Self { | |
191 | let utagged = ((kind as usize) << 32) | TAG_SIMPLE; | |
192 | // Safety: `TAG_SIMPLE` is not zero, so the result of the `|` is not 0. | |
c620b35d FG |
193 | let res = Self( |
194 | unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) }, | |
195 | PhantomData, | |
196 | ); | |
5099ac24 | 197 | // quickly smoke-check we encoded the right thing (This generally will |
9c376795 | 198 | // only run in std's tests, unless the user uses -Zbuild-std) |
5099ac24 FG |
199 | debug_assert!( |
200 | matches!(res.data(), ErrorData::Simple(k) if k == kind), | |
201 | "repr(simple) encoding failed {:?}", | |
202 | kind, | |
203 | ); | |
204 | res | |
205 | } | |
206 | ||
207 | #[inline] | |
208 | pub(super) const fn new_simple_message(m: &'static SimpleMessage) -> Self { | |
209 | // Safety: References are never null. | |
210 | Self(unsafe { NonNull::new_unchecked(m as *const _ as *mut ()) }, PhantomData) | |
211 | } | |
212 | ||
213 | #[inline] | |
214 | pub(super) fn data(&self) -> ErrorData<&Custom> { | |
215 | // Safety: We're a Repr, decode_repr is fine. | |
216 | unsafe { decode_repr(self.0, |c| &*c) } | |
217 | } | |
218 | ||
219 | #[inline] | |
220 | pub(super) fn data_mut(&mut self) -> ErrorData<&mut Custom> { | |
221 | // Safety: We're a Repr, decode_repr is fine. | |
222 | unsafe { decode_repr(self.0, |c| &mut *c) } | |
223 | } | |
224 | ||
225 | #[inline] | |
226 | pub(super) fn into_data(self) -> ErrorData<Box<Custom>> { | |
227 | let this = core::mem::ManuallyDrop::new(self); | |
228 | // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is | |
229 | // safe because we prevent double-drop using `ManuallyDrop`. | |
230 | unsafe { decode_repr(this.0, |p| Box::from_raw(p)) } | |
231 | } | |
232 | } | |
233 | ||
234 | impl Drop for Repr { | |
235 | #[inline] | |
236 | fn drop(&mut self) { | |
237 | // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is | |
238 | // safe because we're being dropped. | |
239 | unsafe { | |
240 | let _ = decode_repr(self.0, |p| Box::<Custom>::from_raw(p)); | |
241 | } | |
242 | } | |
243 | } | |
244 | ||
245 | // Shared helper to decode a `Repr`'s internal pointer into an ErrorData. | |
246 | // | |
247 | // Safety: `ptr`'s bits should be encoded as described in the document at the | |
248 | // top (it should `some_repr.0`) | |
249 | #[inline] | |
250 | unsafe fn decode_repr<C, F>(ptr: NonNull<()>, make_custom: F) -> ErrorData<C> | |
251 | where | |
252 | F: FnOnce(*mut Custom) -> C, | |
253 | { | |
5e7ed085 | 254 | let bits = ptr.as_ptr().addr(); |
5099ac24 FG |
255 | match bits & TAG_MASK { |
256 | TAG_OS => { | |
9ffffee4 | 257 | let code = ((bits as i64) >> 32) as RawOsError; |
5099ac24 FG |
258 | ErrorData::Os(code) |
259 | } | |
260 | TAG_SIMPLE => { | |
261 | let kind_bits = (bits >> 32) as u32; | |
262 | let kind = kind_from_prim(kind_bits).unwrap_or_else(|| { | |
263 | debug_assert!(false, "Invalid io::error::Repr bits: `Repr({:#018x})`", bits); | |
264 | // This means the `ptr` passed in was not valid, which violates | |
265 | // the unsafe contract of `decode_repr`. | |
266 | // | |
267 | // Using this rather than unwrap meaningfully improves the code | |
268 | // for callers which only care about one variant (usually | |
269 | // `Custom`) | |
1f0639a9 | 270 | unsafe { core::hint::unreachable_unchecked() }; |
5099ac24 FG |
271 | }); |
272 | ErrorData::Simple(kind) | |
273 | } | |
1f0639a9 FG |
274 | TAG_SIMPLE_MESSAGE => { |
275 | // SAFETY: per tag | |
276 | unsafe { ErrorData::SimpleMessage(&*ptr.cast::<SimpleMessage>().as_ptr()) } | |
277 | } | |
5099ac24 | 278 | TAG_CUSTOM => { |
f2b60f7d | 279 | // It would be correct for us to use `ptr::byte_sub` here (see the |
5099ac24 FG |
280 | // comment above the `wrapping_add` call in `new_custom` for why), |
281 | // but it isn't clear that it makes a difference, so we don't. | |
f2b60f7d | 282 | let custom = ptr.as_ptr().wrapping_byte_sub(TAG_CUSTOM).cast::<Custom>(); |
5099ac24 FG |
283 | ErrorData::Custom(make_custom(custom)) |
284 | } | |
285 | _ => { | |
286 | // Can't happen, and compiler can tell | |
287 | unreachable!(); | |
288 | } | |
289 | } | |
290 | } | |
291 | ||
292 | // This compiles to the same code as the check+transmute, but doesn't require | |
293 | // unsafe, or to hard-code max ErrorKind or its size in a way the compiler | |
294 | // couldn't verify. | |
295 | #[inline] | |
296 | fn kind_from_prim(ek: u32) -> Option<ErrorKind> { | |
297 | macro_rules! from_prim { | |
298 | ($prim:expr => $Enum:ident { $($Variant:ident),* $(,)? }) => {{ | |
299 | // Force a compile error if the list gets out of date. | |
300 | const _: fn(e: $Enum) = |e: $Enum| match e { | |
301 | $($Enum::$Variant => ()),* | |
302 | }; | |
303 | match $prim { | |
304 | $(v if v == ($Enum::$Variant as _) => Some($Enum::$Variant),)* | |
305 | _ => None, | |
306 | } | |
307 | }} | |
308 | } | |
309 | from_prim!(ek => ErrorKind { | |
310 | NotFound, | |
311 | PermissionDenied, | |
312 | ConnectionRefused, | |
313 | ConnectionReset, | |
314 | HostUnreachable, | |
315 | NetworkUnreachable, | |
316 | ConnectionAborted, | |
317 | NotConnected, | |
318 | AddrInUse, | |
319 | AddrNotAvailable, | |
320 | NetworkDown, | |
321 | BrokenPipe, | |
322 | AlreadyExists, | |
323 | WouldBlock, | |
324 | NotADirectory, | |
325 | IsADirectory, | |
326 | DirectoryNotEmpty, | |
327 | ReadOnlyFilesystem, | |
328 | FilesystemLoop, | |
329 | StaleNetworkFileHandle, | |
330 | InvalidInput, | |
331 | InvalidData, | |
332 | TimedOut, | |
333 | WriteZero, | |
334 | StorageFull, | |
335 | NotSeekable, | |
336 | FilesystemQuotaExceeded, | |
337 | FileTooLarge, | |
338 | ResourceBusy, | |
339 | ExecutableFileBusy, | |
340 | Deadlock, | |
341 | CrossesDevices, | |
342 | TooManyLinks, | |
343 | InvalidFilename, | |
344 | ArgumentListTooLong, | |
345 | Interrupted, | |
346 | Other, | |
347 | UnexpectedEof, | |
348 | Unsupported, | |
349 | OutOfMemory, | |
350 | Uncategorized, | |
351 | }) | |
352 | } | |
353 | ||
354 | // Some static checking to alert us if a change breaks any of the assumptions | |
355 | // that our encoding relies on for correctness and soundness. (Some of these are | |
356 | // a bit overly thorough/cautious, admittedly) | |
357 | // | |
9c376795 | 358 | // If any of these are hit on a platform that std supports, we should likely |
5099ac24 FG |
359 | // just use `repr_unpacked.rs` there instead (unless the fix is easy). |
360 | macro_rules! static_assert { | |
361 | ($condition:expr) => { | |
362 | const _: () = assert!($condition); | |
363 | }; | |
364 | (@usize_eq: $lhs:expr, $rhs:expr) => { | |
365 | const _: [(); $lhs] = [(); $rhs]; | |
366 | }; | |
367 | } | |
368 | ||
369 | // The bitpacking we use requires pointers be exactly 64 bits. | |
370 | static_assert!(@usize_eq: size_of::<NonNull<()>>(), 8); | |
371 | ||
372 | // We also require pointers and usize be the same size. | |
373 | static_assert!(@usize_eq: size_of::<NonNull<()>>(), size_of::<usize>()); | |
374 | ||
375 | // `Custom` and `SimpleMessage` need to be thin pointers. | |
376 | static_assert!(@usize_eq: size_of::<&'static SimpleMessage>(), 8); | |
377 | static_assert!(@usize_eq: size_of::<Box<Custom>>(), 8); | |
378 | ||
379 | static_assert!((TAG_MASK + 1).is_power_of_two()); | |
380 | // And they must have sufficient alignment. | |
381 | static_assert!(align_of::<SimpleMessage>() >= TAG_MASK + 1); | |
382 | static_assert!(align_of::<Custom>() >= TAG_MASK + 1); | |
383 | ||
9c376795 FG |
384 | static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE_MESSAGE, TAG_SIMPLE_MESSAGE); |
385 | static_assert!(@usize_eq: TAG_MASK & TAG_CUSTOM, TAG_CUSTOM); | |
386 | static_assert!(@usize_eq: TAG_MASK & TAG_OS, TAG_OS); | |
387 | static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE, TAG_SIMPLE); | |
5099ac24 FG |
388 | |
389 | // This is obviously true (`TAG_CUSTOM` is `0b01`), but in `Repr::new_custom` we | |
390 | // offset a pointer by this value, and expect it to both be within the same | |
391 | // object, and to not wrap around the address space. See the comment in that | |
392 | // function for further details. | |
393 | // | |
394 | // Actually, at the moment we use `ptr::wrapping_add`, not `ptr::add`, so this | |
395 | // check isn't needed for that one, although the assertion that we don't | |
396 | // actually wrap around in that wrapping_add does simplify the safety reasoning | |
397 | // elsewhere considerably. | |
398 | static_assert!(size_of::<Custom>() >= TAG_CUSTOM); | |
399 | ||
400 | // These two store a payload which is allowed to be zero, so they must be | |
401 | // non-zero to preserve the `NonNull`'s range invariant. | |
402 | static_assert!(TAG_OS != 0); | |
403 | static_assert!(TAG_SIMPLE != 0); | |
404 | // We can't tag `SimpleMessage`s, the tag must be 0. | |
405 | static_assert!(@usize_eq: TAG_SIMPLE_MESSAGE, 0); | |
406 | ||
407 | // Check that the point of all of this still holds. | |
408 | // | |
409 | // We'd check against `io::Error`, but *technically* it's allowed to vary, | |
410 | // as it's not `#[repr(transparent)]`/`#[repr(C)]`. We could add that, but | |
411 | // the `#[repr()]` would show up in rustdoc, which might be seen as a stable | |
412 | // commitment. | |
413 | static_assert!(@usize_eq: size_of::<Repr>(), 8); | |
414 | static_assert!(@usize_eq: size_of::<Option<Repr>>(), 8); | |
415 | static_assert!(@usize_eq: size_of::<Result<(), Repr>>(), 8); | |
416 | static_assert!(@usize_eq: size_of::<Result<usize, Repr>>(), 16); |