]> git.proxmox.com Git - rustc.git/blame - library/std/src/io/error/repr_bitpacked.rs
bump version to 1.81.0+dfsg1-2~bpo12+pve1
[rustc.git] / library / std / src / io / error / repr_bitpacked.rs
CommitLineData
5099ac24
FG
1//! This is a densely packed error representation which is used on targets with
2//! 64-bit pointers.
3//!
4//! (Note that `bitpacked` vs `unpacked` here has no relationship to
5//! `#[repr(packed)]`, it just refers to attempting to use any available bits in
6//! a more clever manner than `rustc`'s default layout algorithm would).
7//!
8//! Conceptually, it stores the same data as the "unpacked" equivalent we use on
9//! other targets. Specifically, you can imagine it as an optimized version of
10//! the following enum (which is roughly equivalent to what's stored by
11//! `repr_unpacked::Repr`, e.g. `super::ErrorData<Box<Custom>>`):
12//!
13//! ```ignore (exposition-only)
14//! enum ErrorData {
15//! Os(i32),
16//! Simple(ErrorKind),
17//! SimpleMessage(&'static SimpleMessage),
18//! Custom(Box<Custom>),
19//! }
20//! ```
21//!
22//! However, it packs this data into a 64bit non-zero value.
23//!
24//! This optimization not only allows `io::Error` to occupy a single pointer,
25//! but improves `io::Result` as well, especially for situations like
26//! `io::Result<()>` (which is now 64 bits) or `io::Result<u64>` (which is now
27//! 128 bits), which are quite common.
28//!
29//! # Layout
30//! Tagged values are 64 bits, with the 2 least significant bits used for the
1f0639a9 31//! tag. This means there are 4 "variants":
5099ac24
FG
32//!
33//! - **Tag 0b00**: The first variant is equivalent to
34//! `ErrorData::SimpleMessage`, and holds a `&'static SimpleMessage` directly.
35//!
36//! `SimpleMessage` has an alignment >= 4 (which is requested with
37//! `#[repr(align)]` and checked statically at the bottom of this file), which
38//! means every `&'static SimpleMessage` should have the both tag bits as 0,
39//! meaning its tagged and untagged representation are equivalent.
40//!
41//! This means we can skip tagging it, which is necessary as this variant can
42//! be constructed from a `const fn`, which probably cannot tag pointers (or
43//! at least it would be difficult).
44//!
45//! - **Tag 0b01**: The other pointer variant holds the data for
46//! `ErrorData::Custom` and the remaining 62 bits are used to store a
47//! `Box<Custom>`. `Custom` also has alignment >= 4, so the bottom two bits
48//! are free to use for the tag.
49//!
50//! The only important thing to note is that `ptr::wrapping_add` and
51//! `ptr::wrapping_sub` are used to tag the pointer, rather than bitwise
52//! operations. This should preserve the pointer's provenance, which would
53//! otherwise be lost.
54//!
55//! - **Tag 0b10**: Holds the data for `ErrorData::Os(i32)`. We store the `i32`
56//! in the pointer's most significant 32 bits, and don't use the bits `2..32`
57//! for anything. Using the top 32 bits is just to let us easily recover the
58//! `i32` code with the correct sign.
59//!
60//! - **Tag 0b11**: Holds the data for `ErrorData::Simple(ErrorKind)`. This
61//! stores the `ErrorKind` in the top 32 bits as well, although it doesn't
62//! occupy nearly that many. Most of the bits are unused here, but it's not
63//! like we need them for anything else yet.
64//!
65//! # Use of `NonNull<()>`
66//!
67//! Everything is stored in a `NonNull<()>`, which is odd, but actually serves a
68//! purpose.
69//!
70//! Conceptually you might think of this more like:
71//!
72//! ```ignore (exposition-only)
73//! union Repr {
74//! // holds integer (Simple/Os) variants, and
75//! // provides access to the tag bits.
c620b35d 76//! bits: NonZero<u64>,
5099ac24
FG
77//! // Tag is 0, so this is stored untagged.
78//! msg: &'static SimpleMessage,
79//! // Tagged (offset) `Box<Custom>` pointer.
80//! tagged_custom: NonNull<()>,
81//! }
82//! ```
83//!
84//! But there are a few problems with this:
85//!
86//! 1. Union access is equivalent to a transmute, so this representation would
87//! require we transmute between integers and pointers in at least one
88//! direction, which may be UB (and even if not, it is likely harder for a
89//! compiler to reason about than explicit ptr->int operations).
90//!
91//! 2. Even if all fields of a union have a niche, the union itself doesn't,
92//! although this may change in the future. This would make things like
93//! `io::Result<()>` and `io::Result<usize>` larger, which defeats part of
94//! the motivation of this bitpacking.
95//!
c620b35d 96//! Storing everything in a `NonZero<usize>` (or some other integer) would be a
5099ac24
FG
97//! bit more traditional for pointer tagging, but it would lose provenance
98//! information, couldn't be constructed from a `const fn`, and would probably
99//! run into other issues as well.
100//!
101//! The `NonNull<()>` seems like the only alternative, even if it's fairly odd
102//! to use a pointer type to store something that may hold an integer, some of
103//! the time.
104
9ffffee4 105use super::{Custom, ErrorData, ErrorKind, RawOsError, SimpleMessage};
5099ac24 106use core::marker::PhantomData;
5e7ed085 107use core::ptr::{self, NonNull};
5099ac24
FG
108
109// The 2 least-significant bits are used as tag.
110const TAG_MASK: usize = 0b11;
111const TAG_SIMPLE_MESSAGE: usize = 0b00;
112const TAG_CUSTOM: usize = 0b01;
113const TAG_OS: usize = 0b10;
114const TAG_SIMPLE: usize = 0b11;
115
116/// The internal representation.
117///
118/// See the module docs for more, this is just a way to hack in a check that we
119/// indeed are not unwind-safe.
120///
121/// ```compile_fail,E0277
122/// fn is_unwind_safe<T: core::panic::UnwindSafe>() {}
123/// is_unwind_safe::<std::io::Error>();
124/// ```
125#[repr(transparent)]
126pub(super) struct Repr(NonNull<()>, PhantomData<ErrorData<Box<Custom>>>);
127
128// All the types `Repr` stores internally are Send + Sync, and so is it.
129unsafe impl Send for Repr {}
130unsafe impl Sync for Repr {}
131
132impl Repr {
064997fb
FG
133 pub(super) fn new(dat: ErrorData<Box<Custom>>) -> Self {
134 match dat {
135 ErrorData::Os(code) => Self::new_os(code),
136 ErrorData::Simple(kind) => Self::new_simple(kind),
137 ErrorData::SimpleMessage(simple_message) => Self::new_simple_message(simple_message),
138 ErrorData::Custom(b) => Self::new_custom(b),
139 }
140 }
141
5099ac24
FG
142 pub(super) fn new_custom(b: Box<Custom>) -> Self {
143 let p = Box::into_raw(b).cast::<u8>();
144 // Should only be possible if an allocator handed out a pointer with
145 // wrong alignment.
5e7ed085 146 debug_assert_eq!(p.addr() & TAG_MASK, 0);
5099ac24
FG
147 // Note: We know `TAG_CUSTOM <= size_of::<Custom>()` (static_assert at
148 // end of file), and both the start and end of the expression must be
149 // valid without address space wraparound due to `Box`'s semantics.
150 //
151 // This means it would be correct to implement this using `ptr::add`
152 // (rather than `ptr::wrapping_add`), but it's unclear this would give
153 // any benefit, so we just use `wrapping_add` instead.
154 let tagged = p.wrapping_add(TAG_CUSTOM).cast::<()>();
155 // Safety: `TAG_CUSTOM + p` is the same as `TAG_CUSTOM | p`,
156 // because `p`'s alignment means it isn't allowed to have any of the
157 // `TAG_BITS` set (you can verify that addition and bitwise-or are the
158 // same when the operands have no bits in common using a truth table).
159 //
160 // Then, `TAG_CUSTOM | p` is not zero, as that would require
161 // `TAG_CUSTOM` and `p` both be zero, and neither is (as `p` came from a
162 // box, and `TAG_CUSTOM` just... isn't zero -- it's `0b01`). Therefore,
163 // `TAG_CUSTOM + p` isn't zero and so `tagged` can't be, and the
164 // `new_unchecked` is safe.
165 let res = Self(unsafe { NonNull::new_unchecked(tagged) }, PhantomData);
166 // quickly smoke-check we encoded the right thing (This generally will
9c376795 167 // only run in std's tests, unless the user uses -Zbuild-std)
5099ac24
FG
168 debug_assert!(matches!(res.data(), ErrorData::Custom(_)), "repr(custom) encoding failed");
169 res
170 }
171
172 #[inline]
9ffffee4 173 pub(super) fn new_os(code: RawOsError) -> Self {
5099ac24
FG
174 let utagged = ((code as usize) << 32) | TAG_OS;
175 // Safety: `TAG_OS` is not zero, so the result of the `|` is not 0.
c620b35d
FG
176 let res = Self(
177 unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) },
178 PhantomData,
179 );
5099ac24 180 // quickly smoke-check we encoded the right thing (This generally will
9c376795 181 // only run in std's tests, unless the user uses -Zbuild-std)
5099ac24
FG
182 debug_assert!(
183 matches!(res.data(), ErrorData::Os(c) if c == code),
5e7ed085 184 "repr(os) encoding failed for {code}"
5099ac24
FG
185 );
186 res
187 }
188
189 #[inline]
190 pub(super) fn new_simple(kind: ErrorKind) -> Self {
191 let utagged = ((kind as usize) << 32) | TAG_SIMPLE;
192 // Safety: `TAG_SIMPLE` is not zero, so the result of the `|` is not 0.
c620b35d
FG
193 let res = Self(
194 unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) },
195 PhantomData,
196 );
5099ac24 197 // quickly smoke-check we encoded the right thing (This generally will
9c376795 198 // only run in std's tests, unless the user uses -Zbuild-std)
5099ac24
FG
199 debug_assert!(
200 matches!(res.data(), ErrorData::Simple(k) if k == kind),
201 "repr(simple) encoding failed {:?}",
202 kind,
203 );
204 res
205 }
206
207 #[inline]
208 pub(super) const fn new_simple_message(m: &'static SimpleMessage) -> Self {
209 // Safety: References are never null.
210 Self(unsafe { NonNull::new_unchecked(m as *const _ as *mut ()) }, PhantomData)
211 }
212
213 #[inline]
214 pub(super) fn data(&self) -> ErrorData<&Custom> {
215 // Safety: We're a Repr, decode_repr is fine.
216 unsafe { decode_repr(self.0, |c| &*c) }
217 }
218
219 #[inline]
220 pub(super) fn data_mut(&mut self) -> ErrorData<&mut Custom> {
221 // Safety: We're a Repr, decode_repr is fine.
222 unsafe { decode_repr(self.0, |c| &mut *c) }
223 }
224
225 #[inline]
226 pub(super) fn into_data(self) -> ErrorData<Box<Custom>> {
227 let this = core::mem::ManuallyDrop::new(self);
228 // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is
229 // safe because we prevent double-drop using `ManuallyDrop`.
230 unsafe { decode_repr(this.0, |p| Box::from_raw(p)) }
231 }
232}
233
234impl Drop for Repr {
235 #[inline]
236 fn drop(&mut self) {
237 // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is
238 // safe because we're being dropped.
239 unsafe {
240 let _ = decode_repr(self.0, |p| Box::<Custom>::from_raw(p));
241 }
242 }
243}
244
245// Shared helper to decode a `Repr`'s internal pointer into an ErrorData.
246//
247// Safety: `ptr`'s bits should be encoded as described in the document at the
248// top (it should `some_repr.0`)
249#[inline]
250unsafe fn decode_repr<C, F>(ptr: NonNull<()>, make_custom: F) -> ErrorData<C>
251where
252 F: FnOnce(*mut Custom) -> C,
253{
5e7ed085 254 let bits = ptr.as_ptr().addr();
5099ac24
FG
255 match bits & TAG_MASK {
256 TAG_OS => {
9ffffee4 257 let code = ((bits as i64) >> 32) as RawOsError;
5099ac24
FG
258 ErrorData::Os(code)
259 }
260 TAG_SIMPLE => {
261 let kind_bits = (bits >> 32) as u32;
262 let kind = kind_from_prim(kind_bits).unwrap_or_else(|| {
263 debug_assert!(false, "Invalid io::error::Repr bits: `Repr({:#018x})`", bits);
264 // This means the `ptr` passed in was not valid, which violates
265 // the unsafe contract of `decode_repr`.
266 //
267 // Using this rather than unwrap meaningfully improves the code
268 // for callers which only care about one variant (usually
269 // `Custom`)
1f0639a9 270 unsafe { core::hint::unreachable_unchecked() };
5099ac24
FG
271 });
272 ErrorData::Simple(kind)
273 }
1f0639a9
FG
274 TAG_SIMPLE_MESSAGE => {
275 // SAFETY: per tag
276 unsafe { ErrorData::SimpleMessage(&*ptr.cast::<SimpleMessage>().as_ptr()) }
277 }
5099ac24 278 TAG_CUSTOM => {
f2b60f7d 279 // It would be correct for us to use `ptr::byte_sub` here (see the
5099ac24
FG
280 // comment above the `wrapping_add` call in `new_custom` for why),
281 // but it isn't clear that it makes a difference, so we don't.
f2b60f7d 282 let custom = ptr.as_ptr().wrapping_byte_sub(TAG_CUSTOM).cast::<Custom>();
5099ac24
FG
283 ErrorData::Custom(make_custom(custom))
284 }
285 _ => {
286 // Can't happen, and compiler can tell
287 unreachable!();
288 }
289 }
290}
291
292// This compiles to the same code as the check+transmute, but doesn't require
293// unsafe, or to hard-code max ErrorKind or its size in a way the compiler
294// couldn't verify.
295#[inline]
296fn kind_from_prim(ek: u32) -> Option<ErrorKind> {
297 macro_rules! from_prim {
298 ($prim:expr => $Enum:ident { $($Variant:ident),* $(,)? }) => {{
299 // Force a compile error if the list gets out of date.
300 const _: fn(e: $Enum) = |e: $Enum| match e {
301 $($Enum::$Variant => ()),*
302 };
303 match $prim {
304 $(v if v == ($Enum::$Variant as _) => Some($Enum::$Variant),)*
305 _ => None,
306 }
307 }}
308 }
309 from_prim!(ek => ErrorKind {
310 NotFound,
311 PermissionDenied,
312 ConnectionRefused,
313 ConnectionReset,
314 HostUnreachable,
315 NetworkUnreachable,
316 ConnectionAborted,
317 NotConnected,
318 AddrInUse,
319 AddrNotAvailable,
320 NetworkDown,
321 BrokenPipe,
322 AlreadyExists,
323 WouldBlock,
324 NotADirectory,
325 IsADirectory,
326 DirectoryNotEmpty,
327 ReadOnlyFilesystem,
328 FilesystemLoop,
329 StaleNetworkFileHandle,
330 InvalidInput,
331 InvalidData,
332 TimedOut,
333 WriteZero,
334 StorageFull,
335 NotSeekable,
336 FilesystemQuotaExceeded,
337 FileTooLarge,
338 ResourceBusy,
339 ExecutableFileBusy,
340 Deadlock,
341 CrossesDevices,
342 TooManyLinks,
343 InvalidFilename,
344 ArgumentListTooLong,
345 Interrupted,
346 Other,
347 UnexpectedEof,
348 Unsupported,
349 OutOfMemory,
350 Uncategorized,
351 })
352}
353
354// Some static checking to alert us if a change breaks any of the assumptions
355// that our encoding relies on for correctness and soundness. (Some of these are
356// a bit overly thorough/cautious, admittedly)
357//
9c376795 358// If any of these are hit on a platform that std supports, we should likely
5099ac24
FG
359// just use `repr_unpacked.rs` there instead (unless the fix is easy).
360macro_rules! static_assert {
361 ($condition:expr) => {
362 const _: () = assert!($condition);
363 };
364 (@usize_eq: $lhs:expr, $rhs:expr) => {
365 const _: [(); $lhs] = [(); $rhs];
366 };
367}
368
369// The bitpacking we use requires pointers be exactly 64 bits.
370static_assert!(@usize_eq: size_of::<NonNull<()>>(), 8);
371
372// We also require pointers and usize be the same size.
373static_assert!(@usize_eq: size_of::<NonNull<()>>(), size_of::<usize>());
374
375// `Custom` and `SimpleMessage` need to be thin pointers.
376static_assert!(@usize_eq: size_of::<&'static SimpleMessage>(), 8);
377static_assert!(@usize_eq: size_of::<Box<Custom>>(), 8);
378
379static_assert!((TAG_MASK + 1).is_power_of_two());
380// And they must have sufficient alignment.
381static_assert!(align_of::<SimpleMessage>() >= TAG_MASK + 1);
382static_assert!(align_of::<Custom>() >= TAG_MASK + 1);
383
9c376795
FG
384static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE_MESSAGE, TAG_SIMPLE_MESSAGE);
385static_assert!(@usize_eq: TAG_MASK & TAG_CUSTOM, TAG_CUSTOM);
386static_assert!(@usize_eq: TAG_MASK & TAG_OS, TAG_OS);
387static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE, TAG_SIMPLE);
5099ac24
FG
388
389// This is obviously true (`TAG_CUSTOM` is `0b01`), but in `Repr::new_custom` we
390// offset a pointer by this value, and expect it to both be within the same
391// object, and to not wrap around the address space. See the comment in that
392// function for further details.
393//
394// Actually, at the moment we use `ptr::wrapping_add`, not `ptr::add`, so this
395// check isn't needed for that one, although the assertion that we don't
396// actually wrap around in that wrapping_add does simplify the safety reasoning
397// elsewhere considerably.
398static_assert!(size_of::<Custom>() >= TAG_CUSTOM);
399
400// These two store a payload which is allowed to be zero, so they must be
401// non-zero to preserve the `NonNull`'s range invariant.
402static_assert!(TAG_OS != 0);
403static_assert!(TAG_SIMPLE != 0);
404// We can't tag `SimpleMessage`s, the tag must be 0.
405static_assert!(@usize_eq: TAG_SIMPLE_MESSAGE, 0);
406
407// Check that the point of all of this still holds.
408//
409// We'd check against `io::Error`, but *technically* it's allowed to vary,
410// as it's not `#[repr(transparent)]`/`#[repr(C)]`. We could add that, but
411// the `#[repr()]` would show up in rustdoc, which might be seen as a stable
412// commitment.
413static_assert!(@usize_eq: size_of::<Repr>(), 8);
414static_assert!(@usize_eq: size_of::<Option<Repr>>(), 8);
415static_assert!(@usize_eq: size_of::<Result<(), Repr>>(), 8);
416static_assert!(@usize_eq: size_of::<Result<usize, Repr>>(), 16);