]> git.proxmox.com Git - rustc.git/blob - src/doc/reference/src/type-layout.md
New upstream version 1.44.1+dfsg1
[rustc.git] / src / doc / reference / src / type-layout.md
1 # Type Layout
2
3 The layout of a type is its size, alignment, and the relative offsets of its
4 fields. For enums, how the discriminant is laid out and interpreted is also part
5 of type layout.
6
7 Type layout can be changed with each compilation. Instead of trying to document
8 exactly what is done, we only document what is guaranteed today.
9
10 ## Size and Alignment
11
12 All values have an alignment and size.
13
14 The *alignment* of a value specifies what addresses are valid to store the value
15 at. A value of alignment `n` must only be stored at an address that is a
16 multiple of n. For example, a value with an alignment of 2 must be stored at an
17 even address, while a value with an alignment of 1 can be stored at any address.
18 Alignment is measured in bytes, and must be at least 1, and always a power of 2.
19 The alignment of a value can be checked with the [`align_of_val`] function.
20
21 The *size* of a value is the offset in bytes between successive elements in an
22 array with that item type including alignment padding. The size of a value is
23 always a multiple of its alignment. The size of a value can be checked with the
24 [`size_of_val`] function.
25
26 Types where all values have the same size and alignment known at compile time
27 implement the [`Sized`] trait and can be checked with the [`size_of`] and
28 [`align_of`] functions. Types that are not [`Sized`] are known as [dynamically
29 sized types]. Since all values of a `Sized` type share the same size and
30 alignment, we refer to those shared values as the size of the type and the
31 alignment of the type respectively.
32
33 ## Primitive Data Layout
34
35 The size of most primitives is given in this table.
36
37 | Type | `size_of::<Type>()`|
38 |-- |-- |
39 | `bool` | 1 |
40 | `u8` / `i8` | 1 |
41 | `u16` / `i16` | 2 |
42 | `u32` / `i32` | 4 |
43 | `u64` / `i64` | 8 |
44 | `u128` / `i128` | 16 |
45 | `f32` | 4 |
46 | `f64` | 8 |
47 | `char` | 4 |
48
49 `usize` and `isize` have a size big enough to contain every address on the
50 target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
51 bit target, this is 8 bytes.
52
53 Most primitives are generally aligned to their size, although this is
54 platform-specific behavior. In particular, on x86 u64 and f64 are only
55 aligned to 32 bits.
56
57 ## Pointers and References Layout
58
59 Pointers and references have the same layout. Mutability of the pointer or
60 reference does not change the layout.
61
62 Pointers to sized types have the same size and alignment as `usize`.
63
64 Pointers to unsized types are sized. The size and alignment is guaranteed to be
65 at least equal to the size and alignment of a pointer.
66
67 > Note: Though you should not rely on this, all pointers to
68 > <abbr title="Dynamically Sized Types">DSTs</abbr> are currently twice the
69 > size of the size of `usize` and have the same alignment.
70
71 ## Array Layout
72
73 Arrays are laid out so that the `nth` element of the array is offset from the
74 start of the array by `n * the size of the type` bytes. An array of `[T; n]`
75 has a size of `size_of::<T>() * n` and the same alignment of `T`.
76
77 ## Slice Layout
78
79 Slices have the same layout as the section of the array they slice.
80
81 > Note: This is about the raw `[T]` type, not pointers (`&[T]`, `Box<[T]>`,
82 > etc.) to slices.
83
84 ## `str` Layout
85 String slices are a UTF-8 representation of characters that have the same layout as slices of type `[u8]`.
86
87 ## Tuple Layout
88
89 Tuples do not have any guarantees about their layout.
90
91 The exception to this is the unit tuple (`()`) which is guaranteed as a
92 zero-sized type to have a size of 0 and an alignment of 1.
93
94 ## Trait Object Layout
95
96 Trait objects have the same layout as the value the trait object is of.
97
98 > Note: This is about the raw trait object types, not pointers (`&Trait`,
99 > `Box<Trait>`, etc.) to trait objects.
100
101 ## Closure Layout
102
103 Closures have no layout guarantees.
104
105 ## Representations
106
107 All user-defined composite types (`struct`s, `enum`s, and `union`s) have a
108 *representation* that specifies what the layout is for the type. The possible
109 representations for a type are:
110
111 - [Default]
112 - [`C`]
113 - The [primitive representations]
114 - [`transparent`]
115
116 The representation of a type can be changed by applying the `repr` attribute
117 to it. The following example shows a struct with a `C` representation.
118
119 ```rust
120 #[repr(C)]
121 struct ThreeInts {
122 first: i16,
123 second: i8,
124 third: i32
125 }
126 ```
127
128 The alignment may be raised or lowered with the `align` and `packed` modifiers
129 respectively. They alter the representation specified in the attribute.
130 If no representation is specified, the default one is altered.
131
132 ```rust
133 // Default representation, alignment lowered to 2.
134 #[repr(packed(2))]
135 struct PackedStruct {
136 first: i16,
137 second: i8,
138 third: i32
139 }
140
141 // C representation, alignment raised to 8
142 #[repr(C, align(8))]
143 struct AlignedStruct {
144 first: i16,
145 second: i8,
146 third: i32
147 }
148 ```
149
150 > Note: As a consequence of the representation being an attribute on the item,
151 > the representation does not depend on generic parameters. Any two types with
152 > the same name have the same representation. For example, `Foo<Bar>` and
153 > `Foo<Baz>` both have the same representation.
154
155 The representation of a type can change the padding between fields, but does
156 not change the layout of the fields themselves. For example, a struct with a
157 `C` representation that contains a struct `Inner` with the default
158 representation will not change the layout of `Inner`.
159
160 ### The Default Representation
161
162 Nominal types without a `repr` attribute have the default representation.
163 Informally, this representation is also called the `rust` representation.
164
165 There are no guarantees of data layout made by this representation.
166
167 ### The `C` Representation
168
169 The `C` representation is designed for dual purposes. One purpose is for
170 creating types that are interoperable with the C Language. The second purpose is
171 to create types that you can soundly perform operations on that rely on data
172 layout such as reinterpreting values as a different type.
173
174 Because of this dual purpose, it is possible to create types that are not useful
175 for interfacing with the C programming language.
176
177 This representation can be applied to structs, unions, and enums.
178
179 #### \#[repr(C)] Structs
180
181 The alignment of the struct is the alignment of the most-aligned field in it.
182
183 The size and offset of fields is determined by the following algorithm.
184
185 Start with a current offset of 0 bytes.
186
187 For each field in declaration order in the struct, first determine the size and
188 alignment of the field. If the current offset is not a multiple of the field's
189 alignment, then add padding bytes to the current offset until it is a multiple
190 of the field's alignment. The offset for the field is what the current offset
191 is now. Then increase the current offset by the size of the field.
192
193 Finally, the size of the struct is the current offset rounded up to the nearest
194 multiple of the struct's alignment.
195
196 Here is this algorithm described in pseudocode.
197
198 <!-- ignore: pseudocode -->
199 ```rust,ignore
200 /// Returns the amount of padding needed after `offset` to ensure that the
201 /// following address will be aligned to `alignment`.
202 fn padding_needed_for(offset: usize, alignment: usize) -> usize {
203 let misalignment = offset % alignment;
204 if misalignment > 0 {
205 // round up to next multiple of `alignment`
206 alignment - misalignment
207 } else {
208 // already a multiple of `alignment`
209 0
210 }
211 }
212
213 struct.alignment = struct.fields().map(|field| field.alignment).max();
214
215 let current_offset = 0;
216
217 for field in struct.fields_in_declaration_order() {
218 // Increase the current offset so that it's a multiple of the alignment
219 // of this field. For the first field, this will always be zero.
220 // The skipped bytes are called padding bytes.
221 current_offset += padding_needed_for(current_offset, field.alignment);
222
223 struct[field].offset = current_offset;
224
225 current_offset += field.size;
226 }
227
228 struct.size = current_offset + padding_needed_for(current_offset, struct.alignment);
229 ```
230
231 <div class="warning">
232
233 Warning: This pseudocode uses a naive algorithm that ignores overflow issues for
234 the sake of clarity. To perform memory layout computations in actual code, use
235 [`Layout`].
236
237 </div>
238
239 > Note: This algorithm can produce zero-sized structs. In C, an empty struct
240 > declaration like `struct Foo { }` is illegal. However, both gcc and clang
241 > support options to enable such structs, and assign them size zero. C++, in
242 > contrast, gives empty structs a size of 1, unless they are inherited from or
243 > they are fields that have the `[[no_unique_address]]` attribute, in which
244 > case they do not increase the overall size of the struct.
245
246 #### \#[repr(C)] Unions
247
248 A union declared with `#[repr(C)]` will have the same size and alignment as an
249 equivalent C union declaration in the C language for the target platform.
250 The union will have a size of the maximum size of all of its fields rounded to
251 its alignment, and an alignment of the maximum alignment of all of its fields.
252 These maximums may come from different fields.
253
254 ```rust
255 #[repr(C)]
256 union Union {
257 f1: u16,
258 f2: [u8; 4],
259 }
260
261 assert_eq!(std::mem::size_of::<Union>(), 4); // From f2
262 assert_eq!(std::mem::align_of::<Union>(), 2); // From f1
263
264 #[repr(C)]
265 union SizeRoundedUp {
266 a: u32,
267 b: [u16; 3],
268 }
269
270 assert_eq!(std::mem::size_of::<SizeRoundedUp>(), 8); // Size of 6 from b,
271 // rounded up to 8 from
272 // alignment of a.
273 assert_eq!(std::mem::align_of::<SizeRoundedUp>(), 4); // From a
274 ```
275
276 #### \#[repr(C)] Enums
277
278 For [C-like enumerations], the `C` representation has the size and alignment of
279 the default `enum` size and alignment for the target platform's C ABI.
280
281 > Note: The enum representation in C is implementation defined, so this is
282 > really a "best guess". In particular, this may be incorrect when the C code
283 > of interest is compiled with certain flags.
284
285 <div class="warning">
286
287 Warning: There are crucial differences between an `enum` in the C language and
288 Rust's C-like enumerations with this representation. An `enum` in C is
289 mostly a `typedef` plus some named constants; in other words, an object of an
290 `enum` type can hold any integer value. For example, this is often used for
291 bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold
292 the discriminant values, everything else is undefined behaviour. Therefore,
293 using a C-like enumeration in FFI to model a C `enum` is often wrong.
294
295 </div>
296
297 It is an error for [zero-variant enumerations] to have the `C` representation.
298
299 For all other enumerations, the layout is unspecified.
300
301 Likewise, combining the `C` representation with a primitive representation, the
302 layout is unspecified.
303
304 ### Primitive representations
305
306 The *primitive representations* are the representations with the same names as
307 the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `u128`,
308 `usize`, `i8`, `i16`, `i32`, `i64`, `i128`, and `isize`.
309
310 Primitive representations can only be applied to enumerations.
311
312 For [C-like enumerations], they set the size and alignment to be the same as the
313 primitive type of the same name. For example, a C-like enumeration with a `u8`
314 representation can only have discriminants between 0 and 255 inclusive.
315
316 It is an error for [zero-variant enumerations] to have a primitive
317 representation.
318
319 For all other enumerations, the layout is unspecified.
320
321 Likewise, combining two primitive representations together is unspecified.
322
323 ### The alignment modifiers
324
325 The `align` and `packed` modifiers can be used to respectively raise or lower
326 the alignment of `struct`s and `union`s. `packed` may also alter the padding
327 between fields.
328
329 The alignment is specified as an integer parameter in the form of
330 `#[repr(align(x))]` or `#[repr(packed(x))]`. The alignment value must be a
331 power of two from 1 up to 2<sup>29</sup>. For `packed`, if no value is given,
332 as in `#[repr(packed)]`, then the value is 1.
333
334 For `align`, if the specified alignment is less than the alignment of the type
335 without the `align` modifier, then the alignment is unaffected.
336
337 For `packed`, if the specified alignment is greater than the type's alignment
338 without the `packed` modifier, then the alignment and layout is unaffected.
339 The alignments of each field, for the purpose of positioning fields, is the
340 smaller of the specified alignment and the alignment of the field's type.
341
342 The `align` and `packed` modifiers cannot be applied on the same type and a
343 `packed` type cannot transitively contain another `align`ed type. `align` and
344 `packed` may only be applied to the [default] and [`C`] representations.
345
346 The `align` modifier can also be applied on an `enum`.
347 When it is, the effect on the `enum`'s alignment is the same as if the `enum`
348 was wrapped in a newtype `struct` with the same `align` modifier.
349
350 <div class="warning">
351
352 ***Warning:*** Dereferencing an unaligned pointer is [undefined behavior] and
353 it is possible to [safely create unaligned pointers to `packed` fields][27060].
354 Like all ways to create undefined behavior in safe Rust, this is a bug.
355
356 </div>
357
358 ### The `transparent` Representation
359
360 The `transparent` representation can only be used on a [`struct`][structs]
361 or an [`enum`][enumerations] with a single variant that has:
362
363 - a single field with non-zero size, and
364 - any number of fields with size 0 and alignment 1 (e.g. [`PhantomData<T>`]).
365
366 Structs and enums with this representation have the same layout and ABI
367 as the single non-zero sized field.
368
369 This is different than the `C` representation because
370 a struct with the `C` representation will always have the ABI of a `C` `struct`
371 while, for example, a struct with the `transparent` representation with a
372 primitive field will have the ABI of the primitive field.
373
374 Because this representation delegates type layout to another type, it cannot be
375 used with any other representation.
376
377 [`align_of_val`]: ../std/mem/fn.align_of_val.html
378 [`size_of_val`]: ../std/mem/fn.size_of_val.html
379 [`align_of`]: ../std/mem/fn.align_of.html
380 [`size_of`]: ../std/mem/fn.size_of.html
381 [`Sized`]: ../std/marker/trait.Sized.html
382 [dynamically sized types]: dynamically-sized-types.md
383 [C-like enumerations]: items/enumerations.md#custom-discriminant-values-for-fieldless-enumerations
384 [enumerations]: items/enumerations.md
385 [zero-variant enumerations]: items/enumerations.md#zero-variant-enums
386 [undefined behavior]: behavior-considered-undefined.md
387 [27060]: https://github.com/rust-lang/rust/issues/27060
388 [`PhantomData<T>`]: special-types-and-traits.md#phantomdatat
389 [Default]: #the-default-representation
390 [`C`]: #the-c-representation
391 [primitive representations]: #primitive-representations
392 [structs]: items/structs.md
393 [`transparent`]: #the-transparent-representation
394 [`Layout`]: ../std/alloc/struct.Layout.html