]>
Commit | Line | Data |
---|---|---|
8bb4bdeb | 1 | # repr(Rust) |
c1a9b12d SL |
2 | |
3 | First and foremost, all types have an alignment specified in bytes. The | |
4 | alignment of a type specifies what addresses are valid to store the value at. A | |
450edc1f | 5 | value with alignment `n` must only be stored at an address that is a multiple of |
c1a9b12d | 6 | `n`. So alignment 2 means you must be stored at an even address, and 1 means |
7cac9316 | 7 | that you can be stored anywhere. Alignment is at least 1, and always a power |
450edc1f XL |
8 | of 2. |
9 | ||
10 | Primitives are usually aligned to their size, although this is | |
11 | platform-specific behavior. For example, on x86 `u64` and `f64` are often | |
12 | aligned to 4 bytes (32 bits). | |
c1a9b12d | 13 | |
fc512014 XL |
14 | A type's size must always be a multiple of its alignment (Zero being a valid size |
15 | for any alignment). This ensures that an array of that type may always be indexed | |
16 | by offsetting by a multiple of its size. Note that the size and alignment of a | |
17 | type may not be known statically in the case of [dynamically sized types][dst]. | |
c1a9b12d SL |
18 | |
19 | Rust gives you the following ways to lay out composite data: | |
20 | ||
21 | * structs (named product types) | |
22 | * tuples (anonymous product types) | |
23 | * arrays (homogeneous product types) | |
24 | * enums (named sum types -- tagged unions) | |
450edc1f | 25 | * unions (untagged unions) |
c1a9b12d | 26 | |
ff7c6d11 | 27 | An enum is said to be *field-less* if none of its variants have associated data. |
c1a9b12d | 28 | |
450edc1f XL |
29 | By default, composite structures have an alignment equal to the maximum |
30 | of their fields' alignments. Rust will consequently insert padding where | |
c1a9b12d SL |
31 | necessary to ensure that all fields are properly aligned and that the overall |
32 | type's size is a multiple of its alignment. For instance: | |
33 | ||
34 | ```rust | |
35 | struct A { | |
36 | a: u8, | |
37 | b: u32, | |
38 | c: u16, | |
39 | } | |
40 | ``` | |
41 | ||
450edc1f | 42 | will be 32-bit aligned on a target that aligns these primitives to their |
e9174d1e | 43 | respective sizes. The whole struct will therefore have a size that is a multiple |
13cf67c4 | 44 | of 32-bits. It may become: |
c1a9b12d SL |
45 | |
46 | ```rust | |
47 | struct A { | |
48 | a: u8, | |
49 | _pad1: [u8; 3], // to align `b` | |
50 | b: u32, | |
51 | c: u16, | |
52 | _pad2: [u8; 2], // to make overall size multiple of 4 | |
53 | } | |
54 | ``` | |
55 | ||
13cf67c4 XL |
56 | or maybe: |
57 | ||
58 | ```rust | |
59 | struct A { | |
60 | b: u32, | |
61 | c: u16, | |
62 | a: u8, | |
63 | _pad: u8, | |
64 | } | |
65 | ``` | |
66 | ||
e9174d1e SL |
67 | There is *no indirection* for these types; all data is stored within the struct, |
68 | as you would expect in C. However with the exception of arrays (which are | |
450edc1f XL |
69 | densely packed and in-order), the layout of data is not specified by default. |
70 | Given the two following struct definitions: | |
c1a9b12d SL |
71 | |
72 | ```rust | |
73 | struct A { | |
74 | a: i32, | |
75 | b: u64, | |
76 | } | |
77 | ||
78 | struct B { | |
e9174d1e | 79 | a: i32, |
c1a9b12d SL |
80 | b: u64, |
81 | } | |
82 | ``` | |
83 | ||
84 | Rust *does* guarantee that two instances of A have their data laid out in | |
e9174d1e | 85 | exactly the same way. However Rust *does not* currently guarantee that an |
450edc1f | 86 | instance of A has the same field ordering or padding as an instance of B. |
c1a9b12d | 87 | |
e9174d1e | 88 | With A and B as written, this point would seem to be pedantic, but several other |
c1a9b12d SL |
89 | features of Rust make it desirable for the language to play with data layout in |
90 | complex ways. | |
91 | ||
92 | For instance, consider this struct: | |
93 | ||
94 | ```rust | |
95 | struct Foo<T, U> { | |
96 | count: u16, | |
97 | data1: T, | |
98 | data2: U, | |
99 | } | |
100 | ``` | |
101 | ||
102 | Now consider the monomorphizations of `Foo<u32, u16>` and `Foo<u16, u32>`. If | |
103 | Rust lays out the fields in the order specified, we expect it to pad the | |
104 | values in the struct to satisfy their alignment requirements. So if Rust | |
105 | didn't reorder fields, we would expect it to produce the following: | |
106 | ||
136023e0 | 107 | <!-- ignore: explanation code --> |
c1a9b12d SL |
108 | ```rust,ignore |
109 | struct Foo<u16, u32> { | |
110 | count: u16, | |
111 | data1: u16, | |
112 | data2: u32, | |
113 | } | |
114 | ||
115 | struct Foo<u32, u16> { | |
116 | count: u16, | |
117 | _pad1: u16, | |
118 | data1: u32, | |
119 | data2: u16, | |
120 | _pad2: u16, | |
121 | } | |
122 | ``` | |
123 | ||
450edc1f | 124 | The latter case quite simply wastes space. An optimal use of space |
c1a9b12d SL |
125 | requires different monomorphizations to have *different field orderings*. |
126 | ||
c1a9b12d SL |
127 | Enums make this consideration even more complicated. Naively, an enum such as: |
128 | ||
129 | ```rust | |
130 | enum Foo { | |
131 | A(u32), | |
132 | B(u64), | |
133 | C(u8), | |
134 | } | |
135 | ``` | |
136 | ||
450edc1f | 137 | might be laid out as: |
c1a9b12d SL |
138 | |
139 | ```rust | |
140 | struct FooRepr { | |
141 | data: u64, // this is either a u64, u32, or u8 based on `tag` | |
142 | tag: u8, // 0 = A, 1 = B, 2 = C | |
143 | } | |
144 | ``` | |
145 | ||
450edc1f | 146 | And indeed this is approximately how it would be laid out (modulo the |
e9174d1e SL |
147 | size and position of `tag`). |
148 | ||
149 | However there are several cases where such a representation is inefficient. The | |
150 | classic case of this is Rust's "null pointer optimization": an enum consisting | |
151 | of a single outer unit variant (e.g. `None`) and a (potentially nested) non- | |
450edc1f XL |
152 | nullable pointer variant (e.g. `Some(&T)`) makes the tag unnecessary. A null |
153 | pointer can safely be interpreted as the unit (`None`) variant. The net | |
154 | result is that, for example, `size_of::<Option<&T>>() == size_of::<&T>()`. | |
c1a9b12d | 155 | |
e9174d1e | 156 | There are many types in Rust that are, or contain, non-nullable pointers such as |
c1a9b12d SL |
157 | `Box<T>`, `Vec<T>`, `String`, `&T`, and `&mut T`. Similarly, one can imagine |
158 | nested enums pooling their tags into a single discriminant, as they are by | |
e9174d1e | 159 | definition known to have a limited range of valid values. In principle enums could |
450edc1f XL |
160 | use fairly elaborate algorithms to store bits throughout nested types with |
161 | forbidden values. As such it is *especially* desirable that | |
c1a9b12d SL |
162 | we leave enum layout unspecified today. |
163 | ||
b039eaaf | 164 | [dst]: exotic-sizes.html#dynamically-sized-types-dsts |