]>
Commit | Line | Data |
---|---|---|
4446d382 TG |
1 | Microarchitectural Data Sampling (MDS) mitigation |
2 | ================================================= | |
3 | ||
4 | .. _mds: | |
5 | ||
6 | Overview | |
7 | -------- | |
8 | ||
9 | Microarchitectural Data Sampling (MDS) is a family of side channel attacks | |
10 | on internal buffers in Intel CPUs. The variants are: | |
11 | ||
12 | - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126) | |
13 | - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130) | |
14 | - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127) | |
15 | ||
16 | MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a | |
17 | dependent load (store-to-load forwarding) as an optimization. The forward | |
18 | can also happen to a faulting or assisting load operation for a different | |
19 | memory address, which can be exploited under certain conditions. Store | |
20 | buffers are partitioned between Hyper-Threads so cross thread forwarding is | |
21 | not possible. But if a thread enters or exits a sleep state the store | |
22 | buffer is repartitioned which can expose data from one thread to the other. | |
23 | ||
24 | MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage | |
25 | L1 miss situations and to hold data which is returned or sent in response | |
26 | to a memory or I/O operation. Fill buffers can forward data to a load | |
27 | operation and also write data to the cache. When the fill buffer is | |
28 | deallocated it can retain the stale data of the preceding operations which | |
29 | can then be forwarded to a faulting or assisting load operation, which can | |
30 | be exploited under certain conditions. Fill buffers are shared between | |
31 | Hyper-Threads so cross thread leakage is possible. | |
32 | ||
33 | MLPDS leaks Load Port Data. Load ports are used to perform load operations | |
34 | from memory or I/O. The received data is then forwarded to the register | |
35 | file or a subsequent operation. In some implementations the Load Port can | |
36 | contain stale data from a previous operation which can be forwarded to | |
37 | faulting or assisting loads under certain conditions, which again can be | |
38 | exploited eventually. Load ports are shared between Hyper-Threads so cross | |
39 | thread leakage is possible. | |
40 | ||
41 | ||
42 | Exposure assumptions | |
43 | -------------------- | |
44 | ||
45 | It is assumed that attack code resides in user space or in a guest with one | |
46 | exception. The rationale behind this assumption is that the code construct | |
47 | needed for exploiting MDS requires: | |
48 | ||
49 | - to control the load to trigger a fault or assist | |
50 | ||
51 | - to have a disclosure gadget which exposes the speculatively accessed | |
52 | data for consumption through a side channel. | |
53 | ||
54 | - to control the pointer through which the disclosure gadget exposes the | |
55 | data | |
56 | ||
57 | The existence of such a construct in the kernel cannot be excluded with | |
58 | 100% certainty, but the complexity involved makes it extremly unlikely. | |
59 | ||
60 | There is one exception, which is untrusted BPF. The functionality of | |
61 | untrusted BPF is limited, but it needs to be thoroughly investigated | |
62 | whether it can be used to create such a construct. | |
63 | ||
64 | ||
65 | Mitigation strategy | |
66 | ------------------- | |
67 | ||
68 | All variants have the same mitigation strategy at least for the single CPU | |
69 | thread case (SMT off): Force the CPU to clear the affected buffers. | |
70 | ||
71 | This is achieved by using the otherwise unused and obsolete VERW | |
72 | instruction in combination with a microcode update. The microcode clears | |
73 | the affected CPU buffers when the VERW instruction is executed. | |
74 | ||
75 | For virtualization there are two ways to achieve CPU buffer | |
76 | clearing. Either the modified VERW instruction or via the L1D Flush | |
77 | command. The latter is issued when L1TF mitigation is enabled so the extra | |
78 | VERW can be avoided. If the CPU is not affected by L1TF then VERW needs to | |
79 | be issued. | |
80 | ||
81 | If the VERW instruction with the supplied segment selector argument is | |
82 | executed on a CPU without the microcode update there is no side effect | |
83 | other than a small number of pointlessly wasted CPU cycles. | |
84 | ||
85 | This does not protect against cross Hyper-Thread attacks except for MSBDS | |
86 | which is only exploitable cross Hyper-thread when one of the Hyper-Threads | |
87 | enters a C-state. | |
88 | ||
89 | The kernel provides a function to invoke the buffer clearing: | |
90 | ||
91 | mds_clear_cpu_buffers() | |
92 | ||
93 | The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state | |
94 | (idle) transitions. | |
95 | ||
96 | According to current knowledge additional mitigations inside the kernel | |
97 | itself are not required because the necessary gadgets to expose the leaked | |
98 | data cannot be controlled in a way which allows exploitation from malicious | |
99 | user space or VM guests. | |
5ab15133 TG |
100 | |
101 | Mitigation points | |
102 | ----------------- | |
103 | ||
104 | 1. Return to user space | |
105 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
106 | ||
107 | When transitioning from kernel to user space the CPU buffers are flushed | |
108 | on affected CPUs when the mitigation is not disabled on the kernel | |
109 | command line. The migitation is enabled through the static key | |
110 | mds_user_clear. | |
111 | ||
112 | The mitigation is invoked in prepare_exit_to_usermode() which covers | |
113 | most of the kernel to user space transitions. There are a few exceptions | |
114 | which are not invoking prepare_exit_to_usermode() on return to user | |
115 | space. These exceptions use the paranoid exit code. | |
116 | ||
117 | - Non Maskable Interrupt (NMI): | |
118 | ||
119 | Access to sensible data like keys, credentials in the NMI context is | |
120 | mostly theoretical: The CPU can do prefetching or execute a | |
121 | misspeculated code path and thereby fetching data which might end up | |
122 | leaking through a buffer. | |
123 | ||
124 | But for mounting other attacks the kernel stack address of the task is | |
125 | already valuable information. So in full mitigation mode, the NMI is | |
126 | mitigated on the return from do_nmi() to provide almost complete | |
127 | coverage. | |
128 | ||
129 | - Double fault (#DF): | |
130 | ||
131 | A double fault is usually fatal, but the ESPFIX workaround, which can | |
132 | be triggered from user space through modify_ldt(2) is a recoverable | |
133 | double fault. #DF uses the paranoid exit path, so explicit mitigation | |
134 | in the double fault handler is required. | |
135 | ||
136 | - Machine Check Exception (#MC): | |
137 | ||
138 | Another corner case is a #MC which hits between the CPU buffer clear | |
139 | invocation and the actual return to user. As this still is in kernel | |
140 | space it takes the paranoid exit path which does not clear the CPU | |
141 | buffers. So the #MC handler repopulates the buffers to some | |
142 | extent. Machine checks are not reliably controllable and the window is | |
143 | extremly small so mitigation would just tick a checkbox that this | |
144 | theoretical corner case is covered. To keep the amount of special | |
145 | cases small, ignore #MC. | |
146 | ||
147 | - Debug Exception (#DB): | |
148 | ||
149 | This takes the paranoid exit path only when the INT1 breakpoint is in | |
150 | kernel space. #DB on a user space address takes the regular exit path, | |
151 | so no extra mitigation required. |