]> git.proxmox.com Git - pve-docs.git/blob - qm-pci-passthrough.adoc
7e00e2e531529b1572b380e8b2d83de02c57b7dd
[pve-docs.git] / qm-pci-passthrough.adoc
1 [[qm_pci_passthrough]]
2 PCI(e) Passthrough
3 ------------------
4
5 PCI(e) passthrough is a mechanism to give a virtual machine control over
6 a PCI device from the host. This can have some advantages over using
7 virtualized hardware, for example lower latency, higher performance, or more
8 features (e.g., offloading).
9
10 But, if you pass through a device to a virtual machine, you cannot use that
11 device anymore on the host or in any other VM.
12
13 General Requirements
14 ~~~~~~~~~~~~~~~~~~~~
15
16 Since passthrough is a feature which also needs hardware support, there are
17 some requirements to check and preparations to be done to make it work.
18
19
20 Hardware
21 ^^^^^^^^
22 Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
23 **U**nit) interrupt remapping, this includes the CPU and the mainboard.
24
25 Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
26 But it is not guaranteed that everything will work out of the box, due
27 to bad hardware implementation and missing or low quality drivers.
28
29 Further, server grade hardware has often better support than consumer grade
30 hardware, but even then, many modern system can support this.
31
32 Please refer to your hardware vendor to check if they support this feature
33 under Linux for your specific setup
34
35
36 Configuration
37 ^^^^^^^^^^^^^
38
39 Once you ensured that your hardware supports passthrough, you will need to do
40 some configuration to enable PCI(e) passthrough.
41
42
43 .IOMMU
44
45 The IOMMU has to be activated on the kernel commandline. The easiest way is to
46 enable trough grub. Edit `'/etc/default/grub'' and add the following to the
47 'GRUB_CMDLINE_LINUX_DEFAULT' variable:
48
49 * for Intel CPUs:
50 +
51 ----
52 intel_iommu=on
53 ----
54 * for AMD CPUs:
55 +
56 ----
57 amd_iommu=on
58 ----
59
60 [[qm_pci_passthrough_update_grub]]
61 To bring this change in effect, make sure you run:
62
63 ----
64 # update-grub
65 ----
66
67 .Kernel Modules
68
69 You have to make sure the following modules are loaded. This can be achieved by
70 adding them to `'/etc/modules''
71
72 ----
73 vfio
74 vfio_iommu_type1
75 vfio_pci
76 vfio_virqfd
77 ----
78
79 [[qm_pci_passthrough_update_initramfs]]
80 After changing anything modules related, you need to refresh your
81 `initramfs`. On {pve} this can be done by executing:
82
83 ----
84 # update-initramfs -u -k all
85 ----
86
87 .Finish Configuration
88
89 Finally reboot to bring the changes into effect and check that it is indeed
90 enabled.
91
92 ----
93 # dmesg -e DMAR -e IOMMU -e AMD-Vi
94 ----
95
96 should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
97 enabled, depending on hardware and kernel the exact message can vary.
98
99 It is also important that the device(s) you want to pass through
100 are in a *separate* `IOMMU` group. This can be checked with:
101
102 ----
103 # find /sys/kernel/iommu_groups/ -type l
104 ----
105
106 It is okay if the device is in an `IOMMU` group together with its functions
107 (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
108
109 .PCI(e) slots
110 [NOTE]
111 ====
112 Some platforms handle their physical PCI(e) slots differently. So, sometimes
113 it can help to put the card in a another PCI(e) slot, if you do not get the
114 desired `IOMMU` group separation.
115 ====
116
117 .Unsafe interrupts
118 [NOTE]
119 ====
120 For some platforms, it may be necessary to allow unsafe interrupts.
121 For this add the following line in a file ending with `.conf' file in
122 */etc/modprobe.d/*:
123
124 ----
125 options vfio_iommu_type1 allow_unsafe_interrupts=1
126 ----
127
128 Please be aware that this option can make your system unstable.
129 ====
130
131 Notes for GPU passthrough
132 ^^^^^^^^^^^^^^^^^^^^^^^^^
133
134 When passing through a GPU (be it a full device, or a vGPU), if you want to use
135 it for displaying, you have to either physically connect a monitor to the
136 card (if possible), or configure a remote desktop software (e.g., VNC, RDP)
137 inside the guest and use that.
138
139 It is not possible to display the content of the GPU via NoVNC/SPICE on the
140 {pve} web interface.
141
142 If you want to use the GPU for things like OpenCL or CUDA, this is not an issue,
143 since the application should be able to choose the hardware without using
144 it as a display.
145
146 Host Device Passthrough
147 ~~~~~~~~~~~~~~~~~~~~~~~
148
149 The most used variant of PCI(e) passthrough is to pass through a whole
150 PCI(e) card, for example a GPU or a network card.
151
152
153 Host Configuration
154 ^^^^^^^^^^^^^^^^^^
155
156 In this case, the host cannot use the card. There are two methods to achieve
157 this:
158
159 * pass the device IDs to the options of the 'vfio-pci' modules by adding
160 +
161 ----
162 options vfio-pci ids=1234:5678,4321:8765
163 ----
164 +
165 to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
166 the vendor and device IDs obtained by:
167 +
168 ----
169 # lcpci -nn
170 ----
171
172 * blacklist the driver completely on the host, ensuring that it is free to bind
173 for passthrough, with
174 +
175 ----
176 blacklist DRIVERNAME
177 ----
178 +
179 in a .conf file in */etc/modprobe.d/*.
180
181 For both methods you need to
182 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
183 reboot after that.
184
185 [[qm_pci_passthrough_vm_config]]
186 VM Configuration
187 ^^^^^^^^^^^^^^^^
188 To pass through the device you need to set the *hostpciX* option in the VM
189 configuration, for example by executing:
190
191 ----
192 # qm set VMID -hostpci0 00:02.0
193 ----
194
195 If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`'),
196 you can pass them through all together with the shortened syntax ``00:02`'
197
198 There are some options to which may be necessary, depending on the device
199 and guest OS:
200
201 * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
202 With this enabled the *vga* configuration option will be ignored.
203
204 * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
205 combination require PCIe rather than PCI. PCIe is only available for 'q35'
206 machine types.
207
208 * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
209 Some PCI(e) devices need this disabled.
210
211 * *romfile=<path>*, is an optional path to a ROM file for the device to use.
212 This is a relative path under */usr/share/kvm/*.
213
214 .Example
215
216 An example of PCIe passthrough with a GPU set to primary:
217
218 ----
219 # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
220 ----
221
222
223 Other considerations
224 ^^^^^^^^^^^^^^^^^^^^
225
226 When passing through a GPU, the best compatibility is reached when using
227 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
228 instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
229 GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
230
231 SR-IOV
232 ~~~~~~
233
234 Another variant for passing through PCI(e) devices, is to use the hardware
235 virtualization features of your devices, if available.
236
237 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
238 a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
239 system. Each of those 'VF' can be used in a different VM, with full hardware
240 features and also better performance and lower latency than software
241 virtualized devices.
242
243 Currently, the most common use case for this are NICs (**N**etwork
244 **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
245 physical port. This allows using features such as checksum offloading, etc. to
246 be used inside a VM, reducing the (host) CPU overhead.
247
248
249 Host Configuration
250 ^^^^^^^^^^^^^^^^^^
251
252 Generally, there are two methods for enabling virtual functions on a device.
253
254 * sometimes there is an option for the driver module e.g. for some
255 Intel drivers
256 +
257 ----
258 max_vfs=4
259 ----
260 +
261 which could be put file with '.conf' ending under */etc/modprobe.d/*.
262 (Do not forget to update your initramfs after that)
263 +
264 Please refer to your driver module documentation for the exact
265 parameters and options.
266
267 * The second, more generic, approach is using the `sysfs`.
268 If a device and driver supports this you can change the number of VFs on
269 the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
270 +
271 ----
272 # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
273 ----
274 +
275 To make this change persistent you can use the `sysfsutils` Debian package.
276 After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
277 */etc/sysfs.d/*.
278
279 VM Configuration
280 ^^^^^^^^^^^^^^^^
281
282 After creating VFs, you should see them as separate PCI(e) devices when
283 outputting them with `lspci`. Get their ID and pass them through like a
284 xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
285
286 Other considerations
287 ^^^^^^^^^^^^^^^^^^^^
288
289 For this feature, platform support is especially important. It may be necessary
290 to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
291 for it to work. In doubt, consult the manual of the platform or contact its
292 vendor.
293
294 Mediated Devices (vGPU, GVT-g)
295 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296
297 Mediated devices are another method to use reuse features and performance from
298 physical hardware for virtualized hardware. These are found most common in
299 virtualized GPU setups such as Intels GVT-g and Nvidias vGPUs used in their
300 GRID technology.
301
302 With this, a physical Card is able to create virtual cards, similar to SR-IOV.
303 The difference is that mediated devices do not appear as PCI(e) devices in the
304 host, and are such only suited for using in virtual machines.
305
306
307 Host Configuration
308 ^^^^^^^^^^^^^^^^^^
309
310 In general your card's driver must support that feature, otherwise it will
311 not work. So please refer to your vendor for compatbile drivers and how to
312 configure them.
313
314 Intels drivers for GVT-g are integraded in the Kernel and should work
315 with the 5th, 6th and 7th generation Intel Core Processors, further E3 v4, E3
316 v5 and E3 v6 Xeon Processors are supported.
317
318 To enable it for Intel Graphcs, you have to make sure to load the module
319 'kvmgt' (for example via `/etc/modules`) and to enable it on the Kernel
320 commandline. For this you can edit `'/etc/default/grub'' and add the following
321 to the 'GRUB_CMDLINE_LINUX_DEFAULT' variable:
322
323 ----
324 i915.enable_gvt=1
325 ----
326
327 After that remember to
328 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
329 xref:qm_pci_passthrough_update_grub[update grub] and
330 reboot your host.
331
332 VM Configuration
333 ^^^^^^^^^^^^^^^^
334
335 To use a mediated device, simply specify the `mdev` property on a `hostpciX`
336 VM configuration option.
337
338 You can get the supported devices via the 'sysfs'. For example, to list the
339 supported types for the device '0000:00:02.0' you would simply execute:
340
341 ----
342 # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
343 ----
344
345 Each entry is a directory which contains the following important files:
346
347 * 'available_instances' contains the amount of still available instances of
348 this type, each 'mdev' use in a VM reduces this.
349 * 'description' contains a short description about the capabilities of the type
350 * 'create' is the endpoint to create such a device, {pve} does this
351 automatically for you, if a 'hostpciX' option with `mdev` is configured.
352
353 Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
354
355 ----
356 # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
357 ----
358
359 With this set, {pve} automatically creates such a device on VM start, and
360 cleans it up again when the VM stops.