]> git.proxmox.com Git - pve-docs.git/blob - qm-pci-passthrough.adoc
pcie-passthrough: add short note about iommu passthrough mode
[pve-docs.git] / qm-pci-passthrough.adoc
1 [[qm_pci_passthrough]]
2 PCI(e) Passthrough
3 ------------------
4 ifdef::wiki[]
5 :pve-toplevel:
6 endif::wiki[]
7
8 PCI(e) passthrough is a mechanism to give a virtual machine control over
9 a PCI device from the host. This can have some advantages over using
10 virtualized hardware, for example lower latency, higher performance, or more
11 features (e.g., offloading).
12
13 But, if you pass through a device to a virtual machine, you cannot use that
14 device anymore on the host or in any other VM.
15
16 General Requirements
17 ~~~~~~~~~~~~~~~~~~~~
18
19 Since passthrough is a feature which also needs hardware support, there are
20 some requirements to check and preparations to be done to make it work.
21
22
23 Hardware
24 ^^^^^^^^
25 Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26 **U**nit) interrupt remapping, this includes the CPU and the mainboard.
27
28 Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29 But it is not guaranteed that everything will work out of the box, due
30 to bad hardware implementation and missing or low quality drivers.
31
32 Further, server grade hardware has often better support than consumer grade
33 hardware, but even then, many modern system can support this.
34
35 Please refer to your hardware vendor to check if they support this feature
36 under Linux for your specific setup.
37
38
39 Configuration
40 ^^^^^^^^^^^^^
41
42 Once you ensured that your hardware supports passthrough, you will need to do
43 some configuration to enable PCI(e) passthrough.
44
45
46 .IOMMU
47
48 First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49 options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50 for the exact option you need to enable.
51
52 Then, the IOMMU might need to be activated on the
53 xref:sysboot_edit_kernel_cmdline[kernel commandline].
54 (On newer kernels, this should not be necessary.)
55
56 The command line parameters are:
57
58 * for Intel CPUs:
59 +
60 ----
61 intel_iommu=on
62 ----
63 * for AMD CPUs it should be enabled automatically.
64
65
66 If your hardware supports it, enabling IOMMU passthrough mode might increase
67 performance, because then the VMs bypass the (default) DMA translation
68 which is normally done by the hypervisor, before handing DMA requests off to
69 the hardware IOMMU. You can enable it with adding
70
71 ----
72 iommu.passthrough=1
73 ----
74
75 or
76
77 ----
78 iommu=pt
79 ----
80
81 to the kernel commandline.
82
83 .Kernel Modules
84
85 You have to make sure the following modules are loaded. This can be achieved by
86 adding them to `'/etc/modules''
87
88 ----
89 vfio
90 vfio_iommu_type1
91 vfio_pci
92 vfio_virqfd
93 ----
94
95 [[qm_pci_passthrough_update_initramfs]]
96 After changing anything modules related, you need to refresh your
97 `initramfs`. On {pve} this can be done by executing:
98
99 ----
100 # update-initramfs -u -k all
101 ----
102
103 .Finish Configuration
104
105 Finally reboot to bring the changes into effect and check that it is indeed
106 enabled.
107
108 ----
109 # dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
110 ----
111
112 should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
113 enabled, depending on hardware and kernel the exact message can vary.
114
115 It is also important that the device(s) you want to pass through
116 are in a *separate* `IOMMU` group. This can be checked with:
117
118 ----
119 # find /sys/kernel/iommu_groups/ -type l
120 ----
121
122 It is okay if the device is in an `IOMMU` group together with its functions
123 (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
124
125 .PCI(e) slots
126 [NOTE]
127 ====
128 Some platforms handle their physical PCI(e) slots differently. So, sometimes
129 it can help to put the card in a another PCI(e) slot, if you do not get the
130 desired `IOMMU` group separation.
131 ====
132
133 .Unsafe interrupts
134 [NOTE]
135 ====
136 For some platforms, it may be necessary to allow unsafe interrupts.
137 For this add the following line in a file ending with `.conf' file in
138 */etc/modprobe.d/*:
139
140 ----
141 options vfio_iommu_type1 allow_unsafe_interrupts=1
142 ----
143
144 Please be aware that this option can make your system unstable.
145 ====
146
147 GPU Passthrough Notes
148 ^^^^^^^^^^^^^^^^^^^^^
149
150 It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
151 the {pve} web interface.
152
153 When passing through a whole GPU or a vGPU and graphic output is wanted, one
154 has to either physically connect a monitor to the card, or configure a remote
155 desktop software (for example, VNC or RDP) inside the guest.
156
157 If you want to use the GPU as a hardware accelerator, for example, for
158 programs using OpenCL or CUDA, this is not required.
159
160 Host Device Passthrough
161 ~~~~~~~~~~~~~~~~~~~~~~~
162
163 The most used variant of PCI(e) passthrough is to pass through a whole
164 PCI(e) card, for example a GPU or a network card.
165
166
167 Host Configuration
168 ^^^^^^^^^^^^^^^^^^
169
170 In this case, the host must not use the card. There are two methods to achieve
171 this:
172
173 * pass the device IDs to the options of the 'vfio-pci' modules by adding
174 +
175 ----
176 options vfio-pci ids=1234:5678,4321:8765
177 ----
178 +
179 to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
180 the vendor and device IDs obtained by:
181 +
182 ----
183 # lspci -nn
184 ----
185
186 * blacklist the driver completely on the host, ensuring that it is free to bind
187 for passthrough, with
188 +
189 ----
190 blacklist DRIVERNAME
191 ----
192 +
193 in a .conf file in */etc/modprobe.d/*.
194
195 For both methods you need to
196 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
197 reboot after that.
198
199 .Verify Configuration
200
201 To check if your changes were successful, you can use
202
203 ----
204 # lspci -nnk
205 ----
206
207 and check your device entry. If it says
208
209 ----
210 Kernel driver in use: vfio-pci
211 ----
212
213 or the 'in use' line is missing entirely, the device is ready to be used for
214 passthrough.
215
216 [[qm_pci_passthrough_vm_config]]
217 VM Configuration
218 ^^^^^^^^^^^^^^^^
219 To pass through the device you need to set the *hostpciX* option in the VM
220 configuration, for example by executing:
221
222 ----
223 # qm set VMID -hostpci0 00:02.0
224 ----
225
226 If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
227 you can pass them through all together with the shortened syntax ``00:02`'.
228 This is equivalent with checking the ``All Functions`' checkbox in the
229 web-interface.
230
231 There are some options to which may be necessary, depending on the device
232 and guest OS:
233
234 * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
235 With this enabled the *vga* configuration option will be ignored.
236
237 * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
238 combination require PCIe rather than PCI. PCIe is only available for 'q35'
239 machine types.
240
241 * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
242 Some PCI(e) devices need this disabled.
243
244 * *romfile=<path>*, is an optional path to a ROM file for the device to use.
245 This is a relative path under */usr/share/kvm/*.
246
247 .Example
248
249 An example of PCIe passthrough with a GPU set to primary:
250
251 ----
252 # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
253 ----
254
255 .PCI ID overrides
256
257 You can override the PCI vendor ID, device ID, and subsystem IDs that will be
258 seen by the guest. This is useful if your device is a variant with an ID that
259 your guest's drivers don't recognize, but you want to force those drivers to be
260 loaded anyway (e.g. if you know your device shares the same chipset as a
261 supported variant).
262
263 The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
264 `sub-device-id`. You can set any or all of these to override your device's
265 default IDs.
266
267 For example:
268
269 ----
270 # qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
271 ----
272
273
274 Other considerations
275 ^^^^^^^^^^^^^^^^^^^^
276
277 When passing through a GPU, the best compatibility is reached when using
278 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
279 instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
280 GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
281
282 SR-IOV
283 ~~~~~~
284
285 Another variant for passing through PCI(e) devices, is to use the hardware
286 virtualization features of your devices, if available.
287
288 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
289 a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
290 system. Each of those 'VF' can be used in a different VM, with full hardware
291 features and also better performance and lower latency than software
292 virtualized devices.
293
294 Currently, the most common use case for this are NICs (**N**etwork
295 **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
296 physical port. This allows using features such as checksum offloading, etc. to
297 be used inside a VM, reducing the (host) CPU overhead.
298
299
300 Host Configuration
301 ^^^^^^^^^^^^^^^^^^
302
303 Generally, there are two methods for enabling virtual functions on a device.
304
305 * sometimes there is an option for the driver module e.g. for some
306 Intel drivers
307 +
308 ----
309 max_vfs=4
310 ----
311 +
312 which could be put file with '.conf' ending under */etc/modprobe.d/*.
313 (Do not forget to update your initramfs after that)
314 +
315 Please refer to your driver module documentation for the exact
316 parameters and options.
317
318 * The second, more generic, approach is using the `sysfs`.
319 If a device and driver supports this you can change the number of VFs on
320 the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
321 +
322 ----
323 # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
324 ----
325 +
326 To make this change persistent you can use the `sysfsutils` Debian package.
327 After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
328 */etc/sysfs.d/*.
329
330 VM Configuration
331 ^^^^^^^^^^^^^^^^
332
333 After creating VFs, you should see them as separate PCI(e) devices when
334 outputting them with `lspci`. Get their ID and pass them through like a
335 xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
336
337 Other considerations
338 ^^^^^^^^^^^^^^^^^^^^
339
340 For this feature, platform support is especially important. It may be necessary
341 to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
342 for it to work. In doubt, consult the manual of the platform or contact its
343 vendor.
344
345 Mediated Devices (vGPU, GVT-g)
346 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
347
348 Mediated devices are another method to reuse features and performance from
349 physical hardware for virtualized hardware. These are found most common in
350 virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
351 GRID technology.
352
353 With this, a physical Card is able to create virtual cards, similar to SR-IOV.
354 The difference is that mediated devices do not appear as PCI(e) devices in the
355 host, and are such only suited for using in virtual machines.
356
357
358 Host Configuration
359 ^^^^^^^^^^^^^^^^^^
360
361 In general your card's driver must support that feature, otherwise it will
362 not work. So please refer to your vendor for compatible drivers and how to
363 configure them.
364
365 Intel's drivers for GVT-g are integrated in the Kernel and should work
366 with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
367 v5 and E3 v6 Xeon Processors.
368
369 To enable it for Intel Graphics, you have to make sure to load the module
370 'kvmgt' (for example via `/etc/modules`) and to enable it on the
371 xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
372
373 ----
374 i915.enable_gvt=1
375 ----
376
377 After that remember to
378 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
379 and reboot your host.
380
381 VM Configuration
382 ^^^^^^^^^^^^^^^^
383
384 To use a mediated device, simply specify the `mdev` property on a `hostpciX`
385 VM configuration option.
386
387 You can get the supported devices via the 'sysfs'. For example, to list the
388 supported types for the device '0000:00:02.0' you would simply execute:
389
390 ----
391 # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
392 ----
393
394 Each entry is a directory which contains the following important files:
395
396 * 'available_instances' contains the amount of still available instances of
397 this type, each 'mdev' use in a VM reduces this.
398 * 'description' contains a short description about the capabilities of the type
399 * 'create' is the endpoint to create such a device, {pve} does this
400 automatically for you, if a 'hostpciX' option with `mdev` is configured.
401
402 Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
403
404 ----
405 # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
406 ----
407
408 With this set, {pve} automatically creates such a device on VM start, and
409 cleans it up again when the VM stops.
410
411 ifdef::wiki[]
412
413 See Also
414 ~~~~~~~~
415
416 * link:/wiki/Pci_passthrough[PCI Passthrough Examples]
417
418 endif::wiki[]