]> git.proxmox.com Git - pve-docs.git/blame_incremental - qm-pci-passthrough.adoc
totp: fix copy/paste mistake
[pve-docs.git] / qm-pci-passthrough.adoc
... / ...
CommitLineData
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
12
13But, if you pass through a device to a virtual machine, you cannot use that
14device anymore on the host or in any other VM.
15
16Note that, while PCI passthrough is available for i440fx and q35 machines, PCIe
17passthrough is only available on q35 machines. This does not mean that
18PCIe capable devices that are passed through as PCI devices will only run at
19PCI speeds. Passing through devices as PCIe just sets a flag for the guest to
20tell it that the device is a PCIe device instead of a "really fast legacy PCI
21device". Some guest applications benefit from this.
22
23General Requirements
24~~~~~~~~~~~~~~~~~~~~
25
26Since passthrough is performed on real hardware, it needs to fulfill some
27requirements. A brief overview of these requirements is given below, for more
28information on specific devices, see
29https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples].
30
31Hardware
32^^^^^^^^
33Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
34**U**nit) interrupt remapping, this includes the CPU and the motherboard.
35
36Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this.
37But it is not guaranteed that everything will work out of the box, due
38to bad hardware implementation and missing or low quality drivers.
39
40Further, server grade hardware has often better support than consumer grade
41hardware, but even then, many modern system can support this.
42
43Please refer to your hardware vendor to check if they support this feature
44under Linux for your specific setup.
45
46Determining PCI Card Address
47^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48
49The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's
50hardware tab. Alternatively, you can use the command line.
51
52You can locate your card using
53
54----
55 lspci
56----
57
58Configuration
59^^^^^^^^^^^^^
60
61Once you ensured that your hardware supports passthrough, you will need to do
62some configuration to enable PCI(e) passthrough.
63
64.IOMMU
65
66First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the
67corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact
68option name in the manual of your motherboard.
69
70For Intel CPUs, you also need to enable the IOMMU on the
71xref:sysboot_edit_kernel_cmdline[kernel command line] kernels by adding:
72
73----
74 intel_iommu=on
75----
76
77For AMD CPUs it should be enabled automatically.
78
79.IOMMU Passthrough Mode
80
81If your hardware supports IOMMU passthrough mode, enabling this mode might
82increase performance.
83This is because VMs then bypass the (default) DMA translation normally
84performed by the hyper-visor and instead pass DMA requests directly to the
85hardware IOMMU. To enable these options, add:
86
87----
88 iommu=pt
89----
90
91to the xref:sysboot_edit_kernel_cmdline[kernel commandline].
92
93.Kernel Modules
94
95//TODO: remove `vfio_virqfd` stuff with eol of pve 7
96You have to make sure the following modules are loaded. This can be achieved by
97adding them to `'/etc/modules''. In kernels newer than 6.2 ({pve} 8 and onward)
98the 'vfio_virqfd' module is part of the 'vfio' module, therefore loading
99'vfio_virqfd' in {pve} 8 and newer is not necessary.
100
101----
102 vfio
103 vfio_iommu_type1
104 vfio_pci
105 vfio_virqfd #not needed if on kernel 6.2 or newer
106----
107
108[[qm_pci_passthrough_update_initramfs]]
109After changing anything modules related, you need to refresh your
110`initramfs`. On {pve} this can be done by executing:
111
112----
113# update-initramfs -u -k all
114----
115
116To check if the modules are being loaded, the output of
117
118----
119# lsmod | grep vfio
120----
121
122should include the four modules from above.
123
124.Finish Configuration
125
126Finally reboot to bring the changes into effect and check that it is indeed
127enabled.
128
129----
130# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
131----
132
133should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
134enabled, depending on hardware and kernel the exact message can vary.
135
136For notes on how to troubleshoot or verify if IOMMU is working as intended, please
137see the https://pve.proxmox.com/wiki/PCI_Passthrough#Verifying_IOMMU_parameters[Verifying IOMMU Parameters]
138section in our wiki.
139
140It is also important that the device(s) you want to pass through
141are in a *separate* `IOMMU` group. This can be checked with a call to the {pve}
142API:
143
144----
145# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""
146----
147
148It is okay if the device is in an `IOMMU` group together with its functions
149(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
150
151.PCI(e) slots
152[NOTE]
153====
154Some platforms handle their physical PCI(e) slots differently. So, sometimes
155it can help to put the card in a another PCI(e) slot, if you do not get the
156desired `IOMMU` group separation.
157====
158
159.Unsafe interrupts
160[NOTE]
161====
162For some platforms, it may be necessary to allow unsafe interrupts.
163For this add the following line in a file ending with `.conf' file in
164*/etc/modprobe.d/*:
165
166----
167 options vfio_iommu_type1 allow_unsafe_interrupts=1
168----
169
170Please be aware that this option can make your system unstable.
171====
172
173GPU Passthrough Notes
174^^^^^^^^^^^^^^^^^^^^^
175
176It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
177the {pve} web interface.
178
179When passing through a whole GPU or a vGPU and graphic output is wanted, one
180has to either physically connect a monitor to the card, or configure a remote
181desktop software (for example, VNC or RDP) inside the guest.
182
183If you want to use the GPU as a hardware accelerator, for example, for
184programs using OpenCL or CUDA, this is not required.
185
186Host Device Passthrough
187~~~~~~~~~~~~~~~~~~~~~~~
188
189The most used variant of PCI(e) passthrough is to pass through a whole
190PCI(e) card, for example a GPU or a network card.
191
192
193Host Configuration
194^^^^^^^^^^^^^^^^^^
195
196{pve} tries to automatically make the PCI(e) device unavailable for the host.
197However, if this doesn't work, there are two things that can be done:
198
199* pass the device IDs to the options of the 'vfio-pci' modules by adding
200+
201----
202 options vfio-pci ids=1234:5678,4321:8765
203----
204+
205to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
206the vendor and device IDs obtained by:
207+
208----
209# lspci -nn
210----
211
212* blacklist the driver on the host completely, ensuring that it is free to bind
213for passthrough, with
214+
215----
216 blacklist DRIVERNAME
217----
218+
219in a .conf file in */etc/modprobe.d/*.
220+
221To find the drivername, execute
222+
223----
224# lspci -k
225----
226+
227for example:
228+
229----
230# lspci -k | grep -A 3 "VGA"
231----
232+
233will output something similar to
234+
235----
23601:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
237 Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030]
238 Kernel driver in use: <some-module>
239 Kernel modules: <some-module>
240----
241+
242Now we can blacklist the drivers by writing them into a .conf file:
243+
244----
245echo "blacklist <some-module>" >> /etc/modprobe.d/blacklist.conf
246----
247
248For both methods you need to
249xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
250reboot after that.
251
252Should this not work, you might need to set a soft dependency to load the gpu
253modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see
254also the manpages on 'modprobe.d' for more information.
255
256For example, if you are using drivers named <some-module>:
257
258----
259# echo "softdep <some-module> pre: vfio-pci" >> /etc/modprobe.d/<some-module>.conf
260----
261
262
263.Verify Configuration
264
265To check if your changes were successful, you can use
266
267----
268# lspci -nnk
269----
270
271and check your device entry. If it says
272
273----
274Kernel driver in use: vfio-pci
275----
276
277or the 'in use' line is missing entirely, the device is ready to be used for
278passthrough.
279
280[[qm_pci_passthrough_vm_config]]
281VM Configuration
282^^^^^^^^^^^^^^^^
283When passing through a GPU, the best compatibility is reached when using
284'q35' as machine type, 'OVMF' ('UEFI' for VMs) instead of SeaBIOS and PCIe
285instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
286GPU needs to have an UEFI capable ROM, otherwise use SeaBIOS instead. To check if
287the ROM is UEFI capable, see the
288https://pve.proxmox.com/wiki/PCI_Passthrough#How_to_know_if_a_graphics_card_is_UEFI_.28OVMF.29_compatible[PCI Passthrough Examples]
289wiki.
290
291Furthermore, using OVMF, disabling vga arbitration may be possible, reducing the
292amount of legacy code needed to be run during boot. To disable vga arbitration:
293
294----
295 echo "options vfio-pci ids=<vendor-id>,<device-id> disable_vga=1" > /etc/modprobe.d/vfio.conf
296----
297
298replacing the <vendor-id> and <device-id> with the ones obtained from:
299
300----
301# lspci -nn
302----
303
304PCI devices can be added in the web interface in the hardware section of the VM.
305Alternatively, you can use the command line; set the *hostpciX* option in the VM
306configuration, for example by executing:
307
308----
309# qm set VMID -hostpci0 00:02.0
310----
311
312or by adding a line to the VM configuration file:
313
314----
315 hostpci0: 00:02.0
316----
317
318
319If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
320you can pass them through all together with the shortened syntax ``00:02`'.
321This is equivalent with checking the ``All Functions`' checkbox in the
322web interface.
323
324There are some options to which may be necessary, depending on the device
325and guest OS:
326
327* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
328With this enabled the *vga* configuration option will be ignored.
329
330* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
331combination require PCIe rather than PCI. PCIe is only available for 'q35'
332machine types.
333
334* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
335Some PCI(e) devices need this disabled.
336
337* *romfile=<path>*, is an optional path to a ROM file for the device to use.
338This is a relative path under */usr/share/kvm/*.
339
340.Example
341
342An example of PCIe passthrough with a GPU set to primary:
343
344----
345# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
346----
347
348.PCI ID overrides
349
350You can override the PCI vendor ID, device ID, and subsystem IDs that will be
351seen by the guest. This is useful if your device is a variant with an ID that
352your guest's drivers don't recognize, but you want to force those drivers to be
353loaded anyway (e.g. if you know your device shares the same chipset as a
354supported variant).
355
356The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
357`sub-device-id`. You can set any or all of these to override your device's
358default IDs.
359
360For example:
361
362----
363# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
364----
365
366SR-IOV
367~~~~~~
368
369Another variant for passing through PCI(e) devices is to use the hardware
370virtualization features of your devices, if available.
371
372.Enabling SR-IOV
373[NOTE]
374====
375To use SR-IOV, platform support is especially important. It may be necessary
376to enable this feature in the BIOS/UEFI first, or to use a specific PCI(e) port
377for it to work. In doubt, consult the manual of the platform or contact its
378vendor.
379====
380
381'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
382a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
383system. Each of those 'VF' can be used in a different VM, with full hardware
384features and also better performance and lower latency than software
385virtualized devices.
386
387Currently, the most common use case for this are NICs (**N**etwork
388**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
389physical port. This allows using features such as checksum offloading, etc. to
390be used inside a VM, reducing the (host) CPU overhead.
391
392Host Configuration
393^^^^^^^^^^^^^^^^^^
394
395Generally, there are two methods for enabling virtual functions on a device.
396
397* sometimes there is an option for the driver module e.g. for some
398Intel drivers
399+
400----
401 max_vfs=4
402----
403+
404which could be put file with '.conf' ending under */etc/modprobe.d/*.
405(Do not forget to update your initramfs after that)
406+
407Please refer to your driver module documentation for the exact
408parameters and options.
409
410* The second, more generic, approach is using the `sysfs`.
411If a device and driver supports this you can change the number of VFs on
412the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
413+
414----
415# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
416----
417+
418To make this change persistent you can use the `sysfsutils` Debian package.
419After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
420*/etc/sysfs.d/*.
421
422VM Configuration
423^^^^^^^^^^^^^^^^
424
425After creating VFs, you should see them as separate PCI(e) devices when
426outputting them with `lspci`. Get their ID and pass them through like a
427xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
428
429Mediated Devices (vGPU, GVT-g)
430~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
431
432Mediated devices are another method to reuse features and performance from
433physical hardware for virtualized hardware. These are found most common in
434virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
435GRID technology.
436
437With this, a physical Card is able to create virtual cards, similar to SR-IOV.
438The difference is that mediated devices do not appear as PCI(e) devices in the
439host, and are such only suited for using in virtual machines.
440
441Host Configuration
442^^^^^^^^^^^^^^^^^^
443
444In general your card's driver must support that feature, otherwise it will
445not work. So please refer to your vendor for compatible drivers and how to
446configure them.
447
448Intel's drivers for GVT-g are integrated in the Kernel and should work
449with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
450v5 and E3 v6 Xeon Processors.
451
452To enable it for Intel Graphics, you have to make sure to load the module
453'kvmgt' (for example via `/etc/modules`) and to enable it on the
454xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
455
456----
457 i915.enable_gvt=1
458----
459
460After that remember to
461xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
462and reboot your host.
463
464VM Configuration
465^^^^^^^^^^^^^^^^
466
467To use a mediated device, simply specify the `mdev` property on a `hostpciX`
468VM configuration option.
469
470You can get the supported devices via the 'sysfs'. For example, to list the
471supported types for the device '0000:00:02.0' you would simply execute:
472
473----
474# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
475----
476
477Each entry is a directory which contains the following important files:
478
479* 'available_instances' contains the amount of still available instances of
480this type, each 'mdev' use in a VM reduces this.
481* 'description' contains a short description about the capabilities of the type
482* 'create' is the endpoint to create such a device, {pve} does this
483automatically for you, if a 'hostpciX' option with `mdev` is configured.
484
485Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
486
487----
488# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
489----
490
491With this set, {pve} automatically creates such a device on VM start, and
492cleans it up again when the VM stops.
493
494Use in Clusters
495~~~~~~~~~~~~~~~
496
497It is also possible to map devices on a cluster level, so that they can be
498properly used with HA and hardware changes are detected and non root users
499can configure them. See xref:resource_mapping[Resource Mapping]
500for details on that.
501
502[[qm_pci_viommu]]
503vIOMMU (emulated IOMMU)
504~~~~~~~~~~~~~~~~~~~~~~~
505
506vIOMMU is the emulation of a hardware IOMMU within a virtual machine, providing
507improved memory access control and security for virtualized I/O devices. Using
508the vIOMMU option also allows you to pass through PCI devices to level-2 VMs in
509level-1 VMs via https://pve.proxmox.com/wiki/Nested_Virtualization[Nested Virtualization].
510There are currently two vIOMMU implementations available: Intel and VirtIO.
511
512Host requirement:
513
514* Add `intel_iommu=on` or `amd_iommu=on` depending on your CPU to your kernel
515command line.
516
517Intel vIOMMU
518^^^^^^^^^^^^
519
520Intel vIOMMU specific VM requirements:
521
522* Whether you are using an Intel or AMD CPU on your host, it is important to set
523`intel_iommu=on` in the VMs kernel parameters.
524
525* To use Intel vIOMMU you need to set *q35* as the machine type.
526
527If all requirements are met, you can add `viommu=intel` to the machine parameter
528in the configuration of the VM that should be able to pass through PCI devices.
529
530----
531# qm set VMID -machine q35,viommu=intel
532----
533
534https://wiki.qemu.org/Features/VT-d[QEMU documentation for VT-d]
535
536VirtIO vIOMMU
537^^^^^^^^^^^^^
538
539This vIOMMU implementation is more recent and does not have as many limitations
540as Intel vIOMMU but is currently less used in production and less documentated.
541
542With VirtIO vIOMMU there is *no* need to set any kernel parameters. It is also
543*not* necessary to use q35 as the machine type, but it is advisable if you want
544to use PCIe.
545
546----
547# qm set VMID -machine q35,viommu=virtio
548----
549
550https://web.archive.org/web/20230804075844/https://michael2012z.medium.com/virtio-iommu-789369049443[Blog-Post by Michael Zhao explaining virtio-iommu]
551
552ifdef::wiki[]
553
554See Also
555~~~~~~~~
556
557* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
558
559endif::wiki[]