]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
bump version to 8.2.0
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
e582833b
DC
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
6e4c46c4
DC
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
6e4c46c4 12
49f20f1b 13But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
14device anymore on the host or in any other VM.
15
9dbab4f8
NU
16Note that, while PCI passthrough is available for i440fx and q35 machines, PCIe
17passthrough is only available on q35 machines. This does not mean that
18PCIe capable devices that are passed through as PCI devices will only run at
19PCI speeds. Passing through devices as PCIe just sets a flag for the guest to
20tell it that the device is a PCIe device instead of a "really fast legacy PCI
21device". Some guest applications benefit from this.
22
6e4c46c4
DC
23General Requirements
24~~~~~~~~~~~~~~~~~~~~
25
9dbab4f8
NU
26Since passthrough is performed on real hardware, it needs to fulfill some
27requirements. A brief overview of these requirements is given below, for more
28information on specific devices, see
29https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples].
6e4c46c4
DC
30
31Hardware
32^^^^^^^^
49f20f1b 33Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
42dfa5e9 34**U**nit) interrupt remapping, this includes the CPU and the motherboard.
6e4c46c4 35
9dbab4f8 36Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this.
49f20f1b
TL
37But it is not guaranteed that everything will work out of the box, due
38to bad hardware implementation and missing or low quality drivers.
6e4c46c4 39
49f20f1b 40Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
41hardware, but even then, many modern system can support this.
42
49f20f1b 43Please refer to your hardware vendor to check if they support this feature
a22d7c24 44under Linux for your specific setup.
49f20f1b 45
9dbab4f8
NU
46Determining PCI Card Address
47^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48
49The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's
50hardware tab. Alternatively, you can use the command line.
51
52You can locate your card using
53
54----
55 lspci
56----
6e4c46c4
DC
57
58Configuration
59^^^^^^^^^^^^^
60
49f20f1b
TL
61Once you ensured that your hardware supports passthrough, you will need to do
62some configuration to enable PCI(e) passthrough.
6e4c46c4 63
39d84f28 64.IOMMU
6e4c46c4 65
9dbab4f8
NU
66First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the
67corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact
b3dc643f 68option name in the manual of your motherboard.
63f0bb9d 69
9dbab4f8
NU
70For Intel CPUs, you also need to enable the IOMMU on the
71xref:sysboot_edit_kernel_cmdline[kernel command line] kernels by adding:
1748211a 72
49f20f1b
TL
73----
74 intel_iommu=on
75----
6e4c46c4 76
b3dc643f
TL
77For AMD CPUs it should be enabled automatically.
78
79.IOMMU Passthrough Mode
a4c60848 80
b3dc643f
TL
81If your hardware supports IOMMU passthrough mode, enabling this mode might
82increase performance.
83This is because VMs then bypass the (default) DMA translation normally
84performed by the hyper-visor and instead pass DMA requests directly to the
85hardware IOMMU. To enable these options, add:
a4c60848 86
a4c60848
DC
87----
88 iommu=pt
89----
90
b3dc643f 91to the xref:sysboot_edit_kernel_cmdline[kernel commandline].
a4c60848 92
39d84f28 93.Kernel Modules
6e4c46c4 94
9dbab4f8 95//TODO: remove `vfio_virqfd` stuff with eol of pve 7
49f20f1b 96You have to make sure the following modules are loaded. This can be achieved by
9dbab4f8
NU
97adding them to `'/etc/modules''. In kernels newer than 6.2 ({pve} 8 and onward)
98the 'vfio_virqfd' module is part of the 'vfio' module, therefore loading
99'vfio_virqfd' in {pve} 8 and newer is not necessary.
6e4c46c4 100
49f20f1b 101----
6e4c46c4
DC
102 vfio
103 vfio_iommu_type1
104 vfio_pci
9dbab4f8 105 vfio_virqfd #not needed if on kernel 6.2 or newer
49f20f1b 106----
6e4c46c4 107
49f20f1b 108[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 109After changing anything modules related, you need to refresh your
49f20f1b 110`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
111
112----
49f20f1b 113# update-initramfs -u -k all
6e4c46c4
DC
114----
115
9dbab4f8
NU
116To check if the modules are being loaded, the output of
117
118----
119# lsmod | grep vfio
120----
121
122should include the four modules from above.
123
39d84f28 124.Finish Configuration
49f20f1b
TL
125
126Finally reboot to bring the changes into effect and check that it is indeed
127enabled.
6e4c46c4
DC
128
129----
5e235b99 130# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
131----
132
49f20f1b
TL
133should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
134enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4 135
9dbab4f8
NU
136For notes on how to troubleshoot or verify if IOMMU is working as intended, please
137see the https://pve.proxmox.com/wiki/PCI_Passthrough#Verifying_IOMMU_parameters[Verifying IOMMU Parameters]
138section in our wiki.
139
6e4c46c4 140It is also important that the device(s) you want to pass through
9dbab4f8
NU
141are in a *separate* `IOMMU` group. This can be checked with a call to the {pve}
142API:
6e4c46c4
DC
143
144----
9dbab4f8 145# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""
6e4c46c4
DC
146----
147
49f20f1b 148It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
149(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
150
151.PCI(e) slots
152[NOTE]
153====
49f20f1b
TL
154Some platforms handle their physical PCI(e) slots differently. So, sometimes
155it can help to put the card in a another PCI(e) slot, if you do not get the
156desired `IOMMU` group separation.
6e4c46c4
DC
157====
158
159.Unsafe interrupts
160[NOTE]
161====
162For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
163For this add the following line in a file ending with `.conf' file in
164*/etc/modprobe.d/*:
6e4c46c4 165
49f20f1b 166----
6e4c46c4 167 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 168----
6e4c46c4
DC
169
170Please be aware that this option can make your system unstable.
171====
172
082b32fb
TL
173GPU Passthrough Notes
174^^^^^^^^^^^^^^^^^^^^^
13cae0c1 175
082b32fb
TL
176It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
177the {pve} web interface.
13cae0c1 178
082b32fb
TL
179When passing through a whole GPU or a vGPU and graphic output is wanted, one
180has to either physically connect a monitor to the card, or configure a remote
181desktop software (for example, VNC or RDP) inside the guest.
13cae0c1 182
082b32fb
TL
183If you want to use the GPU as a hardware accelerator, for example, for
184programs using OpenCL or CUDA, this is not required.
13cae0c1 185
49f20f1b 186Host Device Passthrough
6e4c46c4
DC
187~~~~~~~~~~~~~~~~~~~~~~~
188
189The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
190PCI(e) card, for example a GPU or a network card.
191
6e4c46c4
DC
192
193Host Configuration
194^^^^^^^^^^^^^^^^^^
195
9dbab4f8
NU
196{pve} tries to automatically make the PCI(e) device unavailable for the host.
197However, if this doesn't work, there are two things that can be done:
6e4c46c4 198
49f20f1b
TL
199* pass the device IDs to the options of the 'vfio-pci' modules by adding
200+
201----
6e4c46c4 202 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 203----
49f20f1b
TL
204+
205to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
206the vendor and device IDs obtained by:
207+
208----
eebb3506 209# lspci -nn
6e4c46c4
DC
210----
211
9dbab4f8 212* blacklist the driver on the host completely, ensuring that it is free to bind
49f20f1b
TL
213for passthrough, with
214+
215----
6e4c46c4 216 blacklist DRIVERNAME
49f20f1b
TL
217----
218+
219in a .conf file in */etc/modprobe.d/*.
9dbab4f8
NU
220+
221To find the drivername, execute
222+
223----
224# lspci -k
225----
226+
227for example:
228+
229----
230# lspci -k | grep -A 3 "VGA"
231----
232+
233will output something similar to
234+
235----
23601:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
237 Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030]
238 Kernel driver in use: <some-module>
239 Kernel modules: <some-module>
240----
241+
242Now we can blacklist the drivers by writing them into a .conf file:
243+
244----
245echo "blacklist <some-module>" >> /etc/modprobe.d/blacklist.conf
246----
6e4c46c4 247
49f20f1b
TL
248For both methods you need to
249xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
250reboot after that.
6e4c46c4 251
9dbab4f8
NU
252Should this not work, you might need to set a soft dependency to load the gpu
253modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see
254also the manpages on 'modprobe.d' for more information.
255
256For example, if you are using drivers named <some-module>:
257
258----
259# echo "softdep <some-module> pre: vfio-pci" >> /etc/modprobe.d/<some-module>.conf
260----
261
262
eebb3506
SR
263.Verify Configuration
264
265To check if your changes were successful, you can use
266
267----
268# lspci -nnk
269----
270
271and check your device entry. If it says
272
273----
274Kernel driver in use: vfio-pci
275----
276
277or the 'in use' line is missing entirely, the device is ready to be used for
278passthrough.
279
49f20f1b 280[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
281VM Configuration
282^^^^^^^^^^^^^^^^
9dbab4f8
NU
283When passing through a GPU, the best compatibility is reached when using
284'q35' as machine type, 'OVMF' ('UEFI' for VMs) instead of SeaBIOS and PCIe
285instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
286GPU needs to have an UEFI capable ROM, otherwise use SeaBIOS instead. To check if
287the ROM is UEFI capable, see the
288https://pve.proxmox.com/wiki/PCI_Passthrough#How_to_know_if_a_graphics_card_is_UEFI_.28OVMF.29_compatible[PCI Passthrough Examples]
289wiki.
290
291Furthermore, using OVMF, disabling vga arbitration may be possible, reducing the
292amount of legacy code needed to be run during boot. To disable vga arbitration:
293
294----
295 echo "options vfio-pci ids=<vendor-id>,<device-id> disable_vga=1" > /etc/modprobe.d/vfio.conf
296----
297
298replacing the <vendor-id> and <device-id> with the ones obtained from:
299
300----
301# lspci -nn
302----
303
304PCI devices can be added in the web interface in the hardware section of the VM.
305Alternatively, you can use the command line; set the *hostpciX* option in the VM
49f20f1b 306configuration, for example by executing:
6e4c46c4
DC
307
308----
49f20f1b 309# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
310----
311
9dbab4f8
NU
312or by adding a line to the VM configuration file:
313
314----
315 hostpci0: 00:02.0
316----
317
318
5ee3d3cd 319If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
1fa89424
DC
320you can pass them through all together with the shortened syntax ``00:02`'.
321This is equivalent with checking the ``All Functions`' checkbox in the
e2b3622a 322web interface.
6e4c46c4
DC
323
324There are some options to which may be necessary, depending on the device
49f20f1b
TL
325and guest OS:
326
327* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
328With this enabled the *vga* configuration option will be ignored.
6e4c46c4 329
6e4c46c4 330* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
331combination require PCIe rather than PCI. PCIe is only available for 'q35'
332machine types.
333
6e4c46c4
DC
334* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
335Some PCI(e) devices need this disabled.
49f20f1b 336
6e4c46c4 337* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
338This is a relative path under */usr/share/kvm/*.
339
39d84f28 340.Example
6e4c46c4
DC
341
342An example of PCIe passthrough with a GPU set to primary:
343
344----
49f20f1b 345# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
346----
347
cf2da2d8
NS
348.PCI ID overrides
349
350You can override the PCI vendor ID, device ID, and subsystem IDs that will be
351seen by the guest. This is useful if your device is a variant with an ID that
352your guest's drivers don't recognize, but you want to force those drivers to be
353loaded anyway (e.g. if you know your device shares the same chipset as a
354supported variant).
355
356The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
357`sub-device-id`. You can set any or all of these to override your device's
358default IDs.
359
360For example:
361
362----
363# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
364----
365
6e4c46c4
DC
366SR-IOV
367~~~~~~
368
9dbab4f8 369Another variant for passing through PCI(e) devices is to use the hardware
49f20f1b
TL
370virtualization features of your devices, if available.
371
9dbab4f8
NU
372.Enabling SR-IOV
373[NOTE]
374====
375To use SR-IOV, platform support is especially important. It may be necessary
376to enable this feature in the BIOS/UEFI first, or to use a specific PCI(e) port
377for it to work. In doubt, consult the manual of the platform or contact its
378vendor.
379====
380
49f20f1b
TL
381'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
382a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
383system. Each of those 'VF' can be used in a different VM, with full hardware
384features and also better performance and lower latency than software
385virtualized devices.
6e4c46c4 386
49f20f1b
TL
387Currently, the most common use case for this are NICs (**N**etwork
388**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
389physical port. This allows using features such as checksum offloading, etc. to
390be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 391
6e4c46c4
DC
392Host Configuration
393^^^^^^^^^^^^^^^^^^
394
49f20f1b 395Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 396
49f20f1b 397* sometimes there is an option for the driver module e.g. for some
6e4c46c4 398Intel drivers
49f20f1b
TL
399+
400----
6e4c46c4 401 max_vfs=4
49f20f1b
TL
402----
403+
404which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 405(Do not forget to update your initramfs after that)
49f20f1b 406+
6e4c46c4
DC
407Please refer to your driver module documentation for the exact
408parameters and options.
409
49f20f1b
TL
410* The second, more generic, approach is using the `sysfs`.
411If a device and driver supports this you can change the number of VFs on
412the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
413+
6e4c46c4 414----
49f20f1b 415# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 416----
49f20f1b
TL
417+
418To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 419After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 420*/etc/sysfs.d/*.
6e4c46c4
DC
421
422VM Configuration
423^^^^^^^^^^^^^^^^
424
49f20f1b
TL
425After creating VFs, you should see them as separate PCI(e) devices when
426outputting them with `lspci`. Get their ID and pass them through like a
427xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4 428
d25f097c
TL
429Mediated Devices (vGPU, GVT-g)
430~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 431
a22d7c24 432Mediated devices are another method to reuse features and performance from
d25f097c 433physical hardware for virtualized hardware. These are found most common in
3a433e9b 434virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
d25f097c
TL
435GRID technology.
436
437With this, a physical Card is able to create virtual cards, similar to SR-IOV.
438The difference is that mediated devices do not appear as PCI(e) devices in the
439host, and are such only suited for using in virtual machines.
050192c5 440
050192c5
DC
441Host Configuration
442^^^^^^^^^^^^^^^^^^
443
d25f097c 444In general your card's driver must support that feature, otherwise it will
a22d7c24 445not work. So please refer to your vendor for compatible drivers and how to
050192c5
DC
446configure them.
447
3a433e9b 448Intel's drivers for GVT-g are integrated in the Kernel and should work
a22d7c24
SR
449with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
450v5 and E3 v6 Xeon Processors.
050192c5 451
1748211a
SI
452To enable it for Intel Graphics, you have to make sure to load the module
453'kvmgt' (for example via `/etc/modules`) and to enable it on the
69055103 454xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
050192c5
DC
455
456----
457 i915.enable_gvt=1
458----
459
460After that remember to
461xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
1748211a 462and reboot your host.
050192c5
DC
463
464VM Configuration
465^^^^^^^^^^^^^^^^
466
d25f097c
TL
467To use a mediated device, simply specify the `mdev` property on a `hostpciX`
468VM configuration option.
050192c5 469
d25f097c
TL
470You can get the supported devices via the 'sysfs'. For example, to list the
471supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
472
473----
474# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
475----
476
477Each entry is a directory which contains the following important files:
478
d25f097c
TL
479* 'available_instances' contains the amount of still available instances of
480this type, each 'mdev' use in a VM reduces this.
050192c5 481* 'description' contains a short description about the capabilities of the type
d25f097c
TL
482* 'create' is the endpoint to create such a device, {pve} does this
483automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 484
d25f097c 485Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
486
487----
488# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
489----
490
491With this set, {pve} automatically creates such a device on VM start, and
492cleans it up again when the VM stops.
e582833b 493
e2a867b2
DC
494Use in Clusters
495~~~~~~~~~~~~~~~
496
497It is also possible to map devices on a cluster level, so that they can be
498properly used with HA and hardware changes are detected and non root users
499can configure them. See xref:resource_mapping[Resource Mapping]
500for details on that.
501
d5b75187
MF
502[[qm_pci_viommu]]
503vIOMMU (emulated IOMMU)
504~~~~~~~~~~~~~~~~~~~~~~~
505
506vIOMMU is the emulation of a hardware IOMMU within a virtual machine, providing
507improved memory access control and security for virtualized I/O devices. Using
508the vIOMMU option also allows you to pass through PCI devices to level-2 VMs in
509level-1 VMs via https://pve.proxmox.com/wiki/Nested_Virtualization[Nested Virtualization].
510There are currently two vIOMMU implementations available: Intel and VirtIO.
511
512Host requirement:
513
514* Add `intel_iommu=on` or `amd_iommu=on` depending on your CPU to your kernel
515command line.
516
517Intel vIOMMU
518^^^^^^^^^^^^
519
520Intel vIOMMU specific VM requirements:
521
522* Whether you are using an Intel or AMD CPU on your host, it is important to set
523`intel_iommu=on` in the VMs kernel parameters.
524
525* To use Intel vIOMMU you need to set *q35* as the machine type.
526
527If all requirements are met, you can add `viommu=intel` to the machine parameter
528in the configuration of the VM that should be able to pass through PCI devices.
529
530----
531# qm set VMID -machine q35,viommu=intel
532----
533
534https://wiki.qemu.org/Features/VT-d[QEMU documentation for VT-d]
535
536VirtIO vIOMMU
537^^^^^^^^^^^^^
538
539This vIOMMU implementation is more recent and does not have as many limitations
540as Intel vIOMMU but is currently less used in production and less documentated.
541
542With VirtIO vIOMMU there is *no* need to set any kernel parameters. It is also
543*not* necessary to use q35 as the machine type, but it is advisable if you want
544to use PCIe.
545
546----
547# qm set VMID -machine q35,viommu=virtio
548----
549
550https://web.archive.org/web/20230804075844/https://michael2012z.medium.com/virtio-iommu-789369049443[Blog-Post by Michael Zhao explaining virtio-iommu]
551
e582833b
DC
552ifdef::wiki[]
553
554See Also
555~~~~~~~~
556
557* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
558
559endif::wiki[]