]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
attrs: update cephdocs template to quincy
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
e582833b
DC
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
6e4c46c4
DC
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
6e4c46c4 12
49f20f1b 13But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
14device anymore on the host or in any other VM.
15
16General Requirements
17~~~~~~~~~~~~~~~~~~~~
18
19Since passthrough is a feature which also needs hardware support, there are
49f20f1b
TL
20some requirements to check and preparations to be done to make it work.
21
6e4c46c4
DC
22
23Hardware
24^^^^^^^^
49f20f1b
TL
25Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26**U**nit) interrupt remapping, this includes the CPU and the mainboard.
6e4c46c4 27
49f20f1b
TL
28Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29But it is not guaranteed that everything will work out of the box, due
30to bad hardware implementation and missing or low quality drivers.
6e4c46c4 31
49f20f1b 32Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
33hardware, but even then, many modern system can support this.
34
49f20f1b 35Please refer to your hardware vendor to check if they support this feature
a22d7c24 36under Linux for your specific setup.
49f20f1b 37
6e4c46c4
DC
38
39Configuration
40^^^^^^^^^^^^^
41
49f20f1b
TL
42Once you ensured that your hardware supports passthrough, you will need to do
43some configuration to enable PCI(e) passthrough.
6e4c46c4 44
39d84f28 45.IOMMU
6e4c46c4 46
b3dc643f
TL
47First, you have to enable IOMMU support in your BIOS/UEFI. Usually the
48corresponding setting is called `IOMMU` or `VT-d`,but you should find the exact
49option name in the manual of your motherboard.
63f0bb9d 50
b3dc643f
TL
51For Intel CPUs, you may also need to enable the IOMMU on the
52xref:sysboot_edit_kernel_cmdline[kernel command line] for older (pre-5.15)
53kernels by adding:
1748211a 54
49f20f1b
TL
55----
56 intel_iommu=on
57----
6e4c46c4 58
b3dc643f
TL
59For AMD CPUs it should be enabled automatically.
60
61.IOMMU Passthrough Mode
a4c60848 62
b3dc643f
TL
63If your hardware supports IOMMU passthrough mode, enabling this mode might
64increase performance.
65This is because VMs then bypass the (default) DMA translation normally
66performed by the hyper-visor and instead pass DMA requests directly to the
67hardware IOMMU. To enable these options, add:
a4c60848 68
a4c60848
DC
69----
70 iommu=pt
71----
72
b3dc643f 73to the xref:sysboot_edit_kernel_cmdline[kernel commandline].
a4c60848 74
39d84f28 75.Kernel Modules
6e4c46c4 76
49f20f1b
TL
77You have to make sure the following modules are loaded. This can be achieved by
78adding them to `'/etc/modules''
6e4c46c4 79
49f20f1b 80----
6e4c46c4
DC
81 vfio
82 vfio_iommu_type1
83 vfio_pci
84 vfio_virqfd
49f20f1b 85----
6e4c46c4 86
49f20f1b 87[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 88After changing anything modules related, you need to refresh your
49f20f1b 89`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
90
91----
49f20f1b 92# update-initramfs -u -k all
6e4c46c4
DC
93----
94
39d84f28 95.Finish Configuration
49f20f1b
TL
96
97Finally reboot to bring the changes into effect and check that it is indeed
98enabled.
6e4c46c4
DC
99
100----
5e235b99 101# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
102----
103
49f20f1b
TL
104should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
105enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4
DC
106
107It is also important that the device(s) you want to pass through
49f20f1b 108are in a *separate* `IOMMU` group. This can be checked with:
6e4c46c4
DC
109
110----
49f20f1b 111# find /sys/kernel/iommu_groups/ -type l
6e4c46c4
DC
112----
113
49f20f1b 114It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
115(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
116
117.PCI(e) slots
118[NOTE]
119====
49f20f1b
TL
120Some platforms handle their physical PCI(e) slots differently. So, sometimes
121it can help to put the card in a another PCI(e) slot, if you do not get the
122desired `IOMMU` group separation.
6e4c46c4
DC
123====
124
125.Unsafe interrupts
126[NOTE]
127====
128For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
129For this add the following line in a file ending with `.conf' file in
130*/etc/modprobe.d/*:
6e4c46c4 131
49f20f1b 132----
6e4c46c4 133 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 134----
6e4c46c4
DC
135
136Please be aware that this option can make your system unstable.
137====
138
082b32fb
TL
139GPU Passthrough Notes
140^^^^^^^^^^^^^^^^^^^^^
13cae0c1 141
082b32fb
TL
142It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
143the {pve} web interface.
13cae0c1 144
082b32fb
TL
145When passing through a whole GPU or a vGPU and graphic output is wanted, one
146has to either physically connect a monitor to the card, or configure a remote
147desktop software (for example, VNC or RDP) inside the guest.
13cae0c1 148
082b32fb
TL
149If you want to use the GPU as a hardware accelerator, for example, for
150programs using OpenCL or CUDA, this is not required.
13cae0c1 151
49f20f1b 152Host Device Passthrough
6e4c46c4
DC
153~~~~~~~~~~~~~~~~~~~~~~~
154
155The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
156PCI(e) card, for example a GPU or a network card.
157
6e4c46c4
DC
158
159Host Configuration
160^^^^^^^^^^^^^^^^^^
161
eebb3506 162In this case, the host must not use the card. There are two methods to achieve
49f20f1b 163this:
6e4c46c4 164
49f20f1b
TL
165* pass the device IDs to the options of the 'vfio-pci' modules by adding
166+
167----
6e4c46c4 168 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 169----
49f20f1b
TL
170+
171to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
172the vendor and device IDs obtained by:
173+
174----
eebb3506 175# lspci -nn
6e4c46c4
DC
176----
177
49f20f1b
TL
178* blacklist the driver completely on the host, ensuring that it is free to bind
179for passthrough, with
180+
181----
6e4c46c4 182 blacklist DRIVERNAME
49f20f1b
TL
183----
184+
185in a .conf file in */etc/modprobe.d/*.
6e4c46c4 186
49f20f1b
TL
187For both methods you need to
188xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
189reboot after that.
6e4c46c4 190
eebb3506
SR
191.Verify Configuration
192
193To check if your changes were successful, you can use
194
195----
196# lspci -nnk
197----
198
199and check your device entry. If it says
200
201----
202Kernel driver in use: vfio-pci
203----
204
205or the 'in use' line is missing entirely, the device is ready to be used for
206passthrough.
207
49f20f1b 208[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
209VM Configuration
210^^^^^^^^^^^^^^^^
49f20f1b
TL
211To pass through the device you need to set the *hostpciX* option in the VM
212configuration, for example by executing:
6e4c46c4
DC
213
214----
49f20f1b 215# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
216----
217
5ee3d3cd 218If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
1fa89424
DC
219you can pass them through all together with the shortened syntax ``00:02`'.
220This is equivalent with checking the ``All Functions`' checkbox in the
221web-interface.
6e4c46c4
DC
222
223There are some options to which may be necessary, depending on the device
49f20f1b
TL
224and guest OS:
225
226* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
227With this enabled the *vga* configuration option will be ignored.
6e4c46c4 228
6e4c46c4 229* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
230combination require PCIe rather than PCI. PCIe is only available for 'q35'
231machine types.
232
6e4c46c4
DC
233* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
234Some PCI(e) devices need this disabled.
49f20f1b 235
6e4c46c4 236* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
237This is a relative path under */usr/share/kvm/*.
238
39d84f28 239.Example
6e4c46c4
DC
240
241An example of PCIe passthrough with a GPU set to primary:
242
243----
49f20f1b 244# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
245----
246
cf2da2d8
NS
247.PCI ID overrides
248
249You can override the PCI vendor ID, device ID, and subsystem IDs that will be
250seen by the guest. This is useful if your device is a variant with an ID that
251your guest's drivers don't recognize, but you want to force those drivers to be
252loaded anyway (e.g. if you know your device shares the same chipset as a
253supported variant).
254
255The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
256`sub-device-id`. You can set any or all of these to override your device's
257default IDs.
258
259For example:
260
261----
262# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
263----
264
49f20f1b 265
6e4c46c4
DC
266Other considerations
267^^^^^^^^^^^^^^^^^^^^
268
269When passing through a GPU, the best compatibility is reached when using
49f20f1b
TL
270'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
271instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
272GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
6e4c46c4
DC
273
274SR-IOV
275~~~~~~
276
49f20f1b
TL
277Another variant for passing through PCI(e) devices, is to use the hardware
278virtualization features of your devices, if available.
279
280'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
281a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
282system. Each of those 'VF' can be used in a different VM, with full hardware
283features and also better performance and lower latency than software
284virtualized devices.
6e4c46c4 285
49f20f1b
TL
286Currently, the most common use case for this are NICs (**N**etwork
287**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
288physical port. This allows using features such as checksum offloading, etc. to
289be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 290
6e4c46c4
DC
291
292Host Configuration
293^^^^^^^^^^^^^^^^^^
294
49f20f1b 295Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 296
49f20f1b 297* sometimes there is an option for the driver module e.g. for some
6e4c46c4 298Intel drivers
49f20f1b
TL
299+
300----
6e4c46c4 301 max_vfs=4
49f20f1b
TL
302----
303+
304which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 305(Do not forget to update your initramfs after that)
49f20f1b 306+
6e4c46c4
DC
307Please refer to your driver module documentation for the exact
308parameters and options.
309
49f20f1b
TL
310* The second, more generic, approach is using the `sysfs`.
311If a device and driver supports this you can change the number of VFs on
312the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
313+
6e4c46c4 314----
49f20f1b 315# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 316----
49f20f1b
TL
317+
318To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 319After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 320*/etc/sysfs.d/*.
6e4c46c4
DC
321
322VM Configuration
323^^^^^^^^^^^^^^^^
324
49f20f1b
TL
325After creating VFs, you should see them as separate PCI(e) devices when
326outputting them with `lspci`. Get their ID and pass them through like a
327xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4
DC
328
329Other considerations
330^^^^^^^^^^^^^^^^^^^^
331
332For this feature, platform support is especially important. It may be necessary
49f20f1b
TL
333to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
334for it to work. In doubt, consult the manual of the platform or contact its
335vendor.
050192c5 336
d25f097c
TL
337Mediated Devices (vGPU, GVT-g)
338~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 339
a22d7c24 340Mediated devices are another method to reuse features and performance from
d25f097c 341physical hardware for virtualized hardware. These are found most common in
3a433e9b 342virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
d25f097c
TL
343GRID technology.
344
345With this, a physical Card is able to create virtual cards, similar to SR-IOV.
346The difference is that mediated devices do not appear as PCI(e) devices in the
347host, and are such only suited for using in virtual machines.
050192c5 348
050192c5
DC
349
350Host Configuration
351^^^^^^^^^^^^^^^^^^
352
d25f097c 353In general your card's driver must support that feature, otherwise it will
a22d7c24 354not work. So please refer to your vendor for compatible drivers and how to
050192c5
DC
355configure them.
356
3a433e9b 357Intel's drivers for GVT-g are integrated in the Kernel and should work
a22d7c24
SR
358with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
359v5 and E3 v6 Xeon Processors.
050192c5 360
1748211a
SI
361To enable it for Intel Graphics, you have to make sure to load the module
362'kvmgt' (for example via `/etc/modules`) and to enable it on the
69055103 363xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
050192c5
DC
364
365----
366 i915.enable_gvt=1
367----
368
369After that remember to
370xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
1748211a 371and reboot your host.
050192c5
DC
372
373VM Configuration
374^^^^^^^^^^^^^^^^
375
d25f097c
TL
376To use a mediated device, simply specify the `mdev` property on a `hostpciX`
377VM configuration option.
050192c5 378
d25f097c
TL
379You can get the supported devices via the 'sysfs'. For example, to list the
380supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
381
382----
383# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
384----
385
386Each entry is a directory which contains the following important files:
387
d25f097c
TL
388* 'available_instances' contains the amount of still available instances of
389this type, each 'mdev' use in a VM reduces this.
050192c5 390* 'description' contains a short description about the capabilities of the type
d25f097c
TL
391* 'create' is the endpoint to create such a device, {pve} does this
392automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 393
d25f097c 394Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
395
396----
397# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
398----
399
400With this set, {pve} automatically creates such a device on VM start, and
401cleans it up again when the VM stops.
e582833b
DC
402
403ifdef::wiki[]
404
405See Also
406~~~~~~~~
407
408* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
409
410endif::wiki[]