]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
pcie-passthrough: add short note about iommu passthrough mode
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
e582833b
DC
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
6e4c46c4
DC
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
6e4c46c4 12
49f20f1b 13But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
14device anymore on the host or in any other VM.
15
16General Requirements
17~~~~~~~~~~~~~~~~~~~~
18
19Since passthrough is a feature which also needs hardware support, there are
49f20f1b
TL
20some requirements to check and preparations to be done to make it work.
21
6e4c46c4
DC
22
23Hardware
24^^^^^^^^
49f20f1b
TL
25Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26**U**nit) interrupt remapping, this includes the CPU and the mainboard.
6e4c46c4 27
49f20f1b
TL
28Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29But it is not guaranteed that everything will work out of the box, due
30to bad hardware implementation and missing or low quality drivers.
6e4c46c4 31
49f20f1b 32Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
33hardware, but even then, many modern system can support this.
34
49f20f1b 35Please refer to your hardware vendor to check if they support this feature
a22d7c24 36under Linux for your specific setup.
49f20f1b 37
6e4c46c4
DC
38
39Configuration
40^^^^^^^^^^^^^
41
49f20f1b
TL
42Once you ensured that your hardware supports passthrough, you will need to do
43some configuration to enable PCI(e) passthrough.
6e4c46c4 44
6e4c46c4 45
39d84f28 46.IOMMU
6e4c46c4 47
63f0bb9d
DC
48First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50for the exact option you need to enable.
51
e51a78cd 52Then, the IOMMU might need to be activated on the
69055103 53xref:sysboot_edit_kernel_cmdline[kernel commandline].
e51a78cd 54(On newer kernels, this should not be necessary.)
1748211a
SI
55
56The command line parameters are:
6e4c46c4 57
49f20f1b
TL
58* for Intel CPUs:
59+
60----
61 intel_iommu=on
62----
0c54d612 63* for AMD CPUs it should be enabled automatically.
6e4c46c4 64
a4c60848
DC
65
66If your hardware supports it, enabling IOMMU passthrough mode might increase
67performance, because then the VMs bypass the (default) DMA translation
68which is normally done by the hypervisor, before handing DMA requests off to
69the hardware IOMMU. You can enable it with adding
70
71----
72 iommu.passthrough=1
73----
74
75or
76
77----
78 iommu=pt
79----
80
81to the kernel commandline.
82
39d84f28 83.Kernel Modules
6e4c46c4 84
49f20f1b
TL
85You have to make sure the following modules are loaded. This can be achieved by
86adding them to `'/etc/modules''
6e4c46c4 87
49f20f1b 88----
6e4c46c4
DC
89 vfio
90 vfio_iommu_type1
91 vfio_pci
92 vfio_virqfd
49f20f1b 93----
6e4c46c4 94
49f20f1b 95[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 96After changing anything modules related, you need to refresh your
49f20f1b 97`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
98
99----
49f20f1b 100# update-initramfs -u -k all
6e4c46c4
DC
101----
102
39d84f28 103.Finish Configuration
49f20f1b
TL
104
105Finally reboot to bring the changes into effect and check that it is indeed
106enabled.
6e4c46c4
DC
107
108----
5e235b99 109# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
110----
111
49f20f1b
TL
112should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
113enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4
DC
114
115It is also important that the device(s) you want to pass through
49f20f1b 116are in a *separate* `IOMMU` group. This can be checked with:
6e4c46c4
DC
117
118----
49f20f1b 119# find /sys/kernel/iommu_groups/ -type l
6e4c46c4
DC
120----
121
49f20f1b 122It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
123(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
124
125.PCI(e) slots
126[NOTE]
127====
49f20f1b
TL
128Some platforms handle their physical PCI(e) slots differently. So, sometimes
129it can help to put the card in a another PCI(e) slot, if you do not get the
130desired `IOMMU` group separation.
6e4c46c4
DC
131====
132
133.Unsafe interrupts
134[NOTE]
135====
136For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
137For this add the following line in a file ending with `.conf' file in
138*/etc/modprobe.d/*:
6e4c46c4 139
49f20f1b 140----
6e4c46c4 141 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 142----
6e4c46c4
DC
143
144Please be aware that this option can make your system unstable.
145====
146
082b32fb
TL
147GPU Passthrough Notes
148^^^^^^^^^^^^^^^^^^^^^
13cae0c1 149
082b32fb
TL
150It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
151the {pve} web interface.
13cae0c1 152
082b32fb
TL
153When passing through a whole GPU or a vGPU and graphic output is wanted, one
154has to either physically connect a monitor to the card, or configure a remote
155desktop software (for example, VNC or RDP) inside the guest.
13cae0c1 156
082b32fb
TL
157If you want to use the GPU as a hardware accelerator, for example, for
158programs using OpenCL or CUDA, this is not required.
13cae0c1 159
49f20f1b 160Host Device Passthrough
6e4c46c4
DC
161~~~~~~~~~~~~~~~~~~~~~~~
162
163The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
164PCI(e) card, for example a GPU or a network card.
165
6e4c46c4
DC
166
167Host Configuration
168^^^^^^^^^^^^^^^^^^
169
eebb3506 170In this case, the host must not use the card. There are two methods to achieve
49f20f1b 171this:
6e4c46c4 172
49f20f1b
TL
173* pass the device IDs to the options of the 'vfio-pci' modules by adding
174+
175----
6e4c46c4 176 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 177----
49f20f1b
TL
178+
179to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
180the vendor and device IDs obtained by:
181+
182----
eebb3506 183# lspci -nn
6e4c46c4
DC
184----
185
49f20f1b
TL
186* blacklist the driver completely on the host, ensuring that it is free to bind
187for passthrough, with
188+
189----
6e4c46c4 190 blacklist DRIVERNAME
49f20f1b
TL
191----
192+
193in a .conf file in */etc/modprobe.d/*.
6e4c46c4 194
49f20f1b
TL
195For both methods you need to
196xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
197reboot after that.
6e4c46c4 198
eebb3506
SR
199.Verify Configuration
200
201To check if your changes were successful, you can use
202
203----
204# lspci -nnk
205----
206
207and check your device entry. If it says
208
209----
210Kernel driver in use: vfio-pci
211----
212
213or the 'in use' line is missing entirely, the device is ready to be used for
214passthrough.
215
49f20f1b 216[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
217VM Configuration
218^^^^^^^^^^^^^^^^
49f20f1b
TL
219To pass through the device you need to set the *hostpciX* option in the VM
220configuration, for example by executing:
6e4c46c4
DC
221
222----
49f20f1b 223# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
224----
225
5ee3d3cd 226If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
1fa89424
DC
227you can pass them through all together with the shortened syntax ``00:02`'.
228This is equivalent with checking the ``All Functions`' checkbox in the
229web-interface.
6e4c46c4
DC
230
231There are some options to which may be necessary, depending on the device
49f20f1b
TL
232and guest OS:
233
234* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
235With this enabled the *vga* configuration option will be ignored.
6e4c46c4 236
6e4c46c4 237* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
238combination require PCIe rather than PCI. PCIe is only available for 'q35'
239machine types.
240
6e4c46c4
DC
241* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
242Some PCI(e) devices need this disabled.
49f20f1b 243
6e4c46c4 244* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
245This is a relative path under */usr/share/kvm/*.
246
39d84f28 247.Example
6e4c46c4
DC
248
249An example of PCIe passthrough with a GPU set to primary:
250
251----
49f20f1b 252# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
253----
254
cf2da2d8
NS
255.PCI ID overrides
256
257You can override the PCI vendor ID, device ID, and subsystem IDs that will be
258seen by the guest. This is useful if your device is a variant with an ID that
259your guest's drivers don't recognize, but you want to force those drivers to be
260loaded anyway (e.g. if you know your device shares the same chipset as a
261supported variant).
262
263The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
264`sub-device-id`. You can set any or all of these to override your device's
265default IDs.
266
267For example:
268
269----
270# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
271----
272
49f20f1b 273
6e4c46c4
DC
274Other considerations
275^^^^^^^^^^^^^^^^^^^^
276
277When passing through a GPU, the best compatibility is reached when using
49f20f1b
TL
278'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
279instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
280GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
6e4c46c4
DC
281
282SR-IOV
283~~~~~~
284
49f20f1b
TL
285Another variant for passing through PCI(e) devices, is to use the hardware
286virtualization features of your devices, if available.
287
288'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
289a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
290system. Each of those 'VF' can be used in a different VM, with full hardware
291features and also better performance and lower latency than software
292virtualized devices.
6e4c46c4 293
49f20f1b
TL
294Currently, the most common use case for this are NICs (**N**etwork
295**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
296physical port. This allows using features such as checksum offloading, etc. to
297be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 298
6e4c46c4
DC
299
300Host Configuration
301^^^^^^^^^^^^^^^^^^
302
49f20f1b 303Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 304
49f20f1b 305* sometimes there is an option for the driver module e.g. for some
6e4c46c4 306Intel drivers
49f20f1b
TL
307+
308----
6e4c46c4 309 max_vfs=4
49f20f1b
TL
310----
311+
312which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 313(Do not forget to update your initramfs after that)
49f20f1b 314+
6e4c46c4
DC
315Please refer to your driver module documentation for the exact
316parameters and options.
317
49f20f1b
TL
318* The second, more generic, approach is using the `sysfs`.
319If a device and driver supports this you can change the number of VFs on
320the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
321+
6e4c46c4 322----
49f20f1b 323# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 324----
49f20f1b
TL
325+
326To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 327After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 328*/etc/sysfs.d/*.
6e4c46c4
DC
329
330VM Configuration
331^^^^^^^^^^^^^^^^
332
49f20f1b
TL
333After creating VFs, you should see them as separate PCI(e) devices when
334outputting them with `lspci`. Get their ID and pass them through like a
335xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4
DC
336
337Other considerations
338^^^^^^^^^^^^^^^^^^^^
339
340For this feature, platform support is especially important. It may be necessary
49f20f1b
TL
341to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
342for it to work. In doubt, consult the manual of the platform or contact its
343vendor.
050192c5 344
d25f097c
TL
345Mediated Devices (vGPU, GVT-g)
346~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 347
a22d7c24 348Mediated devices are another method to reuse features and performance from
d25f097c 349physical hardware for virtualized hardware. These are found most common in
3a433e9b 350virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
d25f097c
TL
351GRID technology.
352
353With this, a physical Card is able to create virtual cards, similar to SR-IOV.
354The difference is that mediated devices do not appear as PCI(e) devices in the
355host, and are such only suited for using in virtual machines.
050192c5 356
050192c5
DC
357
358Host Configuration
359^^^^^^^^^^^^^^^^^^
360
d25f097c 361In general your card's driver must support that feature, otherwise it will
a22d7c24 362not work. So please refer to your vendor for compatible drivers and how to
050192c5
DC
363configure them.
364
3a433e9b 365Intel's drivers for GVT-g are integrated in the Kernel and should work
a22d7c24
SR
366with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
367v5 and E3 v6 Xeon Processors.
050192c5 368
1748211a
SI
369To enable it for Intel Graphics, you have to make sure to load the module
370'kvmgt' (for example via `/etc/modules`) and to enable it on the
69055103 371xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
050192c5
DC
372
373----
374 i915.enable_gvt=1
375----
376
377After that remember to
378xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
1748211a 379and reboot your host.
050192c5
DC
380
381VM Configuration
382^^^^^^^^^^^^^^^^
383
d25f097c
TL
384To use a mediated device, simply specify the `mdev` property on a `hostpciX`
385VM configuration option.
050192c5 386
d25f097c
TL
387You can get the supported devices via the 'sysfs'. For example, to list the
388supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
389
390----
391# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
392----
393
394Each entry is a directory which contains the following important files:
395
d25f097c
TL
396* 'available_instances' contains the amount of still available instances of
397this type, each 'mdev' use in a VM reduces this.
050192c5 398* 'description' contains a short description about the capabilities of the type
d25f097c
TL
399* 'create' is the endpoint to create such a device, {pve} does this
400automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 401
d25f097c 402Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
403
404----
405# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
406----
407
408With this set, {pve} automatically creates such a device on VM start, and
409cleans it up again when the VM stops.
e582833b
DC
410
411ifdef::wiki[]
412
413See Also
414~~~~~~~~
415
416* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
417
418endif::wiki[]