]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
README: note level 4 heading issues
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
4
5PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
6a PCI device from the host. This can have some advantages over using
7virtualized hardware, for example lower latency, higher performance, or more
8features (e.g., offloading).
6e4c46c4 9
49f20f1b 10But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
11device anymore on the host or in any other VM.
12
13General Requirements
14~~~~~~~~~~~~~~~~~~~~
15
16Since passthrough is a feature which also needs hardware support, there are
49f20f1b
TL
17some requirements to check and preparations to be done to make it work.
18
6e4c46c4
DC
19
20Hardware
21^^^^^^^^
49f20f1b
TL
22Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
23**U**nit) interrupt remapping, this includes the CPU and the mainboard.
6e4c46c4 24
49f20f1b
TL
25Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
26But it is not guaranteed that everything will work out of the box, due
27to bad hardware implementation and missing or low quality drivers.
6e4c46c4 28
49f20f1b 29Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
30hardware, but even then, many modern system can support this.
31
49f20f1b
TL
32Please refer to your hardware vendor to check if they support this feature
33under Linux for your specific setup
34
6e4c46c4
DC
35
36Configuration
37^^^^^^^^^^^^^
38
49f20f1b
TL
39Once you ensured that your hardware supports passthrough, you will need to do
40some configuration to enable PCI(e) passthrough.
6e4c46c4 41
6e4c46c4 42
39d84f28 43.IOMMU
6e4c46c4 44
49f20f1b 45The IOMMU has to be activated on the kernel commandline. The easiest way is to
39d84f28 46enable trough grub. Edit `'/etc/default/grub'' and add the following to the
49f20f1b 47'GRUB_CMDLINE_LINUX_DEFAULT' variable:
6e4c46c4 48
49f20f1b
TL
49* for Intel CPUs:
50+
51----
52 intel_iommu=on
53----
54* for AMD CPUs:
55+
56----
6e4c46c4 57 amd_iommu=on
49f20f1b 58----
6e4c46c4 59
39d84f28 60[[qm_pci_passthrough_update_grub]]
49f20f1b
TL
61To bring this change in effect, make sure you run:
62
63----
64# update-grub
65----
6e4c46c4 66
39d84f28 67.Kernel Modules
6e4c46c4 68
49f20f1b
TL
69You have to make sure the following modules are loaded. This can be achieved by
70adding them to `'/etc/modules''
6e4c46c4 71
49f20f1b 72----
6e4c46c4
DC
73 vfio
74 vfio_iommu_type1
75 vfio_pci
76 vfio_virqfd
49f20f1b 77----
6e4c46c4 78
49f20f1b 79[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 80After changing anything modules related, you need to refresh your
49f20f1b 81`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
82
83----
49f20f1b 84# update-initramfs -u -k all
6e4c46c4
DC
85----
86
39d84f28 87.Finish Configuration
49f20f1b
TL
88
89Finally reboot to bring the changes into effect and check that it is indeed
90enabled.
6e4c46c4
DC
91
92----
49f20f1b 93# dmesg -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
94----
95
49f20f1b
TL
96should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
97enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4
DC
98
99It is also important that the device(s) you want to pass through
49f20f1b 100are in a *separate* `IOMMU` group. This can be checked with:
6e4c46c4
DC
101
102----
49f20f1b 103# find /sys/kernel/iommu_groups/ -type l
6e4c46c4
DC
104----
105
49f20f1b 106It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
107(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
108
109.PCI(e) slots
110[NOTE]
111====
49f20f1b
TL
112Some platforms handle their physical PCI(e) slots differently. So, sometimes
113it can help to put the card in a another PCI(e) slot, if you do not get the
114desired `IOMMU` group separation.
6e4c46c4
DC
115====
116
117.Unsafe interrupts
118[NOTE]
119====
120For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
121For this add the following line in a file ending with `.conf' file in
122*/etc/modprobe.d/*:
6e4c46c4 123
49f20f1b 124----
6e4c46c4 125 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 126----
6e4c46c4
DC
127
128Please be aware that this option can make your system unstable.
129====
130
49f20f1b 131Host Device Passthrough
6e4c46c4
DC
132~~~~~~~~~~~~~~~~~~~~~~~
133
134The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
135PCI(e) card, for example a GPU or a network card.
136
6e4c46c4
DC
137
138Host Configuration
139^^^^^^^^^^^^^^^^^^
140
49f20f1b
TL
141In this case, the host cannot use the card. There are two methods to achieve
142this:
6e4c46c4 143
49f20f1b
TL
144* pass the device IDs to the options of the 'vfio-pci' modules by adding
145+
146----
6e4c46c4 147 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 148----
49f20f1b
TL
149+
150to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
151the vendor and device IDs obtained by:
152+
153----
154# lcpci -nn
6e4c46c4
DC
155----
156
49f20f1b
TL
157* blacklist the driver completely on the host, ensuring that it is free to bind
158for passthrough, with
159+
160----
6e4c46c4 161 blacklist DRIVERNAME
49f20f1b
TL
162----
163+
164in a .conf file in */etc/modprobe.d/*.
6e4c46c4 165
49f20f1b
TL
166For both methods you need to
167xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
168reboot after that.
6e4c46c4 169
49f20f1b 170[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
171VM Configuration
172^^^^^^^^^^^^^^^^
49f20f1b
TL
173To pass through the device you need to set the *hostpciX* option in the VM
174configuration, for example by executing:
6e4c46c4
DC
175
176----
49f20f1b 177# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
178----
179
180If your device has multiple functions, you can pass them through all together
49f20f1b 181with the shortened syntax ``00:02`'
6e4c46c4
DC
182
183There are some options to which may be necessary, depending on the device
49f20f1b
TL
184and guest OS:
185
186* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
187With this enabled the *vga* configuration option will be ignored.
6e4c46c4 188
6e4c46c4 189* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
190combination require PCIe rather than PCI. PCIe is only available for 'q35'
191machine types.
192
6e4c46c4
DC
193* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
194Some PCI(e) devices need this disabled.
49f20f1b 195
6e4c46c4 196* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
197This is a relative path under */usr/share/kvm/*.
198
39d84f28 199.Example
6e4c46c4
DC
200
201An example of PCIe passthrough with a GPU set to primary:
202
203----
49f20f1b 204# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
205----
206
49f20f1b 207
6e4c46c4
DC
208Other considerations
209^^^^^^^^^^^^^^^^^^^^
210
211When passing through a GPU, the best compatibility is reached when using
49f20f1b
TL
212'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
213instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
214GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
6e4c46c4
DC
215
216SR-IOV
217~~~~~~
218
49f20f1b
TL
219Another variant for passing through PCI(e) devices, is to use the hardware
220virtualization features of your devices, if available.
221
222'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
223a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
224system. Each of those 'VF' can be used in a different VM, with full hardware
225features and also better performance and lower latency than software
226virtualized devices.
6e4c46c4 227
49f20f1b
TL
228Currently, the most common use case for this are NICs (**N**etwork
229**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
230physical port. This allows using features such as checksum offloading, etc. to
231be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 232
6e4c46c4
DC
233
234Host Configuration
235^^^^^^^^^^^^^^^^^^
236
49f20f1b 237Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 238
49f20f1b 239* sometimes there is an option for the driver module e.g. for some
6e4c46c4 240Intel drivers
49f20f1b
TL
241+
242----
6e4c46c4 243 max_vfs=4
49f20f1b
TL
244----
245+
246which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 247(Do not forget to update your initramfs after that)
49f20f1b 248+
6e4c46c4
DC
249Please refer to your driver module documentation for the exact
250parameters and options.
251
49f20f1b
TL
252* The second, more generic, approach is using the `sysfs`.
253If a device and driver supports this you can change the number of VFs on
254the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
255+
6e4c46c4 256----
49f20f1b 257# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 258----
49f20f1b
TL
259+
260To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 261After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 262*/etc/sysfs.d/*.
6e4c46c4
DC
263
264VM Configuration
265^^^^^^^^^^^^^^^^
266
49f20f1b
TL
267After creating VFs, you should see them as separate PCI(e) devices when
268outputting them with `lspci`. Get their ID and pass them through like a
269xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4
DC
270
271Other considerations
272^^^^^^^^^^^^^^^^^^^^
273
274For this feature, platform support is especially important. It may be necessary
49f20f1b
TL
275to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
276for it to work. In doubt, consult the manual of the platform or contact its
277vendor.
050192c5 278
d25f097c
TL
279Mediated Devices (vGPU, GVT-g)
280~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 281
d25f097c
TL
282Mediated devices are another method to use reuse features and performance from
283physical hardware for virtualized hardware. These are found most common in
284virtualized GPU setups such as Intels GVT-g and Nvidias vGPUs used in their
285GRID technology.
286
287With this, a physical Card is able to create virtual cards, similar to SR-IOV.
288The difference is that mediated devices do not appear as PCI(e) devices in the
289host, and are such only suited for using in virtual machines.
050192c5 290
050192c5
DC
291
292Host Configuration
293^^^^^^^^^^^^^^^^^^
294
d25f097c 295In general your card's driver must support that feature, otherwise it will
050192c5
DC
296not work. So please refer to your vendor for compatbile drivers and how to
297configure them.
298
299Intels drivers for GVT-g are integraded in the Kernel and should work
d25f097c
TL
300with the 5th, 6th and 7th generation Intel Core Processors, further E3 v4, E3
301v5 and E3 v6 Xeon Processors are supported.
050192c5
DC
302
303To enable it for Intel Graphcs, you have to make sure to load the module
304'kvmgt' (for example via `/etc/modules`) and to enable it on the Kernel
d25f097c
TL
305commandline. For this you can edit `'/etc/default/grub'' and add the following
306to the 'GRUB_CMDLINE_LINUX_DEFAULT' variable:
050192c5
DC
307
308----
309 i915.enable_gvt=1
310----
311
312After that remember to
313xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
314xref:qm_pci_passthrough_update_grub[update grub] and
315reboot your host.
316
317VM Configuration
318^^^^^^^^^^^^^^^^
319
d25f097c
TL
320To use a mediated device, simply specify the `mdev` property on a `hostpciX`
321VM configuration option.
050192c5 322
d25f097c
TL
323You can get the supported devices via the 'sysfs'. For example, to list the
324supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
325
326----
327# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
328----
329
330Each entry is a directory which contains the following important files:
331
d25f097c
TL
332* 'available_instances' contains the amount of still available instances of
333this type, each 'mdev' use in a VM reduces this.
050192c5 334* 'description' contains a short description about the capabilities of the type
d25f097c
TL
335* 'create' is the endpoint to create such a device, {pve} does this
336automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 337
d25f097c 338Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
339
340----
341# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
342----
343
344With this set, {pve} automatically creates such a device on VM start, and
345cleans it up again when the VM stops.