]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
qm: pct: mention move-disk for storage and reassign
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
e582833b
DC
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
6e4c46c4
DC
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
6e4c46c4 12
49f20f1b 13But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
14device anymore on the host or in any other VM.
15
16General Requirements
17~~~~~~~~~~~~~~~~~~~~
18
19Since passthrough is a feature which also needs hardware support, there are
49f20f1b
TL
20some requirements to check and preparations to be done to make it work.
21
6e4c46c4
DC
22
23Hardware
24^^^^^^^^
49f20f1b
TL
25Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26**U**nit) interrupt remapping, this includes the CPU and the mainboard.
6e4c46c4 27
49f20f1b
TL
28Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29But it is not guaranteed that everything will work out of the box, due
30to bad hardware implementation and missing or low quality drivers.
6e4c46c4 31
49f20f1b 32Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
33hardware, but even then, many modern system can support this.
34
49f20f1b 35Please refer to your hardware vendor to check if they support this feature
a22d7c24 36under Linux for your specific setup.
49f20f1b 37
6e4c46c4
DC
38
39Configuration
40^^^^^^^^^^^^^
41
49f20f1b
TL
42Once you ensured that your hardware supports passthrough, you will need to do
43some configuration to enable PCI(e) passthrough.
6e4c46c4 44
6e4c46c4 45
39d84f28 46.IOMMU
6e4c46c4 47
63f0bb9d
DC
48First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50for the exact option you need to enable.
51
52Then, the IOMMU has to be activated on the
69055103 53xref:sysboot_edit_kernel_cmdline[kernel commandline].
1748211a
SI
54
55The command line parameters are:
6e4c46c4 56
49f20f1b
TL
57* for Intel CPUs:
58+
59----
60 intel_iommu=on
61----
0c54d612 62* for AMD CPUs it should be enabled automatically.
6e4c46c4 63
39d84f28 64.Kernel Modules
6e4c46c4 65
49f20f1b
TL
66You have to make sure the following modules are loaded. This can be achieved by
67adding them to `'/etc/modules''
6e4c46c4 68
49f20f1b 69----
6e4c46c4
DC
70 vfio
71 vfio_iommu_type1
72 vfio_pci
73 vfio_virqfd
49f20f1b 74----
6e4c46c4 75
49f20f1b 76[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 77After changing anything modules related, you need to refresh your
49f20f1b 78`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
79
80----
49f20f1b 81# update-initramfs -u -k all
6e4c46c4
DC
82----
83
39d84f28 84.Finish Configuration
49f20f1b
TL
85
86Finally reboot to bring the changes into effect and check that it is indeed
87enabled.
6e4c46c4
DC
88
89----
5e235b99 90# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
91----
92
49f20f1b
TL
93should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
94enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4
DC
95
96It is also important that the device(s) you want to pass through
49f20f1b 97are in a *separate* `IOMMU` group. This can be checked with:
6e4c46c4
DC
98
99----
49f20f1b 100# find /sys/kernel/iommu_groups/ -type l
6e4c46c4
DC
101----
102
49f20f1b 103It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
104(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
105
106.PCI(e) slots
107[NOTE]
108====
49f20f1b
TL
109Some platforms handle their physical PCI(e) slots differently. So, sometimes
110it can help to put the card in a another PCI(e) slot, if you do not get the
111desired `IOMMU` group separation.
6e4c46c4
DC
112====
113
114.Unsafe interrupts
115[NOTE]
116====
117For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
118For this add the following line in a file ending with `.conf' file in
119*/etc/modprobe.d/*:
6e4c46c4 120
49f20f1b 121----
6e4c46c4 122 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 123----
6e4c46c4
DC
124
125Please be aware that this option can make your system unstable.
126====
127
082b32fb
TL
128GPU Passthrough Notes
129^^^^^^^^^^^^^^^^^^^^^
13cae0c1 130
082b32fb
TL
131It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
132the {pve} web interface.
13cae0c1 133
082b32fb
TL
134When passing through a whole GPU or a vGPU and graphic output is wanted, one
135has to either physically connect a monitor to the card, or configure a remote
136desktop software (for example, VNC or RDP) inside the guest.
13cae0c1 137
082b32fb
TL
138If you want to use the GPU as a hardware accelerator, for example, for
139programs using OpenCL or CUDA, this is not required.
13cae0c1 140
49f20f1b 141Host Device Passthrough
6e4c46c4
DC
142~~~~~~~~~~~~~~~~~~~~~~~
143
144The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
145PCI(e) card, for example a GPU or a network card.
146
6e4c46c4
DC
147
148Host Configuration
149^^^^^^^^^^^^^^^^^^
150
eebb3506 151In this case, the host must not use the card. There are two methods to achieve
49f20f1b 152this:
6e4c46c4 153
49f20f1b
TL
154* pass the device IDs to the options of the 'vfio-pci' modules by adding
155+
156----
6e4c46c4 157 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 158----
49f20f1b
TL
159+
160to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
161the vendor and device IDs obtained by:
162+
163----
eebb3506 164# lspci -nn
6e4c46c4
DC
165----
166
49f20f1b
TL
167* blacklist the driver completely on the host, ensuring that it is free to bind
168for passthrough, with
169+
170----
6e4c46c4 171 blacklist DRIVERNAME
49f20f1b
TL
172----
173+
174in a .conf file in */etc/modprobe.d/*.
6e4c46c4 175
49f20f1b
TL
176For both methods you need to
177xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
178reboot after that.
6e4c46c4 179
eebb3506
SR
180.Verify Configuration
181
182To check if your changes were successful, you can use
183
184----
185# lspci -nnk
186----
187
188and check your device entry. If it says
189
190----
191Kernel driver in use: vfio-pci
192----
193
194or the 'in use' line is missing entirely, the device is ready to be used for
195passthrough.
196
49f20f1b 197[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
198VM Configuration
199^^^^^^^^^^^^^^^^
49f20f1b
TL
200To pass through the device you need to set the *hostpciX* option in the VM
201configuration, for example by executing:
6e4c46c4
DC
202
203----
49f20f1b 204# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
205----
206
5ee3d3cd 207If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
1fa89424
DC
208you can pass them through all together with the shortened syntax ``00:02`'.
209This is equivalent with checking the ``All Functions`' checkbox in the
210web-interface.
6e4c46c4
DC
211
212There are some options to which may be necessary, depending on the device
49f20f1b
TL
213and guest OS:
214
215* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
216With this enabled the *vga* configuration option will be ignored.
6e4c46c4 217
6e4c46c4 218* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
219combination require PCIe rather than PCI. PCIe is only available for 'q35'
220machine types.
221
6e4c46c4
DC
222* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
223Some PCI(e) devices need this disabled.
49f20f1b 224
6e4c46c4 225* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
226This is a relative path under */usr/share/kvm/*.
227
39d84f28 228.Example
6e4c46c4
DC
229
230An example of PCIe passthrough with a GPU set to primary:
231
232----
49f20f1b 233# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
234----
235
49f20f1b 236
6e4c46c4
DC
237Other considerations
238^^^^^^^^^^^^^^^^^^^^
239
240When passing through a GPU, the best compatibility is reached when using
49f20f1b
TL
241'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
242instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
243GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
6e4c46c4
DC
244
245SR-IOV
246~~~~~~
247
49f20f1b
TL
248Another variant for passing through PCI(e) devices, is to use the hardware
249virtualization features of your devices, if available.
250
251'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
252a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
253system. Each of those 'VF' can be used in a different VM, with full hardware
254features and also better performance and lower latency than software
255virtualized devices.
6e4c46c4 256
49f20f1b
TL
257Currently, the most common use case for this are NICs (**N**etwork
258**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
259physical port. This allows using features such as checksum offloading, etc. to
260be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 261
6e4c46c4
DC
262
263Host Configuration
264^^^^^^^^^^^^^^^^^^
265
49f20f1b 266Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 267
49f20f1b 268* sometimes there is an option for the driver module e.g. for some
6e4c46c4 269Intel drivers
49f20f1b
TL
270+
271----
6e4c46c4 272 max_vfs=4
49f20f1b
TL
273----
274+
275which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 276(Do not forget to update your initramfs after that)
49f20f1b 277+
6e4c46c4
DC
278Please refer to your driver module documentation for the exact
279parameters and options.
280
49f20f1b
TL
281* The second, more generic, approach is using the `sysfs`.
282If a device and driver supports this you can change the number of VFs on
283the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
284+
6e4c46c4 285----
49f20f1b 286# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 287----
49f20f1b
TL
288+
289To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 290After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 291*/etc/sysfs.d/*.
6e4c46c4
DC
292
293VM Configuration
294^^^^^^^^^^^^^^^^
295
49f20f1b
TL
296After creating VFs, you should see them as separate PCI(e) devices when
297outputting them with `lspci`. Get their ID and pass them through like a
298xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4
DC
299
300Other considerations
301^^^^^^^^^^^^^^^^^^^^
302
303For this feature, platform support is especially important. It may be necessary
49f20f1b
TL
304to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
305for it to work. In doubt, consult the manual of the platform or contact its
306vendor.
050192c5 307
d25f097c
TL
308Mediated Devices (vGPU, GVT-g)
309~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 310
a22d7c24 311Mediated devices are another method to reuse features and performance from
d25f097c 312physical hardware for virtualized hardware. These are found most common in
3a433e9b 313virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
d25f097c
TL
314GRID technology.
315
316With this, a physical Card is able to create virtual cards, similar to SR-IOV.
317The difference is that mediated devices do not appear as PCI(e) devices in the
318host, and are such only suited for using in virtual machines.
050192c5 319
050192c5
DC
320
321Host Configuration
322^^^^^^^^^^^^^^^^^^
323
d25f097c 324In general your card's driver must support that feature, otherwise it will
a22d7c24 325not work. So please refer to your vendor for compatible drivers and how to
050192c5
DC
326configure them.
327
3a433e9b 328Intel's drivers for GVT-g are integrated in the Kernel and should work
a22d7c24
SR
329with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
330v5 and E3 v6 Xeon Processors.
050192c5 331
1748211a
SI
332To enable it for Intel Graphics, you have to make sure to load the module
333'kvmgt' (for example via `/etc/modules`) and to enable it on the
69055103 334xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
050192c5
DC
335
336----
337 i915.enable_gvt=1
338----
339
340After that remember to
341xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
1748211a 342and reboot your host.
050192c5
DC
343
344VM Configuration
345^^^^^^^^^^^^^^^^
346
d25f097c
TL
347To use a mediated device, simply specify the `mdev` property on a `hostpciX`
348VM configuration option.
050192c5 349
d25f097c
TL
350You can get the supported devices via the 'sysfs'. For example, to list the
351supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
352
353----
354# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
355----
356
357Each entry is a directory which contains the following important files:
358
d25f097c
TL
359* 'available_instances' contains the amount of still available instances of
360this type, each 'mdev' use in a VM reduces this.
050192c5 361* 'description' contains a short description about the capabilities of the type
d25f097c
TL
362* 'create' is the endpoint to create such a device, {pve} does this
363automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 364
d25f097c 365Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
366
367----
368# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
369----
370
371With this set, {pve} automatically creates such a device on VM start, and
372cleans it up again when the VM stops.
e582833b
DC
373
374ifdef::wiki[]
375
376See Also
377~~~~~~~~
378
379* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
380
381endif::wiki[]