]> git.proxmox.com Git - pve-docs.git/blob - qm-pci-passthrough.adoc
pcie-passthrough: note that iommu activation is not always necessary
[pve-docs.git] / qm-pci-passthrough.adoc
1 [[qm_pci_passthrough]]
2 PCI(e) Passthrough
3 ------------------
4 ifdef::wiki[]
5 :pve-toplevel:
6 endif::wiki[]
7
8 PCI(e) passthrough is a mechanism to give a virtual machine control over
9 a PCI device from the host. This can have some advantages over using
10 virtualized hardware, for example lower latency, higher performance, or more
11 features (e.g., offloading).
12
13 But, if you pass through a device to a virtual machine, you cannot use that
14 device anymore on the host or in any other VM.
15
16 General Requirements
17 ~~~~~~~~~~~~~~~~~~~~
18
19 Since passthrough is a feature which also needs hardware support, there are
20 some requirements to check and preparations to be done to make it work.
21
22
23 Hardware
24 ^^^^^^^^
25 Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26 **U**nit) interrupt remapping, this includes the CPU and the mainboard.
27
28 Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29 But it is not guaranteed that everything will work out of the box, due
30 to bad hardware implementation and missing or low quality drivers.
31
32 Further, server grade hardware has often better support than consumer grade
33 hardware, but even then, many modern system can support this.
34
35 Please refer to your hardware vendor to check if they support this feature
36 under Linux for your specific setup.
37
38
39 Configuration
40 ^^^^^^^^^^^^^
41
42 Once you ensured that your hardware supports passthrough, you will need to do
43 some configuration to enable PCI(e) passthrough.
44
45
46 .IOMMU
47
48 First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49 options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50 for the exact option you need to enable.
51
52 Then, the IOMMU might need to be activated on the
53 xref:sysboot_edit_kernel_cmdline[kernel commandline].
54 (On newer kernels, this should not be necessary.)
55
56 The command line parameters are:
57
58 * for Intel CPUs:
59 +
60 ----
61 intel_iommu=on
62 ----
63 * for AMD CPUs it should be enabled automatically.
64
65 .Kernel Modules
66
67 You have to make sure the following modules are loaded. This can be achieved by
68 adding them to `'/etc/modules''
69
70 ----
71 vfio
72 vfio_iommu_type1
73 vfio_pci
74 vfio_virqfd
75 ----
76
77 [[qm_pci_passthrough_update_initramfs]]
78 After changing anything modules related, you need to refresh your
79 `initramfs`. On {pve} this can be done by executing:
80
81 ----
82 # update-initramfs -u -k all
83 ----
84
85 .Finish Configuration
86
87 Finally reboot to bring the changes into effect and check that it is indeed
88 enabled.
89
90 ----
91 # dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
92 ----
93
94 should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
95 enabled, depending on hardware and kernel the exact message can vary.
96
97 It is also important that the device(s) you want to pass through
98 are in a *separate* `IOMMU` group. This can be checked with:
99
100 ----
101 # find /sys/kernel/iommu_groups/ -type l
102 ----
103
104 It is okay if the device is in an `IOMMU` group together with its functions
105 (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
106
107 .PCI(e) slots
108 [NOTE]
109 ====
110 Some platforms handle their physical PCI(e) slots differently. So, sometimes
111 it can help to put the card in a another PCI(e) slot, if you do not get the
112 desired `IOMMU` group separation.
113 ====
114
115 .Unsafe interrupts
116 [NOTE]
117 ====
118 For some platforms, it may be necessary to allow unsafe interrupts.
119 For this add the following line in a file ending with `.conf' file in
120 */etc/modprobe.d/*:
121
122 ----
123 options vfio_iommu_type1 allow_unsafe_interrupts=1
124 ----
125
126 Please be aware that this option can make your system unstable.
127 ====
128
129 GPU Passthrough Notes
130 ^^^^^^^^^^^^^^^^^^^^^
131
132 It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
133 the {pve} web interface.
134
135 When passing through a whole GPU or a vGPU and graphic output is wanted, one
136 has to either physically connect a monitor to the card, or configure a remote
137 desktop software (for example, VNC or RDP) inside the guest.
138
139 If you want to use the GPU as a hardware accelerator, for example, for
140 programs using OpenCL or CUDA, this is not required.
141
142 Host Device Passthrough
143 ~~~~~~~~~~~~~~~~~~~~~~~
144
145 The most used variant of PCI(e) passthrough is to pass through a whole
146 PCI(e) card, for example a GPU or a network card.
147
148
149 Host Configuration
150 ^^^^^^^^^^^^^^^^^^
151
152 In this case, the host must not use the card. There are two methods to achieve
153 this:
154
155 * pass the device IDs to the options of the 'vfio-pci' modules by adding
156 +
157 ----
158 options vfio-pci ids=1234:5678,4321:8765
159 ----
160 +
161 to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
162 the vendor and device IDs obtained by:
163 +
164 ----
165 # lspci -nn
166 ----
167
168 * blacklist the driver completely on the host, ensuring that it is free to bind
169 for passthrough, with
170 +
171 ----
172 blacklist DRIVERNAME
173 ----
174 +
175 in a .conf file in */etc/modprobe.d/*.
176
177 For both methods you need to
178 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
179 reboot after that.
180
181 .Verify Configuration
182
183 To check if your changes were successful, you can use
184
185 ----
186 # lspci -nnk
187 ----
188
189 and check your device entry. If it says
190
191 ----
192 Kernel driver in use: vfio-pci
193 ----
194
195 or the 'in use' line is missing entirely, the device is ready to be used for
196 passthrough.
197
198 [[qm_pci_passthrough_vm_config]]
199 VM Configuration
200 ^^^^^^^^^^^^^^^^
201 To pass through the device you need to set the *hostpciX* option in the VM
202 configuration, for example by executing:
203
204 ----
205 # qm set VMID -hostpci0 00:02.0
206 ----
207
208 If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
209 you can pass them through all together with the shortened syntax ``00:02`'.
210 This is equivalent with checking the ``All Functions`' checkbox in the
211 web-interface.
212
213 There are some options to which may be necessary, depending on the device
214 and guest OS:
215
216 * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
217 With this enabled the *vga* configuration option will be ignored.
218
219 * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
220 combination require PCIe rather than PCI. PCIe is only available for 'q35'
221 machine types.
222
223 * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
224 Some PCI(e) devices need this disabled.
225
226 * *romfile=<path>*, is an optional path to a ROM file for the device to use.
227 This is a relative path under */usr/share/kvm/*.
228
229 .Example
230
231 An example of PCIe passthrough with a GPU set to primary:
232
233 ----
234 # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
235 ----
236
237 .PCI ID overrides
238
239 You can override the PCI vendor ID, device ID, and subsystem IDs that will be
240 seen by the guest. This is useful if your device is a variant with an ID that
241 your guest's drivers don't recognize, but you want to force those drivers to be
242 loaded anyway (e.g. if you know your device shares the same chipset as a
243 supported variant).
244
245 The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
246 `sub-device-id`. You can set any or all of these to override your device's
247 default IDs.
248
249 For example:
250
251 ----
252 # qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
253 ----
254
255
256 Other considerations
257 ^^^^^^^^^^^^^^^^^^^^
258
259 When passing through a GPU, the best compatibility is reached when using
260 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
261 instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
262 GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
263
264 SR-IOV
265 ~~~~~~
266
267 Another variant for passing through PCI(e) devices, is to use the hardware
268 virtualization features of your devices, if available.
269
270 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
271 a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
272 system. Each of those 'VF' can be used in a different VM, with full hardware
273 features and also better performance and lower latency than software
274 virtualized devices.
275
276 Currently, the most common use case for this are NICs (**N**etwork
277 **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
278 physical port. This allows using features such as checksum offloading, etc. to
279 be used inside a VM, reducing the (host) CPU overhead.
280
281
282 Host Configuration
283 ^^^^^^^^^^^^^^^^^^
284
285 Generally, there are two methods for enabling virtual functions on a device.
286
287 * sometimes there is an option for the driver module e.g. for some
288 Intel drivers
289 +
290 ----
291 max_vfs=4
292 ----
293 +
294 which could be put file with '.conf' ending under */etc/modprobe.d/*.
295 (Do not forget to update your initramfs after that)
296 +
297 Please refer to your driver module documentation for the exact
298 parameters and options.
299
300 * The second, more generic, approach is using the `sysfs`.
301 If a device and driver supports this you can change the number of VFs on
302 the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
303 +
304 ----
305 # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
306 ----
307 +
308 To make this change persistent you can use the `sysfsutils` Debian package.
309 After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
310 */etc/sysfs.d/*.
311
312 VM Configuration
313 ^^^^^^^^^^^^^^^^
314
315 After creating VFs, you should see them as separate PCI(e) devices when
316 outputting them with `lspci`. Get their ID and pass them through like a
317 xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
318
319 Other considerations
320 ^^^^^^^^^^^^^^^^^^^^
321
322 For this feature, platform support is especially important. It may be necessary
323 to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
324 for it to work. In doubt, consult the manual of the platform or contact its
325 vendor.
326
327 Mediated Devices (vGPU, GVT-g)
328 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
329
330 Mediated devices are another method to reuse features and performance from
331 physical hardware for virtualized hardware. These are found most common in
332 virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
333 GRID technology.
334
335 With this, a physical Card is able to create virtual cards, similar to SR-IOV.
336 The difference is that mediated devices do not appear as PCI(e) devices in the
337 host, and are such only suited for using in virtual machines.
338
339
340 Host Configuration
341 ^^^^^^^^^^^^^^^^^^
342
343 In general your card's driver must support that feature, otherwise it will
344 not work. So please refer to your vendor for compatible drivers and how to
345 configure them.
346
347 Intel's drivers for GVT-g are integrated in the Kernel and should work
348 with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
349 v5 and E3 v6 Xeon Processors.
350
351 To enable it for Intel Graphics, you have to make sure to load the module
352 'kvmgt' (for example via `/etc/modules`) and to enable it on the
353 xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
354
355 ----
356 i915.enable_gvt=1
357 ----
358
359 After that remember to
360 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
361 and reboot your host.
362
363 VM Configuration
364 ^^^^^^^^^^^^^^^^
365
366 To use a mediated device, simply specify the `mdev` property on a `hostpciX`
367 VM configuration option.
368
369 You can get the supported devices via the 'sysfs'. For example, to list the
370 supported types for the device '0000:00:02.0' you would simply execute:
371
372 ----
373 # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
374 ----
375
376 Each entry is a directory which contains the following important files:
377
378 * 'available_instances' contains the amount of still available instances of
379 this type, each 'mdev' use in a VM reduces this.
380 * 'description' contains a short description about the capabilities of the type
381 * 'create' is the endpoint to create such a device, {pve} does this
382 automatically for you, if a 'hostpciX' option with `mdev` is configured.
383
384 Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
385
386 ----
387 # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
388 ----
389
390 With this set, {pve} automatically creates such a device on VM start, and
391 cleans it up again when the VM stops.
392
393 ifdef::wiki[]
394
395 See Also
396 ~~~~~~~~
397
398 * link:/wiki/Pci_passthrough[PCI Passthrough Examples]
399
400 endif::wiki[]