]> git.proxmox.com Git - pve-docs.git/blob - qm-pci-passthrough.adoc
ssh: document PVE-specific setup
[pve-docs.git] / qm-pci-passthrough.adoc
1 [[qm_pci_passthrough]]
2 PCI(e) Passthrough
3 ------------------
4 ifdef::wiki[]
5 :pve-toplevel:
6 endif::wiki[]
7
8 PCI(e) passthrough is a mechanism to give a virtual machine control over
9 a PCI device from the host. This can have some advantages over using
10 virtualized hardware, for example lower latency, higher performance, or more
11 features (e.g., offloading).
12
13 But, if you pass through a device to a virtual machine, you cannot use that
14 device anymore on the host or in any other VM.
15
16 General Requirements
17 ~~~~~~~~~~~~~~~~~~~~
18
19 Since passthrough is a feature which also needs hardware support, there are
20 some requirements to check and preparations to be done to make it work.
21
22
23 Hardware
24 ^^^^^^^^
25 Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26 **U**nit) interrupt remapping, this includes the CPU and the mainboard.
27
28 Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29 But it is not guaranteed that everything will work out of the box, due
30 to bad hardware implementation and missing or low quality drivers.
31
32 Further, server grade hardware has often better support than consumer grade
33 hardware, but even then, many modern system can support this.
34
35 Please refer to your hardware vendor to check if they support this feature
36 under Linux for your specific setup.
37
38
39 Configuration
40 ^^^^^^^^^^^^^
41
42 Once you ensured that your hardware supports passthrough, you will need to do
43 some configuration to enable PCI(e) passthrough.
44
45
46 .IOMMU
47
48 First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49 options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50 for the exact option you need to enable.
51
52 Then, the IOMMU has to be activated on the
53 xref:sysboot_edit_kernel_cmdline[kernel commandline].
54
55 The command line parameters are:
56
57 * for Intel CPUs:
58 +
59 ----
60 intel_iommu=on
61 ----
62 * for AMD CPUs it should be enabled automatically.
63
64 .Kernel Modules
65
66 You have to make sure the following modules are loaded. This can be achieved by
67 adding them to `'/etc/modules''
68
69 ----
70 vfio
71 vfio_iommu_type1
72 vfio_pci
73 vfio_virqfd
74 ----
75
76 [[qm_pci_passthrough_update_initramfs]]
77 After changing anything modules related, you need to refresh your
78 `initramfs`. On {pve} this can be done by executing:
79
80 ----
81 # update-initramfs -u -k all
82 ----
83
84 .Finish Configuration
85
86 Finally reboot to bring the changes into effect and check that it is indeed
87 enabled.
88
89 ----
90 # dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
91 ----
92
93 should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
94 enabled, depending on hardware and kernel the exact message can vary.
95
96 It is also important that the device(s) you want to pass through
97 are in a *separate* `IOMMU` group. This can be checked with:
98
99 ----
100 # find /sys/kernel/iommu_groups/ -type l
101 ----
102
103 It is okay if the device is in an `IOMMU` group together with its functions
104 (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
105
106 .PCI(e) slots
107 [NOTE]
108 ====
109 Some platforms handle their physical PCI(e) slots differently. So, sometimes
110 it can help to put the card in a another PCI(e) slot, if you do not get the
111 desired `IOMMU` group separation.
112 ====
113
114 .Unsafe interrupts
115 [NOTE]
116 ====
117 For some platforms, it may be necessary to allow unsafe interrupts.
118 For this add the following line in a file ending with `.conf' file in
119 */etc/modprobe.d/*:
120
121 ----
122 options vfio_iommu_type1 allow_unsafe_interrupts=1
123 ----
124
125 Please be aware that this option can make your system unstable.
126 ====
127
128 GPU Passthrough Notes
129 ^^^^^^^^^^^^^^^^^^^^^
130
131 It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
132 the {pve} web interface.
133
134 When passing through a whole GPU or a vGPU and graphic output is wanted, one
135 has to either physically connect a monitor to the card, or configure a remote
136 desktop software (for example, VNC or RDP) inside the guest.
137
138 If you want to use the GPU as a hardware accelerator, for example, for
139 programs using OpenCL or CUDA, this is not required.
140
141 Host Device Passthrough
142 ~~~~~~~~~~~~~~~~~~~~~~~
143
144 The most used variant of PCI(e) passthrough is to pass through a whole
145 PCI(e) card, for example a GPU or a network card.
146
147
148 Host Configuration
149 ^^^^^^^^^^^^^^^^^^
150
151 In this case, the host must not use the card. There are two methods to achieve
152 this:
153
154 * pass the device IDs to the options of the 'vfio-pci' modules by adding
155 +
156 ----
157 options vfio-pci ids=1234:5678,4321:8765
158 ----
159 +
160 to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
161 the vendor and device IDs obtained by:
162 +
163 ----
164 # lspci -nn
165 ----
166
167 * blacklist the driver completely on the host, ensuring that it is free to bind
168 for passthrough, with
169 +
170 ----
171 blacklist DRIVERNAME
172 ----
173 +
174 in a .conf file in */etc/modprobe.d/*.
175
176 For both methods you need to
177 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
178 reboot after that.
179
180 .Verify Configuration
181
182 To check if your changes were successful, you can use
183
184 ----
185 # lspci -nnk
186 ----
187
188 and check your device entry. If it says
189
190 ----
191 Kernel driver in use: vfio-pci
192 ----
193
194 or the 'in use' line is missing entirely, the device is ready to be used for
195 passthrough.
196
197 [[qm_pci_passthrough_vm_config]]
198 VM Configuration
199 ^^^^^^^^^^^^^^^^
200 To pass through the device you need to set the *hostpciX* option in the VM
201 configuration, for example by executing:
202
203 ----
204 # qm set VMID -hostpci0 00:02.0
205 ----
206
207 If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
208 you can pass them through all together with the shortened syntax ``00:02`'.
209 This is equivalent with checking the ``All Functions`' checkbox in the
210 web-interface.
211
212 There are some options to which may be necessary, depending on the device
213 and guest OS:
214
215 * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
216 With this enabled the *vga* configuration option will be ignored.
217
218 * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
219 combination require PCIe rather than PCI. PCIe is only available for 'q35'
220 machine types.
221
222 * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
223 Some PCI(e) devices need this disabled.
224
225 * *romfile=<path>*, is an optional path to a ROM file for the device to use.
226 This is a relative path under */usr/share/kvm/*.
227
228 .Example
229
230 An example of PCIe passthrough with a GPU set to primary:
231
232 ----
233 # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
234 ----
235
236 .PCI ID overrides
237
238 You can override the PCI vendor ID, device ID, and subsystem IDs that will be
239 seen by the guest. This is useful if your device is a variant with an ID that
240 your guest's drivers don't recognize, but you want to force those drivers to be
241 loaded anyway (e.g. if you know your device shares the same chipset as a
242 supported variant).
243
244 The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
245 `sub-device-id`. You can set any or all of these to override your device's
246 default IDs.
247
248 For example:
249
250 ----
251 # qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
252 ----
253
254
255 Other considerations
256 ^^^^^^^^^^^^^^^^^^^^
257
258 When passing through a GPU, the best compatibility is reached when using
259 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
260 instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
261 GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
262
263 SR-IOV
264 ~~~~~~
265
266 Another variant for passing through PCI(e) devices, is to use the hardware
267 virtualization features of your devices, if available.
268
269 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
270 a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
271 system. Each of those 'VF' can be used in a different VM, with full hardware
272 features and also better performance and lower latency than software
273 virtualized devices.
274
275 Currently, the most common use case for this are NICs (**N**etwork
276 **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
277 physical port. This allows using features such as checksum offloading, etc. to
278 be used inside a VM, reducing the (host) CPU overhead.
279
280
281 Host Configuration
282 ^^^^^^^^^^^^^^^^^^
283
284 Generally, there are two methods for enabling virtual functions on a device.
285
286 * sometimes there is an option for the driver module e.g. for some
287 Intel drivers
288 +
289 ----
290 max_vfs=4
291 ----
292 +
293 which could be put file with '.conf' ending under */etc/modprobe.d/*.
294 (Do not forget to update your initramfs after that)
295 +
296 Please refer to your driver module documentation for the exact
297 parameters and options.
298
299 * The second, more generic, approach is using the `sysfs`.
300 If a device and driver supports this you can change the number of VFs on
301 the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
302 +
303 ----
304 # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
305 ----
306 +
307 To make this change persistent you can use the `sysfsutils` Debian package.
308 After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
309 */etc/sysfs.d/*.
310
311 VM Configuration
312 ^^^^^^^^^^^^^^^^
313
314 After creating VFs, you should see them as separate PCI(e) devices when
315 outputting them with `lspci`. Get their ID and pass them through like a
316 xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
317
318 Other considerations
319 ^^^^^^^^^^^^^^^^^^^^
320
321 For this feature, platform support is especially important. It may be necessary
322 to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
323 for it to work. In doubt, consult the manual of the platform or contact its
324 vendor.
325
326 Mediated Devices (vGPU, GVT-g)
327 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
328
329 Mediated devices are another method to reuse features and performance from
330 physical hardware for virtualized hardware. These are found most common in
331 virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
332 GRID technology.
333
334 With this, a physical Card is able to create virtual cards, similar to SR-IOV.
335 The difference is that mediated devices do not appear as PCI(e) devices in the
336 host, and are such only suited for using in virtual machines.
337
338
339 Host Configuration
340 ^^^^^^^^^^^^^^^^^^
341
342 In general your card's driver must support that feature, otherwise it will
343 not work. So please refer to your vendor for compatible drivers and how to
344 configure them.
345
346 Intel's drivers for GVT-g are integrated in the Kernel and should work
347 with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
348 v5 and E3 v6 Xeon Processors.
349
350 To enable it for Intel Graphics, you have to make sure to load the module
351 'kvmgt' (for example via `/etc/modules`) and to enable it on the
352 xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
353
354 ----
355 i915.enable_gvt=1
356 ----
357
358 After that remember to
359 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
360 and reboot your host.
361
362 VM Configuration
363 ^^^^^^^^^^^^^^^^
364
365 To use a mediated device, simply specify the `mdev` property on a `hostpciX`
366 VM configuration option.
367
368 You can get the supported devices via the 'sysfs'. For example, to list the
369 supported types for the device '0000:00:02.0' you would simply execute:
370
371 ----
372 # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
373 ----
374
375 Each entry is a directory which contains the following important files:
376
377 * 'available_instances' contains the amount of still available instances of
378 this type, each 'mdev' use in a VM reduces this.
379 * 'description' contains a short description about the capabilities of the type
380 * 'create' is the endpoint to create such a device, {pve} does this
381 automatically for you, if a 'hostpciX' option with `mdev` is configured.
382
383 Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
384
385 ----
386 # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
387 ----
388
389 With this set, {pve} automatically creates such a device on VM start, and
390 cleans it up again when the VM stops.
391
392 ifdef::wiki[]
393
394 See Also
395 ~~~~~~~~
396
397 * link:/wiki/Pci_passthrough[PCI Passthrough Examples]
398
399 endif::wiki[]