]> git.proxmox.com Git - pve-docs.git/blob - qm-pci-passthrough.adoc
remove 'amd_iommu=on' from the passthrough docs
[pve-docs.git] / qm-pci-passthrough.adoc
1 [[qm_pci_passthrough]]
2 PCI(e) Passthrough
3 ------------------
4 ifdef::wiki[]
5 :pve-toplevel:
6 endif::wiki[]
7
8 PCI(e) passthrough is a mechanism to give a virtual machine control over
9 a PCI device from the host. This can have some advantages over using
10 virtualized hardware, for example lower latency, higher performance, or more
11 features (e.g., offloading).
12
13 But, if you pass through a device to a virtual machine, you cannot use that
14 device anymore on the host or in any other VM.
15
16 General Requirements
17 ~~~~~~~~~~~~~~~~~~~~
18
19 Since passthrough is a feature which also needs hardware support, there are
20 some requirements to check and preparations to be done to make it work.
21
22
23 Hardware
24 ^^^^^^^^
25 Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26 **U**nit) interrupt remapping, this includes the CPU and the mainboard.
27
28 Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29 But it is not guaranteed that everything will work out of the box, due
30 to bad hardware implementation and missing or low quality drivers.
31
32 Further, server grade hardware has often better support than consumer grade
33 hardware, but even then, many modern system can support this.
34
35 Please refer to your hardware vendor to check if they support this feature
36 under Linux for your specific setup.
37
38
39 Configuration
40 ^^^^^^^^^^^^^
41
42 Once you ensured that your hardware supports passthrough, you will need to do
43 some configuration to enable PCI(e) passthrough.
44
45
46 .IOMMU
47
48 The IOMMU has to be activated on the
49 xref:sysboot_edit_kernel_cmdline[kernel commandline].
50
51 The command line parameters are:
52
53 * for Intel CPUs:
54 +
55 ----
56 intel_iommu=on
57 ----
58 * for AMD CPUs it should be enabled automatically.
59
60 .Kernel Modules
61
62 You have to make sure the following modules are loaded. This can be achieved by
63 adding them to `'/etc/modules''
64
65 ----
66 vfio
67 vfio_iommu_type1
68 vfio_pci
69 vfio_virqfd
70 ----
71
72 [[qm_pci_passthrough_update_initramfs]]
73 After changing anything modules related, you need to refresh your
74 `initramfs`. On {pve} this can be done by executing:
75
76 ----
77 # update-initramfs -u -k all
78 ----
79
80 .Finish Configuration
81
82 Finally reboot to bring the changes into effect and check that it is indeed
83 enabled.
84
85 ----
86 # dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
87 ----
88
89 should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
90 enabled, depending on hardware and kernel the exact message can vary.
91
92 It is also important that the device(s) you want to pass through
93 are in a *separate* `IOMMU` group. This can be checked with:
94
95 ----
96 # find /sys/kernel/iommu_groups/ -type l
97 ----
98
99 It is okay if the device is in an `IOMMU` group together with its functions
100 (e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
101
102 .PCI(e) slots
103 [NOTE]
104 ====
105 Some platforms handle their physical PCI(e) slots differently. So, sometimes
106 it can help to put the card in a another PCI(e) slot, if you do not get the
107 desired `IOMMU` group separation.
108 ====
109
110 .Unsafe interrupts
111 [NOTE]
112 ====
113 For some platforms, it may be necessary to allow unsafe interrupts.
114 For this add the following line in a file ending with `.conf' file in
115 */etc/modprobe.d/*:
116
117 ----
118 options vfio_iommu_type1 allow_unsafe_interrupts=1
119 ----
120
121 Please be aware that this option can make your system unstable.
122 ====
123
124 GPU Passthrough Notes
125 ^^^^^^^^^^^^^^^^^^^^^
126
127 It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
128 the {pve} web interface.
129
130 When passing through a whole GPU or a vGPU and graphic output is wanted, one
131 has to either physically connect a monitor to the card, or configure a remote
132 desktop software (for example, VNC or RDP) inside the guest.
133
134 If you want to use the GPU as a hardware accelerator, for example, for
135 programs using OpenCL or CUDA, this is not required.
136
137 Host Device Passthrough
138 ~~~~~~~~~~~~~~~~~~~~~~~
139
140 The most used variant of PCI(e) passthrough is to pass through a whole
141 PCI(e) card, for example a GPU or a network card.
142
143
144 Host Configuration
145 ^^^^^^^^^^^^^^^^^^
146
147 In this case, the host must not use the card. There are two methods to achieve
148 this:
149
150 * pass the device IDs to the options of the 'vfio-pci' modules by adding
151 +
152 ----
153 options vfio-pci ids=1234:5678,4321:8765
154 ----
155 +
156 to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
157 the vendor and device IDs obtained by:
158 +
159 ----
160 # lspci -nn
161 ----
162
163 * blacklist the driver completely on the host, ensuring that it is free to bind
164 for passthrough, with
165 +
166 ----
167 blacklist DRIVERNAME
168 ----
169 +
170 in a .conf file in */etc/modprobe.d/*.
171
172 For both methods you need to
173 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
174 reboot after that.
175
176 .Verify Configuration
177
178 To check if your changes were successful, you can use
179
180 ----
181 # lspci -nnk
182 ----
183
184 and check your device entry. If it says
185
186 ----
187 Kernel driver in use: vfio-pci
188 ----
189
190 or the 'in use' line is missing entirely, the device is ready to be used for
191 passthrough.
192
193 [[qm_pci_passthrough_vm_config]]
194 VM Configuration
195 ^^^^^^^^^^^^^^^^
196 To pass through the device you need to set the *hostpciX* option in the VM
197 configuration, for example by executing:
198
199 ----
200 # qm set VMID -hostpci0 00:02.0
201 ----
202
203 If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
204 you can pass them through all together with the shortened syntax ``00:02`'.
205 This is equivalent with checking the ``All Functions`' checkbox in the
206 web-interface.
207
208 There are some options to which may be necessary, depending on the device
209 and guest OS:
210
211 * *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
212 With this enabled the *vga* configuration option will be ignored.
213
214 * *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
215 combination require PCIe rather than PCI. PCIe is only available for 'q35'
216 machine types.
217
218 * *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
219 Some PCI(e) devices need this disabled.
220
221 * *romfile=<path>*, is an optional path to a ROM file for the device to use.
222 This is a relative path under */usr/share/kvm/*.
223
224 .Example
225
226 An example of PCIe passthrough with a GPU set to primary:
227
228 ----
229 # qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
230 ----
231
232
233 Other considerations
234 ^^^^^^^^^^^^^^^^^^^^
235
236 When passing through a GPU, the best compatibility is reached when using
237 'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
238 instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
239 GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
240
241 SR-IOV
242 ~~~~~~
243
244 Another variant for passing through PCI(e) devices, is to use the hardware
245 virtualization features of your devices, if available.
246
247 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
248 a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
249 system. Each of those 'VF' can be used in a different VM, with full hardware
250 features and also better performance and lower latency than software
251 virtualized devices.
252
253 Currently, the most common use case for this are NICs (**N**etwork
254 **I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
255 physical port. This allows using features such as checksum offloading, etc. to
256 be used inside a VM, reducing the (host) CPU overhead.
257
258
259 Host Configuration
260 ^^^^^^^^^^^^^^^^^^
261
262 Generally, there are two methods for enabling virtual functions on a device.
263
264 * sometimes there is an option for the driver module e.g. for some
265 Intel drivers
266 +
267 ----
268 max_vfs=4
269 ----
270 +
271 which could be put file with '.conf' ending under */etc/modprobe.d/*.
272 (Do not forget to update your initramfs after that)
273 +
274 Please refer to your driver module documentation for the exact
275 parameters and options.
276
277 * The second, more generic, approach is using the `sysfs`.
278 If a device and driver supports this you can change the number of VFs on
279 the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
280 +
281 ----
282 # echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
283 ----
284 +
285 To make this change persistent you can use the `sysfsutils` Debian package.
286 After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
287 */etc/sysfs.d/*.
288
289 VM Configuration
290 ^^^^^^^^^^^^^^^^
291
292 After creating VFs, you should see them as separate PCI(e) devices when
293 outputting them with `lspci`. Get their ID and pass them through like a
294 xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
295
296 Other considerations
297 ^^^^^^^^^^^^^^^^^^^^
298
299 For this feature, platform support is especially important. It may be necessary
300 to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
301 for it to work. In doubt, consult the manual of the platform or contact its
302 vendor.
303
304 Mediated Devices (vGPU, GVT-g)
305 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
306
307 Mediated devices are another method to reuse features and performance from
308 physical hardware for virtualized hardware. These are found most common in
309 virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
310 GRID technology.
311
312 With this, a physical Card is able to create virtual cards, similar to SR-IOV.
313 The difference is that mediated devices do not appear as PCI(e) devices in the
314 host, and are such only suited for using in virtual machines.
315
316
317 Host Configuration
318 ^^^^^^^^^^^^^^^^^^
319
320 In general your card's driver must support that feature, otherwise it will
321 not work. So please refer to your vendor for compatible drivers and how to
322 configure them.
323
324 Intel's drivers for GVT-g are integrated in the Kernel and should work
325 with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
326 v5 and E3 v6 Xeon Processors.
327
328 To enable it for Intel Graphics, you have to make sure to load the module
329 'kvmgt' (for example via `/etc/modules`) and to enable it on the
330 xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
331
332 ----
333 i915.enable_gvt=1
334 ----
335
336 After that remember to
337 xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
338 and reboot your host.
339
340 VM Configuration
341 ^^^^^^^^^^^^^^^^
342
343 To use a mediated device, simply specify the `mdev` property on a `hostpciX`
344 VM configuration option.
345
346 You can get the supported devices via the 'sysfs'. For example, to list the
347 supported types for the device '0000:00:02.0' you would simply execute:
348
349 ----
350 # ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
351 ----
352
353 Each entry is a directory which contains the following important files:
354
355 * 'available_instances' contains the amount of still available instances of
356 this type, each 'mdev' use in a VM reduces this.
357 * 'description' contains a short description about the capabilities of the type
358 * 'create' is the endpoint to create such a device, {pve} does this
359 automatically for you, if a 'hostpciX' option with `mdev` is configured.
360
361 Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
362
363 ----
364 # qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
365 ----
366
367 With this set, {pve} automatically creates such a device on VM start, and
368 cleans it up again when the VM stops.
369
370 ifdef::wiki[]
371
372 See Also
373 ~~~~~~~~
374
375 * link:/wiki/Pci_passthrough[PCI Passthrough Examples]
376
377 endif::wiki[]