]> git.proxmox.com Git - pve-docs.git/blame - qm-pci-passthrough.adoc
fix #3884: Add section for kernel samepage merging
[pve-docs.git] / qm-pci-passthrough.adoc
CommitLineData
6e4c46c4
DC
1[[qm_pci_passthrough]]
2PCI(e) Passthrough
3------------------
e582833b
DC
4ifdef::wiki[]
5:pve-toplevel:
6endif::wiki[]
6e4c46c4
DC
7
8PCI(e) passthrough is a mechanism to give a virtual machine control over
49f20f1b
TL
9a PCI device from the host. This can have some advantages over using
10virtualized hardware, for example lower latency, higher performance, or more
11features (e.g., offloading).
6e4c46c4 12
49f20f1b 13But, if you pass through a device to a virtual machine, you cannot use that
6e4c46c4
DC
14device anymore on the host or in any other VM.
15
16General Requirements
17~~~~~~~~~~~~~~~~~~~~
18
19Since passthrough is a feature which also needs hardware support, there are
49f20f1b
TL
20some requirements to check and preparations to be done to make it work.
21
6e4c46c4
DC
22
23Hardware
24^^^^^^^^
49f20f1b
TL
25Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
26**U**nit) interrupt remapping, this includes the CPU and the mainboard.
6e4c46c4 27
49f20f1b
TL
28Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
29But it is not guaranteed that everything will work out of the box, due
30to bad hardware implementation and missing or low quality drivers.
6e4c46c4 31
49f20f1b 32Further, server grade hardware has often better support than consumer grade
6e4c46c4
DC
33hardware, but even then, many modern system can support this.
34
49f20f1b 35Please refer to your hardware vendor to check if they support this feature
a22d7c24 36under Linux for your specific setup.
49f20f1b 37
6e4c46c4
DC
38
39Configuration
40^^^^^^^^^^^^^
41
49f20f1b
TL
42Once you ensured that your hardware supports passthrough, you will need to do
43some configuration to enable PCI(e) passthrough.
6e4c46c4 44
6e4c46c4 45
39d84f28 46.IOMMU
6e4c46c4 47
63f0bb9d
DC
48First, the IOMMU support has to be enabled in your BIOS/UEFI. Most often, that
49options is named `IOMMU` or `VT-d`, but check the manual for your motherboard
50for the exact option you need to enable.
51
52Then, the IOMMU has to be activated on the
69055103 53xref:sysboot_edit_kernel_cmdline[kernel commandline].
1748211a
SI
54
55The command line parameters are:
6e4c46c4 56
49f20f1b
TL
57* for Intel CPUs:
58+
59----
60 intel_iommu=on
61----
0c54d612 62* for AMD CPUs it should be enabled automatically.
6e4c46c4 63
39d84f28 64.Kernel Modules
6e4c46c4 65
49f20f1b
TL
66You have to make sure the following modules are loaded. This can be achieved by
67adding them to `'/etc/modules''
6e4c46c4 68
49f20f1b 69----
6e4c46c4
DC
70 vfio
71 vfio_iommu_type1
72 vfio_pci
73 vfio_virqfd
49f20f1b 74----
6e4c46c4 75
49f20f1b 76[[qm_pci_passthrough_update_initramfs]]
6e4c46c4 77After changing anything modules related, you need to refresh your
49f20f1b 78`initramfs`. On {pve} this can be done by executing:
6e4c46c4
DC
79
80----
49f20f1b 81# update-initramfs -u -k all
6e4c46c4
DC
82----
83
39d84f28 84.Finish Configuration
49f20f1b
TL
85
86Finally reboot to bring the changes into effect and check that it is indeed
87enabled.
6e4c46c4
DC
88
89----
5e235b99 90# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
6e4c46c4
DC
91----
92
49f20f1b
TL
93should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
94enabled, depending on hardware and kernel the exact message can vary.
6e4c46c4
DC
95
96It is also important that the device(s) you want to pass through
49f20f1b 97are in a *separate* `IOMMU` group. This can be checked with:
6e4c46c4
DC
98
99----
49f20f1b 100# find /sys/kernel/iommu_groups/ -type l
6e4c46c4
DC
101----
102
49f20f1b 103It is okay if the device is in an `IOMMU` group together with its functions
6e4c46c4
DC
104(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
105
106.PCI(e) slots
107[NOTE]
108====
49f20f1b
TL
109Some platforms handle their physical PCI(e) slots differently. So, sometimes
110it can help to put the card in a another PCI(e) slot, if you do not get the
111desired `IOMMU` group separation.
6e4c46c4
DC
112====
113
114.Unsafe interrupts
115[NOTE]
116====
117For some platforms, it may be necessary to allow unsafe interrupts.
49f20f1b
TL
118For this add the following line in a file ending with `.conf' file in
119*/etc/modprobe.d/*:
6e4c46c4 120
49f20f1b 121----
6e4c46c4 122 options vfio_iommu_type1 allow_unsafe_interrupts=1
49f20f1b 123----
6e4c46c4
DC
124
125Please be aware that this option can make your system unstable.
126====
127
082b32fb
TL
128GPU Passthrough Notes
129^^^^^^^^^^^^^^^^^^^^^
13cae0c1 130
082b32fb
TL
131It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
132the {pve} web interface.
13cae0c1 133
082b32fb
TL
134When passing through a whole GPU or a vGPU and graphic output is wanted, one
135has to either physically connect a monitor to the card, or configure a remote
136desktop software (for example, VNC or RDP) inside the guest.
13cae0c1 137
082b32fb
TL
138If you want to use the GPU as a hardware accelerator, for example, for
139programs using OpenCL or CUDA, this is not required.
13cae0c1 140
49f20f1b 141Host Device Passthrough
6e4c46c4
DC
142~~~~~~~~~~~~~~~~~~~~~~~
143
144The most used variant of PCI(e) passthrough is to pass through a whole
49f20f1b
TL
145PCI(e) card, for example a GPU or a network card.
146
6e4c46c4
DC
147
148Host Configuration
149^^^^^^^^^^^^^^^^^^
150
eebb3506 151In this case, the host must not use the card. There are two methods to achieve
49f20f1b 152this:
6e4c46c4 153
49f20f1b
TL
154* pass the device IDs to the options of the 'vfio-pci' modules by adding
155+
156----
6e4c46c4 157 options vfio-pci ids=1234:5678,4321:8765
6e4c46c4 158----
49f20f1b
TL
159+
160to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
161the vendor and device IDs obtained by:
162+
163----
eebb3506 164# lspci -nn
6e4c46c4
DC
165----
166
49f20f1b
TL
167* blacklist the driver completely on the host, ensuring that it is free to bind
168for passthrough, with
169+
170----
6e4c46c4 171 blacklist DRIVERNAME
49f20f1b
TL
172----
173+
174in a .conf file in */etc/modprobe.d/*.
6e4c46c4 175
49f20f1b
TL
176For both methods you need to
177xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
178reboot after that.
6e4c46c4 179
eebb3506
SR
180.Verify Configuration
181
182To check if your changes were successful, you can use
183
184----
185# lspci -nnk
186----
187
188and check your device entry. If it says
189
190----
191Kernel driver in use: vfio-pci
192----
193
194or the 'in use' line is missing entirely, the device is ready to be used for
195passthrough.
196
49f20f1b 197[[qm_pci_passthrough_vm_config]]
6e4c46c4
DC
198VM Configuration
199^^^^^^^^^^^^^^^^
49f20f1b
TL
200To pass through the device you need to set the *hostpciX* option in the VM
201configuration, for example by executing:
6e4c46c4
DC
202
203----
49f20f1b 204# qm set VMID -hostpci0 00:02.0
6e4c46c4
DC
205----
206
5ee3d3cd 207If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
1fa89424
DC
208you can pass them through all together with the shortened syntax ``00:02`'.
209This is equivalent with checking the ``All Functions`' checkbox in the
210web-interface.
6e4c46c4
DC
211
212There are some options to which may be necessary, depending on the device
49f20f1b
TL
213and guest OS:
214
215* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
216With this enabled the *vga* configuration option will be ignored.
6e4c46c4 217
6e4c46c4 218* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
49f20f1b
TL
219combination require PCIe rather than PCI. PCIe is only available for 'q35'
220machine types.
221
6e4c46c4
DC
222* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
223Some PCI(e) devices need this disabled.
49f20f1b 224
6e4c46c4 225* *romfile=<path>*, is an optional path to a ROM file for the device to use.
49f20f1b
TL
226This is a relative path under */usr/share/kvm/*.
227
39d84f28 228.Example
6e4c46c4
DC
229
230An example of PCIe passthrough with a GPU set to primary:
231
232----
49f20f1b 233# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
6e4c46c4
DC
234----
235
cf2da2d8
NS
236.PCI ID overrides
237
238You can override the PCI vendor ID, device ID, and subsystem IDs that will be
239seen by the guest. This is useful if your device is a variant with an ID that
240your guest's drivers don't recognize, but you want to force those drivers to be
241loaded anyway (e.g. if you know your device shares the same chipset as a
242supported variant).
243
244The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
245`sub-device-id`. You can set any or all of these to override your device's
246default IDs.
247
248For example:
249
250----
251# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
252----
253
49f20f1b 254
6e4c46c4
DC
255Other considerations
256^^^^^^^^^^^^^^^^^^^^
257
258When passing through a GPU, the best compatibility is reached when using
49f20f1b
TL
259'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
260instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
261GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
6e4c46c4
DC
262
263SR-IOV
264~~~~~~
265
49f20f1b
TL
266Another variant for passing through PCI(e) devices, is to use the hardware
267virtualization features of your devices, if available.
268
269'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
270a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
271system. Each of those 'VF' can be used in a different VM, with full hardware
272features and also better performance and lower latency than software
273virtualized devices.
6e4c46c4 274
49f20f1b
TL
275Currently, the most common use case for this are NICs (**N**etwork
276**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
277physical port. This allows using features such as checksum offloading, etc. to
278be used inside a VM, reducing the (host) CPU overhead.
6e4c46c4 279
6e4c46c4
DC
280
281Host Configuration
282^^^^^^^^^^^^^^^^^^
283
49f20f1b 284Generally, there are two methods for enabling virtual functions on a device.
6e4c46c4 285
49f20f1b 286* sometimes there is an option for the driver module e.g. for some
6e4c46c4 287Intel drivers
49f20f1b
TL
288+
289----
6e4c46c4 290 max_vfs=4
49f20f1b
TL
291----
292+
293which could be put file with '.conf' ending under */etc/modprobe.d/*.
6e4c46c4 294(Do not forget to update your initramfs after that)
49f20f1b 295+
6e4c46c4
DC
296Please refer to your driver module documentation for the exact
297parameters and options.
298
49f20f1b
TL
299* The second, more generic, approach is using the `sysfs`.
300If a device and driver supports this you can change the number of VFs on
301the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
302+
6e4c46c4 303----
49f20f1b 304# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
6e4c46c4 305----
49f20f1b
TL
306+
307To make this change persistent you can use the `sysfsutils` Debian package.
39d84f28 308After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
49f20f1b 309*/etc/sysfs.d/*.
6e4c46c4
DC
310
311VM Configuration
312^^^^^^^^^^^^^^^^
313
49f20f1b
TL
314After creating VFs, you should see them as separate PCI(e) devices when
315outputting them with `lspci`. Get their ID and pass them through like a
316xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
6e4c46c4
DC
317
318Other considerations
319^^^^^^^^^^^^^^^^^^^^
320
321For this feature, platform support is especially important. It may be necessary
49f20f1b
TL
322to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
323for it to work. In doubt, consult the manual of the platform or contact its
324vendor.
050192c5 325
d25f097c
TL
326Mediated Devices (vGPU, GVT-g)
327~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
050192c5 328
a22d7c24 329Mediated devices are another method to reuse features and performance from
d25f097c 330physical hardware for virtualized hardware. These are found most common in
3a433e9b 331virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
d25f097c
TL
332GRID technology.
333
334With this, a physical Card is able to create virtual cards, similar to SR-IOV.
335The difference is that mediated devices do not appear as PCI(e) devices in the
336host, and are such only suited for using in virtual machines.
050192c5 337
050192c5
DC
338
339Host Configuration
340^^^^^^^^^^^^^^^^^^
341
d25f097c 342In general your card's driver must support that feature, otherwise it will
a22d7c24 343not work. So please refer to your vendor for compatible drivers and how to
050192c5
DC
344configure them.
345
3a433e9b 346Intel's drivers for GVT-g are integrated in the Kernel and should work
a22d7c24
SR
347with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
348v5 and E3 v6 Xeon Processors.
050192c5 349
1748211a
SI
350To enable it for Intel Graphics, you have to make sure to load the module
351'kvmgt' (for example via `/etc/modules`) and to enable it on the
69055103 352xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
050192c5
DC
353
354----
355 i915.enable_gvt=1
356----
357
358After that remember to
359xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
1748211a 360and reboot your host.
050192c5
DC
361
362VM Configuration
363^^^^^^^^^^^^^^^^
364
d25f097c
TL
365To use a mediated device, simply specify the `mdev` property on a `hostpciX`
366VM configuration option.
050192c5 367
d25f097c
TL
368You can get the supported devices via the 'sysfs'. For example, to list the
369supported types for the device '0000:00:02.0' you would simply execute:
050192c5
DC
370
371----
372# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
373----
374
375Each entry is a directory which contains the following important files:
376
d25f097c
TL
377* 'available_instances' contains the amount of still available instances of
378this type, each 'mdev' use in a VM reduces this.
050192c5 379* 'description' contains a short description about the capabilities of the type
d25f097c
TL
380* 'create' is the endpoint to create such a device, {pve} does this
381automatically for you, if a 'hostpciX' option with `mdev` is configured.
050192c5 382
d25f097c 383Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
050192c5
DC
384
385----
386# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
387----
388
389With this set, {pve} automatically creates such a device on VM start, and
390cleans it up again when the VM stops.
e582833b
DC
391
392ifdef::wiki[]
393
394See Also
395~~~~~~~~
396
397* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
398
399endif::wiki[]