[pve-docs.git] / qm-pci-passthrough.adoc

[[qm_pci_passthrough]]
PCI(e) Passthrough
------------------
ifdef::wiki[]
:pve-toplevel:
endif::wiki[]

PCI(e) passthrough is a mechanism to give a virtual machine control over
a PCI device from the host. This can have some advantages over using
virtualized hardware, for example lower latency, higher performance, or more
features (e.g., offloading).

But, if you pass through a device to a virtual machine, you cannot use that
device anymore on the host or in any other VM.

Note that, while PCI passthrough is available for i440fx and q35 machines, PCIe
passthrough is only available on q35 machines. This does not mean that
PCIe capable devices that are passed through as PCI devices will only run at
PCI speeds. Passing through devices as PCIe just sets a flag for the guest to
tell it that the device is a  PCIe device instead of a "really fast legacy PCI
device". Some guest applications benefit from this.

General Requirements
~~~~~~~~~~~~~~~~~~~~

Since passthrough is performed on real hardware, it needs to fulfill some
requirements. A brief overview of these requirements is given below, for more
information on specific devices, see
https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples].

Hardware
^^^^^^^^
Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
**U**nit) interrupt remapping, this includes the CPU and the motherboard.

Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this.
But it is not guaranteed that everything will work out of the box, due
to bad hardware implementation and missing or low quality drivers.

Further, server grade hardware has often better support than consumer grade
hardware, but even then, many modern system can support this.

Please refer to your hardware vendor to check if they support this feature
under Linux for your specific setup.

Determining PCI Card Address
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's
hardware tab. Alternatively, you can use the command line.

You can locate your card using

----
 lspci
----

Configuration
^^^^^^^^^^^^^

Once you ensured that your hardware supports passthrough, you will need to do
some configuration to enable PCI(e) passthrough.

.IOMMU

First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the
corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact
option name in the manual of your motherboard.

For Intel CPUs, you also need to enable the IOMMU on the
xref:sysboot_edit_kernel_cmdline[kernel command line] kernels by adding:

----
 intel_iommu=on
----

For AMD CPUs it should be enabled automatically.

.IOMMU Passthrough Mode

If your hardware supports IOMMU passthrough mode, enabling this mode might
increase performance.
This is because VMs then bypass the (default) DMA translation normally
performed by the hyper-visor and instead pass DMA requests directly to the
hardware IOMMU. To enable these options, add:

----
 iommu=pt
----

to the xref:sysboot_edit_kernel_cmdline[kernel commandline].

.Kernel Modules

//TODO: remove `vfio_virqfd` stuff with eol of pve 7
You have to make sure the following modules are loaded. This can be achieved by
adding them to `'/etc/modules''. In kernels newer than 6.2 ({pve} 8 and onward)
the 'vfio_virqfd' module is part of the 'vfio' module, therefore loading
'vfio_virqfd' in {pve} 8 and newer is not necessary.

----
 vfio
 vfio_iommu_type1
 vfio_pci
 vfio_virqfd #not needed if on kernel 6.2 or newer
----

[[qm_pci_passthrough_update_initramfs]]
After changing anything modules related, you need to refresh your
`initramfs`. On {pve} this can be done by executing:

----
# update-initramfs -u -k all
----

To check if the modules are being loaded, the output of

----
# lsmod | grep vfio
----

should include the four modules from above.

.Finish Configuration

Finally reboot to bring the changes into effect and check that it is indeed
enabled.

----
# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
----

should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
enabled, depending on hardware and kernel the exact message can vary.

For notes on how to troubleshoot or verify if IOMMU is working as intended, please
see the https://pve.proxmox.com/wiki/PCI_Passthrough#Verifying_IOMMU_parameters[Verifying IOMMU Parameters]
section in our wiki.

It is also important that the device(s) you want to pass through
are in a *separate* `IOMMU` group. This can be checked with a call to the {pve}
API:

----
# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""
----

It is okay if the device is in an `IOMMU` group together with its functions
(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.

.PCI(e) slots
[NOTE]
====
Some platforms handle their physical PCI(e) slots differently. So, sometimes
it can help to put the card in a another PCI(e) slot, if you do not get the
desired `IOMMU` group separation.
====

.Unsafe interrupts
[NOTE]
====
For some platforms, it may be necessary to allow unsafe interrupts.
For this add  the following line in a file ending with `.conf' file in
*/etc/modprobe.d/*:

----
 options vfio_iommu_type1 allow_unsafe_interrupts=1
----

Please be aware that this option can make your system unstable.
====

GPU Passthrough Notes
^^^^^^^^^^^^^^^^^^^^^

It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
the {pve} web interface.

When passing through a whole GPU or a vGPU and graphic output is wanted, one
has to either physically connect a monitor to the card, or configure a remote
desktop software (for example, VNC or RDP) inside the guest.

If you want to use the GPU as a hardware accelerator, for example, for
programs using OpenCL or CUDA, this is not required.

Host Device Passthrough
~~~~~~~~~~~~~~~~~~~~~~~

The most used variant of PCI(e) passthrough is to pass through a whole
PCI(e) card, for example a GPU or a network card.


Host Configuration
^^^^^^^^^^^^^^^^^^

{pve} tries to automatically make the PCI(e) device unavailable for the host.
However, if this doesn't work, there are two things that can be done:

* pass the device IDs to the options of the 'vfio-pci' modules by adding
+
----
 options vfio-pci ids=1234:5678,4321:8765
----
+
to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
the vendor and device IDs obtained by:
+
----
# lspci -nn
----

* blacklist the driver on the host completely, ensuring that it is free to bind
for passthrough, with
+
----
 blacklist DRIVERNAME
----
+
in a .conf file in */etc/modprobe.d/*.
+
To find the drivername, execute
+
----
# lspci -k
----
+
for example:
+
----
# lspci -k | grep -A 3 "VGA"
----
+
will output something similar to
+
----
01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
	Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030]
	Kernel driver in use: <some-module>
	Kernel modules: <some-module>
----
+
Now we can blacklist the drivers by writing them into a .conf file:
+
----
echo "blacklist <some-module>" >> /etc/modprobe.d/blacklist.conf
----

For both methods you need to
xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
reboot after that.

Should this not work, you might need to set a soft dependency to load the gpu
modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see
also the manpages on 'modprobe.d' for more information.

For example, if you are using drivers named <some-module>:

----
# echo "softdep <some-module> pre: vfio-pci" >> /etc/modprobe.d/<some-module>.conf
----


.Verify Configuration

To check if your changes were successful, you can use

----
# lspci -nnk
----

and check your device entry. If it says

----
Kernel driver in use: vfio-pci
----

or the 'in use' line is missing entirely, the device is ready to be used for
passthrough.

[[qm_pci_passthrough_vm_config]]
VM Configuration
^^^^^^^^^^^^^^^^
When passing through a GPU, the best compatibility is reached when using
'q35' as machine type, 'OVMF' ('UEFI' for VMs) instead of SeaBIOS and PCIe
instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
GPU needs to have an UEFI capable ROM, otherwise use SeaBIOS instead. To check if
the ROM is UEFI capable, see the
https://pve.proxmox.com/wiki/PCI_Passthrough#How_to_know_if_a_graphics_card_is_UEFI_.28OVMF.29_compatible[PCI Passthrough Examples]
wiki.

Furthermore, using OVMF, disabling vga arbitration may be possible, reducing the
amount of legacy code needed to be run during boot. To disable vga arbitration:

----
 echo "options vfio-pci ids=<vendor-id>,<device-id> disable_vga=1" > /etc/modprobe.d/vfio.conf
----

replacing the <vendor-id> and <device-id> with the ones obtained from:

----
# lspci -nn
----

PCI devices can be added in the web interface in the hardware section of the VM.
Alternatively, you can use the command line; set the *hostpciX* option in the VM
configuration, for example by executing:

----
# qm set VMID -hostpci0 00:02.0
----

or by adding a line to the VM configuration file:

----
 hostpci0: 00:02.0
----


If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
you can pass them through all together with the shortened syntax ``00:02`'.
This is equivalent with checking the ``All Functions`' checkbox in the
web interface.

There are some options to which may be necessary, depending on the device
and guest OS:

* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
With this enabled the *vga* configuration option will be ignored.

* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
combination require PCIe rather than PCI. PCIe is only available for 'q35'
machine types.

* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
Some PCI(e) devices need this disabled.

* *romfile=<path>*, is an optional path to a ROM file for the device to use.
This is a relative path under */usr/share/kvm/*.

.Example

An example of PCIe passthrough with a GPU set to primary:

----
# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
----

.PCI ID overrides

You can override the PCI vendor ID, device ID, and subsystem IDs that will be
seen by the guest. This is useful if your device is a variant with an ID that
your guest's drivers don't recognize, but you want to force those drivers to be
loaded anyway (e.g. if you know your device shares the same chipset as a
supported variant).

The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
`sub-device-id`. You can set any or all of these to override your device's
default IDs.

For example:

----
# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
----

SR-IOV
~~~~~~

Another variant for passing through PCI(e) devices is to use the hardware
virtualization features of your devices, if available.

.Enabling SR-IOV
[NOTE]
====
To use SR-IOV, platform support is especially important. It may be necessary
to enable this feature in the BIOS/UEFI first, or to use a specific PCI(e) port
for it to work. In doubt, consult the manual of the platform or contact its
vendor.
====

'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
system. Each of those 'VF' can be used in a different VM, with full hardware
features and also better performance and lower latency than software
virtualized devices.

Currently, the most common use case for this are NICs (**N**etwork
**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
physical port. This allows using features such as checksum offloading, etc. to
be used inside a VM, reducing the (host) CPU overhead.

Host Configuration
^^^^^^^^^^^^^^^^^^

Generally, there are two methods for enabling virtual functions on a device.

* sometimes there is an option for the driver module e.g. for some
Intel drivers
+
----
 max_vfs=4
----
+
which could be put file with '.conf' ending under */etc/modprobe.d/*.
(Do not forget to update your initramfs after that)
+
Please refer to your driver module documentation for the exact
parameters and options.

* The second, more generic, approach is using the `sysfs`.
If a device and driver supports this you can change the number of VFs on
the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
+
----
# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
----
+
To make this change persistent you can use the `sysfsutils` Debian package.
After installation configure it via */etc/sysfs.conf* or a `FILE.conf' in
*/etc/sysfs.d/*.

VM Configuration
^^^^^^^^^^^^^^^^

After creating VFs, you should see them as separate PCI(e) devices when
outputting them with `lspci`. Get their ID and pass them through like a
xref:qm_pci_passthrough_vm_config[normal PCI(e) device].

Mediated Devices (vGPU, GVT-g)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mediated devices are another method to reuse features and performance from
physical hardware for virtualized hardware. These are found most common in
virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
GRID technology.

With this, a physical Card is able to create virtual cards, similar to SR-IOV.
The difference is that mediated devices do not appear as PCI(e) devices in the
host, and are such only suited for using in virtual machines.

Host Configuration
^^^^^^^^^^^^^^^^^^

In general your card's driver must support that feature, otherwise it will
not work. So please refer to your vendor for compatible drivers and how to
configure them.

Intel's drivers for GVT-g are integrated in the Kernel and should work
with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
v5 and E3 v6 Xeon Processors.

To enable it for Intel Graphics, you have to make sure to load the module
'kvmgt' (for example via `/etc/modules`) and to enable it on the
xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:

----
 i915.enable_gvt=1
----

After that remember to
xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
and reboot your host.

VM Configuration
^^^^^^^^^^^^^^^^

To use a mediated device, simply specify the `mdev` property on a `hostpciX`
VM configuration option.

You can get the supported devices via the 'sysfs'. For example, to list the
supported types for the device '0000:00:02.0' you would simply execute:

----
# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
----

Each entry is a directory which contains the following important files:

* 'available_instances' contains the amount of still available instances of
this type, each 'mdev' use in a VM reduces this.
* 'description' contains a short description about the capabilities of the type
* 'create' is the endpoint to create such a device, {pve} does this
automatically for you, if a 'hostpciX' option with `mdev` is configured.

Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):

----
# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
----

With this set, {pve} automatically creates such a device on VM start, and
cleans it up again when the VM stops.

Use in Clusters
~~~~~~~~~~~~~~~

It is also possible to map devices on a cluster level, so that they can be
properly used with HA and hardware changes are detected and non root users
can configure them. See xref:resource_mapping[Resource Mapping]
for details on that.

[[qm_pci_viommu]]
vIOMMU (emulated IOMMU)
~~~~~~~~~~~~~~~~~~~~~~~

vIOMMU is the emulation of a hardware IOMMU within a virtual machine, providing
improved memory access control and security for virtualized I/O devices. Using
the vIOMMU option also allows you to pass through PCI devices to level-2 VMs in
level-1 VMs via https://pve.proxmox.com/wiki/Nested_Virtualization[Nested Virtualization].
There are currently two vIOMMU implementations available: Intel and VirtIO.

Host requirement:

* Add `intel_iommu=on` or `amd_iommu=on` depending on your CPU to your kernel
command line.

Intel vIOMMU
^^^^^^^^^^^^

Intel vIOMMU specific VM requirements:

* Whether you are using an Intel or AMD CPU on your host, it is important to set
`intel_iommu=on` in the VMs kernel parameters.

* To use Intel vIOMMU you need to set *q35* as the machine type.

If all requirements are met, you can add `viommu=intel` to the machine parameter
in the configuration of the VM that should be able to pass through PCI devices.

----
# qm set VMID -machine q35,viommu=intel
----

https://wiki.qemu.org/Features/VT-d[QEMU documentation for VT-d]

VirtIO vIOMMU
^^^^^^^^^^^^^

This vIOMMU implementation is more recent and does not have as many limitations
as Intel vIOMMU but is currently less used in production and less documentated.

With VirtIO vIOMMU there is *no* need to set any kernel parameters. It is also
*not* necessary to use q35 as the machine type, but it is advisable if you want
to use PCIe.

----
# qm set VMID -machine q35,viommu=virtio
----

https://web.archive.org/web/20230804075844/https://michael2012z.medium.com/virtio-iommu-789369049443[Blog-Post by Michael Zhao explaining virtio-iommu]

ifdef::wiki[]

See Also
~~~~~~~~

* link:/wiki/Pci_passthrough[PCI Passthrough Examples]

endif::wiki[]
Commit	Line	Data
	1	[[qm_pci_passthrough]]
	2	PCI(e) Passthrough
	3	------------------
	4	ifdef::wiki[]
	5	:pve-toplevel:
	6	endif::wiki[]
	7
	8	PCI(e) passthrough is a mechanism to give a virtual machine control over
	9	a PCI device from the host. This can have some advantages over using
	10	virtualized hardware, for example lower latency, higher performance, or more
	11	features (e.g., offloading).
	12
	13	But, if you pass through a device to a virtual machine, you cannot use that
	14	device anymore on the host or in any other VM.
	15
	16	Note that, while PCI passthrough is available for i440fx and q35 machines, PCIe
	17	passthrough is only available on q35 machines. This does not mean that
	18	PCIe capable devices that are passed through as PCI devices will only run at
	19	PCI speeds. Passing through devices as PCIe just sets a flag for the guest to
	20	tell it that the device is a PCIe device instead of a "really fast legacy PCI
	21	device". Some guest applications benefit from this.
	22
	23	General Requirements
	24	~~~~~~~~~~~~~~~~~~~~
	25
	26	Since passthrough is performed on real hardware, it needs to fulfill some
	27	requirements. A brief overview of these requirements is given below, for more
	28	information on specific devices, see
	29	https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples].
	30
	31	Hardware
	32	^^^^^^^^
	33	Your hardware needs to support `IOMMU` (I/O Memory Management
	34	Unit) interrupt remapping, this includes the CPU and the motherboard.
	35
	36	Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this.
	37	But it is not guaranteed that everything will work out of the box, due
	38	to bad hardware implementation and missing or low quality drivers.
	39
	40	Further, server grade hardware has often better support than consumer grade
	41	hardware, but even then, many modern system can support this.
	42
	43	Please refer to your hardware vendor to check if they support this feature
	44	under Linux for your specific setup.
	45
	46	Determining PCI Card Address
	47	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	48
	49	The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's
	50	hardware tab. Alternatively, you can use the command line.
	51
	52	You can locate your card using
	53
	54	----
	55	lspci
	56	----
	57
	58	Configuration
	59	^^^^^^^^^^^^^
	60
	61	Once you ensured that your hardware supports passthrough, you will need to do
	62	some configuration to enable PCI(e) passthrough.
	63
	64	.IOMMU
	65
	66	First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the
	67	corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact
	68	option name in the manual of your motherboard.
	69
	70	For Intel CPUs, you also need to enable the IOMMU on the
	71	xref:sysboot_edit_kernel_cmdline[kernel command line] kernels by adding:
	72
	73	----
	74	intel_iommu=on
	75	----
	76
	77	For AMD CPUs it should be enabled automatically.
	78
	79	.IOMMU Passthrough Mode
	80
	81	If your hardware supports IOMMU passthrough mode, enabling this mode might
	82	increase performance.
	83	This is because VMs then bypass the (default) DMA translation normally
	84	performed by the hyper-visor and instead pass DMA requests directly to the
	85	hardware IOMMU. To enable these options, add:
	86
	87	----
	88	iommu=pt
	89	----
	90
	91	to the xref:sysboot_edit_kernel_cmdline[kernel commandline].
	92
	93	.Kernel Modules
	94
	95	//TODO: remove `vfio_virqfd` stuff with eol of pve 7
	96	You have to make sure the following modules are loaded. This can be achieved by
	97	adding them to `'/etc/modules''. In kernels newer than 6.2 ({pve} 8 and onward)
	98	the 'vfio_virqfd' module is part of the 'vfio' module, therefore loading
	99	'vfio_virqfd' in {pve} 8 and newer is not necessary.
	100
	101	----
	102	vfio
	103	vfio_iommu_type1
	104	vfio_pci
	105	vfio_virqfd #not needed if on kernel 6.2 or newer
	106	----
	107
	108	[[qm_pci_passthrough_update_initramfs]]
	109	After changing anything modules related, you need to refresh your
	110	`initramfs`. On {pve} this can be done by executing:
	111
	112	----
	113	# update-initramfs -u -k all
	114	----
	115
	116	To check if the modules are being loaded, the output of
	117
	118	----
	119	# lsmod \| grep vfio
	120	----
	121
	122	should include the four modules from above.
	123
	124	.Finish Configuration
	125
	126	Finally reboot to bring the changes into effect and check that it is indeed
	127	enabled.
	128
	129	----
	130	# dmesg \| grep -e DMAR -e IOMMU -e AMD-Vi
	131	----
	132
	133	should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
	134	enabled, depending on hardware and kernel the exact message can vary.
	135
	136	For notes on how to troubleshoot or verify if IOMMU is working as intended, please
	137	see the https://pve.proxmox.com/wiki/PCI_Passthrough#Verifying_IOMMU_parameters[Verifying IOMMU Parameters]
	138	section in our wiki.
	139
	140	It is also important that the device(s) you want to pass through
	141	are in a separate `IOMMU` group. This can be checked with a call to the {pve}
	142	API:
	143
	144	----
	145	# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""
	146	----
	147
	148	It is okay if the device is in an `IOMMU` group together with its functions
	149	(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
	150
	151	.PCI(e) slots
	152	[NOTE]
	153	====
	154	Some platforms handle their physical PCI(e) slots differently. So, sometimes
	155	it can help to put the card in a another PCI(e) slot, if you do not get the
	156	desired `IOMMU` group separation.
	157	====
	158
	159	.Unsafe interrupts
	160	[NOTE]
	161	====
	162	For some platforms, it may be necessary to allow unsafe interrupts.
	163	For this add the following line in a file ending with `.conf' file in
	164	/etc/modprobe.d/:
	165
	166	----
	167	options vfio_iommu_type1 allow_unsafe_interrupts=1
	168	----
	169
	170	Please be aware that this option can make your system unstable.
	171	====
	172
	173	GPU Passthrough Notes
	174	^^^^^^^^^^^^^^^^^^^^^
	175
	176	It is not possible to display the frame buffer of the GPU via NoVNC or SPICE on
	177	the {pve} web interface.
	178
	179	When passing through a whole GPU or a vGPU and graphic output is wanted, one
	180	has to either physically connect a monitor to the card, or configure a remote
	181	desktop software (for example, VNC or RDP) inside the guest.
	182
	183	If you want to use the GPU as a hardware accelerator, for example, for
	184	programs using OpenCL or CUDA, this is not required.
	185
	186	Host Device Passthrough
	187	~~~~~~~~~~~~~~~~~~~~~~~
	188
	189	The most used variant of PCI(e) passthrough is to pass through a whole
	190	PCI(e) card, for example a GPU or a network card.
	191
	192
	193	Host Configuration
	194	^^^^^^^^^^^^^^^^^^
	195
	196	{pve} tries to automatically make the PCI(e) device unavailable for the host.
	197	However, if this doesn't work, there are two things that can be done:
	198
	199	* pass the device IDs to the options of the 'vfio-pci' modules by adding
	200	+
	201	----
	202	options vfio-pci ids=1234:5678,4321:8765
	203	----
	204	+
	205	to a .conf file in /etc/modprobe.d/ where `1234:5678` and `4321:8765` are
	206	the vendor and device IDs obtained by:
	207	+
	208	----
	209	# lspci -nn
	210	----
	211
	212	* blacklist the driver on the host completely, ensuring that it is free to bind
	213	for passthrough, with
	214	+
	215	----
	216	blacklist DRIVERNAME
	217	----
	218	+
	219	in a .conf file in /etc/modprobe.d/.
	220	+
	221	To find the drivername, execute
	222	+
	223	----
	224	# lspci -k
	225	----
	226	+
	227	for example:
	228	+
	229	----
	230	# lspci -k \| grep -A 3 "VGA"
	231	----
	232	+
	233	will output something similar to
	234	+
	235	----
	236	01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
	237	Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030]
	238	Kernel driver in use: <some-module>
	239	Kernel modules: <some-module>
	240	----
	241	+
	242	Now we can blacklist the drivers by writing them into a .conf file:
	243	+
	244	----
	245	echo "blacklist <some-module>" >> /etc/modprobe.d/blacklist.conf
	246	----
	247
	248	For both methods you need to
	249	xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
	250	reboot after that.
	251
	252	Should this not work, you might need to set a soft dependency to load the gpu
	253	modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see
	254	also the manpages on 'modprobe.d' for more information.
	255
	256	For example, if you are using drivers named <some-module>:
	257
	258	----
	259	# echo "softdep <some-module> pre: vfio-pci" >> /etc/modprobe.d/<some-module>.conf
	260	----
	261
	262
	263	.Verify Configuration
	264
	265	To check if your changes were successful, you can use
	266
	267	----
	268	# lspci -nnk
	269	----
	270
	271	and check your device entry. If it says
	272
	273	----
	274	Kernel driver in use: vfio-pci
	275	----
	276
	277	or the 'in use' line is missing entirely, the device is ready to be used for
	278	passthrough.
	279
	280	[[qm_pci_passthrough_vm_config]]
	281	VM Configuration
	282	^^^^^^^^^^^^^^^^
	283	When passing through a GPU, the best compatibility is reached when using
	284	'q35' as machine type, 'OVMF' ('UEFI' for VMs) instead of SeaBIOS and PCIe
	285	instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
	286	GPU needs to have an UEFI capable ROM, otherwise use SeaBIOS instead. To check if
	287	the ROM is UEFI capable, see the
	288	https://pve.proxmox.com/wiki/PCI_Passthrough#How_to_know_if_a_graphics_card_is_UEFI_.28OVMF.29_compatible[PCI Passthrough Examples]
	289	wiki.
	290
	291	Furthermore, using OVMF, disabling vga arbitration may be possible, reducing the
	292	amount of legacy code needed to be run during boot. To disable vga arbitration:
	293
	294	----
	295	echo "options vfio-pci ids=<vendor-id>,<device-id> disable_vga=1" > /etc/modprobe.d/vfio.conf
	296	----
	297
	298	replacing the <vendor-id> and <device-id> with the ones obtained from:
	299
	300	----
	301	# lspci -nn
	302	----
	303
	304	PCI devices can be added in the web interface in the hardware section of the VM.
	305	Alternatively, you can use the command line; set the hostpciX option in the VM
	306	configuration, for example by executing:
	307
	308	----
	309	# qm set VMID -hostpci0 00:02.0
	310	----
	311
	312	or by adding a line to the VM configuration file:
	313
	314	----
	315	hostpci0: 00:02.0
	316	----
	317
	318
	319	If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ),
	320	you can pass them through all together with the shortened syntax ``00:02`'.
	321	This is equivalent with checking the ``All Functions`' checkbox in the
	322	web interface.
	323
	324	There are some options to which may be necessary, depending on the device
	325	and guest OS:
	326
	327	* x-vga=on\|off marks the PCI(e) device as the primary GPU of the VM.
	328	With this enabled the vga configuration option will be ignored.
	329
	330	* pcie=on\|off tells {pve} to use a PCIe or PCI port. Some guests/device
	331	combination require PCIe rather than PCI. PCIe is only available for 'q35'
	332	machine types.
	333
	334	* rombar=on\|off makes the firmware ROM visible for the guest. Default is on.
	335	Some PCI(e) devices need this disabled.
	336
	337	* romfile=<path>, is an optional path to a ROM file for the device to use.
	338	This is a relative path under /usr/share/kvm/.
	339
	340	.Example
	341
	342	An example of PCIe passthrough with a GPU set to primary:
	343
	344	----
	345	# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
	346	----
	347
	348	.PCI ID overrides
	349
	350	You can override the PCI vendor ID, device ID, and subsystem IDs that will be
	351	seen by the guest. This is useful if your device is a variant with an ID that
	352	your guest's drivers don't recognize, but you want to force those drivers to be
	353	loaded anyway (e.g. if you know your device shares the same chipset as a
	354	supported variant).
	355
	356	The available options are `vendor-id`, `device-id`, `sub-vendor-id`, and
	357	`sub-device-id`. You can set any or all of these to override your device's
	358	default IDs.
	359
	360	For example:
	361
	362	----
	363	# qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000
	364	----
	365
	366	SR-IOV
	367	~~~~~~
	368
	369	Another variant for passing through PCI(e) devices is to use the hardware
	370	virtualization features of your devices, if available.
	371
	372	.Enabling SR-IOV
	373	[NOTE]
	374	====
	375	To use SR-IOV, platform support is especially important. It may be necessary
	376	to enable this feature in the BIOS/UEFI first, or to use a specific PCI(e) port
	377	for it to work. In doubt, consult the manual of the platform or contact its
	378	vendor.
	379	====
	380
	381	'SR-IOV' (Single-Root Input/Output Virtualization) enables
	382	a single device to provide multiple 'VF' (Virtual Functions) to the
	383	system. Each of those 'VF' can be used in a different VM, with full hardware
	384	features and also better performance and lower latency than software
	385	virtualized devices.
	386
	387	Currently, the most common use case for this are NICs (Network
	388	Interface Card) with SR-IOV support, which can provide multiple VFs per
	389	physical port. This allows using features such as checksum offloading, etc. to
	390	be used inside a VM, reducing the (host) CPU overhead.
	391
	392	Host Configuration
	393	^^^^^^^^^^^^^^^^^^
	394
	395	Generally, there are two methods for enabling virtual functions on a device.
	396
	397	* sometimes there is an option for the driver module e.g. for some
	398	Intel drivers
	399	+
	400	----
	401	max_vfs=4
	402	----
	403	+
	404	which could be put file with '.conf' ending under /etc/modprobe.d/.
	405	(Do not forget to update your initramfs after that)
	406	+
	407	Please refer to your driver module documentation for the exact
	408	parameters and options.
	409
	410	* The second, more generic, approach is using the `sysfs`.
	411	If a device and driver supports this you can change the number of VFs on
	412	the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
	413	+
	414	----
	415	# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
	416	----
	417	+
	418	To make this change persistent you can use the `sysfsutils` Debian package.
	419	After installation configure it via /etc/sysfs.conf or a `FILE.conf' in
	420	/etc/sysfs.d/.
	421
	422	VM Configuration
	423	^^^^^^^^^^^^^^^^
	424
	425	After creating VFs, you should see them as separate PCI(e) devices when
	426	outputting them with `lspci`. Get their ID and pass them through like a
	427	xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
	428
	429	Mediated Devices (vGPU, GVT-g)
	430	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	431
	432	Mediated devices are another method to reuse features and performance from
	433	physical hardware for virtualized hardware. These are found most common in
	434	virtualized GPU setups such as Intel's GVT-g and NVIDIA's vGPUs used in their
	435	GRID technology.
	436
	437	With this, a physical Card is able to create virtual cards, similar to SR-IOV.
	438	The difference is that mediated devices do not appear as PCI(e) devices in the
	439	host, and are such only suited for using in virtual machines.
	440
	441	Host Configuration
	442	^^^^^^^^^^^^^^^^^^
	443
	444	In general your card's driver must support that feature, otherwise it will
	445	not work. So please refer to your vendor for compatible drivers and how to
	446	configure them.
	447
	448	Intel's drivers for GVT-g are integrated in the Kernel and should work
	449	with 5th, 6th and 7th generation Intel Core Processors, as well as E3 v4, E3
	450	v5 and E3 v6 Xeon Processors.
	451
	452	To enable it for Intel Graphics, you have to make sure to load the module
	453	'kvmgt' (for example via `/etc/modules`) and to enable it on the
	454	xref:sysboot_edit_kernel_cmdline[Kernel commandline] and add the following parameter:
	455
	456	----
	457	i915.enable_gvt=1
	458	----
	459
	460	After that remember to
	461	xref:qm_pci_passthrough_update_initramfs[update the `initramfs`],
	462	and reboot your host.
	463
	464	VM Configuration
	465	^^^^^^^^^^^^^^^^
	466
	467	To use a mediated device, simply specify the `mdev` property on a `hostpciX`
	468	VM configuration option.
	469
	470	You can get the supported devices via the 'sysfs'. For example, to list the
	471	supported types for the device '0000:00:02.0' you would simply execute:
	472
	473	----
	474	# ls /sys/bus/pci/devices/0000:00:02.0/mdev_supported_types
	475	----
	476
	477	Each entry is a directory which contains the following important files:
	478
	479	* 'available_instances' contains the amount of still available instances of
	480	this type, each 'mdev' use in a VM reduces this.
	481	* 'description' contains a short description about the capabilities of the type
	482	* 'create' is the endpoint to create such a device, {pve} does this
	483	automatically for you, if a 'hostpciX' option with `mdev` is configured.
	484
	485	Example configuration with an `Intel GVT-g vGPU` (`Intel Skylake 6700k`):
	486
	487	----
	488	# qm set VMID -hostpci0 00:02.0,mdev=i915-GVTg_V5_4
	489	----
	490
	491	With this set, {pve} automatically creates such a device on VM start, and
	492	cleans it up again when the VM stops.
	493
	494	Use in Clusters
	495	~~~~~~~~~~~~~~~
	496
	497	It is also possible to map devices on a cluster level, so that they can be
	498	properly used with HA and hardware changes are detected and non root users
	499	can configure them. See xref:resource_mapping[Resource Mapping]
	500	for details on that.
	501
	502	[[qm_pci_viommu]]
	503	vIOMMU (emulated IOMMU)
	504	~~~~~~~~~~~~~~~~~~~~~~~
	505
	506	vIOMMU is the emulation of a hardware IOMMU within a virtual machine, providing
	507	improved memory access control and security for virtualized I/O devices. Using
	508	the vIOMMU option also allows you to pass through PCI devices to level-2 VMs in
	509	level-1 VMs via https://pve.proxmox.com/wiki/Nested_Virtualization[Nested Virtualization].
	510	There are currently two vIOMMU implementations available: Intel and VirtIO.
	511
	512	Host requirement:
	513
	514	* Add `intel_iommu=on` or `amd_iommu=on` depending on your CPU to your kernel
	515	command line.
	516
	517	Intel vIOMMU
	518	^^^^^^^^^^^^
	519
	520	Intel vIOMMU specific VM requirements:
	521
	522	* Whether you are using an Intel or AMD CPU on your host, it is important to set
	523	`intel_iommu=on` in the VMs kernel parameters.
	524
	525	* To use Intel vIOMMU you need to set q35 as the machine type.
	526
	527	If all requirements are met, you can add `viommu=intel` to the machine parameter
	528	in the configuration of the VM that should be able to pass through PCI devices.
	529
	530	----
	531	# qm set VMID -machine q35,viommu=intel
	532	----
	533
	534	https://wiki.qemu.org/Features/VT-d[QEMU documentation for VT-d]
	535
	536	VirtIO vIOMMU
	537	^^^^^^^^^^^^^
	538
	539	This vIOMMU implementation is more recent and does not have as many limitations
	540	as Intel vIOMMU but is currently less used in production and less documentated.
	541
	542	With VirtIO vIOMMU there is no need to set any kernel parameters. It is also
	543	not necessary to use q35 as the machine type, but it is advisable if you want
	544	to use PCIe.
	545
	546	----
	547	# qm set VMID -machine q35,viommu=virtio
	548	----
	549
	550	https://web.archive.org/web/20230804075844/https://michael2012z.medium.com/virtio-iommu-789369049443[Blog-Post by Michael Zhao explaining virtio-iommu]
	551
	552	ifdef::wiki[]
	553
	554	See Also
	555	~~~~~~~~
	556
	557	* link:/wiki/Pci_passthrough[PCI Passthrough Examples]
	558
	559	endif::wiki[]