it is usually a safe bet to set the number of *Total cores* to 2.
NOTE: It is perfectly safe if the _overall_ number of cores of all your VMs
-is greater than the number of cores on the server (e.g., 4 VMs with each 4
-cores on a machine with only 8 cores). In that case the host system will
-balance the Qemu execution threads between your server cores, just like if you
-were running a standard multi-threaded application. However, {pve} will prevent
-you from starting VMs with more virtual CPU cores than physically available, as
-this will only bring the performance down due to the cost of context switches.
+is greater than the number of cores on the server (for example, 4 VMs each with
+4 cores (= total 16) on a machine with only 8 cores). In that case the host
+system will balance the QEMU execution threads between your server cores, just
+like if you were running a standard multi-threaded application. However, {pve}
+will prevent you from starting VMs with more virtual CPU cores than physically
+available, as this will only bring the performance down due to the cost of
+context switches.
[[qm_cpu_resource_limits]]
Resource Limits
real host cores CPU time. But, if only 4 would do work they could still get
almost 100% of a real core each.
-NOTE: VMs can, depending on their configuration, use additional threads e.g.,
-for networking or IO operations but also live migration. Thus a VM can show up
-to use more CPU time than just its virtual CPUs could use. To ensure that a VM
-never uses more CPU time than virtual CPUs assigned set the *cpulimit* setting
-to the same value as the total core count.
+NOTE: VMs can, depending on their configuration, use additional threads, such
+as for networking or IO operations but also live migration. Thus a VM can show
+up to use more CPU time than just its virtual CPUs could use. To ensure that a
+VM never uses more CPU time than virtual CPUs assigned set the *cpulimit*
+setting to the same value as the total core count.
The second CPU resource limiting setting, *cpuunits* (nowadays often called CPU
-shares or CPU weight), controls how much CPU time a VM gets in regards to other
-VMs running. It is a relative weight which defaults to `1024`, if you increase
-this for a VM it will be prioritized by the scheduler in comparison to other
-VMs with lower weight. E.g., if VM 100 has set the default 1024 and VM 200 was
-changed to `2048`, the latter VM 200 would receive twice the CPU bandwidth than
-the first VM 100.
+shares or CPU weight), controls how much CPU time a VM gets compared to other
+running VMs. It is a relative weight which defaults to `100` (or `1024` if the
+host uses legacy cgroup v1). If you increase this for a VM it will be
+prioritized by the scheduler in comparison to other VMs with lower weight. For
+example, if VM 100 has set the default `100` and VM 200 was changed to `200`,
+the latter VM 200 would receive twice the CPU bandwidth than the first VM 100.
For more information see `man systemd.resource-control`, here `CPUQuota`
-corresponds to `cpulimit` and `CPUShares` corresponds to our `cpuunits`
+corresponds to `cpulimit` and `CPUWeight` corresponds to our `cpuunits`
setting, visit its Notes section for references and implementation details.
+The third CPU resource limiting setting, *affinity*, controls what host cores
+the virtual machine will be permitted to execute on. E.g., if an affinity value
+of `0-3,8-11` is provided, the virtual machine will be restricted to using the
+host cores `0,1,2,3,8,9,10,` and `11`. Valid *affinity* values are written in
+cpuset `List Format`. List Format is a comma-separated list of CPU numbers and
+ranges of numbers, in ASCII decimal.
+
+NOTE: CPU *affinity* uses the `taskset` command to restrict virtual machines to
+a given set of cores. This restriction will not take effect for some types of
+processes that may be created for IO. *CPU affinity is not a security feature.*
+
+For more information regarding *affinity* see `man cpuset`. Here the
+`List Format` corresponds to valid *affinity* values. Visit its `Formats`
+section for more examples.
+
CPU Type
^^^^^^^^
Save this under /etc/udev/rules.d/ as a file ending in `.rules`.
-Note: CPU hot-remove is machine dependent and requires guest cooperation.
-The deletion command does not guarantee CPU removal to actually happen,
-typically it's a request forwarded to guest using target dependent mechanism,
-e.g., ACPI on x86/amd64.
+Note: CPU hot-remove is machine dependent and requires guest cooperation. The
+deletion command does not guarantee CPU removal to actually happen, typically
+it's a request forwarded to guest OS using target dependent mechanism, such as
+ACPI on x86/amd64.
[[qm_memory]]
VM, because it delivers useful information such as how much memory the guest
really uses.
In general, you should leave *ballooning* enabled, but if you want to disable
-it (e.g. for debugging purposes), simply uncheck
-*Ballooning Device* or set
+it (like for debugging purposes), simply uncheck *Ballooning Device* or set
balloon: 0
You can also skip adding a network device when creating a VM by selecting *No
network device*.
+You can overwrite the *MTU* setting for each VM network device. The option
+`mtu=1` represents a special case, in which the MTU value will be inherited
+from the underlying bridge.
+This option is only available for *VirtIO* network devices.
+
.Multiqueue
If you are using the VirtIO driver, you can optionally activate the
*Multiqueue* option. This option allows the guest OS to process networking
* *cirrus*, this was once the default, it emulates a very old hardware module
with all its problems. This display type should only be used if really
necessary footnote:[https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful/
-qemu: using cirrus considered harmful], e.g., if using Windows XP or earlier
+qemu: using cirrus considered harmful], for example, if using Windows XP or
+earlier
* *vmware*, is a VMWare SVGA-II compatible adapter.
* *qxl*, is the QXL paravirtualized graphics card. Selecting this also
enables https://www.spice-space.org/[SPICE] (a remote viewer protocol) for the
especially with SPICE/QXL.
As the memory is reserved by display device, selecting Multi-Monitor mode
-for SPICE (e.g., `qxl2` for dual monitors) has some implications:
+for SPICE (such as `qxl2` for dual monitors) has some implications:
* Windows needs a device for each monitor, so if your 'ostype' is some
version of Windows, {pve} gives the VM an extra device per monitor.
compatible implementation instead. In such cases, you must rather use *OVMF*,
which is an open-source UEFI implementation. footnote:[See the OVMF Project https://github.com/tianocore/tianocore.github.io/wiki/OVMF]
-There are other scenarios in which a BIOS is not a good firmware to boot from,
-e.g. if you want to do VGA passthrough. footnote:[Alex Williamson has a very
-good blog entry about this https://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html]
+There are other scenarios in which the SeaBIOS may not be the ideal firmware to
+boot from, for example if you want to do VGA passthrough. footnote:[Alex
+Williamson has a good blog entry about this
+https://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html]
If you want to use OVMF, there are several things to consider:
encryption keys - securely and provides tamper-resistance functions for
validating system boot.
-Certain operating systems (e.g. Windows 11) require such a device to be attached
-to a machine (be it physical or virtual).
+Certain operating systems (such as Windows 11) require such a device to be
+attached to a machine (be it physical or virtual).
A TPM is added by specifying a *tpmstate* volume. This works similar to an
efidisk, in that it cannot be changed (only removed) once created. You can add
~~~~~~~~~~~~~~~~~
QEMU can tell the guest which devices it should boot from, and in which order.
-This can be specified in the config via the `boot` property, e.g.:
+This can be specified in the config via the `boot` property, for example:
----
boot: order=scsi0;net0;hostpci0
to other guest systems. For this you can use the following
parameters:
-* *Start/Shutdown order*: Defines the start order priority. E.g. set it to 1 if
+* *Start/Shutdown order*: Defines the start order priority. For example, set it
+* to 1 if
you want the VM to be the first to be started. (We use the reverse startup
order for shutdown, so a machine with a start order of 1 would be the last to
be shut down). If multiple VMs have the same order defined on a host, they will
additionally be ordered by 'VMID' in ascending order.
* *Startup delay*: Defines the interval between this VM start and subsequent
-VMs starts . E.g. set it to 240 if you want to wait 240 seconds before starting
-other VMs.
+VMs starts. For example, set it to 240 if you want to wait 240 seconds before
+starting other VMs.
* *Shutdown timeout*: Defines the duration in seconds {pve} should wait
-for the VM to be offline after issuing a shutdown command.
-By default this value is set to 180, which means that {pve} will issue a
-shutdown request and wait 180 seconds for the machine to be offline. If
-the machine is still online after the timeout it will be stopped forcefully.
+for the VM to be offline after issuing a shutdown command. By default this
+value is set to 180, which means that {pve} will issue a shutdown request and
+wait 180 seconds for the machine to be offline. If the machine is still online
+after the timeout it will be stopped forcefully.
NOTE: VMs managed by the HA stack do not follow the 'start on boot' and
'boot order' options currently. Those VMs will be skipped by the startup and
Online Migration
~~~~~~~~~~~~~~~~
-When your VM is running and it has no local resources defined (such as disks
-on local storage, passed through devices, etc.) you can initiate a live
-migration with the -online flag.
+If your VM is running and no locally bound resources are configured (such as
+passed-through devices), you can initiate a live migration with the `--online`
+flag in the `qm migration` command evocation. The web-interface defaults to
+live migration when the VM is running.
How it works
^^^^^^^^^^^^
-This starts a Qemu Process on the target host with the 'incoming' flag, which
-means that the process starts and waits for the memory data and device states
-from the source Virtual Machine (since all other resources, e.g. disks,
-are shared, the memory content and device state are the only things left
-to transmit).
-
-Once this connection is established, the source begins to send the memory
-content asynchronously to the target. If the memory on the source changes,
-those sections are marked dirty and there will be another pass of sending data.
-This happens until the amount of data to send is so small that it can
-pause the VM on the source, send the remaining data to the target and start
-the VM on the target in under a second.
+Online migration first starts a new QEMU process on the target host with the
+'incoming' flag, which performs only basic initialization with the guest vCPUs
+still paused and then waits for the guest memory and device state data streams
+of the source Virtual Machine.
+All other resources, such as disks, are either shared or got already sent
+before runtime state migration of the VMs begins; so only the memory content
+and device state remain to be transferred.
+
+Once this connection is established, the source begins asynchronously sending
+the memory content to the target. If the guest memory on the source changes,
+those sections are marked dirty and another pass is made to send the guest
+memory data.
+This loop is repeated until the data difference between running source VM
+and incoming target VM is small enough to be sent in a few milliseconds,
+because then the source VM can be paused completely, without a user or program
+noticing the pause, so that the remaining data can be sent to the target, and
+then unpause the targets VM's CPU to make it the new running VM in well under a
+second.
Requirements
^^^^^^^^^^^^
For Live Migration to work, there are some things required:
-* The VM has no local resources (e.g. passed through devices, local disks, etc.)
-* The hosts are in the same {pve} cluster.
-* The hosts have a working (and reliable) network connection.
-* The target host must have the same or higher versions of the
- {pve} packages. (It *might* work the other way, but this is never guaranteed)
-* The hosts have CPUs from the same vendor. (It *might* work otherwise, but this
- is never guaranteed)
+* The VM has no local resources that cannot be migrated. For example,
+ PCI or USB devices that are passed through currently block live-migration.
+ Local Disks, on the other hand, can be migrated by sending them to the target
+ just fine.
+* The hosts are located in the same {pve} cluster.
+* The hosts have a working (and reliable) network connection between them.
+* The target host must have the same, or higher versions of the
+ {pve} packages. Although it can sometimes work the other way around, this
+ cannot be guaranteed.
+* The hosts have CPUs from the same vendor with similar capabilities. Different
+ vendor *might* work depending on the actual models and VMs CPU type
+ configured, but it cannot be guaranteed - so please test before deploying
+ such a setup in production.
Offline Migration
~~~~~~~~~~~~~~~~~
-If you have local resources, you can still offline migrate your VMs,
-as long as all disk are on storages, which are defined on both hosts.
-Then the migration will copy the disk over the network to the target host.
+If you have local resources, you can still migrate your VMs offline as long as
+all disk are on storage defined on both hosts.
+Migration then copies the disks to the target host over the network, as with
+online migration. Note that any hardware pass-through configuration may need to
+be adapted to the device location on the target host.
+
+// TODO: mention hardware map IDs as better way to solve that, once available
[[qm_copy_and_clone]]
Copies and Clones
To create and add a 'vmgenid' to an already existing VM one can pass the
special value `1' to let {pve} autogenerate one or manually set the 'UUID'
-footnote:[Online GUID generator http://guid.one/] by using it as value,
-e.g.:
+footnote:[Online GUID generator http://guid.one/] by using it as value, for
+example:
----
# qm set VMID -vmgenid 1
The most prominent use case for 'vmgenid' are newer Microsoft Windows
operating systems, which use it to avoid problems in time sensitive or
-replicate services (e.g., databases, domain controller
+replicate services (such as databases or domain controller
footnote:[https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/virtualized-domain-controller-architecture])
on snapshot rollback, backup restore or a whole VM clone operation.
Locks
-----
-Online migrations, snapshots and backups (`vzdump`) set a lock to
-prevent incompatible concurrent actions on the affected VMs. Sometimes
-you need to remove such a lock manually (e.g., after a power failure).
+Online migrations, snapshots and backups (`vzdump`) set a lock to prevent
+incompatible concurrent actions on the affected VMs. Sometimes you need to
+remove such a lock manually (for example after a power failure).
----
# qm unlock <vmid>