X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=qm.adoc;h=bd10ca1e3c349671006cf5a1e20949da0d2bea7e;hp=14de2d14ce63bcfc38a35158fd3da5e89f0f2ff7;hb=effa481895e75d5f55be56c5da225d36e74dd671;hpb=a7f36905f70da1a2a26753a2e83f900657dd5776 diff --git a/qm.adoc b/qm.adoc index 14de2d1..bd10ca1 100644 --- a/qm.adoc +++ b/qm.adoc @@ -1,7 +1,8 @@ +[[chapter_virtual_machines]] ifdef::manvolnum[] -PVE({manvolnum}) -================ -include::attributes.txt[] +qm(1) +===== +:pve-toplevel: NAME ---- @@ -9,7 +10,7 @@ NAME qm - Qemu/KVM Virtual Machine Manager -SYNOPSYS +SYNOPSIS -------- include::qm.1-synopsis.adoc[] @@ -17,30 +18,1025 @@ include::qm.1-synopsis.adoc[] DESCRIPTION ----------- endif::manvolnum[] - ifndef::manvolnum[] Qemu/KVM Virtual Machines ========================= -include::attributes.txt[] +:pve-toplevel: endif::manvolnum[] +// deprecates +// http://pve.proxmox.com/wiki/Container_and_Full_Virtualization +// http://pve.proxmox.com/wiki/KVM +// http://pve.proxmox.com/wiki/Qemu_Server + +Qemu (short form for Quick Emulator) is an open source hypervisor that emulates a +physical computer. From the perspective of the host system where Qemu is +running, Qemu is a user program which has access to a number of local resources +like partitions, files, network cards which are then passed to an +emulated computer which sees them as if they were real devices. + +A guest operating system running in the emulated computer accesses these +devices, and runs as it were running on real hardware. For instance you can pass +an iso image as a parameter to Qemu, and the OS running in the emulated computer +will see a real CDROM inserted in a CD drive. + +Qemu can emulate a great variety of hardware from ARM to Sparc, but {pve} is +only concerned with 32 and 64 bits PC clone emulation, since it represents the +overwhelming majority of server hardware. The emulation of PC clones is also one +of the fastest due to the availability of processor extensions which greatly +speed up Qemu when the emulated architecture is the same as the host +architecture. + +NOTE: You may sometimes encounter the term _KVM_ (Kernel-based Virtual Machine). +It means that Qemu is running with the support of the virtualization processor +extensions, via the Linux kvm module. In the context of {pve} _Qemu_ and +_KVM_ can be used interchangeably as Qemu in {pve} will always try to load the kvm +module. + +Qemu inside {pve} runs as a root process, since this is required to access block +and PCI devices. + + +Emulated devices and paravirtualized devices +-------------------------------------------- + +The PC hardware emulated by Qemu includes a mainboard, network controllers, +scsi, ide and sata controllers, serial ports (the complete list can be seen in +the `kvm(1)` man page) all of them emulated in software. All these devices +are the exact software equivalent of existing hardware devices, and if the OS +running in the guest has the proper drivers it will use the devices as if it +were running on real hardware. This allows Qemu to runs _unmodified_ operating +systems. + +This however has a performance cost, as running in software what was meant to +run in hardware involves a lot of extra work for the host CPU. To mitigate this, +Qemu can present to the guest operating system _paravirtualized devices_, where +the guest OS recognizes it is running inside Qemu and cooperates with the +hypervisor. + +Qemu relies on the virtio virtualization standard, and is thus able to present +paravirtualized virtio devices, which includes a paravirtualized generic disk +controller, a paravirtualized network card, a paravirtualized serial port, +a paravirtualized SCSI controller, etc ... + +It is highly recommended to use the virtio devices whenever you can, as they +provide a big performance improvement. Using the virtio generic disk controller +versus an emulated IDE controller will double the sequential write throughput, +as measured with `bonnie++(8)`. Using the virtio network interface can deliver +up to three times the throughput of an emulated Intel E1000 network card, as +measured with `iperf(1)`. footnote:[See this benchmark on the KVM wiki +http://www.linux-kvm.org/page/Using_VirtIO_NIC] + + +[[qm_virtual_machines_settings]] +Virtual Machines Settings +------------------------- + +Generally speaking {pve} tries to choose sane defaults for virtual machines +(VM). Make sure you understand the meaning of the settings you change, as it +could incur a performance slowdown, or putting your data at risk. + + +[[qm_general_settings]] +General Settings +~~~~~~~~~~~~~~~~ + +[thumbnail="screenshot/gui-create-vm-general.png"] + +General settings of a VM include + +* the *Node* : the physical server on which the VM will run +* the *VM ID*: a unique number in this {pve} installation used to identify your VM +* *Name*: a free form text string you can use to describe the VM +* *Resource Pool*: a logical group of VMs + + +[[qm_os_settings]] +OS Settings +~~~~~~~~~~~ + +[thumbnail="screenshot/gui-create-vm-os.png"] + +When creating a VM, setting the proper Operating System(OS) allows {pve} to +optimize some low level parameters. For instance Windows OS expect the BIOS +clock to use the local time, while Unix based OS expect the BIOS clock to have +the UTC time. + + +[[qm_hard_disk]] +Hard Disk +~~~~~~~~~ + +Qemu can emulate a number of storage controllers: + +* the *IDE* controller, has a design which goes back to the 1984 PC/AT disk +controller. Even if this controller has been superseded by recent designs, +each and every OS you can think of has support for it, making it a great choice +if you want to run an OS released before 2003. You can connect up to 4 devices +on this controller. + +* the *SATA* (Serial ATA) controller, dating from 2003, has a more modern +design, allowing higher throughput and a greater number of devices to be +connected. You can connect up to 6 devices on this controller. + +* the *SCSI* controller, designed in 1985, is commonly found on server grade +hardware, and can connect up to 14 storage devices. {pve} emulates by default a +LSI 53C895A controller. ++ +A SCSI controller of type _VirtIO SCSI_ is the recommended setting if you aim for +performance and is automatically selected for newly created Linux VMs since +{pve} 4.3. Linux distributions have support for this controller since 2012, and +FreeBSD since 2014. For Windows OSes, you need to provide an extra iso +containing the drivers during the installation. +// https://pve.proxmox.com/wiki/Paravirtualized_Block_Drivers_for_Windows#During_windows_installation. +If you aim at maximum performance, you can select a SCSI controller of type +_VirtIO SCSI single_ which will allow you to select the *IO Thread* option. +When selecting _VirtIO SCSI single_ Qemu will create a new controller for +each disk, instead of adding all disks to the same controller. + +* The *VirtIO Block* controller, often just called VirtIO or virtio-blk, +is an older type of paravirtualized controller. It has been superseded by the +VirtIO SCSI Controller, in terms of features. + +[thumbnail="screenshot/gui-create-vm-hard-disk.png"] +On each controller you attach a number of emulated hard disks, which are backed +by a file or a block device residing in the configured storage. The choice of +a storage type will determine the format of the hard disk image. Storages which +present block devices (LVM, ZFS, Ceph) will require the *raw disk image format*, +whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose +either the *raw disk image format* or the *QEMU image format*. + + * the *QEMU image format* is a copy on write format which allows snapshots, and + thin provisioning of the disk image. + * the *raw disk image* is a bit-to-bit image of a hard disk, similar to what + you would get when executing the `dd` command on a block device in Linux. This + format does not support thin provisioning or snapshots by itself, requiring + cooperation from the storage layer for these tasks. It may, however, be up to + 10% faster than the *QEMU image format*. footnote:[See this benchmark for details + http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf] + * the *VMware image format* only makes sense if you intend to import/export the + disk image to other hypervisors. + +Setting the *Cache* mode of the hard drive will impact how the host system will +notify the guest systems of block write completions. The *No cache* default +means that the guest system will be notified that a write is complete when each +block reaches the physical storage write queue, ignoring the host page cache. +This provides a good balance between safety and speed. + +If you want the {pve} backup manager to skip a disk when doing a backup of a VM, +you can set the *No backup* option on that disk. + +If you want the {pve} storage replication mechanism to skip a disk when starting + a replication job, you can set the *Skip replication* option on that disk. +As of {pve} 5.0, replication requires the disk images to be on a storage of type +`zfspool`, so adding a disk image to other storages when the VM has replication +configured requires to skip replication for this disk image. + +If your storage supports _thin provisioning_ (see the storage chapter in the +{pve} guide), and your VM has a *SCSI* controller you can activate the *Discard* +option on the hard disks connected to that controller. With *Discard* enabled, +when the filesystem of a VM marks blocks as unused after removing files, the +emulated SCSI controller will relay this information to the storage, which will +then shrink the disk image accordingly. + +.IO Thread +The option *IO Thread* can only be used when using a disk with the +*VirtIO* controller, or with the *SCSI* controller, when the emulated controller + type is *VirtIO SCSI single*. +With this enabled, Qemu creates one I/O thread per storage controller, +instead of a single thread for all I/O, so it increases performance when +multiple disks are used and each disk has its own storage controller. +Note that backups do not currently work with *IO Thread* enabled. + + +[[qm_cpu]] +CPU +~~~ + +[thumbnail="screenshot/gui-create-vm-cpu.png"] + +A *CPU socket* is a physical slot on a PC motherboard where you can plug a CPU. +This CPU can then contain one or many *cores*, which are independent +processing units. Whether you have a single CPU socket with 4 cores, or two CPU +sockets with two cores is mostly irrelevant from a performance point of view. +However some software licenses depend on the number of sockets a machine has, +in that case it makes sense to set the number of sockets to what the license +allows you. + +Increasing the number of virtual cpus (cores and sockets) will usually provide a +performance improvement though that is heavily dependent on the use of the VM. +Multithreaded applications will of course benefit from a large number of +virtual cpus, as for each virtual cpu you add, Qemu will create a new thread of +execution on the host system. If you're not sure about the workload of your VM, +it is usually a safe bet to set the number of *Total cores* to 2. + +NOTE: It is perfectly safe if the _overall_ number of cores of all your VMs +is greater than the number of cores on the server (e.g., 4 VMs with each 4 +cores on a machine with only 8 cores). In that case the host system will +balance the Qemu execution threads between your server cores, just like if you +were running a standard multithreaded application. However, {pve} will prevent +you from assigning more virtual CPU cores than physically available, as this will +only bring the performance down due to the cost of context switches. + +[[qm_cpu_resource_limits]] +Resource Limits +^^^^^^^^^^^^^^^ + +In addition to the number of virtual cores, you can configure how much resources +a VM can get in relation to the host CPU time and also in relation to other +VMs. +With the *cpulimit* (``Host CPU Time'') option you can limit how much CPU time +the whole VM can use on the host. It is a floating point value representing CPU +time in percent, so `1.0` is equal to `100%`, `2.5` to `250%` and so on. If a +single process would fully use one single core it would have `100%` CPU Time +usage. If a VM with four cores utilizes all its cores fully it would +theoretically use `400%`. In reality the usage may be even a bit higher as Qemu +can have additional threads for VM peripherals besides the vCPU core ones. +This setting can be useful if a VM should have multiple vCPUs, as it runs a few +processes in parallel, but the VM as a whole should not be able to run all +vCPUs at 100% at the same time. Using a specific example: lets say we have a VM +which would profit from having 8 vCPUs, but at no time all of those 8 cores +should run at full load - as this would make the server so overloaded that +other VMs and CTs would get to less CPU. So, we set the *cpulimit* limit to +`4.0` (=400%). If all cores do the same heavy work they would all get 50% of a +real host cores CPU time. But, if only 4 would do work they could still get +almost 100% of a real core each. + +NOTE: VMs can, depending on their configuration, use additional threads e.g., +for networking or IO operations but also live migration. Thus a VM can show up +to use more CPU time than just its virtual CPUs could use. To ensure that a VM +never uses more CPU time than virtual CPUs assigned set the *cpulimit* setting +to the same value as the total core count. + +The second CPU resource limiting setting, *cpuunits* (nowadays often called CPU +shares or CPU weight), controls how much CPU time a VM gets in regards to other +VMs running. It is a relative weight which defaults to `1024`, if you increase +this for a VM it will be prioritized by the scheduler in comparison to other +VMs with lower weight. E.g., if VM 100 has set the default 1024 and VM 200 was +changed to `2048`, the latter VM 200 would receive twice the CPU bandwidth than +the first VM 100. + +For more information see `man systemd.resource-control`, here `CPUQuota` +corresponds to `cpulimit` and `CPUShares` corresponds to our `cpuunits` +setting, visit its Notes section for references and implementation details. + +CPU Type +^^^^^^^^ + +Qemu can emulate a number different of *CPU types* from 486 to the latest Xeon +processors. Each new processor generation adds new features, like hardware +assisted 3d rendering, random number generation, memory protection, etc ... +Usually you should select for your VM a processor type which closely matches the +CPU of the host system, as it means that the host CPU features (also called _CPU +flags_ ) will be available in your VMs. If you want an exact match, you can set +the CPU type to *host* in which case the VM will have exactly the same CPU flags +as your host system. + +This has a downside though. If you want to do a live migration of VMs between +different hosts, your VM might end up on a new system with a different CPU type. +If the CPU flags passed to the guest are missing, the qemu process will stop. To +remedy this Qemu has also its own CPU type *kvm64*, that {pve} uses by defaults. +kvm64 is a Pentium 4 look a like CPU type, which has a reduced CPU flags set, +but is guaranteed to work everywhere. + +In short, if you care about live migration and moving VMs between nodes, leave +the kvm64 default. If you don’t care about live migration or have a homogeneous +cluster where all nodes have the same CPU, set the CPU type to host, as in +theory this will give your guests maximum performance. + +Meltdown / Spectre related CPU flags +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are two CPU flags related to the Meltdown and Spectre vulnerabilities +footnote:[Meltdown Attack https://meltdownattack.com/] which need to be set +manually unless the selected CPU type of your VM already enables them by default. + +The first, called 'pcid', helps to reduce the performance impact of the Meltdown +mitigation called 'Kernel Page-Table Isolation (KPTI)', which effectively hides +the Kernel memory from the user space. Without PCID, KPTI is quite an expensive +mechanism footnote:[PCID is now a critical performance/security feature on x86 +https://groups.google.com/forum/m/#!topic/mechanical-sympathy/L9mHTbeQLNU]. + +The second CPU flag is called 'spec-ctrl', which allows an operating system to +selectively disable or restrict speculative execution in order to limit the +ability of attackers to exploit the Spectre vulnerability. + +There are two requirements that need to be fulfilled in order to use these two +CPU flags: + +* The host CPU(s) must support the feature and propagate it to the guest's virtual CPU(s) +* The guest operating system must be updated to a version which mitigates the + attacks and is able to utilize the CPU feature + +In order to use 'spec-ctrl', your CPU or system vendor also needs to provide a +so-called ``microcode update'' footnote:[You can use `intel-microcode' / +`amd-microcode' from Debian non-free if your vendor does not provide such an +update. Note that not all affected CPUs can be updated to support spec-ctrl.] +for your CPU. + +To check if the {pve} host supports PCID, execute the following command as root: + +---- +# grep ' pcid ' /proc/cpuinfo +---- + +If this does not return empty your host's CPU has support for 'pcid'. + +To check if the {pve} host supports spec-ctrl, execute the following command as root: + +---- +# grep ' spec_ctrl ' /proc/cpuinfo +---- + +If this does not return empty your host's CPU has support for 'spec-ctrl'. + +If you use `host' or another CPU type which enables the desired flags by +default, and you updated your guest OS to make use of the associated CPU +features, you're already set. + +Otherwise you need to set the desired CPU flag of the virtual CPU, either by +editing the CPU options in the WebUI, or by setting the 'flags' property of the +'cpu' option in the VM configuration file. + +NUMA +^^^^ +You can also optionally emulate a *NUMA* +footnote:[https://en.wikipedia.org/wiki/Non-uniform_memory_access] architecture +in your VMs. The basics of the NUMA architecture mean that instead of having a +global memory pool available to all your cores, the memory is spread into local +banks close to each socket. +This can bring speed improvements as the memory bus is not a bottleneck +anymore. If your system has a NUMA architecture footnote:[if the command +`numactl --hardware | grep available` returns more than one node, then your host +system has a NUMA architecture] we recommend to activate the option, as this +will allow proper distribution of the VM resources on the host system. +This option is also required to hot-plug cores or RAM in a VM. + +If the NUMA option is used, it is recommended to set the number of sockets to +the number of sockets of the host system. + +vCPU hot-plug +^^^^^^^^^^^^^ + +Modern operating systems introduced the capability to hot-plug and, to a +certain extent, hot-unplug CPUs in a running systems. Virtualisation allows us +to avoid a lot of the (physical) problems real hardware can cause in such +scenarios. +Still, this is a rather new and complicated feature, so its use should be +restricted to cases where its absolutely needed. Most of the functionality can +be replicated with other, well tested and less complicated, features, see +xref:qm_cpu_resource_limits[Resource Limits]. + +In {pve} the maximal number of plugged CPUs is always `cores * sockets`. +To start a VM with less than this total core count of CPUs you may use the +*vpus* setting, it denotes how many vCPUs should be plugged in at VM start. + +Currently only this feature is only supported on Linux, a kernel newer than 3.10 +is needed, a kernel newer than 4.7 is recommended. + +You can use a udev rule as follow to automatically set new CPUs as online in +the guest: + +---- +SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" +---- + +Save this under /etc/udev/rules.d/ as a file ending in `.rules`. + +Note: CPU hot-remove is machine dependent and requires guest cooperation. +The deletion command does not guarantee CPU removal to actually happen, +typically it's a request forwarded to guest using target dependent mechanism, +e.g., ACPI on x86/amd64. + + +[[qm_memory]] +Memory +~~~~~~ + +For each VM you have the option to set a fixed size memory or asking +{pve} to dynamically allocate memory based on the current RAM usage of the +host. + +.Fixed Memory Allocation +[thumbnail="screenshot/gui-create-vm-memory.png"] + +When setting memory and minimum memory to the same amount +{pve} will simply allocate what you specify to your VM. + +Even when using a fixed memory size, the ballooning device gets added to the +VM, because it delivers useful information such as how much memory the guest +really uses. +In general, you should leave *ballooning* enabled, but if you want to disable +it (e.g. for debugging purposes), simply uncheck +*Ballooning Device* or set + + balloon: 0 + +in the configuration. + +.Automatic Memory Allocation + +// see autoballoon() in pvestatd.pm +When setting the minimum memory lower than memory, {pve} will make sure that the +minimum amount you specified is always available to the VM, and if RAM usage on +the host is below 80%, will dynamically add memory to the guest up to the +maximum memory specified. + +When the host is running low on RAM, the VM will then release some memory +back to the host, swapping running processes if needed and starting the oom +killer in last resort. The passing around of memory between host and guest is +done via a special `balloon` kernel driver running inside the guest, which will +grab or release memory pages from the host. +footnote:[A good explanation of the inner workings of the balloon driver can be found here https://rwmj.wordpress.com/2010/07/17/virtio-balloon/] + +When multiple VMs use the autoallocate facility, it is possible to set a +*Shares* coefficient which indicates the relative amount of the free host memory +that each VM should take. Suppose for instance you have four VMs, three of them +running an HTTP server and the last one is a database server. To cache more +database blocks in the database server RAM, you would like to prioritize the +database VM when spare RAM is available. For this you assign a Shares property +of 3000 to the database VM, leaving the other VMs to the Shares default setting +of 1000. The host server has 32GB of RAM, and is currently using 16GB, leaving 32 +* 80/100 - 16 = 9GB RAM to be allocated to the VMs. The database VM will get 9 * +3000 / (3000 + 1000 + 1000 + 1000) = 4.5 GB extra RAM and each HTTP server will +get 1.5 GB. + +All Linux distributions released after 2010 have the balloon kernel driver +included. For Windows OSes, the balloon driver needs to be added manually and can +incur a slowdown of the guest, so we don't recommend using it on critical +systems. +// see https://forum.proxmox.com/threads/solved-hyper-threading-vs-no-hyper-threading-fixed-vs-variable-memory.20265/ + +When allocating RAM to your VMs, a good rule of thumb is always to leave 1GB +of RAM available to the host. + + +[[qm_network_device]] +Network Device +~~~~~~~~~~~~~~ + +[thumbnail="screenshot/gui-create-vm-network.png"] + +Each VM can have many _Network interface controllers_ (NIC), of four different +types: + + * *Intel E1000* is the default, and emulates an Intel Gigabit network card. + * the *VirtIO* paravirtualized NIC should be used if you aim for maximum +performance. Like all VirtIO devices, the guest OS should have the proper driver +installed. + * the *Realtek 8139* emulates an older 100 MB/s network card, and should +only be used when emulating older operating systems ( released before 2002 ) + * the *vmxnet3* is another paravirtualized device, which should only be used +when importing a VM from another hypervisor. + +{pve} will generate for each NIC a random *MAC address*, so that your VM is +addressable on Ethernet networks. + +The NIC you added to the VM can follow one of two different models: + + * in the default *Bridged mode* each virtual NIC is backed on the host by a +_tap device_, ( a software loopback device simulating an Ethernet NIC ). This +tap device is added to a bridge, by default vmbr0 in {pve}. In this mode, VMs +have direct access to the Ethernet LAN on which the host is located. + * in the alternative *NAT mode*, each virtual NIC will only communicate with +the Qemu user networking stack, where a built-in router and DHCP server can +provide network access. This built-in DHCP will serve addresses in the private +10.0.2.0/24 range. The NAT mode is much slower than the bridged mode, and +should only be used for testing. This mode is only available via CLI or the API, +but not via the WebUI. + +You can also skip adding a network device when creating a VM by selecting *No +network device*. + +.Multiqueue +If you are using the VirtIO driver, you can optionally activate the +*Multiqueue* option. This option allows the guest OS to process networking +packets using multiple virtual CPUs, providing an increase in the total number +of packets transferred. + +//http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html +When using the VirtIO driver with {pve}, each NIC network queue is passed to the +host kernel, where the queue will be processed by a kernel thread spawned by the +vhost driver. With this option activated, it is possible to pass _multiple_ +network queues to the host kernel for each NIC. + +//https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Networking-Techniques.html#sect-Virtualization_Tuning_Optimization_Guide-Networking-Multi-queue_virtio-net +When using Multiqueue, it is recommended to set it to a value equal +to the number of Total Cores of your guest. You also need to set in +the VM the number of multi-purpose channels on each VirtIO NIC with the ethtool +command: + +`ethtool -L ens1 combined X` + +where X is the number of the number of vcpus of the VM. + +You should note that setting the Multiqueue parameter to a value greater +than one will increase the CPU load on the host and guest systems as the +traffic increases. We recommend to set this option only when the VM has to +process a great number of incoming connections, such as when the VM is running +as a router, reverse proxy or a busy HTTP server doing long polling. + + +[[qm_usb_passthrough]] +USB Passthrough +~~~~~~~~~~~~~~~ + +There are two different types of USB passthrough devices: + +* Host USB passthrough +* SPICE USB passthrough + +Host USB passthrough works by giving a VM a USB device of the host. +This can either be done via the vendor- and product-id, or +via the host bus and port. + +The vendor/product-id looks like this: *0123:abcd*, +where *0123* is the id of the vendor, and *abcd* is the id +of the product, meaning two pieces of the same usb device +have the same id. + +The bus/port looks like this: *1-2.3.4*, where *1* is the bus +and *2.3.4* is the port path. This represents the physical +ports of your host (depending of the internal order of the +usb controllers). + +If a device is present in a VM configuration when the VM starts up, +but the device is not present in the host, the VM can boot without problems. +As soon as the device/port is available in the host, it gets passed through. + +WARNING: Using this kind of USB passthrough means that you cannot move +a VM online to another host, since the hardware is only available +on the host the VM is currently residing. + +The second type of passthrough is SPICE USB passthrough. This is useful +if you use a SPICE client which supports it. If you add a SPICE USB port +to your VM, you can passthrough a USB device from where your SPICE client is, +directly to the VM (for example an input device or hardware dongle). + + +[[qm_bios_and_uefi]] +BIOS and UEFI +~~~~~~~~~~~~~ + +In order to properly emulate a computer, QEMU needs to use a firmware. +By default QEMU uses *SeaBIOS* for this, which is an open-source, x86 BIOS +implementation. SeaBIOS is a good choice for most standard setups. + +There are, however, some scenarios in which a BIOS is not a good firmware +to boot from, e.g. if you want to do VGA passthrough. footnote:[Alex Williamson has a very good blog entry about this. +http://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html] +In such cases, you should rather use *OVMF*, which is an open-source UEFI implementation. footnote:[See the OVMF Project http://www.tianocore.org/ovmf/] + +If you want to use OVMF, there are several things to consider: + +In order to save things like the *boot order*, there needs to be an EFI Disk. +This disk will be included in backups and snapshots, and there can only be one. + +You can create such a disk with the following command: + + qm set -efidisk0 :1,format= + +Where ** is the storage where you want to have the disk, and +** is a format which the storage supports. Alternatively, you can +create such a disk through the web interface with 'Add' -> 'EFI Disk' in the +hardware section of a VM. + +When using OVMF with a virtual display (without VGA passthrough), +you need to set the client resolution in the OVMF menu(which you can reach +with a press of the ESC button during boot), or you have to choose +SPICE as the display type. + +[[qm_startup_and_shutdown]] +Automatic Start and Shutdown of Virtual Machines +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After creating your VMs, you probably want them to start automatically +when the host system boots. For this you need to select the option 'Start at +boot' from the 'Options' Tab of your VM in the web interface, or set it with +the following command: + + qm set -onboot 1 -qm is a script to manage virtual machines with Qemu/Kvm. You can +.Start and Shutdown Order + +[thumbnail="screenshot/gui-qemu-edit-start-order.png"] + +In some case you want to be able to fine tune the boot order of your +VMs, for instance if one of your VM is providing firewalling or DHCP +to other guest systems. For this you can use the following +parameters: + +* *Start/Shutdown order*: Defines the start order priority. E.g. set it to 1 if +you want the VM to be the first to be started. (We use the reverse startup +order for shutdown, so a machine with a start order of 1 would be the last to +be shut down). If multiple VMs have the same order defined on a host, they will +additionally be ordered by 'VMID' in ascending order. +* *Startup delay*: Defines the interval between this VM start and subsequent +VMs starts . E.g. set it to 240 if you want to wait 240 seconds before starting +other VMs. +* *Shutdown timeout*: Defines the duration in seconds {pve} should wait +for the VM to be offline after issuing a shutdown command. +By default this value is set to 180, which means that {pve} will issue a +shutdown request and wait 180 seconds for the machine to be offline. If +the machine is still online after the timeout it will be stopped forcefully. + +NOTE: VMs managed by the HA stack do not follow the 'start on boot' and +'boot order' options currently. Those VMs will be skipped by the startup and +shutdown algorithm as the HA manager itself ensures that VMs get started and +stopped. + +Please note that machines without a Start/Shutdown order parameter will always +start after those where the parameter is set. Further, this parameter can only +be enforced between virtual machines running on the same host, not +cluster-wide. + + +[[qm_migration]] +Migration +--------- + +[thumbnail="screenshot/gui-qemu-migrate.png"] + +If you have a cluster, you can migrate your VM to another host with + + qm migrate + +There are generally two mechanisms for this + +* Online Migration (aka Live Migration) +* Offline Migration + +Online Migration +~~~~~~~~~~~~~~~~ + +When your VM is running and it has no local resources defined (such as disks +on local storage, passed through devices, etc.) you can initiate a live +migration with the -online flag. + +How it works +^^^^^^^^^^^^ + +This starts a Qemu Process on the target host with the 'incoming' flag, which +means that the process starts and waits for the memory data and device states +from the source Virtual Machine (since all other resources, e.g. disks, +are shared, the memory content and device state are the only things left +to transmit). + +Once this connection is established, the source begins to send the memory +content asynchronously to the target. If the memory on the source changes, +those sections are marked dirty and there will be another pass of sending data. +This happens until the amount of data to send is so small that it can +pause the VM on the source, send the remaining data to the target and start +the VM on the target in under a second. + +Requirements +^^^^^^^^^^^^ + +For Live Migration to work, there are some things required: + +* The VM has no local resources (e.g. passed through devices, local disks, etc.) +* The hosts are in the same {pve} cluster. +* The hosts have a working (and reliable) network connection. +* The target host must have the same or higher versions of the + {pve} packages. (It *might* work the other way, but this is never guaranteed) + +Offline Migration +~~~~~~~~~~~~~~~~~ + +If you have local resources, you can still offline migrate your VMs, +as long as all disk are on storages, which are defined on both hosts. +Then the migration will copy the disk over the network to the target host. + +[[qm_copy_and_clone]] +Copies and Clones +----------------- + +[thumbnail="screenshot/gui-qemu-full-clone.png"] + +VM installation is usually done using an installation media (CD-ROM) +from the operation system vendor. Depending on the OS, this can be a +time consuming task one might want to avoid. + +An easy way to deploy many VMs of the same type is to copy an existing +VM. We use the term 'clone' for such copies, and distinguish between +'linked' and 'full' clones. + +Full Clone:: + +The result of such copy is an independent VM. The +new VM does not share any storage resources with the original. ++ + +It is possible to select a *Target Storage*, so one can use this to +migrate a VM to a totally different storage. You can also change the +disk image *Format* if the storage driver supports several formats. ++ + +NOTE: A full clone need to read and copy all VM image data. This is +usually much slower than creating a linked clone. ++ + +Some storage types allows to copy a specific *Snapshot*, which +defaults to the 'current' VM data. This also means that the final copy +never includes any additional snapshots from the original VM. + + +Linked Clone:: + +Modern storage drivers supports a way to generate fast linked +clones. Such a clone is a writable copy whose initial contents are the +same as the original data. Creating a linked clone is nearly +instantaneous, and initially consumes no additional space. ++ + +They are called 'linked' because the new image still refers to the +original. Unmodified data blocks are read from the original image, but +modification are written (and afterwards read) from a new +location. This technique is called 'Copy-on-write'. ++ + +This requires that the original volume is read-only. With {pve} one +can convert any VM into a read-only <>). Such +templates can later be used to create linked clones efficiently. ++ + +NOTE: You cannot delete the original template while linked clones +exists. ++ + +It is not possible to change the *Target storage* for linked clones, +because this is a storage internal feature. + + +The *Target node* option allows you to create the new VM on a +different node. The only restriction is that the VM is on shared +storage, and that storage is also available on the target node. + +To avoid resource conflicts, all network interface MAC addresses gets +randomized, and we generate a new 'UUID' for the VM BIOS (smbios1) +setting. + + +[[qm_templates]] +Virtual Machine Templates +------------------------- + +One can convert a VM into a Template. Such templates are read-only, +and you can use them to create linked clones. + +NOTE: It is not possible to start templates, because this would modify +the disk images. If you want to change the template, create a linked +clone and modify that. + +VM Generation ID +---------------- + +{pve} supports Virtual Machine Generation ID ('vmgedid') footnote:[Official +'vmgenid' Specification +https://docs.microsoft.com/en-us/windows/desktop/hyperv_v2/virtual-machine-generation-identifier] +for virtual machines. +This can be used by the guest operating system to detect any event resulting +in a time shift event, for example, restoring a backup or a snapshot rollback. + +When creating new VMs, a 'vmgenid' will be automatically generated and saved +in its configuration file. + +To create and add a 'vmgenid' to an already existing VM one can pass the +special value `1' to let {pve} autogenerate one or manually set the 'UUID' +footnote:[Online GUID generator http://guid.one/] by using it as value, +e.g.: + +---- + qm set VMID -vmgenid 1 + qm set VMID -vmgenid 00000000-0000-0000-0000-000000000000 +---- + +In the rare case the 'vmgenid' mechanism is not wanted one can pass `0' for +its value on VM creation, or retroactively delete the property in the +configuration with: + +---- + qm set VMID -delete vmgenid +---- + +The most prominent use case for 'vmgenid' are newer Microsoft Windows +operating systems, which use it to avoid problems in time sensitive or +replicate services (e.g., databases, domain controller) on snapshot +rollback, backup restore or a whole VM clone operation. + +Importing Virtual Machines and disk images +------------------------------------------ + +A VM export from a foreign hypervisor takes usually the form of one or more disk + images, with a configuration file describing the settings of the VM (RAM, + number of cores). + +The disk images can be in the vmdk format, if the disks come from +VMware or VirtualBox, or qcow2 if the disks come from a KVM hypervisor. +The most popular configuration format for VM exports is the OVF standard, but in +practice interoperation is limited because many settings are not implemented in +the standard itself, and hypervisors export the supplementary information +in non-standard extensions. + +Besides the problem of format, importing disk images from other hypervisors +may fail if the emulated hardware changes too much from one hypervisor to +another. Windows VMs are particularly concerned by this, as the OS is very +picky about any changes of hardware. This problem may be solved by +installing the MergeIDE.zip utility available from the Internet before exporting +and choosing a hard disk type of *IDE* before booting the imported Windows VM. + +Finally there is the question of paravirtualized drivers, which improve the +speed of the emulated system and are specific to the hypervisor. +GNU/Linux and other free Unix OSes have all the necessary drivers installed by +default and you can switch to the paravirtualized drivers right after importing +the VM. For Windows VMs, you need to install the Windows paravirtualized +drivers by yourself. + +GNU/Linux and other free Unix can usually be imported without hassle. Note +that we cannot guarantee a successful import/export of Windows VMs in all +cases due to the problems above. + +Step-by-step example of a Windows OVF import +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Microsoft provides +https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/[Virtual Machines downloads] + to get started with Windows development.We are going to use one of these +to demonstrate the OVF import feature. + +Download the Virtual Machine zip +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +After getting informed about the user agreement, choose the _Windows 10 +Enterprise (Evaluation - Build)_ for the VMware platform, and download the zip. + +Extract the disk image from the zip +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using the `unzip` utility or any archiver of your choice, unpack the zip, +and copy via ssh/scp the ovf and vmdk files to your {pve} host. + +Import the Virtual Machine +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This will create a new virtual machine, using cores, memory and +VM name as read from the OVF manifest, and import the disks to the +local-lvm+ + storage. You have to configure the network manually. + + qm importovf 999 WinDev1709Eval.ovf local-lvm + +The VM is ready to be started. + +Adding an external disk image to a Virtual Machine +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can also add an existing disk image to a VM, either coming from a +foreign hypervisor, or one that you created yourself. + +Suppose you created a Debian/Ubuntu disk image with the 'vmdebootstrap' tool: + + vmdebootstrap --verbose \ + --size 10GiB --serial-console \ + --grub --no-extlinux \ + --package openssh-server \ + --package avahi-daemon \ + --package qemu-guest-agent \ + --hostname vm600 --enable-dhcp \ + --customize=./copy_pub_ssh.sh \ + --sparse --image vm600.raw + +You can now create a new target VM for this image. + + qm create 600 --net0 virtio,bridge=vmbr0 --name vm600 --serial0 socket \ + --bootdisk scsi0 --scsihw virtio-scsi-pci --ostype l26 + +Add the disk image as +unused0+ to the VM, using the storage +pvedir+: + + qm importdisk 600 vm600.raw pvedir + +Finally attach the unused disk to the SCSI controller of the VM: + + qm set 600 --scsi0 pvedir:600/vm-600-disk-1.raw + +The VM is ready to be started. + + +ifndef::wiki[] +include::qm-cloud-init.adoc[] +endif::wiki[] + + + +Managing Virtual Machines with `qm` +------------------------------------ + +qm is the tool to manage Qemu/Kvm virtual machines on {pve}. You can create and destroy virtual machines, and control execution (start/stop/suspend/resume). Besides that, you can use qm to set parameters in the associated config file. It is also possible to create and delete virtual disks. +CLI Usage Examples +~~~~~~~~~~~~~~~~~~ + +Using an iso file uploaded on the 'local' storage, create a VM +with a 4 GB IDE disk on the 'local-lvm' storage + + qm create 300 -ide0 local-lvm:4 -net0 e1000 -cdrom local:iso/proxmox-mailgateway_2.1.iso + +Start the new VM + + qm start 300 + +Send a shutdown request, then wait until the VM is stopped. + + qm shutdown 300 && qm wait 300 + +Same as above, but only wait for 40 seconds. + + qm shutdown 300 && qm wait 300 -timeout 40 + + +[[qm_configuration]] Configuration ------------- -All configuration files consists of lines in the form +VM configuration files are stored inside the Proxmox cluster file +system, and can be accessed at `/etc/pve/qemu-server/.conf`. +Like other files stored inside `/etc/pve/`, they get automatically +replicated to all other cluster nodes. + +NOTE: VMIDs < 100 are reserved for internal purposes, and VMIDs need to be +unique cluster wide. + +.Example VM Configuration +---- +cores: 1 +sockets: 1 +memory: 512 +name: webmail +ostype: l26 +bootdisk: virtio0 +net0: e1000=EE:D2:28:5F:B6:3E,bridge=vmbr0 +virtio0: local:vm-100-disk-1,size=32G +---- + +Those configuration files are simple text files, and you can edit them +using a normal text editor (`vi`, `nano`, ...). This is sometimes +useful to do small corrections, but keep in mind that you need to +restart the VM to apply such changes. + +For that reason, it is usually better to use the `qm` command to +generate and modify those files, or do the whole thing using the GUI. +Our toolkit is smart enough to instantaneously apply most changes to +running VM. This feature is called "hot plug", and there is no +need to restart the VM in that case. + + +File Format +~~~~~~~~~~~ + +VM configuration files use a simple colon separated key/value +format. Each line has the following format: + +----- +# this is a comment +OPTION: value +----- + +Blank lines in those files are ignored, and lines starting with a `#` +character are treated as comments and are also ignored. + - PARAMETER: value +[[qm_snapshots]] +Snapshots +~~~~~~~~~ -Configuration files are stored inside the Proxmox cluster file -system, and can be access at '/etc/pve/qemu-server/.conf'. +When you create a snapshot, `qm` stores the configuration at snapshot +time into a separate snapshot section within the same configuration +file. For example, after creating a snapshot called ``testsnapshot'', +your configuration file will look like this: +.VM configuration with snapshot +---- +memory: 512 +swap: 512 +parent: testsnaphot +... + +[testsnaphot] +memory: 512 +swap: 512 +snaptime: 1457170803 +... +---- + +There are a few snapshot related properties like `parent` and +`snaptime`. The `parent` property is used to store the parent/child +relationship between snapshots. `snaptime` is the snapshot creation +time stamp (Unix epoch). + + +[[qm_options]] Options ~~~~~~~ @@ -50,32 +1046,35 @@ include::qm.conf.5-opts.adoc[] Locks ----- -Online migrations and backups ('vzdump') set a lock to prevent incompatible -concurrent actions on the affected VMs. Sometimes you need to remove such a -lock manually (e.g., after a power failure). +Online migrations, snapshots and backups (`vzdump`) set a lock to +prevent incompatible concurrent actions on the affected VMs. Sometimes +you need to remove such a lock manually (e.g., after a power failure). qm unlock -Examples --------- +CAUTION: Only do that if you are sure the action which set the lock is +no longer running. -Create a new VM with 4 GB IDE disk. - qm create 300 -ide0 4 -net0 e1000 -cdrom proxmox-mailgateway_2.1.iso +ifdef::wiki[] -Start the new VM +See Also +~~~~~~~~ - qm start 300 +* link:/wiki/Cloud-Init_Support[Cloud-Init Support] -Send a shutdown request, then wait until the VM is stopped. +endif::wiki[] - qm shutdown 300 && qm wait 300 -Same as above, but only wait for 40 seconds. +ifdef::manvolnum[] - qm shutdown 300 && qm wait 300 -timeout 40 +Files +------ + +`/etc/pve/qemu-server/.conf`:: + +Configuration file for the VM ''. -ifdef::manvolnum[] include::pve-copyright.adoc[] endif::manvolnum[]