X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=qm.adoc;h=ec709d80eabe72fc904b14b70902f71ac86d8a91;hp=14de2d14ce63bcfc38a35158fd3da5e89f0f2ff7;hb=b8eeec5219af1c38771b9ea5688511edd53e824e;hpb=a7f36905f70da1a2a26753a2e83f900657dd5776 diff --git a/qm.adoc b/qm.adoc index 14de2d1..ec709d8 100644 --- a/qm.adoc +++ b/qm.adoc @@ -24,40 +24,399 @@ Qemu/KVM Virtual Machines include::attributes.txt[] endif::manvolnum[] +// deprecates +// http://pve.proxmox.com/wiki/Container_and_Full_Virtualization +// http://pve.proxmox.com/wiki/KVM +// http://pve.proxmox.com/wiki/Qemu_Server -qm is a script to manage virtual machines with Qemu/Kvm. You can -create and destroy virtual machines, and control execution -(start/stop/suspend/resume). Besides that, you can use qm to set -parameters in the associated config file. It is also possible to -create and delete virtual disks. +Qemu (short form for Quick Emulator) is an open source hypervisor that emulates a +physical computer. From the perspective of the host system where Qemu is +running, Qemu is a user program which has access to a number of local resources +like partitions, files, network cards which are then passed to an +emulated computer which sees them as if they were real devices. -Configuration -------------- +A guest operating system running in the emulated computer accesses these +devices, and runs as it were running on real hardware. For instance you can pass +an iso image as a parameter to Qemu, and the OS running in the emulated computer +will see a real CDROM inserted in a CD drive. -All configuration files consists of lines in the form +Qemu can emulates a great variety of hardware from ARM to Sparc, but {pve} is +only concerned with 32 and 64 bits PC clone emulation, since it represents the +overwhelming majority of server hardware. The emulation of PC clones is also one +of the fastest due to the availability of processor extensions which greatly +speed up Qemu when the emulated architecture is the same as the host +architecture. - PARAMETER: value +NOTE: You may sometimes encounter the term _KVM_ (Kernel-based Virtual Machine). +It means that Qemu is running with the support of the virtualization processor +extensions, via the Linux kvm module. In the context of {pve} _Qemu_ and +_KVM_ can be use interchangeably as Qemu in {pve} will always try to load the kvm +module. -Configuration files are stored inside the Proxmox cluster file -system, and can be access at '/etc/pve/qemu-server/.conf'. +Qemu inside {pve} runs as a root process, since this is required to access block +and PCI devices. -Options -~~~~~~~ -include::qm.conf.5-opts.adoc[] +Emulated devices and paravirtualized devices +-------------------------------------------- +The PC hardware emulated by Qemu includes a mainboard, network controllers, +scsi, ide and sata controllers, serial ports (the complete list can be seen in +the `kvm(1)` man page) all of them emulated in software. All these devices +are the exact software equivalent of existing hardware devices, and if the OS +running in the guest has the proper drivers it will use the devices as if it +were running on real hardware. This allows Qemu to runs _unmodified_ operating +systems. -Locks ------ +This however has a performance cost, as running in software what was meant to +run in hardware involves a lot of extra work for the host CPU. To mitigate this, +Qemu can present to the guest operating system _paravirtualized devices_, where +the guest OS recognizes it is running inside Qemu and cooperates with the +hypervisor. -Online migrations and backups ('vzdump') set a lock to prevent incompatible -concurrent actions on the affected VMs. Sometimes you need to remove such a -lock manually (e.g., after a power failure). +Qemu relies on the virtio virtualization standard, and is thus able to presente +paravirtualized virtio devices, which includes a paravirtualized generic disk +controller, a paravirtualized network card, a paravirtualized serial port, +a paravirtualized SCSI controller, etc ... - qm unlock +It is highly recommended to use the virtio devices whenever you can, as they +provide a big performance improvement. Using the virtio generic disk controller +versus an emulated IDE controller will double the sequential write throughput, +as measured with `bonnie++(8)`. Using the virtio network interface can deliver +up to three times the throughput of an emulated Intel E1000 network card, as +measured with `iperf(1)`. footnote:[See this benchmark on the KVM wiki +http://www.linux-kvm.org/page/Using_VirtIO_NIC] -Examples --------- + +Virtual Machines settings +------------------------- +Generally speaking {pve} tries to choose sane defaults for virtual machines +(VM). Make sure you understand the meaning of the settings you change, as it +could incur a performance slowdown, or putting your data at risk. + + +General Settings +~~~~~~~~~~~~~~~~ +General settings of a VM include + +* the *Node* : the physical server on which the VM will run +* the *VM ID*: a unique number in this {pve} installation used to identify your VM +* *Name*: a free form text string you can use to describe the VM +* *Resource Pool*: a logical group of VMs + + +OS Settings +~~~~~~~~~~~ +When creating a VM, setting the proper Operating System(OS) allows {pve} to +optimize some low level parameters. For instance Windows OS expect the BIOS +clock to use the local time, while Unix based OS expect the BIOS clock to have +the UTC time. + + +Hard Disk +~~~~~~~~~ +Qemu can emulate a number of storage controllers: + +* the *IDE* controller, has a design which goes back to the 1984 PC/AT disk +controller. Even if this controller has been superseded by more more designs, +each and every OS you can think has support for it, making it a great choice +if you want to run an OS released before 2003. You can connect up to 4 devices +on this controller. + +* the *SATA* (Serial ATA) controller, dating from 2003, has a more modern +design, allowing higher throughput and a greater number of devices to be +connected. You can connect up to 6 devices on this controller. + +* the *SCSI* controller, designed in 1985, is commonly found on server grade +hardware, and can connect up to 14 storage devices. {pve} emulates by default a +LSI 53C895A controller. + +A SCSI controller of type _Virtio_ is the recommended setting if you aim for +performance and is automatically selected for newly created Linux VMs since +{pve} 4.3. Linux distributions have support for this controller since 2012, and +FreeBSD since 2014. For Windows OSes, you need to provide an extra iso +containing the drivers during the installation. +// https://pve.proxmox.com/wiki/Paravirtualized_Block_Drivers_for_Windows#During_windows_installation. + +* The *Virtio* controller, also called virtio-blk to distinguish from +the Virtio SCSI controller, is an older type of paravirtualized controller +which has been superseded in features by the Virtio SCSI Controller. + +On each controller you attach a number of emulated hard disks, which are backed +by a file or a block device residing in the configured storage. The choice of +a storage type will determine the format of the hard disk image. Storages which +present block devices (LVM, ZFS, Ceph) will require the *raw disk image format*, +whereas files based storages (Ext4, NFS, GlusterFS) will let you to choose +either the *raw disk image format* or the *QEMU image format*. + + * the *QEMU image format* is a copy on write format which allows snapshots, and + thin provisioning of the disk image. + * the *raw disk image* is a bit-to-bit image of a hard disk, similar to what + you would get when executing the `dd` command on a block device in Linux. This + format do not support thin provisioning or snapshotting by itself, requiring + cooperation from the storage layer for these tasks. It is however 10% faster + than the *QEMU image format*. footnote:[See this benchmark for details + http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf] + * the *VMware image format* only makes sense if you intend to import/export the + disk image to other hypervisors. + +Setting the *Cache* mode of the hard drive will impact how the host system will +notify the guest systems of block write completions. The *No cache* default +means that the guest system will be notified that a write is complete when each +block reaches the physical storage write queue, ignoring the host page cache. +This provides a good balance between safety and speed. + +If you want the {pve} backup manager to skip a disk when doing a backup of a VM, +you can set the *No backup* option on that disk. + +If your storage supports _thin provisioning_ (see the storage chapter in the +{pve} guide), and your VM has a *SCSI* controller you can activate the *Discard* +option on the hard disks connected to that controller. With *Discard* enabled, +when the filesystem of a VM marks blocks as unused after removing files, the +emulated SCSI controller will relay this information to the storage, which will +then shrink the disk image accordingly. + +.IO Thread +The option *IO Thread* can only be enabled when using a disk with the *VirtIO* controller, +or with the *SCSI* controller, when the emulated controller type is *VirtIO SCSI*. +With this enabled, Qemu uses one thread per disk, instead of one thread for all, +so it should increase performance when using multiple disks. +Note that backups do not currently work with *IO Thread* enabled. + +CPU +~~~ +A *CPU socket* is a physical slot on a PC motherboard where you can plug a CPU. +This CPU can then contain one or many *cores*, which are independent +processing units. Whether you have a single CPU socket with 4 cores, or two CPU +sockets with two cores is mostly irrelevant from a performance point of view. +However some software is licensed depending on the number of sockets you have in +your machine, in that case it makes sense to set the number of of sockets to +what the license allows you, and increase the number of cores. + +Increasing the number of virtual cpus (cores and sockets) will usually provide a +performance improvement though that is heavily dependent on the use of the VM. +Multithreaded applications will of course benefit from a large number of +virtual cpus, as for each virtual cpu you add, Qemu will create a new thread of +execution on the host system. If you're not sure about the workload of your VM, +it is usually a safe bet to set the number of *Total cores* to 2. + +NOTE: It is perfectly safe to set the _overall_ number of total cores in all +your VMs to be greater than the number of of cores you have on your server (ie. +4 VMs with each 4 Total cores running in a 8 core machine is OK) In that case +the host system will balance the Qemu execution threads between your server +cores just like if you were running a standard multithreaded application. +However {pve} will prevent you to allocate on a _single_ machine more vcpus than +physically available, as this will only bring the performance down due to the +cost of context switches. + +Qemu can emulate a number different of *CPU types* from 486 to the latest Xeon +processors. Each new processor generation adds new features, like hardware +assisted 3d rendering, random number generation, memory protection, etc ... +Usually you should select for your VM a processor type which closely matches the +CPU of the host system, as it means that the host CPU features (also called _CPU +flags_ ) will be available in your VMs. If you want an exact match, you can set +the CPU type to *host* in which case the VM will have exactly the same CPU flags +as your host system. + +This has a downside though. If you want to do a live migration of VMs between +different hosts, your VM might end up on a new system with a different CPU type. +If the CPU flags passed to the guest are missing, the qemu process will stop. To +remedy this Qemu has also its own CPU type *kvm64*, that {pve} uses by defaults. +kvm64 is a Pentium 4 look a like CPU type, which has a reduced CPU flags set, +but is guaranteed to work everywhere. + + In short, if you care about live migration and moving VMs between nodes, leave +the kvm64 default. If you don’t care about live migration, set the CPU type to +host, as in theory this will give your guests maximum performance. + +You can also optionally emulate a *NUMA* architecture in your VMs. The basics of +the NUMA architecture mean that instead of having a global memory pool available +to all your cores, the memory is spread into local banks close to each socket. +This can bring speed improvements as the memory bus is not a bottleneck +anymore. If your system has a NUMA architecture footnote:[if the command +`numactl --hardware | grep available` returns more than one node, then your host +system has a NUMA architecture] we recommend to activate the option, as this +will allow proper distribution of the VM resources on the host system. This +option is also required in {pve} to allow hotplugging of cores and RAM to a VM. + +If the NUMA option is used, it is recommended to set the number of sockets to +the number of sockets of the host system. + +Memory +~~~~~~ +For each VM you have the option to set a fixed size memory or asking +{pve} to dynamically allocate memory based on the current RAM usage of the +host. + +When choosing a *fixed size memory* {pve} will simply allocate what you +specify to your VM. + +// see autoballoon() in pvestatd.pm +When choosing to *automatically allocate memory*, {pve} will make sure that the +minimum amount you specified is always available to the VM, and if RAM usage on +the host is below 80%, will dynamically add memory to the guest up to the +maximum memory specified. + +When the host is becoming short on RAM, the VM will then release some memory +back to the host, swapping running processes if needed and starting the oom +killer in last resort. The passing around of memory between host and guest is +done via a special `balloon` kernel driver running inside the guest, which will +grab or release memory pages from the host. +footnote:[A good explanation of the inner workings of the balloon driver can be found here https://rwmj.wordpress.com/2010/07/17/virtio-balloon/] + +When multiple VMs use the autoallocate facility, it is possible to set a +*Shares* coefficient which indicates the relative amount of the free host memory +that each VM shoud take. Suppose for instance you have four VMs, three of them +running a HTTP server and the last one is a database server. To cache more +database blocks in the database server RAM, you would like to prioritize the +database VM when spare RAM is available. For this you assign a Shares property +of 3000 to the database VM, leaving the other VMs to the Shares default setting +of 1000. The host server has 32GB of RAM, and is curring using 16GB, leaving 32 +* 80/100 - 16 = 9GB RAM to be allocated to the VMs. The database VM will get 9 * +3000 / (3000 + 1000 + 1000 + 1000) = 4.5 GB extra RAM and each HTTP server will +get 1/5 GB. + +All Linux distributions released after 2010 have the balloon kernel driver +included. For Windows OSes, the balloon driver needs to be added manually and can +incur a slowdown of the guest, so we don't recommend using it on critical +systems. +// see https://forum.proxmox.com/threads/solved-hyper-threading-vs-no-hyper-threading-fixed-vs-variable-memory.20265/ + +When allocating RAMs to your VMs, a good rule of thumb is always to leave 1GB +of RAM available to the host. + +Network Device +~~~~~~~~~~~~~~ +Each VM can have many _Network interface controllers_ (NIC), of four different +types: + + * *Intel E1000* is the default, and emulates an Intel Gigabit network card. + * the *VirtIO* paravirtualized NIC should be used if you aim for maximum +performance. Like all VirtIO devices, the guest OS should have the proper driver +installed. + * the *Realtek 8139* emulates an older 100 MB/s network card, and should +only be used when emulating older operating systems ( released before 2002 ) + * the *vmxnet3* is another paravirtualized device, which should only be used +when importing a VM from another hypervisor. + +{pve} will generate for each NIC a random *MAC address*, so that your VM is +addressable on Ethernet networks. + +The NIC you added to the VM can follow one of two differents models: + + * in the default *Bridged mode* each virtual NIC is backed on the host by a +_tap device_, ( a software loopback device simulating an Ethernet NIC ). This +tap device is added to a bridge, by default vmbr0 in {pve}. In this mode, VMs +have direct access to the Ethernet LAN on which the host is located. + * in the alternative *NAT mode*, each virtual NIC will only communicate with +the Qemu user networking stack, where a builting router and DHCP server can +provide network access. This built-in DHCP will serve adresses in the private +10.0.2.0/24 range. The NAT mode is much slower than the bridged mode, and +should only be used for testing. + +You can also skip adding a network device when creating a VM by selecting *No +network device*. + +.Multiqueue +If you are using the VirtIO driver, you can optionally activate the +*Multiqueue* option. This option allows the guest OS to process networking +packets using multiple virtual CPUs, providing an increase in the total number +of packets transfered. + +//http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html +When using the VirtIO driver with {pve}, each NIC network queue is passed to the +host kernel, where the queue will be processed by a kernel thread spawn by the +vhost driver. With this option activated, it is possible to pass _multiple_ +network queues to the host kernel for each NIC. + +//https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Networking-Techniques.html#sect-Virtualization_Tuning_Optimization_Guide-Networking-Multi-queue_virtio-net +When using Multiqueue, it is recommended to set it to a value equal +to the number of Total Cores of your guest. You also need to set in +the VM the number of multi-purpose channels on each VirtIO NIC with the ethtool +command: + +`ethtool -L eth0 combined X` + +where X is the number of the number of vcpus of the VM. + +You should note that setting the Multiqueue parameter to a value greater +than one will increase the CPU load on the host and guest systems as the +traffic increases. We recommend to set this option only when the VM has to +process a great number of incoming connections, such as when the VM is running +as a router, reverse proxy or a busy HTTP server doing long polling. + +USB Passthrough +~~~~~~~~~~~~~~~ +There are two different types of USB passthrough devices: + +* Host USB passtrough +* SPICE USB passthrough + +Host USB passthrough works by giving a VM a USB device of the host. +This can either be done via the vendor- and product-id, or +via the host bus and port. + +The vendor/product-id looks like this: *0123:abcd*, +where *0123* is the id of the vendor, and *abcd* is the id +of the product, meaning two pieces of the same usb device +have the same id. + +The bus/port looks like this: *1-2.3.4*, where *1* is the bus +and *2.3.4* is the port path. This represents the physical +ports of your host (depending of the internal order of the +usb controllers). + +If a device is present in a VM configuration when the VM starts up, +but the device is not present in the host, the VM can boot without problems. +As soon as the device/port ist available in the host, it gets passed through. + +WARNING: Using this kind of USB passthrough, means that you cannot move +a VM online to another host, since the hardware is only available +on the host the VM is currently residing. + +The second type of passthrough is SPICE USB passthrough. This is useful +if you use a SPICE client which supports it. If you add a SPICE USB port +to your VM, you can passthrough a USB device from where your SPICE client is, +directly to the VM (for example an input device or hardware dongle). + +BIOS and UEFI +~~~~~~~~~~~~~ + +In order to properly emulate a computer, QEMU needs to use a firmware. +By default QEMU uses *SeaBIOS* for this, which is an open-source, x86 BIOS +implementation. SeaBIOS is a good choice for most standard setups. + +There are, however, some scenarios in which a BIOS is not a good firmware +to boot from, e.g. if you want to do VGA passthrough. footnote:[Alex Williamson has a very good blog entry about this. +http://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html] +In such cases, you should rather use *OVMF*, which is an open-source UEFI implemenation. footnote:[See the OVMF Project http://www.tianocore.org/ovmf/] + +If you want to use OVMF, there are several things to consider: + +In order to save things like the *boot order*, there needs to be an EFI Disk. +This disk will be included in backups and snapshots, and there can only be one. + +You can create such a disk with the following command: + + qm set -efidisk0 :1,format= + +Where ** is the storage where you want to have the disk, and +** is a format which the storage supports. Alternatively, you can +create such a disk through the web interface with 'Add' -> 'EFI Disk' in the +hardware section of a VM. + +When using OVMF with a virtual display (without VGA passthrough), +you need to set the client resolution in the OVMF menu(which you can reach +with a press of the ESC button during boot), or you have to choose +SPICE as the display type. + + +Managing Virtual Machines with `qm` +------------------------------------ + +qm is the tool to manage Qemu/Kvm virtual machines on {pve}. You can +create and destroy virtual machines, and control execution +(start/stop/suspend/resume). Besides that, you can use qm to set +parameters in the associated config file. It is also possible to +create and delete virtual disks. + +CLI Usage Examples +~~~~~~~~~~~~~~~~~~ Create a new VM with 4 GB IDE disk. @@ -75,6 +434,31 @@ Same as above, but only wait for 40 seconds. qm shutdown 300 && qm wait 300 -timeout 40 +Configuration +------------- + +All configuration files consists of lines in the form + + PARAMETER: value + +Configuration files are stored inside the Proxmox cluster file +system, and can be accessed at `/etc/pve/qemu-server/.conf`. + +Options +~~~~~~~ + +include::qm.conf.5-opts.adoc[] + + +Locks +----- + +Online migrations and backups (`vzdump`) set a lock to prevent incompatible +concurrent actions on the affected VMs. Sometimes you need to remove such a +lock manually (e.g., after a power failure). + + qm unlock + ifdef::manvolnum[] include::pve-copyright.adoc[]