]> git.proxmox.com Git - pve-docs.git/blame_incremental - qm.adoc
add ivshmem documentation
[pve-docs.git] / qm.adoc
... / ...
CommitLineData
1[[chapter_virtual_machines]]
2ifdef::manvolnum[]
3qm(1)
4=====
5:pve-toplevel:
6
7NAME
8----
9
10qm - Qemu/KVM Virtual Machine Manager
11
12
13SYNOPSIS
14--------
15
16include::qm.1-synopsis.adoc[]
17
18DESCRIPTION
19-----------
20endif::manvolnum[]
21ifndef::manvolnum[]
22Qemu/KVM Virtual Machines
23=========================
24:pve-toplevel:
25endif::manvolnum[]
26
27// deprecates
28// http://pve.proxmox.com/wiki/Container_and_Full_Virtualization
29// http://pve.proxmox.com/wiki/KVM
30// http://pve.proxmox.com/wiki/Qemu_Server
31
32Qemu (short form for Quick Emulator) is an open source hypervisor that emulates a
33physical computer. From the perspective of the host system where Qemu is
34running, Qemu is a user program which has access to a number of local resources
35like partitions, files, network cards which are then passed to an
36emulated computer which sees them as if they were real devices.
37
38A guest operating system running in the emulated computer accesses these
39devices, and runs as it were running on real hardware. For instance you can pass
40an iso image as a parameter to Qemu, and the OS running in the emulated computer
41will see a real CDROM inserted in a CD drive.
42
43Qemu can emulate a great variety of hardware from ARM to Sparc, but {pve} is
44only concerned with 32 and 64 bits PC clone emulation, since it represents the
45overwhelming majority of server hardware. The emulation of PC clones is also one
46of the fastest due to the availability of processor extensions which greatly
47speed up Qemu when the emulated architecture is the same as the host
48architecture.
49
50NOTE: You may sometimes encounter the term _KVM_ (Kernel-based Virtual Machine).
51It means that Qemu is running with the support of the virtualization processor
52extensions, via the Linux kvm module. In the context of {pve} _Qemu_ and
53_KVM_ can be used interchangeably as Qemu in {pve} will always try to load the kvm
54module.
55
56Qemu inside {pve} runs as a root process, since this is required to access block
57and PCI devices.
58
59
60Emulated devices and paravirtualized devices
61--------------------------------------------
62
63The PC hardware emulated by Qemu includes a mainboard, network controllers,
64scsi, ide and sata controllers, serial ports (the complete list can be seen in
65the `kvm(1)` man page) all of them emulated in software. All these devices
66are the exact software equivalent of existing hardware devices, and if the OS
67running in the guest has the proper drivers it will use the devices as if it
68were running on real hardware. This allows Qemu to runs _unmodified_ operating
69systems.
70
71This however has a performance cost, as running in software what was meant to
72run in hardware involves a lot of extra work for the host CPU. To mitigate this,
73Qemu can present to the guest operating system _paravirtualized devices_, where
74the guest OS recognizes it is running inside Qemu and cooperates with the
75hypervisor.
76
77Qemu relies on the virtio virtualization standard, and is thus able to present
78paravirtualized virtio devices, which includes a paravirtualized generic disk
79controller, a paravirtualized network card, a paravirtualized serial port,
80a paravirtualized SCSI controller, etc ...
81
82It is highly recommended to use the virtio devices whenever you can, as they
83provide a big performance improvement. Using the virtio generic disk controller
84versus an emulated IDE controller will double the sequential write throughput,
85as measured with `bonnie++(8)`. Using the virtio network interface can deliver
86up to three times the throughput of an emulated Intel E1000 network card, as
87measured with `iperf(1)`. footnote:[See this benchmark on the KVM wiki
88http://www.linux-kvm.org/page/Using_VirtIO_NIC]
89
90
91[[qm_virtual_machines_settings]]
92Virtual Machines Settings
93-------------------------
94
95Generally speaking {pve} tries to choose sane defaults for virtual machines
96(VM). Make sure you understand the meaning of the settings you change, as it
97could incur a performance slowdown, or putting your data at risk.
98
99
100[[qm_general_settings]]
101General Settings
102~~~~~~~~~~~~~~~~
103
104[thumbnail="screenshot/gui-create-vm-general.png"]
105
106General settings of a VM include
107
108* the *Node* : the physical server on which the VM will run
109* the *VM ID*: a unique number in this {pve} installation used to identify your VM
110* *Name*: a free form text string you can use to describe the VM
111* *Resource Pool*: a logical group of VMs
112
113
114[[qm_os_settings]]
115OS Settings
116~~~~~~~~~~~
117
118[thumbnail="screenshot/gui-create-vm-os.png"]
119
120When creating a VM, setting the proper Operating System(OS) allows {pve} to
121optimize some low level parameters. For instance Windows OS expect the BIOS
122clock to use the local time, while Unix based OS expect the BIOS clock to have
123the UTC time.
124
125
126[[qm_hard_disk]]
127Hard Disk
128~~~~~~~~~
129
130Qemu can emulate a number of storage controllers:
131
132* the *IDE* controller, has a design which goes back to the 1984 PC/AT disk
133controller. Even if this controller has been superseded by recent designs,
134each and every OS you can think of has support for it, making it a great choice
135if you want to run an OS released before 2003. You can connect up to 4 devices
136on this controller.
137
138* the *SATA* (Serial ATA) controller, dating from 2003, has a more modern
139design, allowing higher throughput and a greater number of devices to be
140connected. You can connect up to 6 devices on this controller.
141
142* the *SCSI* controller, designed in 1985, is commonly found on server grade
143hardware, and can connect up to 14 storage devices. {pve} emulates by default a
144LSI 53C895A controller.
145+
146A SCSI controller of type _VirtIO SCSI_ is the recommended setting if you aim for
147performance and is automatically selected for newly created Linux VMs since
148{pve} 4.3. Linux distributions have support for this controller since 2012, and
149FreeBSD since 2014. For Windows OSes, you need to provide an extra iso
150containing the drivers during the installation.
151// https://pve.proxmox.com/wiki/Paravirtualized_Block_Drivers_for_Windows#During_windows_installation.
152If you aim at maximum performance, you can select a SCSI controller of type
153_VirtIO SCSI single_ which will allow you to select the *IO Thread* option.
154When selecting _VirtIO SCSI single_ Qemu will create a new controller for
155each disk, instead of adding all disks to the same controller.
156
157* The *VirtIO Block* controller, often just called VirtIO or virtio-blk,
158is an older type of paravirtualized controller. It has been superseded by the
159VirtIO SCSI Controller, in terms of features.
160
161[thumbnail="screenshot/gui-create-vm-hard-disk.png"]
162On each controller you attach a number of emulated hard disks, which are backed
163by a file or a block device residing in the configured storage. The choice of
164a storage type will determine the format of the hard disk image. Storages which
165present block devices (LVM, ZFS, Ceph) will require the *raw disk image format*,
166whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose
167either the *raw disk image format* or the *QEMU image format*.
168
169 * the *QEMU image format* is a copy on write format which allows snapshots, and
170 thin provisioning of the disk image.
171 * the *raw disk image* is a bit-to-bit image of a hard disk, similar to what
172 you would get when executing the `dd` command on a block device in Linux. This
173 format does not support thin provisioning or snapshots by itself, requiring
174 cooperation from the storage layer for these tasks. It may, however, be up to
175 10% faster than the *QEMU image format*. footnote:[See this benchmark for details
176 http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf]
177 * the *VMware image format* only makes sense if you intend to import/export the
178 disk image to other hypervisors.
179
180Setting the *Cache* mode of the hard drive will impact how the host system will
181notify the guest systems of block write completions. The *No cache* default
182means that the guest system will be notified that a write is complete when each
183block reaches the physical storage write queue, ignoring the host page cache.
184This provides a good balance between safety and speed.
185
186If you want the {pve} backup manager to skip a disk when doing a backup of a VM,
187you can set the *No backup* option on that disk.
188
189If you want the {pve} storage replication mechanism to skip a disk when starting
190 a replication job, you can set the *Skip replication* option on that disk.
191As of {pve} 5.0, replication requires the disk images to be on a storage of type
192`zfspool`, so adding a disk image to other storages when the VM has replication
193configured requires to skip replication for this disk image.
194
195If your storage supports _thin provisioning_ (see the storage chapter in the
196{pve} guide), you can activate the *Discard* option on a drive. With *Discard*
197set and a _TRIM_-enabled guest OS footnote:[TRIM, UNMAP, and discard
198https://en.wikipedia.org/wiki/Trim_%28computing%29], when the VM's filesystem
199marks blocks as unused after deleting files, the controller will relay this
200information to the storage, which will then shrink the disk image accordingly.
201For the guest to be able to issue _TRIM_ commands, you must either use a
202*VirtIO SCSI* (or *VirtIO SCSI Single*) controller or set the *SSD emulation*
203option on the drive. Note that *Discard* is not supported on *VirtIO Block*
204drives.
205
206If you would like a drive to be presented to the guest as a solid-state drive
207rather than a rotational hard disk, you can set the *SSD emulation* option on
208that drive. There is no requirement that the underlying storage actually be
209backed by SSDs; this feature can be used with physical media of any type.
210Note that *SSD emulation* is not supported on *VirtIO Block* drives.
211
212.IO Thread
213The option *IO Thread* can only be used when using a disk with the
214*VirtIO* controller, or with the *SCSI* controller, when the emulated controller
215 type is *VirtIO SCSI single*.
216With this enabled, Qemu creates one I/O thread per storage controller,
217instead of a single thread for all I/O, so it increases performance when
218multiple disks are used and each disk has its own storage controller.
219Note that backups do not currently work with *IO Thread* enabled.
220
221
222[[qm_cpu]]
223CPU
224~~~
225
226[thumbnail="screenshot/gui-create-vm-cpu.png"]
227
228A *CPU socket* is a physical slot on a PC motherboard where you can plug a CPU.
229This CPU can then contain one or many *cores*, which are independent
230processing units. Whether you have a single CPU socket with 4 cores, or two CPU
231sockets with two cores is mostly irrelevant from a performance point of view.
232However some software licenses depend on the number of sockets a machine has,
233in that case it makes sense to set the number of sockets to what the license
234allows you.
235
236Increasing the number of virtual cpus (cores and sockets) will usually provide a
237performance improvement though that is heavily dependent on the use of the VM.
238Multithreaded applications will of course benefit from a large number of
239virtual cpus, as for each virtual cpu you add, Qemu will create a new thread of
240execution on the host system. If you're not sure about the workload of your VM,
241it is usually a safe bet to set the number of *Total cores* to 2.
242
243NOTE: It is perfectly safe if the _overall_ number of cores of all your VMs
244is greater than the number of cores on the server (e.g., 4 VMs with each 4
245cores on a machine with only 8 cores). In that case the host system will
246balance the Qemu execution threads between your server cores, just like if you
247were running a standard multithreaded application. However, {pve} will prevent
248you from assigning more virtual CPU cores than physically available, as this will
249only bring the performance down due to the cost of context switches.
250
251[[qm_cpu_resource_limits]]
252Resource Limits
253^^^^^^^^^^^^^^^
254
255In addition to the number of virtual cores, you can configure how much resources
256a VM can get in relation to the host CPU time and also in relation to other
257VMs.
258With the *cpulimit* (``Host CPU Time'') option you can limit how much CPU time
259the whole VM can use on the host. It is a floating point value representing CPU
260time in percent, so `1.0` is equal to `100%`, `2.5` to `250%` and so on. If a
261single process would fully use one single core it would have `100%` CPU Time
262usage. If a VM with four cores utilizes all its cores fully it would
263theoretically use `400%`. In reality the usage may be even a bit higher as Qemu
264can have additional threads for VM peripherals besides the vCPU core ones.
265This setting can be useful if a VM should have multiple vCPUs, as it runs a few
266processes in parallel, but the VM as a whole should not be able to run all
267vCPUs at 100% at the same time. Using a specific example: lets say we have a VM
268which would profit from having 8 vCPUs, but at no time all of those 8 cores
269should run at full load - as this would make the server so overloaded that
270other VMs and CTs would get to less CPU. So, we set the *cpulimit* limit to
271`4.0` (=400%). If all cores do the same heavy work they would all get 50% of a
272real host cores CPU time. But, if only 4 would do work they could still get
273almost 100% of a real core each.
274
275NOTE: VMs can, depending on their configuration, use additional threads e.g.,
276for networking or IO operations but also live migration. Thus a VM can show up
277to use more CPU time than just its virtual CPUs could use. To ensure that a VM
278never uses more CPU time than virtual CPUs assigned set the *cpulimit* setting
279to the same value as the total core count.
280
281The second CPU resource limiting setting, *cpuunits* (nowadays often called CPU
282shares or CPU weight), controls how much CPU time a VM gets in regards to other
283VMs running. It is a relative weight which defaults to `1024`, if you increase
284this for a VM it will be prioritized by the scheduler in comparison to other
285VMs with lower weight. E.g., if VM 100 has set the default 1024 and VM 200 was
286changed to `2048`, the latter VM 200 would receive twice the CPU bandwidth than
287the first VM 100.
288
289For more information see `man systemd.resource-control`, here `CPUQuota`
290corresponds to `cpulimit` and `CPUShares` corresponds to our `cpuunits`
291setting, visit its Notes section for references and implementation details.
292
293CPU Type
294^^^^^^^^
295
296Qemu can emulate a number different of *CPU types* from 486 to the latest Xeon
297processors. Each new processor generation adds new features, like hardware
298assisted 3d rendering, random number generation, memory protection, etc ...
299Usually you should select for your VM a processor type which closely matches the
300CPU of the host system, as it means that the host CPU features (also called _CPU
301flags_ ) will be available in your VMs. If you want an exact match, you can set
302the CPU type to *host* in which case the VM will have exactly the same CPU flags
303as your host system.
304
305This has a downside though. If you want to do a live migration of VMs between
306different hosts, your VM might end up on a new system with a different CPU type.
307If the CPU flags passed to the guest are missing, the qemu process will stop. To
308remedy this Qemu has also its own CPU type *kvm64*, that {pve} uses by defaults.
309kvm64 is a Pentium 4 look a like CPU type, which has a reduced CPU flags set,
310but is guaranteed to work everywhere.
311
312In short, if you care about live migration and moving VMs between nodes, leave
313the kvm64 default. If you don’t care about live migration or have a homogeneous
314cluster where all nodes have the same CPU, set the CPU type to host, as in
315theory this will give your guests maximum performance.
316
317Meltdown / Spectre related CPU flags
318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
319
320There are several CPU flags related to the Meltdown and Spectre vulnerabilities
321footnote:[Meltdown Attack https://meltdownattack.com/] which need to be set
322manually unless the selected CPU type of your VM already enables them by default.
323
324There are two requirements that need to be fulfilled in order to use these
325CPU flags:
326
327* The host CPU(s) must support the feature and propagate it to the guest's virtual CPU(s)
328* The guest operating system must be updated to a version which mitigates the
329 attacks and is able to utilize the CPU feature
330
331Otherwise you need to set the desired CPU flag of the virtual CPU, either by
332editing the CPU options in the WebUI, or by setting the 'flags' property of the
333'cpu' option in the VM configuration file.
334
335For Spectre v1,v2,v4 fixes, your CPU or system vendor also needs to provide a
336so-called ``microcode update'' footnote:[You can use `intel-microcode' /
337`amd-microcode' from Debian non-free if your vendor does not provide such an
338update. Note that not all affected CPUs can be updated to support spec-ctrl.]
339for your CPU.
340
341
342To check if the {pve} host is vulnerable, execute the following command as root:
343
344----
345for f in /sys/devices/system/cpu/vulnerabilities/*; do echo "${f##*/} -" $(cat "$f"); done
346----
347
348A community script is also available to detect is the host is still vulnerable.
349footnote:[spectre-meltdown-checker https://meltdown.ovh/]
350
351Intel processors
352^^^^^^^^^^^^^^^^
353
354* 'pcid'
355+
356This reduces the performance impact of the Meltdown (CVE-2017-5754) mitigation
357called 'Kernel Page-Table Isolation (KPTI)', which effectively hides
358the Kernel memory from the user space. Without PCID, KPTI is quite an expensive
359mechanism footnote:[PCID is now a critical performance/security feature on x86
360https://groups.google.com/forum/m/#!topic/mechanical-sympathy/L9mHTbeQLNU].
361+
362To check if the {pve} host supports PCID, execute the following command as root:
363+
364----
365# grep ' pcid ' /proc/cpuinfo
366----
367+
368If this does not return empty your host's CPU has support for 'pcid'.
369
370* 'spec-ctrl'
371+
372Required to enable the Spectre v1 (CVE-2017-5753) and Spectre v2 (CVE-2017-5715) fix,
373in cases where retpolines are not sufficient.
374Included by default in Intel CPU models with -IBRS suffix.
375Must be explicitly turned on for Intel CPU models without -IBRS suffix.
376Requires an updated host CPU microcode (intel-microcode >= 20180425).
377+
378* 'ssbd'
379+
380Required to enable the Spectre V4 (CVE-2018-3639) fix. Not included by default in any Intel CPU model.
381Must be explicitly turned on for all Intel CPU models.
382Requires an updated host CPU microcode(intel-microcode >= 20180703).
383
384
385AMD processors
386^^^^^^^^^^^^^^
387
388* 'ibpb'
389+
390Required to enable the Spectre v1 (CVE-2017-5753) and Spectre v2 (CVE-2017-5715) fix,
391in cases where retpolines are not sufficient.
392Included by default in AMD CPU models with -IBPB suffix.
393Must be explicitly turned on for AMD CPU models without -IBPB suffix.
394Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
395
396
397
398* 'virt-ssbd'
399+
400Required to enable the Spectre v4 (CVE-2018-3639) fix.
401Not included by default in any AMD CPU model.
402Must be explicitly turned on for all AMD CPU models.
403This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility.
404Note that this must be explicitly enabled when when using the "host" cpu model,
405because this is a virtual feature which does not exist in the physical CPUs.
406
407
408* 'amd-ssbd'
409+
410Required to enable the Spectre v4 (CVE-2018-3639) fix.
411Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models.
412This provides higher performance than virt-ssbd, therefore a host supporting this should always expose this to guests if possible.
413virt-ssbd should none the less also be exposed for maximum guest compatibility as some kernels only know about virt-ssbd.
414
415
416* 'amd-no-ssb'
417+
418Recommended to indicate the host is not vulnerable to Spectre V4 (CVE-2018-3639).
419Not included by default in any AMD CPU model.
420Future hardware generations of CPU will not be vulnerable to CVE-2018-3639,
421and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb.
422This is mutually exclusive with virt-ssbd and amd-ssbd.
423
424
425NUMA
426^^^^
427You can also optionally emulate a *NUMA*
428footnote:[https://en.wikipedia.org/wiki/Non-uniform_memory_access] architecture
429in your VMs. The basics of the NUMA architecture mean that instead of having a
430global memory pool available to all your cores, the memory is spread into local
431banks close to each socket.
432This can bring speed improvements as the memory bus is not a bottleneck
433anymore. If your system has a NUMA architecture footnote:[if the command
434`numactl --hardware | grep available` returns more than one node, then your host
435system has a NUMA architecture] we recommend to activate the option, as this
436will allow proper distribution of the VM resources on the host system.
437This option is also required to hot-plug cores or RAM in a VM.
438
439If the NUMA option is used, it is recommended to set the number of sockets to
440the number of nodes of the host system.
441
442vCPU hot-plug
443^^^^^^^^^^^^^
444
445Modern operating systems introduced the capability to hot-plug and, to a
446certain extent, hot-unplug CPUs in a running systems. Virtualisation allows us
447to avoid a lot of the (physical) problems real hardware can cause in such
448scenarios.
449Still, this is a rather new and complicated feature, so its use should be
450restricted to cases where its absolutely needed. Most of the functionality can
451be replicated with other, well tested and less complicated, features, see
452xref:qm_cpu_resource_limits[Resource Limits].
453
454In {pve} the maximal number of plugged CPUs is always `cores * sockets`.
455To start a VM with less than this total core count of CPUs you may use the
456*vpus* setting, it denotes how many vCPUs should be plugged in at VM start.
457
458Currently only this feature is only supported on Linux, a kernel newer than 3.10
459is needed, a kernel newer than 4.7 is recommended.
460
461You can use a udev rule as follow to automatically set new CPUs as online in
462the guest:
463
464----
465SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
466----
467
468Save this under /etc/udev/rules.d/ as a file ending in `.rules`.
469
470Note: CPU hot-remove is machine dependent and requires guest cooperation.
471The deletion command does not guarantee CPU removal to actually happen,
472typically it's a request forwarded to guest using target dependent mechanism,
473e.g., ACPI on x86/amd64.
474
475
476[[qm_memory]]
477Memory
478~~~~~~
479
480For each VM you have the option to set a fixed size memory or asking
481{pve} to dynamically allocate memory based on the current RAM usage of the
482host.
483
484.Fixed Memory Allocation
485[thumbnail="screenshot/gui-create-vm-memory.png"]
486
487When setting memory and minimum memory to the same amount
488{pve} will simply allocate what you specify to your VM.
489
490Even when using a fixed memory size, the ballooning device gets added to the
491VM, because it delivers useful information such as how much memory the guest
492really uses.
493In general, you should leave *ballooning* enabled, but if you want to disable
494it (e.g. for debugging purposes), simply uncheck
495*Ballooning Device* or set
496
497 balloon: 0
498
499in the configuration.
500
501.Automatic Memory Allocation
502
503// see autoballoon() in pvestatd.pm
504When setting the minimum memory lower than memory, {pve} will make sure that the
505minimum amount you specified is always available to the VM, and if RAM usage on
506the host is below 80%, will dynamically add memory to the guest up to the
507maximum memory specified.
508
509When the host is running low on RAM, the VM will then release some memory
510back to the host, swapping running processes if needed and starting the oom
511killer in last resort. The passing around of memory between host and guest is
512done via a special `balloon` kernel driver running inside the guest, which will
513grab or release memory pages from the host.
514footnote:[A good explanation of the inner workings of the balloon driver can be found here https://rwmj.wordpress.com/2010/07/17/virtio-balloon/]
515
516When multiple VMs use the autoallocate facility, it is possible to set a
517*Shares* coefficient which indicates the relative amount of the free host memory
518that each VM should take. Suppose for instance you have four VMs, three of them
519running an HTTP server and the last one is a database server. To cache more
520database blocks in the database server RAM, you would like to prioritize the
521database VM when spare RAM is available. For this you assign a Shares property
522of 3000 to the database VM, leaving the other VMs to the Shares default setting
523of 1000. The host server has 32GB of RAM, and is currently using 16GB, leaving 32
524* 80/100 - 16 = 9GB RAM to be allocated to the VMs. The database VM will get 9 *
5253000 / (3000 + 1000 + 1000 + 1000) = 4.5 GB extra RAM and each HTTP server will
526get 1.5 GB.
527
528All Linux distributions released after 2010 have the balloon kernel driver
529included. For Windows OSes, the balloon driver needs to be added manually and can
530incur a slowdown of the guest, so we don't recommend using it on critical
531systems.
532// see https://forum.proxmox.com/threads/solved-hyper-threading-vs-no-hyper-threading-fixed-vs-variable-memory.20265/
533
534When allocating RAM to your VMs, a good rule of thumb is always to leave 1GB
535of RAM available to the host.
536
537
538[[qm_network_device]]
539Network Device
540~~~~~~~~~~~~~~
541
542[thumbnail="screenshot/gui-create-vm-network.png"]
543
544Each VM can have many _Network interface controllers_ (NIC), of four different
545types:
546
547 * *Intel E1000* is the default, and emulates an Intel Gigabit network card.
548 * the *VirtIO* paravirtualized NIC should be used if you aim for maximum
549performance. Like all VirtIO devices, the guest OS should have the proper driver
550installed.
551 * the *Realtek 8139* emulates an older 100 MB/s network card, and should
552only be used when emulating older operating systems ( released before 2002 )
553 * the *vmxnet3* is another paravirtualized device, which should only be used
554when importing a VM from another hypervisor.
555
556{pve} will generate for each NIC a random *MAC address*, so that your VM is
557addressable on Ethernet networks.
558
559The NIC you added to the VM can follow one of two different models:
560
561 * in the default *Bridged mode* each virtual NIC is backed on the host by a
562_tap device_, ( a software loopback device simulating an Ethernet NIC ). This
563tap device is added to a bridge, by default vmbr0 in {pve}. In this mode, VMs
564have direct access to the Ethernet LAN on which the host is located.
565 * in the alternative *NAT mode*, each virtual NIC will only communicate with
566the Qemu user networking stack, where a built-in router and DHCP server can
567provide network access. This built-in DHCP will serve addresses in the private
56810.0.2.0/24 range. The NAT mode is much slower than the bridged mode, and
569should only be used for testing. This mode is only available via CLI or the API,
570but not via the WebUI.
571
572You can also skip adding a network device when creating a VM by selecting *No
573network device*.
574
575.Multiqueue
576If you are using the VirtIO driver, you can optionally activate the
577*Multiqueue* option. This option allows the guest OS to process networking
578packets using multiple virtual CPUs, providing an increase in the total number
579of packets transferred.
580
581//http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html
582When using the VirtIO driver with {pve}, each NIC network queue is passed to the
583host kernel, where the queue will be processed by a kernel thread spawned by the
584vhost driver. With this option activated, it is possible to pass _multiple_
585network queues to the host kernel for each NIC.
586
587//https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Networking-Techniques.html#sect-Virtualization_Tuning_Optimization_Guide-Networking-Multi-queue_virtio-net
588When using Multiqueue, it is recommended to set it to a value equal
589to the number of Total Cores of your guest. You also need to set in
590the VM the number of multi-purpose channels on each VirtIO NIC with the ethtool
591command:
592
593`ethtool -L ens1 combined X`
594
595where X is the number of the number of vcpus of the VM.
596
597You should note that setting the Multiqueue parameter to a value greater
598than one will increase the CPU load on the host and guest systems as the
599traffic increases. We recommend to set this option only when the VM has to
600process a great number of incoming connections, such as when the VM is running
601as a router, reverse proxy or a busy HTTP server doing long polling.
602
603[[qm_display]]
604Display
605~~~~~~~
606
607QEMU can virtualize a few types of VGA hardware. Some examples are:
608
609* *std*, the default, emulates a card with Bochs VBE extensions.
610* *cirrus*, this was once the default, it emulates a very old hardware module
611with all its problems. This display type should only be used if really
612necessary footnote:[https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful/
613qemu: using cirrus considered harmful], e.g., if using Windows XP or earlier
614* *vmware*, is a VMWare SVGA-II compatible adapter.
615* *qxl*, is the QXL paravirtualized graphics card. Selecting this also
616enables SPICE for the VM.
617
618You can edit the amount of memory given to the virtual GPU, by setting
619the 'memory' option. This can enable higher resolutions inside the VM,
620especially with SPICE/QXL.
621
622As the memory is reserved by display device, selecting Multi-Monitor mode
623for SPICE (e.g., `qxl2` for dual monitors) has some implications:
624
625* Windows needs a device for each monitor, so if your 'ostype' is some
626version of Windows, {pve} gives the VM an extra device per monitor.
627Each device gets the specified amount of memory.
628
629* Linux VMs, can always enable more virtual monitors, but selecting
630a Multi-Monitor mode multiplies the memory given to the device with
631the number of monitors.
632
633Selecting `serialX` as display 'type' disables the VGA output, and redirects
634the Web Console to the selected serial port. A configured display 'memory'
635setting will be ignored in that case.
636
637[[qm_usb_passthrough]]
638USB Passthrough
639~~~~~~~~~~~~~~~
640
641There are two different types of USB passthrough devices:
642
643* Host USB passthrough
644* SPICE USB passthrough
645
646Host USB passthrough works by giving a VM a USB device of the host.
647This can either be done via the vendor- and product-id, or
648via the host bus and port.
649
650The vendor/product-id looks like this: *0123:abcd*,
651where *0123* is the id of the vendor, and *abcd* is the id
652of the product, meaning two pieces of the same usb device
653have the same id.
654
655The bus/port looks like this: *1-2.3.4*, where *1* is the bus
656and *2.3.4* is the port path. This represents the physical
657ports of your host (depending of the internal order of the
658usb controllers).
659
660If a device is present in a VM configuration when the VM starts up,
661but the device is not present in the host, the VM can boot without problems.
662As soon as the device/port is available in the host, it gets passed through.
663
664WARNING: Using this kind of USB passthrough means that you cannot move
665a VM online to another host, since the hardware is only available
666on the host the VM is currently residing.
667
668The second type of passthrough is SPICE USB passthrough. This is useful
669if you use a SPICE client which supports it. If you add a SPICE USB port
670to your VM, you can passthrough a USB device from where your SPICE client is,
671directly to the VM (for example an input device or hardware dongle).
672
673
674[[qm_bios_and_uefi]]
675BIOS and UEFI
676~~~~~~~~~~~~~
677
678In order to properly emulate a computer, QEMU needs to use a firmware.
679By default QEMU uses *SeaBIOS* for this, which is an open-source, x86 BIOS
680implementation. SeaBIOS is a good choice for most standard setups.
681
682There are, however, some scenarios in which a BIOS is not a good firmware
683to boot from, e.g. if you want to do VGA passthrough. footnote:[Alex Williamson has a very good blog entry about this.
684http://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html]
685In such cases, you should rather use *OVMF*, which is an open-source UEFI implementation. footnote:[See the OVMF Project http://www.tianocore.org/ovmf/]
686
687If you want to use OVMF, there are several things to consider:
688
689In order to save things like the *boot order*, there needs to be an EFI Disk.
690This disk will be included in backups and snapshots, and there can only be one.
691
692You can create such a disk with the following command:
693
694 qm set <vmid> -efidisk0 <storage>:1,format=<format>
695
696Where *<storage>* is the storage where you want to have the disk, and
697*<format>* is a format which the storage supports. Alternatively, you can
698create such a disk through the web interface with 'Add' -> 'EFI Disk' in the
699hardware section of a VM.
700
701When using OVMF with a virtual display (without VGA passthrough),
702you need to set the client resolution in the OVMF menu(which you can reach
703with a press of the ESC button during boot), or you have to choose
704SPICE as the display type.
705
706[[qm_ivshmem]]
707Inter-VM shared memory
708~~~~~~~~~~~~~~~~~~~~~~
709
710You can add a Inter-VM shared memory device (`ivshmem`) to be able to
711share memory between the host and a guest, or between multiple guests.
712
713To add such a device, you can use `qm`:
714
715 qm set <vmid> -ivshmem size=32,name=foo
716
717Where the size is in MiB. The file will be located under
718`/dev/shm/pve-shm-$name` (the default name is the vmid).
719
720A usecase for such a device is Looking Glass
721footnote:[Looking Glass: https://looking-glass.hostfission.com/]
722which enables high performance, low-latency display mirroring between
723host and guest.
724
725[[qm_startup_and_shutdown]]
726Automatic Start and Shutdown of Virtual Machines
727~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
728
729After creating your VMs, you probably want them to start automatically
730when the host system boots. For this you need to select the option 'Start at
731boot' from the 'Options' Tab of your VM in the web interface, or set it with
732the following command:
733
734 qm set <vmid> -onboot 1
735
736.Start and Shutdown Order
737
738[thumbnail="screenshot/gui-qemu-edit-start-order.png"]
739
740In some case you want to be able to fine tune the boot order of your
741VMs, for instance if one of your VM is providing firewalling or DHCP
742to other guest systems. For this you can use the following
743parameters:
744
745* *Start/Shutdown order*: Defines the start order priority. E.g. set it to 1 if
746you want the VM to be the first to be started. (We use the reverse startup
747order for shutdown, so a machine with a start order of 1 would be the last to
748be shut down). If multiple VMs have the same order defined on a host, they will
749additionally be ordered by 'VMID' in ascending order.
750* *Startup delay*: Defines the interval between this VM start and subsequent
751VMs starts . E.g. set it to 240 if you want to wait 240 seconds before starting
752other VMs.
753* *Shutdown timeout*: Defines the duration in seconds {pve} should wait
754for the VM to be offline after issuing a shutdown command.
755By default this value is set to 180, which means that {pve} will issue a
756shutdown request and wait 180 seconds for the machine to be offline. If
757the machine is still online after the timeout it will be stopped forcefully.
758
759NOTE: VMs managed by the HA stack do not follow the 'start on boot' and
760'boot order' options currently. Those VMs will be skipped by the startup and
761shutdown algorithm as the HA manager itself ensures that VMs get started and
762stopped.
763
764Please note that machines without a Start/Shutdown order parameter will always
765start after those where the parameter is set. Further, this parameter can only
766be enforced between virtual machines running on the same host, not
767cluster-wide.
768
769
770[[qm_migration]]
771Migration
772---------
773
774[thumbnail="screenshot/gui-qemu-migrate.png"]
775
776If you have a cluster, you can migrate your VM to another host with
777
778 qm migrate <vmid> <target>
779
780There are generally two mechanisms for this
781
782* Online Migration (aka Live Migration)
783* Offline Migration
784
785Online Migration
786~~~~~~~~~~~~~~~~
787
788When your VM is running and it has no local resources defined (such as disks
789on local storage, passed through devices, etc.) you can initiate a live
790migration with the -online flag.
791
792How it works
793^^^^^^^^^^^^
794
795This starts a Qemu Process on the target host with the 'incoming' flag, which
796means that the process starts and waits for the memory data and device states
797from the source Virtual Machine (since all other resources, e.g. disks,
798are shared, the memory content and device state are the only things left
799to transmit).
800
801Once this connection is established, the source begins to send the memory
802content asynchronously to the target. If the memory on the source changes,
803those sections are marked dirty and there will be another pass of sending data.
804This happens until the amount of data to send is so small that it can
805pause the VM on the source, send the remaining data to the target and start
806the VM on the target in under a second.
807
808Requirements
809^^^^^^^^^^^^
810
811For Live Migration to work, there are some things required:
812
813* The VM has no local resources (e.g. passed through devices, local disks, etc.)
814* The hosts are in the same {pve} cluster.
815* The hosts have a working (and reliable) network connection.
816* The target host must have the same or higher versions of the
817 {pve} packages. (It *might* work the other way, but this is never guaranteed)
818
819Offline Migration
820~~~~~~~~~~~~~~~~~
821
822If you have local resources, you can still offline migrate your VMs,
823as long as all disk are on storages, which are defined on both hosts.
824Then the migration will copy the disk over the network to the target host.
825
826[[qm_copy_and_clone]]
827Copies and Clones
828-----------------
829
830[thumbnail="screenshot/gui-qemu-full-clone.png"]
831
832VM installation is usually done using an installation media (CD-ROM)
833from the operation system vendor. Depending on the OS, this can be a
834time consuming task one might want to avoid.
835
836An easy way to deploy many VMs of the same type is to copy an existing
837VM. We use the term 'clone' for such copies, and distinguish between
838'linked' and 'full' clones.
839
840Full Clone::
841
842The result of such copy is an independent VM. The
843new VM does not share any storage resources with the original.
844+
845
846It is possible to select a *Target Storage*, so one can use this to
847migrate a VM to a totally different storage. You can also change the
848disk image *Format* if the storage driver supports several formats.
849+
850
851NOTE: A full clone needs to read and copy all VM image data. This is
852usually much slower than creating a linked clone.
853+
854
855Some storage types allows to copy a specific *Snapshot*, which
856defaults to the 'current' VM data. This also means that the final copy
857never includes any additional snapshots from the original VM.
858
859
860Linked Clone::
861
862Modern storage drivers support a way to generate fast linked
863clones. Such a clone is a writable copy whose initial contents are the
864same as the original data. Creating a linked clone is nearly
865instantaneous, and initially consumes no additional space.
866+
867
868They are called 'linked' because the new image still refers to the
869original. Unmodified data blocks are read from the original image, but
870modification are written (and afterwards read) from a new
871location. This technique is called 'Copy-on-write'.
872+
873
874This requires that the original volume is read-only. With {pve} one
875can convert any VM into a read-only <<qm_templates, Template>>). Such
876templates can later be used to create linked clones efficiently.
877+
878
879NOTE: You cannot delete an original template while linked clones
880exist.
881+
882
883It is not possible to change the *Target storage* for linked clones,
884because this is a storage internal feature.
885
886
887The *Target node* option allows you to create the new VM on a
888different node. The only restriction is that the VM is on shared
889storage, and that storage is also available on the target node.
890
891To avoid resource conflicts, all network interface MAC addresses get
892randomized, and we generate a new 'UUID' for the VM BIOS (smbios1)
893setting.
894
895
896[[qm_templates]]
897Virtual Machine Templates
898-------------------------
899
900One can convert a VM into a Template. Such templates are read-only,
901and you can use them to create linked clones.
902
903NOTE: It is not possible to start templates, because this would modify
904the disk images. If you want to change the template, create a linked
905clone and modify that.
906
907VM Generation ID
908----------------
909
910{pve} supports Virtual Machine Generation ID ('vmgenid') footnote:[Official
911'vmgenid' Specification
912https://docs.microsoft.com/en-us/windows/desktop/hyperv_v2/virtual-machine-generation-identifier]
913for virtual machines.
914This can be used by the guest operating system to detect any event resulting
915in a time shift event, for example, restoring a backup or a snapshot rollback.
916
917When creating new VMs, a 'vmgenid' will be automatically generated and saved
918in its configuration file.
919
920To create and add a 'vmgenid' to an already existing VM one can pass the
921special value `1' to let {pve} autogenerate one or manually set the 'UUID'
922footnote:[Online GUID generator http://guid.one/] by using it as value,
923e.g.:
924
925----
926 qm set VMID -vmgenid 1
927 qm set VMID -vmgenid 00000000-0000-0000-0000-000000000000
928----
929
930NOTE: The initial addition of a 'vmgenid' device to an existing VM, may result
931in the same effects as a change on snapshot rollback, backup restore, etc., has
932as the VM can interpret this as generation change.
933
934In the rare case the 'vmgenid' mechanism is not wanted one can pass `0' for
935its value on VM creation, or retroactively delete the property in the
936configuration with:
937
938----
939 qm set VMID -delete vmgenid
940----
941
942The most prominent use case for 'vmgenid' are newer Microsoft Windows
943operating systems, which use it to avoid problems in time sensitive or
944replicate services (e.g., databases, domain controller
945footnote:[https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/virtualized-domain-controller-architecture])
946on snapshot rollback, backup restore or a whole VM clone operation.
947
948Importing Virtual Machines and disk images
949------------------------------------------
950
951A VM export from a foreign hypervisor takes usually the form of one or more disk
952 images, with a configuration file describing the settings of the VM (RAM,
953 number of cores). +
954The disk images can be in the vmdk format, if the disks come from
955VMware or VirtualBox, or qcow2 if the disks come from a KVM hypervisor.
956The most popular configuration format for VM exports is the OVF standard, but in
957practice interoperation is limited because many settings are not implemented in
958the standard itself, and hypervisors export the supplementary information
959in non-standard extensions.
960
961Besides the problem of format, importing disk images from other hypervisors
962may fail if the emulated hardware changes too much from one hypervisor to
963another. Windows VMs are particularly concerned by this, as the OS is very
964picky about any changes of hardware. This problem may be solved by
965installing the MergeIDE.zip utility available from the Internet before exporting
966and choosing a hard disk type of *IDE* before booting the imported Windows VM.
967
968Finally there is the question of paravirtualized drivers, which improve the
969speed of the emulated system and are specific to the hypervisor.
970GNU/Linux and other free Unix OSes have all the necessary drivers installed by
971default and you can switch to the paravirtualized drivers right after importing
972the VM. For Windows VMs, you need to install the Windows paravirtualized
973drivers by yourself.
974
975GNU/Linux and other free Unix can usually be imported without hassle. Note
976that we cannot guarantee a successful import/export of Windows VMs in all
977cases due to the problems above.
978
979Step-by-step example of a Windows OVF import
980~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
981
982Microsoft provides
983https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/[Virtual Machines downloads]
984 to get started with Windows development.We are going to use one of these
985to demonstrate the OVF import feature.
986
987Download the Virtual Machine zip
988^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
989
990After getting informed about the user agreement, choose the _Windows 10
991Enterprise (Evaluation - Build)_ for the VMware platform, and download the zip.
992
993Extract the disk image from the zip
994^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
995
996Using the `unzip` utility or any archiver of your choice, unpack the zip,
997and copy via ssh/scp the ovf and vmdk files to your {pve} host.
998
999Import the Virtual Machine
1000^^^^^^^^^^^^^^^^^^^^^^^^^^
1001
1002This will create a new virtual machine, using cores, memory and
1003VM name as read from the OVF manifest, and import the disks to the +local-lvm+
1004 storage. You have to configure the network manually.
1005
1006 qm importovf 999 WinDev1709Eval.ovf local-lvm
1007
1008The VM is ready to be started.
1009
1010Adding an external disk image to a Virtual Machine
1011~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1012
1013You can also add an existing disk image to a VM, either coming from a
1014foreign hypervisor, or one that you created yourself.
1015
1016Suppose you created a Debian/Ubuntu disk image with the 'vmdebootstrap' tool:
1017
1018 vmdebootstrap --verbose \
1019 --size 10GiB --serial-console \
1020 --grub --no-extlinux \
1021 --package openssh-server \
1022 --package avahi-daemon \
1023 --package qemu-guest-agent \
1024 --hostname vm600 --enable-dhcp \
1025 --customize=./copy_pub_ssh.sh \
1026 --sparse --image vm600.raw
1027
1028You can now create a new target VM for this image.
1029
1030 qm create 600 --net0 virtio,bridge=vmbr0 --name vm600 --serial0 socket \
1031 --bootdisk scsi0 --scsihw virtio-scsi-pci --ostype l26
1032
1033Add the disk image as +unused0+ to the VM, using the storage +pvedir+:
1034
1035 qm importdisk 600 vm600.raw pvedir
1036
1037Finally attach the unused disk to the SCSI controller of the VM:
1038
1039 qm set 600 --scsi0 pvedir:600/vm-600-disk-1.raw
1040
1041The VM is ready to be started.
1042
1043
1044ifndef::wiki[]
1045include::qm-cloud-init.adoc[]
1046endif::wiki[]
1047
1048ifndef::wiki[]
1049include::qm-pci-passthrough.adoc[]
1050endif::wiki[]
1051
1052Hookscripts
1053~~~~~~~~~~~
1054
1055You can add a hook script to VMs with the config property `hookscript`.
1056
1057 qm set 100 -hookscript local:snippets/hookscript.pl
1058
1059It will be called during various phases of the guests lifetime.
1060For an example and documentation see the example script under
1061`/usr/share/pve-docs/examples/guest-example-hookscript.pl`.
1062
1063Managing Virtual Machines with `qm`
1064------------------------------------
1065
1066qm is the tool to manage Qemu/Kvm virtual machines on {pve}. You can
1067create and destroy virtual machines, and control execution
1068(start/stop/suspend/resume). Besides that, you can use qm to set
1069parameters in the associated config file. It is also possible to
1070create and delete virtual disks.
1071
1072CLI Usage Examples
1073~~~~~~~~~~~~~~~~~~
1074
1075Using an iso file uploaded on the 'local' storage, create a VM
1076with a 4 GB IDE disk on the 'local-lvm' storage
1077
1078 qm create 300 -ide0 local-lvm:4 -net0 e1000 -cdrom local:iso/proxmox-mailgateway_2.1.iso
1079
1080Start the new VM
1081
1082 qm start 300
1083
1084Send a shutdown request, then wait until the VM is stopped.
1085
1086 qm shutdown 300 && qm wait 300
1087
1088Same as above, but only wait for 40 seconds.
1089
1090 qm shutdown 300 && qm wait 300 -timeout 40
1091
1092
1093[[qm_configuration]]
1094Configuration
1095-------------
1096
1097VM configuration files are stored inside the Proxmox cluster file
1098system, and can be accessed at `/etc/pve/qemu-server/<VMID>.conf`.
1099Like other files stored inside `/etc/pve/`, they get automatically
1100replicated to all other cluster nodes.
1101
1102NOTE: VMIDs < 100 are reserved for internal purposes, and VMIDs need to be
1103unique cluster wide.
1104
1105.Example VM Configuration
1106----
1107cores: 1
1108sockets: 1
1109memory: 512
1110name: webmail
1111ostype: l26
1112bootdisk: virtio0
1113net0: e1000=EE:D2:28:5F:B6:3E,bridge=vmbr0
1114virtio0: local:vm-100-disk-1,size=32G
1115----
1116
1117Those configuration files are simple text files, and you can edit them
1118using a normal text editor (`vi`, `nano`, ...). This is sometimes
1119useful to do small corrections, but keep in mind that you need to
1120restart the VM to apply such changes.
1121
1122For that reason, it is usually better to use the `qm` command to
1123generate and modify those files, or do the whole thing using the GUI.
1124Our toolkit is smart enough to instantaneously apply most changes to
1125running VM. This feature is called "hot plug", and there is no
1126need to restart the VM in that case.
1127
1128
1129File Format
1130~~~~~~~~~~~
1131
1132VM configuration files use a simple colon separated key/value
1133format. Each line has the following format:
1134
1135-----
1136# this is a comment
1137OPTION: value
1138-----
1139
1140Blank lines in those files are ignored, and lines starting with a `#`
1141character are treated as comments and are also ignored.
1142
1143
1144[[qm_snapshots]]
1145Snapshots
1146~~~~~~~~~
1147
1148When you create a snapshot, `qm` stores the configuration at snapshot
1149time into a separate snapshot section within the same configuration
1150file. For example, after creating a snapshot called ``testsnapshot'',
1151your configuration file will look like this:
1152
1153.VM configuration with snapshot
1154----
1155memory: 512
1156swap: 512
1157parent: testsnaphot
1158...
1159
1160[testsnaphot]
1161memory: 512
1162swap: 512
1163snaptime: 1457170803
1164...
1165----
1166
1167There are a few snapshot related properties like `parent` and
1168`snaptime`. The `parent` property is used to store the parent/child
1169relationship between snapshots. `snaptime` is the snapshot creation
1170time stamp (Unix epoch).
1171
1172
1173[[qm_options]]
1174Options
1175~~~~~~~
1176
1177include::qm.conf.5-opts.adoc[]
1178
1179
1180Locks
1181-----
1182
1183Online migrations, snapshots and backups (`vzdump`) set a lock to
1184prevent incompatible concurrent actions on the affected VMs. Sometimes
1185you need to remove such a lock manually (e.g., after a power failure).
1186
1187 qm unlock <vmid>
1188
1189CAUTION: Only do that if you are sure the action which set the lock is
1190no longer running.
1191
1192
1193ifdef::wiki[]
1194
1195See Also
1196~~~~~~~~
1197
1198* link:/wiki/Cloud-Init_Support[Cloud-Init Support]
1199
1200endif::wiki[]
1201
1202
1203ifdef::manvolnum[]
1204
1205Files
1206------
1207
1208`/etc/pve/qemu-server/<VMID>.conf`::
1209
1210Configuration file for the VM '<VMID>'.
1211
1212
1213include::pve-copyright.adoc[]
1214endif::manvolnum[]