X-Git-Url: https://git.proxmox.com/?p=pve-docs.git;a=blobdiff_plain;f=qm.adoc;h=06e88e3ae975b3ac0cbba4b3f257f358238c6ceb;hp=44d45f6152b875889369ea1381b39d4aa90e8a1a;hb=445822a94613be87eb68f126f21d56249d0e88ca;hpb=eb01c5cfee72747e7e2669c28b6a499fb61b0bf4

diff --git a/qm.adoc b/qm.adoc
index 44d45f6..06e88e3 100644
--- a/qm.adoc
+++ b/qm.adoc
@@ -130,7 +130,7 @@ Hard Disk
 Qemu can emulate a number of storage controllers:
 
 * the *IDE* controller, has a design which goes back to the 1984 PC/AT disk
-controller. Even if this controller has been superseded by more more designs,
+controller. Even if this controller has been superseded by recent designs,
 each and every OS you can think of has support for it, making it a great choice
 if you want to run an OS released before 2003. You can connect up to 4 devices
 on this controller.
@@ -154,25 +154,25 @@ _VirtIO SCSI single_ which will allow you to select the *IO Thread* option.
 When selecting _VirtIO SCSI single_ Qemu will create a new controller for
 each disk, instead of adding all disks to the same controller.
 
-* The *Virtio* controller, also called virtio-blk to distinguish from
-the VirtIO SCSI controller, is an older type of paravirtualized controller
-which has been superseded in features by the Virtio SCSI Controller.
+* The *VirtIO Block* controller, often just called VirtIO or virtio-blk,
+is an older type of paravirtualized controller. It has been superseded by the
+VirtIO SCSI Controller, in terms of features.
 
 [thumbnail="gui-create-vm-hard-disk.png"]
 On each controller you attach a number of emulated hard disks, which are backed
 by a file or a block device residing in the configured storage. The choice of
 a storage type will determine the format of the hard disk image. Storages which
 present block devices (LVM, ZFS, Ceph) will require the *raw disk image format*,
-whereas files based storages (Ext4, NFS, GlusterFS) will let you to choose
+whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose
 either the *raw disk image format* or the *QEMU image format*.
 
  * the *QEMU image format* is a copy on write format which allows snapshots, and
   thin provisioning of the disk image.
  * the *raw disk image* is a bit-to-bit image of a hard disk, similar to what
  you would get when executing the `dd` command on a block device in Linux. This
- format do not support thin provisioning or snapshotting by itself, requiring
- cooperation from the storage layer for these tasks. It is however 10% faster
-  than the *QEMU image format*. footnote:[See this benchmark for details
+ format does not support thin provisioning or snapshots by itself, requiring
+ cooperation from the storage layer for these tasks. It may, however, be up to
+ 10% faster than the *QEMU image format*. footnote:[See this benchmark for details
  http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf]
  * the *VMware image format* only makes sense if you intend to import/export the
  disk image to other hypervisors.
@@ -219,9 +219,9 @@ A *CPU socket* is a physical slot on a PC motherboard where you can plug a CPU.
 This CPU can then contain one or many *cores*, which are independent
 processing units. Whether you have a single CPU socket with 4 cores, or two CPU
 sockets with two cores is mostly irrelevant from a performance point of view.
-However some software is licensed depending on the number of sockets you have in
-your machine, in that case it makes sense to set the number of of sockets to
-what the license allows you, and increase the number of cores.
+However some software licenses depend on the number of sockets a machine has,
+in that case it makes sense to set the number of sockets to what the license
+allows you.
 
 Increasing the number of virtual cpus (cores and sockets) will usually provide a
 performance improvement though that is heavily dependent on the use of the VM.
@@ -230,14 +230,58 @@ virtual cpus, as for each virtual cpu you add, Qemu will create a new thread of
 execution on the host system. If you're not sure about the workload of your VM,
 it is usually a safe bet to set the number of *Total cores* to 2.
 
-NOTE: It is perfectly safe to set the _overall_ number of total cores in all
-your VMs to be greater than the number of of cores you have on your server (i.e.
-4 VMs with each 4 Total cores running in a 8 core machine is OK) In that case
-the host system will balance the Qemu execution threads between your server
-cores just like if you were running a standard multithreaded application.
-However {pve} will prevent you to allocate on a _single_ machine more vcpus than
-physically available, as this will only bring the performance down due to the
-cost of context switches.
+NOTE: It is perfectly safe if the _overall_ number of cores of all your VMs
+is greater than the number of cores on the server (e.g., 4 VMs with each 4
+cores on a machine with only 8 cores). In that case the host system will
+balance the Qemu execution threads between your server cores, just like if you
+were running a standard multithreaded application. However, {pve} will prevent
+you from assigning more virtual CPU cores than physically available, as this will
+only bring the performance down due to the cost of context switches.
+
+[[qm_cpu_resource_limits]]
+Resource Limits
+^^^^^^^^^^^^^^^
+
+In addition to the number of virtual cores, you can configure how much resources
+a VM can get in relation to the host CPU time and also in relation to other
+VMs.
+With the *cpulimit* (``Host CPU Time'') option you can limit how much CPU time
+the whole VM can use on the host. It is a floating point value representing CPU
+time in percent, so `1.0` is equal to `100%`, `2.5` to `250%` and so on. If a
+single process would fully use one single core it would have `100%` CPU Time
+usage. If a VM with four cores utilizes all its cores fully it would
+theoretically use `400%`. In reality the usage may be even a bit higher as Qemu
+can have additional threads for VM peripherals besides the vCPU core ones.
+This setting can be useful if a VM should have multiple vCPUs, as it runs a few
+processes in parallel, but the VM as a whole should not be able to run all
+vCPUs at 100% at the same time. Using a specific example: lets say we have a VM
+which would profit from having 8 vCPUs, but at no time all of those 8 cores
+should run at full load - as this would make the server so overloaded that
+other VMs and CTs would get to less CPU. So, we set the *cpulimit* limit to
+`4.0` (=400%). If all cores do the same heavy work they would all get 50% of a
+real host cores CPU time. But, if only 4 would do work they could still get
+almost 100% of a real core each.
+
+NOTE: VMs can, depending on their configuration, use additional threads e.g.,
+for networking or IO operations but also live migration. Thus a VM can show up
+to use more CPU time than just its virtual CPUs could use. To ensure that a VM
+never uses more CPU time than virtual CPUs assigned set the *cpulimit* setting
+to the same value as the total core count.
+
+The second CPU resource limiting setting, *cpuunits* (nowadays often called CPU
+shares or CPU weight), controls how much CPU time a VM gets in regards to other
+VMs running.  It is a relative weight which defaults to `1024`, if you increase
+this for a VM it will be prioritized by the scheduler in comparison to other
+VMs with lower weight. E.g., if VM 100 has set the default 1024 and VM 200 was
+changed to `2048`, the latter VM 200 would receive twice the CPU bandwidth than
+the first VM 100.
+
+For more information see `man systemd.resource-control`, here `CPUQuota`
+corresponds to `cpulimit` and `CPUShares` corresponds to our `cpuunits`
+setting, visit its Notes section for references and implementation details.
+
+CPU Type
+^^^^^^^^
 
 Qemu can emulate a number different of *CPU types* from 486 to the latest Xeon
 processors. Each new processor generation adds new features, like hardware
@@ -256,22 +300,114 @@ kvm64 is a Pentium 4 look a like CPU type, which has a reduced CPU flags set,
 but is guaranteed to work everywhere.
 
 In short, if you care about live migration and moving VMs between nodes, leave
-the kvm64 default. If you donât care about live migration, set the CPU type to
-host, as in theory this will give your guests maximum performance.
+the kvm64 default. If you donât care about live migration or have a homogeneous
+cluster where all nodes have the same CPU, set the CPU type to host, as in
+theory this will give your guests maximum performance.
 
-You can also optionally emulate a *NUMA* architecture in your VMs. The basics of
-the NUMA architecture mean that instead of having a global memory pool available
-to all your cores, the memory is spread into local banks close to each socket.
+Meltdown / Spectre related CPU flags
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are two CPU flags related to the Meltdown and Spectre vulnerabilities
+footnote:[Meltdown Attack https://meltdownattack.com/] which need to be set
+manually unless the selected CPU type of your VM already enables them by default.
+
+The first, called 'pcid', helps to reduce the performance impact of the Meltdown
+mitigation called 'Kernel Page-Table Isolation (KPTI)', which effectively hides
+the Kernel memory from the user space. Without PCID, KPTI is quite an expensive
+mechanism footnote:[PCID is now a critical performance/security feature on x86
+https://groups.google.com/forum/m/#!topic/mechanical-sympathy/L9mHTbeQLNU].
+
+The second CPU flag is called 'spec-ctrl', which allows an operating system to
+selectively disable or restrict speculative execution in order to limit the
+ability of attackers to exploit the Spectre vulnerability.
+
+There are two requirements that need to be fulfilled in order to use these two
+CPU flags:
+
+* The host CPU(s) must support the feature and propagate it to the guest's virtual CPU(s)
+* The guest operating system must be updated to a version which mitigates the
+  attacks and is able to utilize the CPU feature
+
+In order to use 'spec-ctrl', your CPU or system vendor also needs to provide a
+so-called ``microcode update'' footnote:[You can use `intel-microcode' /
+`amd-microcode' from Debian non-free if your vendor does not provide such an
+update. Note that not all affected CPUs can be updated to support spec-ctrl.]
+for your CPU.
+
+To check if the {pve} host supports PCID, execute the following command as root:
+
+----
+# grep ' pcid ' /proc/cpuinfo
+----
+
+If this does not return empty your host's CPU has support for 'pcid'.
+
+To check if the {pve} host supports spec-ctrl, execute the following command as root:
+
+----
+# grep ' spec_ctrl ' /proc/cpuinfo
+----
+
+If this does not return empty your host's CPU has support for 'spec-ctrl'.
+
+If you use `host' or another CPU type which enables the desired flags by
+default, and you updated your guest OS to make use of the associated CPU
+features, you're already set.
+
+Otherwise you need to set the desired CPU flag of the virtual CPU, either by
+editing the CPU options in the WebUI, or by setting the 'flags' property of the
+'cpu' option in the VM configuration file.
+
+NUMA
+^^^^
+You can also optionally emulate a *NUMA*
+footnote:[https://en.wikipedia.org/wiki/Non-uniform_memory_access] architecture
+in your VMs. The basics of the NUMA architecture mean that instead of having a
+global memory pool available to all your cores, the memory is spread into local
+banks close to each socket.
 This can bring speed improvements as the memory bus is not a bottleneck
 anymore. If your system has a NUMA architecture footnote:[if the command
 `numactl --hardware | grep available` returns more than one node, then your host
 system has a NUMA architecture] we recommend to activate the option, as this
-will allow proper distribution of the VM resources on the host system. This
-option is also required in {pve} to allow hotplugging of cores and RAM to a VM.
+will allow proper distribution of the VM resources on the host system.
+This option is also required to hot-plug cores or RAM in a VM.
 
 If the NUMA option is used, it is recommended to set the number of sockets to
 the number of sockets of the host system.
 
+vCPU hot-plug
+^^^^^^^^^^^^^
+
+Modern operating systems introduced the capability to hot-plug and, to a
+certain extent, hot-unplug CPUs in a running systems. Virtualisation allows us
+to avoid a lot of the (physical) problems real hardware can cause in such
+scenarios.
+Still, this is a rather new and complicated feature, so its use should be
+restricted to cases where its absolutely needed. Most of the functionality can
+be replicated with other, well tested and less complicated, features, see
+xref:qm_cpu_resource_limits[Resource Limits].
+
+In {pve} the maximal number of plugged CPUs is always `cores * sockets`.
+To start a VM with less than this total core count of CPUs you may use the
+*vpus* setting, it denotes how many vCPUs should be plugged in at VM start.
+
+Currently only this feature is only supported on Linux, a kernel newer than 3.10
+is needed, a kernel newer than 4.7 is recommended.
+
+You can use a udev rule as follow to automatically set new CPUs as online in
+the guest:
+
+----
+SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
+----
+
+Save this under /etc/udev/rules.d/ as a file ending in `.rules`.
+
+Note: CPU hot-remove is machine dependent and requires guest cooperation.
+The deletion command does not guarantee CPU removal to actually happen,
+typically it's a request forwarded to guest using target dependent mechanism,
+e.g., ACPI on x86/amd64.
+
 
 [[qm_memory]]
 Memory
@@ -282,27 +418,26 @@ For each VM you have the option to set a fixed size memory or asking
 host.
 
 .Fixed Memory Allocation
-[thumbnail="gui-create-vm-memory-fixed.png"]
+[thumbnail="gui-create-vm-memory.png"]
 
-When choosing a *fixed size memory* {pve} will simply allocate what you
-specify to your VM.
+When setting memory and minimum memory to the same amount
+{pve} will simply allocate what you specify to your VM.
 
 Even when using a fixed memory size, the ballooning device gets added to the
 VM, because it delivers useful information such as how much memory the guest
 really uses.
 In general, you should leave *ballooning* enabled, but if you want to disable
 it (e.g. for debugging purposes), simply uncheck
-*Ballooning* or set
+*Ballooning Device* or set
 
  balloon: 0
 
 in the configuration.
 
 .Automatic Memory Allocation
-[thumbnail="gui-create-vm-memory-dynamic.png", float="left"]
 
 // see autoballoon() in pvestatd.pm
-When choosing to *automatically allocate memory*, {pve} will make sure that the
+When setting the minimum memory lower than memory, {pve} will make sure that the
 minimum amount you specified is always available to the VM, and if RAM usage on
 the host is below 80%, will dynamically add memory to the guest up to the
 maximum memory specified.
@@ -367,7 +502,8 @@ have direct access to the Ethernet LAN on which the host is located.
 the Qemu user networking stack, where a built-in router and DHCP server can
 provide network access. This built-in DHCP will serve addresses in the private
 10.0.2.0/24 range. The NAT mode is much slower than the bridged mode, and
-should only be used for testing.
+should only be used for testing. This mode is only available via CLI or the API,
+but not via the WebUI.
 
 You can also skip adding a network device when creating a VM by selecting *No
 network device*.
@@ -493,15 +629,16 @@ parameters:
 * *Start/Shutdown order*: Defines the start order priority. E.g. set it to 1 if
 you want the VM to be the first to be started. (We use the reverse startup
 order for shutdown, so a machine with a start order of 1 would be the last to
-be shut down)
+be shut down). If multiple VMs have the same order defined on a host, they will
+additionally be ordered by 'VMID' in ascending order.
 * *Startup delay*: Defines the interval between this VM start and subsequent
 VMs starts . E.g. set it to 240 if you want to wait 240 seconds before starting
 other VMs.
 * *Shutdown timeout*: Defines the duration in seconds {pve} should wait
 for the VM to be offline after issuing a shutdown command.
-By default this value is set to 60, which means that {pve} will issue a
-shutdown request, wait 60s for the machine to be offline, and if after 60s
-the machine is still online will notify that the shutdown action failed.
+By default this value is set to 180, which means that {pve} will issue a
+shutdown request and wait 180 seconds for the machine to be offline. If
+the machine is still online after the timeout it will be stopped forcefully.
 
 NOTE: VMs managed by the HA stack do not follow the 'start on boot' and
 'boot order' options currently. Those VMs will be skipped by the startup and
@@ -509,8 +646,8 @@ shutdown algorithm as the HA manager itself ensures that VMs get started and
 stopped.
 
 Please note that machines without a Start/Shutdown order parameter will always
-start after those where the parameter is set, and this parameter only
-makes sense between the machines running locally on a host, and not
+start after those where the parameter is set. Further, this parameter can only
+be enforced between virtual machines running on the same host, not
 cluster-wide.
 
 
@@ -651,8 +788,8 @@ NOTE: It is not possible to start templates, because this would modify
 the disk images. If you want to change the template, create a linked
 clone and modify that.
 
-Importing Virtual Machines from foreign hypervisors
----------------------------------------------------
+Importing Virtual Machines and disk images
+------------------------------------------
 
 A VM export from a foreign hypervisor takes usually the form of one or more disk
  images, with a configuration file describing the settings of the VM (RAM,
@@ -682,44 +819,77 @@ GNU/Linux and other free Unix can usually be imported without hassle. Note
 that we cannot guarantee a successful import/export of Windows VMs in all
 cases due to the problems above.
 
-Step-by-step example of a Windows disk image import
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Step-by-step example of a Windows OVF import
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 Microsoft provides
-https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/[Virtual Machines exports]
- in different formats for browser testing. We are going to use one of these to
- demonstrate a VMDK import.
+https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/[Virtual Machines downloads]
+ to get started with Windows development.We are going to use one of these 
+to demonstrate the OVF import feature.
 
-Download the export zip
-^^^^^^^^^^^^^^^^^^^^^^^
+Download the Virtual Machine zip
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-After getting informed about the user agreement, choose the _Microsoft Edge on
-Windows 10 Virtual Machine_ for the VMware platform, and download the zip.
+After getting informed about the user agreement, choose the _Windows 10 
+Enterprise (Evaluation - Build)_ for the VMware platform, and download the zip.
 
 Extract the disk image from the zip
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Using the unzip utility or any archiver of your choice, unpack the zip,
-and copy via ssh/scp the vmdk file to your {pve} host.
+Using the `unzip` utility or any archiver of your choice, unpack the zip,
+and copy via ssh/scp the ovf and vmdk files to your {pve} host.
+
+Import the Virtual Machine
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This will create a new virtual machine, using cores, memory and
+VM name as read from the OVF manifest, and import the disks to the +local-lvm+
+ storage. You have to configure the network manually.
+
+ qm importovf 999 WinDev1709Eval.ovf local-lvm
+
+The VM is ready to be started.
+
+Adding an external disk image to a Virtual Machine
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can also add an existing disk image to a VM, either coming from a 
+foreign hypervisor, or one that you created yourself.
 
-Create a new virtual machine and import the disk
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Suppose you created a Debian/Ubuntu disk image with the 'vmdebootstrap' tool:
 
-Create a virtual machine with 2 cores, 2GB RAM, and one NIC on the default
-+vmbr0+ bridge:
+ vmdebootstrap --verbose \
+  --size 10GiB --serial-console \
+  --grub --no-extlinux \
+  --package openssh-server \
+  --package avahi-daemon \
+  --package qemu-guest-agent \
+  --hostname vm600 --enable-dhcp \
+  --customize=./copy_pub_ssh.sh \
+  --sparse --image vm600.raw
 
- qm create 999 -net0 e1000,bridge=vmbr0 -name Win10 -memory 2048 -bootdisk sata0
+You can now create a new target VM for this image.
 
-Import the disk image to the +local-lvm+ storage:
+ qm create 600 --net0 virtio,bridge=vmbr0 --name vm600 --serial0 socket \
+   --bootdisk scsi0 --scsihw virtio-scsi-pci --ostype l26
 
- qm importdisk 999 "MSEdge - Win10_preview.vmdk" local-lvm
+Add the disk image as +unused0+ to the VM, using the storage +pvedir+:
+
+ qm importdisk 600 vm600.raw pvedir
+
+Finally attach the unused disk to the SCSI controller of the VM:
+
+ qm set 600 --scsi0 pvedir:600/vm-600-disk-1.raw
 
-The disk will be marked as *Unused* in the VM 999 configuration.
-After that you can go in the GUI, in the VM *Hardware*, *Edit* the unused disk
-and set the *Bus/Device* to *SATA/0*.
 The VM is ready to be started.
 
 
+ifndef::wiki[]
+include::qm-cloud-init.adoc[]
+endif::wiki[]
+
+
+
 Managing Virtual Machines with `qm`
 ------------------------------------
 
@@ -850,6 +1020,16 @@ CAUTION: Only do that if you are sure the action which set the lock is
 no longer running.
 
 
+ifdef::wiki[]
+
+See Also
+~~~~~~~~
+
+* link:/wiki/Cloud-Init_Support[Cloud-Init Support]
+
+endif::wiki[]
+
+
 ifdef::manvolnum[]
 
 Files