]>
Commit | Line | Data |
---|---|---|
80c0adcb | 1 | [[chapter_virtual_machines]] |
f69cfd23 | 2 | ifdef::manvolnum[] |
b2f242ab DM |
3 | qm(1) |
4 | ===== | |
5f09af76 DM |
5 | :pve-toplevel: |
6 | ||
f69cfd23 DM |
7 | NAME |
8 | ---- | |
9 | ||
c730e973 | 10 | qm - QEMU/KVM Virtual Machine Manager |
f69cfd23 DM |
11 | |
12 | ||
49a5e11c | 13 | SYNOPSIS |
f69cfd23 DM |
14 | -------- |
15 | ||
16 | include::qm.1-synopsis.adoc[] | |
17 | ||
18 | DESCRIPTION | |
19 | ----------- | |
20 | endif::manvolnum[] | |
f69cfd23 | 21 | ifndef::manvolnum[] |
c730e973 | 22 | QEMU/KVM Virtual Machines |
f69cfd23 | 23 | ========================= |
5f09af76 | 24 | :pve-toplevel: |
194d2f29 | 25 | endif::manvolnum[] |
5f09af76 | 26 | |
c4cba5d7 EK |
27 | // deprecates |
28 | // http://pve.proxmox.com/wiki/Container_and_Full_Virtualization | |
29 | // http://pve.proxmox.com/wiki/KVM | |
30 | // http://pve.proxmox.com/wiki/Qemu_Server | |
31 | ||
c730e973 FE |
32 | QEMU (short form for Quick Emulator) is an open source hypervisor that emulates a |
33 | physical computer. From the perspective of the host system where QEMU is | |
34 | running, QEMU is a user program which has access to a number of local resources | |
c4cba5d7 | 35 | like partitions, files, network cards which are then passed to an |
189d3661 | 36 | emulated computer which sees them as if they were real devices. |
c4cba5d7 EK |
37 | |
38 | A guest operating system running in the emulated computer accesses these | |
3a433e9b | 39 | devices, and runs as if it were running on real hardware. For instance, you can pass |
c730e973 | 40 | an ISO image as a parameter to QEMU, and the OS running in the emulated computer |
3a433e9b | 41 | will see a real CD-ROM inserted into a CD drive. |
c4cba5d7 | 42 | |
c730e973 | 43 | QEMU can emulate a great variety of hardware from ARM to Sparc, but {pve} is |
c4cba5d7 EK |
44 | only concerned with 32 and 64 bits PC clone emulation, since it represents the |
45 | overwhelming majority of server hardware. The emulation of PC clones is also one | |
46 | of the fastest due to the availability of processor extensions which greatly | |
c730e973 | 47 | speed up QEMU when the emulated architecture is the same as the host |
9c63b5d9 EK |
48 | architecture. |
49 | ||
50 | NOTE: You may sometimes encounter the term _KVM_ (Kernel-based Virtual Machine). | |
c730e973 FE |
51 | It means that QEMU is running with the support of the virtualization processor |
52 | extensions, via the Linux KVM module. In the context of {pve} _QEMU_ and | |
53 | _KVM_ can be used interchangeably, as QEMU in {pve} will always try to load the KVM | |
9c63b5d9 EK |
54 | module. |
55 | ||
c730e973 | 56 | QEMU inside {pve} runs as a root process, since this is required to access block |
c4cba5d7 EK |
57 | and PCI devices. |
58 | ||
5eba0743 | 59 | |
c4cba5d7 EK |
60 | Emulated devices and paravirtualized devices |
61 | -------------------------------------------- | |
62 | ||
42dfa5e9 | 63 | The PC hardware emulated by QEMU includes a motherboard, network controllers, |
3a433e9b | 64 | SCSI, IDE and SATA controllers, serial ports (the complete list can be seen in |
189d3661 DC |
65 | the `kvm(1)` man page) all of them emulated in software. All these devices |
66 | are the exact software equivalent of existing hardware devices, and if the OS | |
67 | running in the guest has the proper drivers it will use the devices as if it | |
c35063c2 | 68 | were running on real hardware. This allows QEMU to run _unmodified_ operating |
c4cba5d7 EK |
69 | systems. |
70 | ||
71 | This however has a performance cost, as running in software what was meant to | |
72 | run in hardware involves a lot of extra work for the host CPU. To mitigate this, | |
c730e973 FE |
73 | QEMU can present to the guest operating system _paravirtualized devices_, where |
74 | the guest OS recognizes it is running inside QEMU and cooperates with the | |
c4cba5d7 EK |
75 | hypervisor. |
76 | ||
c730e973 | 77 | QEMU relies on the virtio virtualization standard, and is thus able to present |
189d3661 DC |
78 | paravirtualized virtio devices, which includes a paravirtualized generic disk |
79 | controller, a paravirtualized network card, a paravirtualized serial port, | |
c4cba5d7 EK |
80 | a paravirtualized SCSI controller, etc ... |
81 | ||
e3d91783 FE |
82 | TIP: It is *highly recommended* to use the virtio devices whenever you can, as |
83 | they provide a big performance improvement and are generally better maintained. | |
84 | Using the virtio generic disk controller versus an emulated IDE controller will | |
85 | double the sequential write throughput, as measured with `bonnie++(8)`. Using | |
86 | the virtio network interface can deliver up to three times the throughput of an | |
0677f4cc FE |
87 | emulated Intel E1000 network card, as measured with `iperf(1)`. footnote:[See |
88 | this benchmark on the KVM wiki https://www.linux-kvm.org/page/Using_VirtIO_NIC] | |
c4cba5d7 | 89 | |
5eba0743 | 90 | |
80c0adcb | 91 | [[qm_virtual_machines_settings]] |
5274ad28 | 92 | Virtual Machines Settings |
c4cba5d7 | 93 | ------------------------- |
80c0adcb | 94 | |
c4cba5d7 EK |
95 | Generally speaking {pve} tries to choose sane defaults for virtual machines |
96 | (VM). Make sure you understand the meaning of the settings you change, as it | |
97 | could incur a performance slowdown, or putting your data at risk. | |
98 | ||
5eba0743 | 99 | |
80c0adcb | 100 | [[qm_general_settings]] |
c4cba5d7 EK |
101 | General Settings |
102 | ~~~~~~~~~~~~~~~~ | |
80c0adcb | 103 | |
1ff5e4e8 | 104 | [thumbnail="screenshot/gui-create-vm-general.png"] |
b16d767f | 105 | |
c4cba5d7 EK |
106 | General settings of a VM include |
107 | ||
108 | * the *Node* : the physical server on which the VM will run | |
109 | * the *VM ID*: a unique number in this {pve} installation used to identify your VM | |
110 | * *Name*: a free form text string you can use to describe the VM | |
111 | * *Resource Pool*: a logical group of VMs | |
112 | ||
5eba0743 | 113 | |
80c0adcb | 114 | [[qm_os_settings]] |
c4cba5d7 EK |
115 | OS Settings |
116 | ~~~~~~~~~~~ | |
80c0adcb | 117 | |
1ff5e4e8 | 118 | [thumbnail="screenshot/gui-create-vm-os.png"] |
200114a7 | 119 | |
d3c00374 TL |
120 | When creating a virtual machine (VM), setting the proper Operating System(OS) |
121 | allows {pve} to optimize some low level parameters. For instance Windows OS | |
122 | expect the BIOS clock to use the local time, while Unix based OS expect the | |
123 | BIOS clock to have the UTC time. | |
124 | ||
125 | [[qm_system_settings]] | |
126 | System Settings | |
127 | ~~~~~~~~~~~~~~~ | |
128 | ||
ade78a55 TL |
129 | On VM creation you can change some basic system components of the new VM. You |
130 | can specify which xref:qm_display[display type] you want to use. | |
d3c00374 TL |
131 | [thumbnail="screenshot/gui-create-vm-system.png"] |
132 | Additionally, the xref:qm_hard_disk[SCSI controller] can be changed. | |
133 | If you plan to install the QEMU Guest Agent, or if your selected ISO image | |
c730e973 | 134 | already ships and installs it automatically, you may want to tick the 'QEMU |
d3c00374 TL |
135 | Agent' box, which lets {pve} know that it can use its features to show some |
136 | more information, and complete some actions (for example, shutdown or | |
137 | snapshots) more intelligently. | |
138 | ||
139 | {pve} allows to boot VMs with different firmware and machine types, namely | |
140 | xref:qm_bios_and_uefi[SeaBIOS and OVMF]. In most cases you want to switch from | |
3a433e9b | 141 | the default SeaBIOS to OVMF only if you plan to use |
cecc8064 FE |
142 | xref:qm_pci_passthrough[PCIe passthrough]. |
143 | ||
ff0c3ed1 TL |
144 | [[qm_machine_type]] |
145 | ||
cecc8064 FE |
146 | Machine Type |
147 | ^^^^^^^^^^^^ | |
148 | ||
149 | A VM's 'Machine Type' defines the hardware layout of the VM's virtual | |
150 | motherboard. You can choose between the default | |
151 | https://en.wikipedia.org/wiki/Intel_440FX[Intel 440FX] or the | |
d3c00374 | 152 | https://ark.intel.com/content/www/us/en/ark/products/31918/intel-82q35-graphics-and-memory-controller.html[Q35] |
cecc8064 FE |
153 | chipset, which also provides a virtual PCIe bus, and thus may be |
154 | desired if you want to pass through PCIe hardware. | |
155 | ||
ff0c3ed1 TL |
156 | Machine Version |
157 | +++++++++++++++ | |
158 | ||
cecc8064 FE |
159 | Each machine type is versioned in QEMU and a given QEMU binary supports many |
160 | machine versions. New versions might bring support for new features, fixes or | |
161 | general improvements. However, they also change properties of the virtual | |
162 | hardware. To avoid sudden changes from the guest's perspective and ensure | |
163 | compatibility of the VM state, live-migration and snapshots with RAM will keep | |
164 | using the same machine version in the new QEMU instance. | |
165 | ||
166 | For Windows guests, the machine version is pinned during creation, because | |
167 | Windows is sensitive to changes in the virtual hardware - even between cold | |
168 | boots. For example, the enumeration of network devices might be different with | |
169 | different machine versions. Other OSes like Linux can usually deal with such | |
170 | changes just fine. For those, the 'Latest' machine version is used by default. | |
171 | This means that after a fresh start, the newest machine version supported by the | |
172 | QEMU binary is used (e.g. the newest machine version QEMU 8.1 supports is | |
173 | version 8.1 for each machine type). | |
174 | ||
ff0c3ed1 TL |
175 | [[qm_machine_update]] |
176 | ||
177 | Update to a Newer Machine Version | |
178 | +++++++++++++++++++++++++++++++++ | |
179 | ||
cecc8064 FE |
180 | Very old machine versions might become deprecated in QEMU. For example, this is |
181 | the case for versions 1.4 to 1.7 for the i440fx machine type. It is expected | |
182 | that support for these machine versions will be dropped at some point. If you | |
183 | see a deprecation warning, you should change the machine version to a newer one. | |
184 | Be sure to have a working backup first and be prepared for changes to how the | |
185 | guest sees hardware. In some scenarios, re-installing certain drivers might be | |
186 | required. You should also check for snapshots with RAM that were taken with | |
187 | these machine versions (i.e. the `runningmachine` configuration entry). | |
188 | Unfortunately, there is no way to change the machine version of a snapshot, so | |
189 | you'd need to load the snapshot to salvage any data from it. | |
5eba0743 | 190 | |
80c0adcb | 191 | [[qm_hard_disk]] |
c4cba5d7 EK |
192 | Hard Disk |
193 | ~~~~~~~~~ | |
80c0adcb | 194 | |
3dbe1daa TL |
195 | [[qm_hard_disk_bus]] |
196 | Bus/Controller | |
197 | ^^^^^^^^^^^^^^ | |
c730e973 | 198 | QEMU can emulate a number of storage controllers: |
c4cba5d7 | 199 | |
741fa478 FE |
200 | TIP: It is highly recommended to use the *VirtIO SCSI* or *VirtIO Block* |
201 | controller for performance reasons and because they are better maintained. | |
202 | ||
c4cba5d7 | 203 | * the *IDE* controller, has a design which goes back to the 1984 PC/AT disk |
44f38275 | 204 | controller. Even if this controller has been superseded by recent designs, |
6fb50457 | 205 | each and every OS you can think of has support for it, making it a great choice |
c4cba5d7 EK |
206 | if you want to run an OS released before 2003. You can connect up to 4 devices |
207 | on this controller. | |
208 | ||
209 | * the *SATA* (Serial ATA) controller, dating from 2003, has a more modern | |
210 | design, allowing higher throughput and a greater number of devices to be | |
211 | connected. You can connect up to 6 devices on this controller. | |
212 | ||
b0b6802b EK |
213 | * the *SCSI* controller, designed in 1985, is commonly found on server grade |
214 | hardware, and can connect up to 14 storage devices. {pve} emulates by default a | |
f4bfd701 DM |
215 | LSI 53C895A controller. |
216 | + | |
a89ded0b FE |
217 | A SCSI controller of type _VirtIO SCSI single_ and enabling the |
218 | xref:qm_hard_disk_iothread[IO Thread] setting for the attached disks is | |
219 | recommended if you aim for performance. This is the default for newly created | |
220 | Linux VMs since {pve} 7.3. Each disk will have its own _VirtIO SCSI_ controller, | |
221 | and QEMU will handle the disks IO in a dedicated thread. Linux distributions | |
222 | have support for this controller since 2012, and FreeBSD since 2014. For Windows | |
223 | OSes, you need to provide an extra ISO containing the drivers during the | |
224 | installation. | |
b0b6802b EK |
225 | // https://pve.proxmox.com/wiki/Paravirtualized_Block_Drivers_for_Windows#During_windows_installation. |
226 | ||
30e6fe00 TL |
227 | * The *VirtIO Block* controller, often just called VirtIO or virtio-blk, |
228 | is an older type of paravirtualized controller. It has been superseded by the | |
229 | VirtIO SCSI Controller, in terms of features. | |
c4cba5d7 | 230 | |
1ff5e4e8 | 231 | [thumbnail="screenshot/gui-create-vm-hard-disk.png"] |
3dbe1daa TL |
232 | |
233 | [[qm_hard_disk_formats]] | |
234 | Image Format | |
235 | ^^^^^^^^^^^^ | |
c4cba5d7 EK |
236 | On each controller you attach a number of emulated hard disks, which are backed |
237 | by a file or a block device residing in the configured storage. The choice of | |
238 | a storage type will determine the format of the hard disk image. Storages which | |
239 | present block devices (LVM, ZFS, Ceph) will require the *raw disk image format*, | |
de14ebff | 240 | whereas files based storages (Ext4, NFS, CIFS, GlusterFS) will let you to choose |
c4cba5d7 EK |
241 | either the *raw disk image format* or the *QEMU image format*. |
242 | ||
243 | * the *QEMU image format* is a copy on write format which allows snapshots, and | |
244 | thin provisioning of the disk image. | |
189d3661 DC |
245 | * the *raw disk image* is a bit-to-bit image of a hard disk, similar to what |
246 | you would get when executing the `dd` command on a block device in Linux. This | |
4371b2fe | 247 | format does not support thin provisioning or snapshots by itself, requiring |
30e6fe00 TL |
248 | cooperation from the storage layer for these tasks. It may, however, be up to |
249 | 10% faster than the *QEMU image format*. footnote:[See this benchmark for details | |
43530f6f | 250 | https://events.static.linuxfound.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf] |
189d3661 | 251 | * the *VMware image format* only makes sense if you intend to import/export the |
c4cba5d7 EK |
252 | disk image to other hypervisors. |
253 | ||
3dbe1daa TL |
254 | [[qm_hard_disk_cache]] |
255 | Cache Mode | |
256 | ^^^^^^^^^^ | |
c4cba5d7 EK |
257 | Setting the *Cache* mode of the hard drive will impact how the host system will |
258 | notify the guest systems of block write completions. The *No cache* default | |
259 | means that the guest system will be notified that a write is complete when each | |
260 | block reaches the physical storage write queue, ignoring the host page cache. | |
261 | This provides a good balance between safety and speed. | |
262 | ||
263 | If you want the {pve} backup manager to skip a disk when doing a backup of a VM, | |
264 | you can set the *No backup* option on that disk. | |
265 | ||
3205ac49 EK |
266 | If you want the {pve} storage replication mechanism to skip a disk when starting |
267 | a replication job, you can set the *Skip replication* option on that disk. | |
6fb50457 | 268 | As of {pve} 5.0, replication requires the disk images to be on a storage of type |
3205ac49 | 269 | `zfspool`, so adding a disk image to other storages when the VM has replication |
6fb50457 | 270 | configured requires to skip replication for this disk image. |
3205ac49 | 271 | |
3dbe1daa TL |
272 | [[qm_hard_disk_discard]] |
273 | Trim/Discard | |
274 | ^^^^^^^^^^^^ | |
c4cba5d7 | 275 | If your storage supports _thin provisioning_ (see the storage chapter in the |
53cbac40 NC |
276 | {pve} guide), you can activate the *Discard* option on a drive. With *Discard* |
277 | set and a _TRIM_-enabled guest OS footnote:[TRIM, UNMAP, and discard | |
278 | https://en.wikipedia.org/wiki/Trim_%28computing%29], when the VM's filesystem | |
279 | marks blocks as unused after deleting files, the controller will relay this | |
280 | information to the storage, which will then shrink the disk image accordingly. | |
43975153 SR |
281 | For the guest to be able to issue _TRIM_ commands, you must enable the *Discard* |
282 | option on the drive. Some guest operating systems may also require the | |
283 | *SSD Emulation* flag to be set. Note that *Discard* on *VirtIO Block* drives is | |
284 | only supported on guests using Linux Kernel 5.0 or higher. | |
c4cba5d7 | 285 | |
25203dc1 NC |
286 | If you would like a drive to be presented to the guest as a solid-state drive |
287 | rather than a rotational hard disk, you can set the *SSD emulation* option on | |
288 | that drive. There is no requirement that the underlying storage actually be | |
289 | backed by SSDs; this feature can be used with physical media of any type. | |
53cbac40 | 290 | Note that *SSD emulation* is not supported on *VirtIO Block* drives. |
25203dc1 | 291 | |
3dbe1daa TL |
292 | |
293 | [[qm_hard_disk_iothread]] | |
294 | IO Thread | |
295 | ^^^^^^^^^ | |
4c7a47cf FE |
296 | The option *IO Thread* can only be used when using a disk with the *VirtIO* |
297 | controller, or with the *SCSI* controller, when the emulated controller type is | |
298 | *VirtIO SCSI single*. With *IO Thread* enabled, QEMU creates one I/O thread per | |
58e695ca | 299 | storage controller rather than handling all I/O in the main event loop or vCPU |
afb90565 TL |
300 | threads. One benefit is better work distribution and utilization of the |
301 | underlying storage. Another benefit is reduced latency (hangs) in the guest for | |
302 | very I/O-intensive host workloads, since neither the main thread nor a vCPU | |
303 | thread can be blocked by disk I/O. | |
80c0adcb DM |
304 | |
305 | [[qm_cpu]] | |
34e541c5 EK |
306 | CPU |
307 | ~~~ | |
80c0adcb | 308 | |
1ff5e4e8 | 309 | [thumbnail="screenshot/gui-create-vm-cpu.png"] |
397c74c3 | 310 | |
34e541c5 EK |
311 | A *CPU socket* is a physical slot on a PC motherboard where you can plug a CPU. |
312 | This CPU can then contain one or many *cores*, which are independent | |
313 | processing units. Whether you have a single CPU socket with 4 cores, or two CPU | |
314 | sockets with two cores is mostly irrelevant from a performance point of view. | |
44f38275 TL |
315 | However some software licenses depend on the number of sockets a machine has, |
316 | in that case it makes sense to set the number of sockets to what the license | |
317 | allows you. | |
f4bfd701 | 318 | |
3a433e9b | 319 | Increasing the number of virtual CPUs (cores and sockets) will usually provide a |
34e541c5 | 320 | performance improvement though that is heavily dependent on the use of the VM. |
3a433e9b | 321 | Multi-threaded applications will of course benefit from a large number of |
c730e973 | 322 | virtual CPUs, as for each virtual cpu you add, QEMU will create a new thread of |
34e541c5 EK |
323 | execution on the host system. If you're not sure about the workload of your VM, |
324 | it is usually a safe bet to set the number of *Total cores* to 2. | |
325 | ||
fb29acdd | 326 | NOTE: It is perfectly safe if the _overall_ number of cores of all your VMs |
d6466262 TL |
327 | is greater than the number of cores on the server (for example, 4 VMs each with |
328 | 4 cores (= total 16) on a machine with only 8 cores). In that case the host | |
329 | system will balance the QEMU execution threads between your server cores, just | |
330 | like if you were running a standard multi-threaded application. However, {pve} | |
331 | will prevent you from starting VMs with more virtual CPU cores than physically | |
332 | available, as this will only bring the performance down due to the cost of | |
333 | context switches. | |
34e541c5 | 334 | |
af54f54d TL |
335 | [[qm_cpu_resource_limits]] |
336 | Resource Limits | |
337 | ^^^^^^^^^^^^^^^ | |
338 | ||
d17b6bd3 AZ |
339 | *cpulimit* |
340 | ||
341 | In addition to the number of virtual cores, the total available ``Host CPU | |
342 | Time'' for the VM can be set with the *cpulimit* option. It is a floating point | |
343 | value representing CPU time in percent, so `1.0` is equal to `100%`, `2.5` to | |
344 | `250%` and so on. If a single process would fully use one single core it would | |
345 | have `100%` CPU Time usage. If a VM with four cores utilizes all its cores | |
346 | fully it would theoretically use `400%`. In reality the usage may be even a bit | |
347 | higher as QEMU can have additional threads for VM peripherals besides the vCPU | |
348 | core ones. | |
349 | ||
0186a467 TL |
350 | This setting can be useful when a VM should have multiple vCPUs because it is |
351 | running some processes in parallel, but the VM as a whole should not be able to | |
352 | run all vCPUs at 100% at the same time. | |
353 | ||
354 | For example, suppose you have a virtual machine that would benefit from having 8 | |
355 | virtual CPUs, but you don't want the VM to be able to max out all 8 cores | |
356 | running at full load - because that would overload the server and leave other | |
357 | virtual machines and containers with too little CPU time. To solve this, you | |
358 | could set *cpulimit* to `4.0` (=400%). This means that if the VM fully utilizes | |
359 | all 8 virtual CPUs by running 8 processes simultaneously, each vCPU will receive | |
360 | a maximum of 50% CPU time from the physical cores. However, if the VM workload | |
361 | only fully utilizes 4 virtual CPUs, it could still receive up to 100% CPU time | |
362 | from a physical core, for a total of 400%. | |
af54f54d | 363 | |
d6466262 TL |
364 | NOTE: VMs can, depending on their configuration, use additional threads, such |
365 | as for networking or IO operations but also live migration. Thus a VM can show | |
366 | up to use more CPU time than just its virtual CPUs could use. To ensure that a | |
d17b6bd3 AZ |
367 | VM never uses more CPU time than vCPUs assigned, set the *cpulimit* to |
368 | the same value as the total core count. | |
af54f54d | 369 | |
6a31c01b AZ |
370 | *cpuuntis* |
371 | ||
372 | With the *cpuunits* option, nowadays often called CPU shares or CPU weight, you | |
373 | can control how much CPU time a VM gets compared to other running VMs. It is a | |
374 | relative weight which defaults to `100` (or `1024` if the host uses legacy | |
375 | cgroup v1). If you increase this for a VM it will be prioritized by the | |
376 | scheduler in comparison to other VMs with lower weight. | |
377 | ||
378 | For example, if VM 100 has set the default `100` and VM 200 was changed to | |
379 | `200`, the latter VM 200 would receive twice the CPU bandwidth than the first | |
380 | VM 100. | |
af54f54d TL |
381 | |
382 | For more information see `man systemd.resource-control`, here `CPUQuota` | |
6a31c01b AZ |
383 | corresponds to `cpulimit` and `CPUWeight` to our `cpuunits` setting. Visit its |
384 | Notes section for references and implementation details. | |
af54f54d | 385 | |
b3848f24 AZ |
386 | *affinity* |
387 | ||
af47d42e TL |
388 | With the *affinity* option, you can specify the physical CPU cores that are used |
389 | to run the VM's vCPUs. Peripheral VM processes, such as those for I/O, are not | |
390 | affected by this setting. Note that the *CPU affinity is not a security | |
b3848f24 AZ |
391 | feature*. |
392 | ||
af47d42e | 393 | Forcing a CPU *affinity* can make sense in certain cases but is accompanied by |
b3848f24 AZ |
394 | an increase in complexity and maintenance effort. For example, if you want to |
395 | add more VMs later or migrate VMs to nodes with fewer CPU cores. It can also | |
396 | easily lead to asynchronous and therefore limited system performance if some | |
397 | CPUs are fully utilized while others are almost idle. | |
398 | ||
af47d42e TL |
399 | The *affinity* is set through the `taskset` CLI tool. It accepts the host CPU |
400 | numbers (see `lscpu`) in the `List Format` from `man cpuset`. This ASCII decimal | |
401 | list can contain numbers but also number ranges. For example, the *affinity* | |
402 | `0-1,8-11` (expanded `0, 1, 8, 9, 10, 11`) would allow the VM to run on only | |
403 | these six specific host cores. | |
1e6b30b5 | 404 | |
af54f54d TL |
405 | CPU Type |
406 | ^^^^^^^^ | |
407 | ||
c730e973 | 408 | QEMU can emulate a number different of *CPU types* from 486 to the latest Xeon |
34e541c5 | 409 | processors. Each new processor generation adds new features, like hardware |
16b31cc9 AZ |
410 | assisted 3d rendering, random number generation, memory protection, etc. Also, |
411 | a current generation can be upgraded through | |
412 | xref:chapter_firmware_updates[microcode update] with bug or security fixes. | |
41379e9a | 413 | |
34e541c5 EK |
414 | Usually you should select for your VM a processor type which closely matches the |
415 | CPU of the host system, as it means that the host CPU features (also called _CPU | |
416 | flags_ ) will be available in your VMs. If you want an exact match, you can set | |
417 | the CPU type to *host* in which case the VM will have exactly the same CPU flags | |
f4bfd701 DM |
418 | as your host system. |
419 | ||
34e541c5 | 420 | This has a downside though. If you want to do a live migration of VMs between |
41379e9a | 421 | different hosts, your VM might end up on a new system with a different CPU type |
57bb28ef FE |
422 | or a different microcode version. |
423 | If the CPU flags passed to the guest are missing, the QEMU process will stop. To | |
424 | remedy this QEMU has also its own virtual CPU types, that {pve} uses by default. | |
41379e9a | 425 | |
57bb28ef FE |
426 | The backend default is 'kvm64' which works on essentially all x86_64 host CPUs |
427 | and the UI default when creating a new VM is 'x86-64-v2-AES', which requires a | |
428 | host CPU starting from Westmere for Intel or at least a fourth generation | |
429 | Opteron for AMD. | |
41379e9a AD |
430 | |
431 | In short: | |
f4bfd701 | 432 | |
57bb28ef FE |
433 | If you don’t care about live migration or have a homogeneous cluster where all |
434 | nodes have the same CPU and same microcode version, set the CPU type to host, as | |
435 | in theory this will give your guests maximum performance. | |
af54f54d | 436 | |
57bb28ef FE |
437 | If you care about live migration and security, and you have only Intel CPUs or |
438 | only AMD CPUs, choose the lowest generation CPU model of your cluster. | |
41379e9a | 439 | |
57bb28ef FE |
440 | If you care about live migration without security, or have mixed Intel/AMD |
441 | cluster, choose the lowest compatible virtual QEMU CPU type. | |
41379e9a | 442 | |
57bb28ef | 443 | NOTE: Live migrations between Intel and AMD host CPUs have no guarantee to work. |
41379e9a | 444 | |
85e53bbf | 445 | See also |
2157032d | 446 | xref:chapter_qm_vcpu_list[List of AMD and Intel CPU Types as Defined in QEMU]. |
41379e9a | 447 | |
c85a1f5a | 448 | QEMU CPU Types |
41379e9a AD |
449 | ^^^^^^^^^^^^^^ |
450 | ||
c85a1f5a FE |
451 | QEMU also provide virtual CPU types, compatible with both Intel and AMD host |
452 | CPUs. | |
41379e9a | 453 | |
c85a1f5a FE |
454 | NOTE: To mitigate the Spectre vulnerability for virtual CPU types, you need to |
455 | add the relevant CPU flags, see | |
456 | xref:qm_meltdown_spectre[Meltdown / Spectre related CPU flags]. | |
41379e9a | 457 | |
c85a1f5a FE |
458 | Historically, {pve} had the 'kvm64' CPU model, with CPU flags at the level of |
459 | Pentium 4 enabled, so performance was not great for certain workloads. | |
41379e9a | 460 | |
c85a1f5a FE |
461 | In the summer of 2020, AMD, Intel, Red Hat, and SUSE collaborated to define |
462 | three x86-64 microarchitecture levels on top of the x86-64 baseline, with modern | |
463 | flags enabled. For details, see the | |
464 | https://gitlab.com/x86-psABIs/x86-64-ABI[x86-64-ABI specification]. | |
41379e9a | 465 | |
c85a1f5a FE |
466 | NOTE: Some newer distributions like CentOS 9 are now built with 'x86-64-v2' |
467 | flags as a minimum requirement. | |
41379e9a | 468 | |
c85a1f5a FE |
469 | * 'kvm64 (x86-64-v1)': Compatible with Intel CPU >= Pentium 4, AMD CPU >= |
470 | Phenom. | |
41379e9a | 471 | + |
c85a1f5a FE |
472 | * 'x86-64-v2': Compatible with Intel CPU >= Nehalem, AMD CPU >= Opteron_G3. |
473 | Added CPU flags compared to 'x86-64-v1': '+cx16', '+lahf-lm', '+popcnt', '+pni', | |
474 | '+sse4.1', '+sse4.2', '+ssse3'. | |
41379e9a | 475 | + |
c85a1f5a FE |
476 | * 'x86-64-v2-AES': Compatible with Intel CPU >= Westmere, AMD CPU >= Opteron_G4. |
477 | Added CPU flags compared to 'x86-64-v2': '+aes'. | |
41379e9a | 478 | + |
c85a1f5a FE |
479 | * 'x86-64-v3': Compatible with Intel CPU >= Broadwell, AMD CPU >= EPYC. Added |
480 | CPU flags compared to 'x86-64-v2-AES': '+avx', '+avx2', '+bmi1', '+bmi2', | |
481 | '+f16c', '+fma', '+movbe', '+xsave'. | |
41379e9a | 482 | + |
c85a1f5a FE |
483 | * 'x86-64-v4': Compatible with Intel CPU >= Skylake, AMD CPU >= EPYC v4 Genoa. |
484 | Added CPU flags compared to 'x86-64-v3': '+avx512f', '+avx512bw', '+avx512cd', | |
485 | '+avx512dq', '+avx512vl'. | |
41379e9a | 486 | |
9e797d8c SR |
487 | Custom CPU Types |
488 | ^^^^^^^^^^^^^^^^ | |
489 | ||
490 | You can specify custom CPU types with a configurable set of features. These are | |
491 | maintained in the configuration file `/etc/pve/virtual-guest/cpu-models.conf` by | |
492 | an administrator. See `man cpu-models.conf` for format details. | |
493 | ||
494 | Specified custom types can be selected by any user with the `Sys.Audit` | |
495 | privilege on `/nodes`. When configuring a custom CPU type for a VM via the CLI | |
496 | or API, the name needs to be prefixed with 'custom-'. | |
497 | ||
c85a1f5a | 498 | [[qm_meltdown_spectre]] |
72ae8aa2 FG |
499 | Meltdown / Spectre related CPU flags |
500 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
501 | ||
2975cb7a | 502 | There are several CPU flags related to the Meltdown and Spectre vulnerabilities |
72ae8aa2 FG |
503 | footnote:[Meltdown Attack https://meltdownattack.com/] which need to be set |
504 | manually unless the selected CPU type of your VM already enables them by default. | |
505 | ||
2975cb7a | 506 | There are two requirements that need to be fulfilled in order to use these |
72ae8aa2 | 507 | CPU flags: |
5dba2677 | 508 | |
72ae8aa2 FG |
509 | * The host CPU(s) must support the feature and propagate it to the guest's virtual CPU(s) |
510 | * The guest operating system must be updated to a version which mitigates the | |
511 | attacks and is able to utilize the CPU feature | |
512 | ||
2975cb7a | 513 | Otherwise you need to set the desired CPU flag of the virtual CPU, either by |
e2b3622a | 514 | editing the CPU options in the web UI, or by setting the 'flags' property of the |
2975cb7a AD |
515 | 'cpu' option in the VM configuration file. |
516 | ||
517 | For Spectre v1,v2,v4 fixes, your CPU or system vendor also needs to provide a | |
16b31cc9 AZ |
518 | so-called ``microcode update'' for your CPU, see |
519 | xref:chapter_firmware_updates[chapter Firmware Updates]. Note that not all | |
520 | affected CPUs can be updated to support spec-ctrl. | |
5dba2677 | 521 | |
2975cb7a AD |
522 | |
523 | To check if the {pve} host is vulnerable, execute the following command as root: | |
5dba2677 TL |
524 | |
525 | ---- | |
2975cb7a | 526 | for f in /sys/devices/system/cpu/vulnerabilities/*; do echo "${f##*/} -" $(cat "$f"); done |
5dba2677 TL |
527 | ---- |
528 | ||
16b31cc9 | 529 | A community script is also available to detect if the host is still vulnerable. |
2975cb7a | 530 | footnote:[spectre-meltdown-checker https://meltdown.ovh/] |
72ae8aa2 | 531 | |
2975cb7a AD |
532 | Intel processors |
533 | ^^^^^^^^^^^^^^^^ | |
72ae8aa2 | 534 | |
2975cb7a AD |
535 | * 'pcid' |
536 | + | |
144d5ede | 537 | This reduces the performance impact of the Meltdown (CVE-2017-5754) mitigation |
2975cb7a AD |
538 | called 'Kernel Page-Table Isolation (KPTI)', which effectively hides |
539 | the Kernel memory from the user space. Without PCID, KPTI is quite an expensive | |
540 | mechanism footnote:[PCID is now a critical performance/security feature on x86 | |
541 | https://groups.google.com/forum/m/#!topic/mechanical-sympathy/L9mHTbeQLNU]. | |
542 | + | |
543 | To check if the {pve} host supports PCID, execute the following command as root: | |
544 | + | |
72ae8aa2 | 545 | ---- |
2975cb7a | 546 | # grep ' pcid ' /proc/cpuinfo |
72ae8aa2 | 547 | ---- |
2975cb7a AD |
548 | + |
549 | If this does not return empty your host's CPU has support for 'pcid'. | |
72ae8aa2 | 550 | |
2975cb7a AD |
551 | * 'spec-ctrl' |
552 | + | |
144d5ede WB |
553 | Required to enable the Spectre v1 (CVE-2017-5753) and Spectre v2 (CVE-2017-5715) fix, |
554 | in cases where retpolines are not sufficient. | |
555 | Included by default in Intel CPU models with -IBRS suffix. | |
556 | Must be explicitly turned on for Intel CPU models without -IBRS suffix. | |
557 | Requires an updated host CPU microcode (intel-microcode >= 20180425). | |
2975cb7a AD |
558 | + |
559 | * 'ssbd' | |
560 | + | |
144d5ede WB |
561 | Required to enable the Spectre V4 (CVE-2018-3639) fix. Not included by default in any Intel CPU model. |
562 | Must be explicitly turned on for all Intel CPU models. | |
563 | Requires an updated host CPU microcode(intel-microcode >= 20180703). | |
72ae8aa2 | 564 | |
72ae8aa2 | 565 | |
2975cb7a AD |
566 | AMD processors |
567 | ^^^^^^^^^^^^^^ | |
568 | ||
569 | * 'ibpb' | |
570 | + | |
144d5ede WB |
571 | Required to enable the Spectre v1 (CVE-2017-5753) and Spectre v2 (CVE-2017-5715) fix, |
572 | in cases where retpolines are not sufficient. | |
573 | Included by default in AMD CPU models with -IBPB suffix. | |
574 | Must be explicitly turned on for AMD CPU models without -IBPB suffix. | |
2975cb7a AD |
575 | Requires the host CPU microcode to support this feature before it can be used for guest CPUs. |
576 | ||
577 | ||
578 | ||
579 | * 'virt-ssbd' | |
580 | + | |
581 | Required to enable the Spectre v4 (CVE-2018-3639) fix. | |
144d5ede WB |
582 | Not included by default in any AMD CPU model. |
583 | Must be explicitly turned on for all AMD CPU models. | |
584 | This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility. | |
585 | Note that this must be explicitly enabled when when using the "host" cpu model, | |
586 | because this is a virtual feature which does not exist in the physical CPUs. | |
2975cb7a AD |
587 | |
588 | ||
589 | * 'amd-ssbd' | |
590 | + | |
144d5ede WB |
591 | Required to enable the Spectre v4 (CVE-2018-3639) fix. |
592 | Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. | |
593 | This provides higher performance than virt-ssbd, therefore a host supporting this should always expose this to guests if possible. | |
2975cb7a AD |
594 | virt-ssbd should none the less also be exposed for maximum guest compatibility as some kernels only know about virt-ssbd. |
595 | ||
596 | ||
597 | * 'amd-no-ssb' | |
598 | + | |
599 | Recommended to indicate the host is not vulnerable to Spectre V4 (CVE-2018-3639). | |
144d5ede WB |
600 | Not included by default in any AMD CPU model. |
601 | Future hardware generations of CPU will not be vulnerable to CVE-2018-3639, | |
602 | and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb. | |
2975cb7a AD |
603 | This is mutually exclusive with virt-ssbd and amd-ssbd. |
604 | ||
5dba2677 | 605 | |
af54f54d TL |
606 | NUMA |
607 | ^^^^ | |
608 | You can also optionally emulate a *NUMA* | |
609 | footnote:[https://en.wikipedia.org/wiki/Non-uniform_memory_access] architecture | |
610 | in your VMs. The basics of the NUMA architecture mean that instead of having a | |
611 | global memory pool available to all your cores, the memory is spread into local | |
612 | banks close to each socket. | |
34e541c5 EK |
613 | This can bring speed improvements as the memory bus is not a bottleneck |
614 | anymore. If your system has a NUMA architecture footnote:[if the command | |
615 | `numactl --hardware | grep available` returns more than one node, then your host | |
616 | system has a NUMA architecture] we recommend to activate the option, as this | |
af54f54d TL |
617 | will allow proper distribution of the VM resources on the host system. |
618 | This option is also required to hot-plug cores or RAM in a VM. | |
34e541c5 EK |
619 | |
620 | If the NUMA option is used, it is recommended to set the number of sockets to | |
4ccb911c | 621 | the number of nodes of the host system. |
34e541c5 | 622 | |
af54f54d TL |
623 | vCPU hot-plug |
624 | ^^^^^^^^^^^^^ | |
625 | ||
626 | Modern operating systems introduced the capability to hot-plug and, to a | |
3a433e9b | 627 | certain extent, hot-unplug CPUs in a running system. Virtualization allows us |
4371b2fe FG |
628 | to avoid a lot of the (physical) problems real hardware can cause in such |
629 | scenarios. | |
630 | Still, this is a rather new and complicated feature, so its use should be | |
631 | restricted to cases where its absolutely needed. Most of the functionality can | |
632 | be replicated with other, well tested and less complicated, features, see | |
af54f54d TL |
633 | xref:qm_cpu_resource_limits[Resource Limits]. |
634 | ||
635 | In {pve} the maximal number of plugged CPUs is always `cores * sockets`. | |
636 | To start a VM with less than this total core count of CPUs you may use the | |
2a249c84 | 637 | *vcpus* setting, it denotes how many vCPUs should be plugged in at VM start. |
af54f54d | 638 | |
4371b2fe | 639 | Currently only this feature is only supported on Linux, a kernel newer than 3.10 |
af54f54d TL |
640 | is needed, a kernel newer than 4.7 is recommended. |
641 | ||
642 | You can use a udev rule as follow to automatically set new CPUs as online in | |
643 | the guest: | |
644 | ||
645 | ---- | |
646 | SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" | |
647 | ---- | |
648 | ||
649 | Save this under /etc/udev/rules.d/ as a file ending in `.rules`. | |
650 | ||
d6466262 TL |
651 | Note: CPU hot-remove is machine dependent and requires guest cooperation. The |
652 | deletion command does not guarantee CPU removal to actually happen, typically | |
653 | it's a request forwarded to guest OS using target dependent mechanism, such as | |
654 | ACPI on x86/amd64. | |
af54f54d | 655 | |
80c0adcb DM |
656 | |
657 | [[qm_memory]] | |
34e541c5 EK |
658 | Memory |
659 | ~~~~~~ | |
80c0adcb | 660 | |
34e541c5 EK |
661 | For each VM you have the option to set a fixed size memory or asking |
662 | {pve} to dynamically allocate memory based on the current RAM usage of the | |
59552707 | 663 | host. |
34e541c5 | 664 | |
96124d0f | 665 | .Fixed Memory Allocation |
1ff5e4e8 | 666 | [thumbnail="screenshot/gui-create-vm-memory.png"] |
96124d0f | 667 | |
9ea21953 | 668 | When setting memory and minimum memory to the same amount |
9fb002e6 | 669 | {pve} will simply allocate what you specify to your VM. |
34e541c5 | 670 | |
9abfec65 DC |
671 | Even when using a fixed memory size, the ballooning device gets added to the |
672 | VM, because it delivers useful information such as how much memory the guest | |
673 | really uses. | |
674 | In general, you should leave *ballooning* enabled, but if you want to disable | |
d6466262 | 675 | it (like for debugging purposes), simply uncheck *Ballooning Device* or set |
9abfec65 DC |
676 | |
677 | balloon: 0 | |
678 | ||
679 | in the configuration. | |
680 | ||
96124d0f | 681 | .Automatic Memory Allocation |
96124d0f | 682 | |
34e541c5 | 683 | // see autoballoon() in pvestatd.pm |
58e04593 | 684 | When setting the minimum memory lower than memory, {pve} will make sure that the |
34e541c5 EK |
685 | minimum amount you specified is always available to the VM, and if RAM usage on |
686 | the host is below 80%, will dynamically add memory to the guest up to the | |
f4bfd701 DM |
687 | maximum memory specified. |
688 | ||
a35aad4a | 689 | When the host is running low on RAM, the VM will then release some memory |
34e541c5 EK |
690 | back to the host, swapping running processes if needed and starting the oom |
691 | killer in last resort. The passing around of memory between host and guest is | |
692 | done via a special `balloon` kernel driver running inside the guest, which will | |
693 | grab or release memory pages from the host. | |
694 | footnote:[A good explanation of the inner workings of the balloon driver can be found here https://rwmj.wordpress.com/2010/07/17/virtio-balloon/] | |
695 | ||
c9f6e1a4 EK |
696 | When multiple VMs use the autoallocate facility, it is possible to set a |
697 | *Shares* coefficient which indicates the relative amount of the free host memory | |
470d4313 | 698 | that each VM should take. Suppose for instance you have four VMs, three of them |
a35aad4a | 699 | running an HTTP server and the last one is a database server. To cache more |
c9f6e1a4 EK |
700 | database blocks in the database server RAM, you would like to prioritize the |
701 | database VM when spare RAM is available. For this you assign a Shares property | |
702 | of 3000 to the database VM, leaving the other VMs to the Shares default setting | |
470d4313 | 703 | of 1000. The host server has 32GB of RAM, and is currently using 16GB, leaving 32 |
c3ff2832 AZ |
704 | * 80/100 - 16 = 9GB RAM to be allocated to the VMs on top of their configured |
705 | minimum memory amount. The database VM will benefit from 9 * 3000 / (3000 + | |
706 | 1000 + 1000 + 1000) = 4.5 GB extra RAM and each HTTP server from 1.5 GB. | |
c9f6e1a4 | 707 | |
34e541c5 EK |
708 | All Linux distributions released after 2010 have the balloon kernel driver |
709 | included. For Windows OSes, the balloon driver needs to be added manually and can | |
710 | incur a slowdown of the guest, so we don't recommend using it on critical | |
59552707 | 711 | systems. |
34e541c5 EK |
712 | // see https://forum.proxmox.com/threads/solved-hyper-threading-vs-no-hyper-threading-fixed-vs-variable-memory.20265/ |
713 | ||
470d4313 | 714 | When allocating RAM to your VMs, a good rule of thumb is always to leave 1GB |
34e541c5 EK |
715 | of RAM available to the host. |
716 | ||
80c0adcb DM |
717 | |
718 | [[qm_network_device]] | |
1ff7835b EK |
719 | Network Device |
720 | ~~~~~~~~~~~~~~ | |
80c0adcb | 721 | |
1ff5e4e8 | 722 | [thumbnail="screenshot/gui-create-vm-network.png"] |
c24ddb0a | 723 | |
1ff7835b EK |
724 | Each VM can have many _Network interface controllers_ (NIC), of four different |
725 | types: | |
726 | ||
727 | * *Intel E1000* is the default, and emulates an Intel Gigabit network card. | |
728 | * the *VirtIO* paravirtualized NIC should be used if you aim for maximum | |
729 | performance. Like all VirtIO devices, the guest OS should have the proper driver | |
730 | installed. | |
731 | * the *Realtek 8139* emulates an older 100 MB/s network card, and should | |
59552707 | 732 | only be used when emulating older operating systems ( released before 2002 ) |
1ff7835b EK |
733 | * the *vmxnet3* is another paravirtualized device, which should only be used |
734 | when importing a VM from another hypervisor. | |
735 | ||
736 | {pve} will generate for each NIC a random *MAC address*, so that your VM is | |
737 | addressable on Ethernet networks. | |
738 | ||
470d4313 | 739 | The NIC you added to the VM can follow one of two different models: |
af9c6de1 EK |
740 | |
741 | * in the default *Bridged mode* each virtual NIC is backed on the host by a | |
742 | _tap device_, ( a software loopback device simulating an Ethernet NIC ). This | |
743 | tap device is added to a bridge, by default vmbr0 in {pve}. In this mode, VMs | |
744 | have direct access to the Ethernet LAN on which the host is located. | |
745 | * in the alternative *NAT mode*, each virtual NIC will only communicate with | |
c730e973 | 746 | the QEMU user networking stack, where a built-in router and DHCP server can |
470d4313 | 747 | provide network access. This built-in DHCP will serve addresses in the private |
af9c6de1 | 748 | 10.0.2.0/24 range. The NAT mode is much slower than the bridged mode, and |
f5041150 | 749 | should only be used for testing. This mode is only available via CLI or the API, |
e2b3622a | 750 | but not via the web UI. |
af9c6de1 EK |
751 | |
752 | You can also skip adding a network device when creating a VM by selecting *No | |
753 | network device*. | |
754 | ||
750d4f04 | 755 | You can overwrite the *MTU* setting for each VM network device. The option |
00dc358b | 756 | `mtu=1` represents a special case, in which the MTU value will be inherited |
750d4f04 DT |
757 | from the underlying bridge. |
758 | This option is only available for *VirtIO* network devices. | |
759 | ||
af9c6de1 | 760 | .Multiqueue |
1ff7835b | 761 | If you are using the VirtIO driver, you can optionally activate the |
af9c6de1 | 762 | *Multiqueue* option. This option allows the guest OS to process networking |
1ff7835b | 763 | packets using multiple virtual CPUs, providing an increase in the total number |
470d4313 | 764 | of packets transferred. |
1ff7835b EK |
765 | |
766 | //http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html | |
767 | When using the VirtIO driver with {pve}, each NIC network queue is passed to the | |
a35aad4a | 768 | host kernel, where the queue will be processed by a kernel thread spawned by the |
1ff7835b EK |
769 | vhost driver. With this option activated, it is possible to pass _multiple_ |
770 | network queues to the host kernel for each NIC. | |
771 | ||
772 | //https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Tuning_and_Optimization_Guide/sect-Virtualization_Tuning_Optimization_Guide-Networking-Techniques.html#sect-Virtualization_Tuning_Optimization_Guide-Networking-Multi-queue_virtio-net | |
3d565359 SS |
773 | When using Multiqueue, it is recommended to set it to a value equal to the |
774 | number of vCPUs of your guest. Remember that the number of vCPUs is the number | |
775 | of sockets times the number of cores configured for the VM. You also need to set | |
776 | the number of multi-purpose channels on each VirtIO NIC in the VM with this | |
777 | ethtool command: | |
1ff7835b | 778 | |
7a0d4784 | 779 | `ethtool -L ens1 combined X` |
1ff7835b | 780 | |
3d565359 | 781 | where X is the number of the number of vCPUs of the VM. |
1ff7835b | 782 | |
b1b6d1bc | 783 | To configure a Windows guest for Multiqueue install the |
93a7dcca AL |
784 | https://pve.proxmox.com/wiki/Windows_VirtIO_Drivers[Redhat VirtIO Ethernet |
785 | Adapter drivers], then adapt the NIC's configuration as follows. Open the | |
786 | device manager, right click the NIC under "Network adapters", and select | |
787 | "Properties". Then open the "Advanced" tab and select "Receive Side Scaling" | |
788 | from the list on the left. Make sure it is set to "Enabled". Next, navigate to | |
789 | "Maximum number of RSS Queues" in the list and set it to the number of vCPUs of | |
790 | your VM. Once you verified that the settings are correct, click "OK" to confirm | |
791 | them. | |
b1b6d1bc | 792 | |
af9c6de1 | 793 | You should note that setting the Multiqueue parameter to a value greater |
1ff7835b EK |
794 | than one will increase the CPU load on the host and guest systems as the |
795 | traffic increases. We recommend to set this option only when the VM has to | |
796 | process a great number of incoming connections, such as when the VM is running | |
797 | as a router, reverse proxy or a busy HTTP server doing long polling. | |
798 | ||
6cb67d7f DC |
799 | [[qm_display]] |
800 | Display | |
801 | ~~~~~~~ | |
802 | ||
803 | QEMU can virtualize a few types of VGA hardware. Some examples are: | |
804 | ||
805 | * *std*, the default, emulates a card with Bochs VBE extensions. | |
1368dc02 TL |
806 | * *cirrus*, this was once the default, it emulates a very old hardware module |
807 | with all its problems. This display type should only be used if really | |
808 | necessary footnote:[https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful/ | |
d6466262 TL |
809 | qemu: using cirrus considered harmful], for example, if using Windows XP or |
810 | earlier | |
6cb67d7f DC |
811 | * *vmware*, is a VMWare SVGA-II compatible adapter. |
812 | * *qxl*, is the QXL paravirtualized graphics card. Selecting this also | |
37422176 AL |
813 | enables https://www.spice-space.org/[SPICE] (a remote viewer protocol) for the |
814 | VM. | |
e039fe3c TL |
815 | * *virtio-gl*, often named VirGL is a virtual 3D GPU for use inside VMs that |
816 | can offload workloads to the host GPU without requiring special (expensive) | |
817 | models and drivers and neither binding the host GPU completely, allowing | |
818 | reuse between multiple guests and or the host. | |
819 | + | |
820 | NOTE: VirGL support needs some extra libraries that aren't installed by | |
821 | default due to being relatively big and also not available as open source for | |
822 | all GPU models/vendors. For most setups you'll just need to do: | |
823 | `apt install libgl1 libegl1` | |
6cb67d7f DC |
824 | |
825 | You can edit the amount of memory given to the virtual GPU, by setting | |
1368dc02 | 826 | the 'memory' option. This can enable higher resolutions inside the VM, |
6cb67d7f DC |
827 | especially with SPICE/QXL. |
828 | ||
1368dc02 | 829 | As the memory is reserved by display device, selecting Multi-Monitor mode |
d6466262 | 830 | for SPICE (such as `qxl2` for dual monitors) has some implications: |
6cb67d7f | 831 | |
1368dc02 TL |
832 | * Windows needs a device for each monitor, so if your 'ostype' is some |
833 | version of Windows, {pve} gives the VM an extra device per monitor. | |
6cb67d7f | 834 | Each device gets the specified amount of memory. |
1368dc02 | 835 | |
6cb67d7f DC |
836 | * Linux VMs, can always enable more virtual monitors, but selecting |
837 | a Multi-Monitor mode multiplies the memory given to the device with | |
838 | the number of monitors. | |
839 | ||
1368dc02 TL |
840 | Selecting `serialX` as display 'type' disables the VGA output, and redirects |
841 | the Web Console to the selected serial port. A configured display 'memory' | |
842 | setting will be ignored in that case. | |
80c0adcb | 843 | |
4005a5fa MF |
844 | .VNC clipboard |
845 | You can enable the VNC clipboard by setting `clipboard` to `vnc`. | |
846 | ||
847 | ---- | |
848 | # qm set <vmid> -vga <displaytype>,clipboard=vnc | |
849 | ---- | |
850 | ||
851 | In order to use the clipboard feature, you must first install the | |
852 | SPICE guest tools. On Debian-based distributions, this can be achieved | |
853 | by installing `spice-vdagent`. For other Operating Systems search for it | |
854 | in the offical repositories or see: https://www.spice-space.org/download.html | |
855 | ||
856 | Once you have installed the spice guest tools, you can use the VNC clipboard | |
857 | function (e.g. in the noVNC console panel). However, if you're using | |
858 | SPICE, virtio or virgl, you'll need to choose which clipboard to use. | |
859 | This is because the default *SPICE* clipboard will be replaced by the | |
860 | *VNC* clipboard, if `clipboard` is set to `vnc`. | |
861 | ||
dbb44ef0 | 862 | [[qm_usb_passthrough]] |
685cc8e0 DC |
863 | USB Passthrough |
864 | ~~~~~~~~~~~~~~~ | |
80c0adcb | 865 | |
685cc8e0 DC |
866 | There are two different types of USB passthrough devices: |
867 | ||
470d4313 | 868 | * Host USB passthrough |
685cc8e0 DC |
869 | * SPICE USB passthrough |
870 | ||
871 | Host USB passthrough works by giving a VM a USB device of the host. | |
872 | This can either be done via the vendor- and product-id, or | |
873 | via the host bus and port. | |
874 | ||
875 | The vendor/product-id looks like this: *0123:abcd*, | |
876 | where *0123* is the id of the vendor, and *abcd* is the id | |
877 | of the product, meaning two pieces of the same usb device | |
878 | have the same id. | |
879 | ||
880 | The bus/port looks like this: *1-2.3.4*, where *1* is the bus | |
881 | and *2.3.4* is the port path. This represents the physical | |
882 | ports of your host (depending of the internal order of the | |
883 | usb controllers). | |
884 | ||
885 | If a device is present in a VM configuration when the VM starts up, | |
886 | but the device is not present in the host, the VM can boot without problems. | |
470d4313 | 887 | As soon as the device/port is available in the host, it gets passed through. |
685cc8e0 | 888 | |
e60ce90c | 889 | WARNING: Using this kind of USB passthrough means that you cannot move |
685cc8e0 DC |
890 | a VM online to another host, since the hardware is only available |
891 | on the host the VM is currently residing. | |
892 | ||
9632a85d NU |
893 | The second type of passthrough is SPICE USB passthrough. If you add one or more |
894 | SPICE USB ports to your VM, you can dynamically pass a local USB device from | |
895 | your SPICE client through to the VM. This can be useful to redirect an input | |
896 | device or hardware dongle temporarily. | |
685cc8e0 | 897 | |
e2a867b2 DC |
898 | It is also possible to map devices on a cluster level, so that they can be |
899 | properly used with HA and hardware changes are detected and non root users | |
900 | can configure them. See xref:resource_mapping[Resource Mapping] | |
901 | for details on that. | |
80c0adcb DM |
902 | |
903 | [[qm_bios_and_uefi]] | |
076d60ae DC |
904 | BIOS and UEFI |
905 | ~~~~~~~~~~~~~ | |
906 | ||
907 | In order to properly emulate a computer, QEMU needs to use a firmware. | |
55ce3375 TL |
908 | Which, on common PCs often known as BIOS or (U)EFI, is executed as one of the |
909 | first steps when booting a VM. It is responsible for doing basic hardware | |
910 | initialization and for providing an interface to the firmware and hardware for | |
911 | the operating system. By default QEMU uses *SeaBIOS* for this, which is an | |
912 | open-source, x86 BIOS implementation. SeaBIOS is a good choice for most | |
913 | standard setups. | |
076d60ae | 914 | |
8e5720fd | 915 | Some operating systems (such as Windows 11) may require use of an UEFI |
58e695ca | 916 | compatible implementation. In such cases, you must use *OVMF* instead, |
8e5720fd SR |
917 | which is an open-source UEFI implementation. footnote:[See the OVMF Project https://github.com/tianocore/tianocore.github.io/wiki/OVMF] |
918 | ||
d6466262 TL |
919 | There are other scenarios in which the SeaBIOS may not be the ideal firmware to |
920 | boot from, for example if you want to do VGA passthrough. footnote:[Alex | |
921 | Williamson has a good blog entry about this | |
922 | https://vfio.blogspot.co.at/2014/08/primary-graphics-assignment-without-vga.html] | |
076d60ae DC |
923 | |
924 | If you want to use OVMF, there are several things to consider: | |
925 | ||
926 | In order to save things like the *boot order*, there needs to be an EFI Disk. | |
927 | This disk will be included in backups and snapshots, and there can only be one. | |
928 | ||
929 | You can create such a disk with the following command: | |
930 | ||
32e8b5b2 AL |
931 | ---- |
932 | # qm set <vmid> -efidisk0 <storage>:1,format=<format>,efitype=4m,pre-enrolled-keys=1 | |
933 | ---- | |
076d60ae DC |
934 | |
935 | Where *<storage>* is the storage where you want to have the disk, and | |
936 | *<format>* is a format which the storage supports. Alternatively, you can | |
937 | create such a disk through the web interface with 'Add' -> 'EFI Disk' in the | |
938 | hardware section of a VM. | |
939 | ||
8e5720fd SR |
940 | The *efitype* option specifies which version of the OVMF firmware should be |
941 | used. For new VMs, this should always be '4m', as it supports Secure Boot and | |
942 | has more space allocated to support future development (this is the default in | |
943 | the GUI). | |
944 | ||
945 | *pre-enroll-keys* specifies if the efidisk should come pre-loaded with | |
946 | distribution-specific and Microsoft Standard Secure Boot keys. It also enables | |
947 | Secure Boot by default (though it can still be disabled in the OVMF menu within | |
948 | the VM). | |
949 | ||
950 | NOTE: If you want to start using Secure Boot in an existing VM (that still uses | |
951 | a '2m' efidisk), you need to recreate the efidisk. To do so, delete the old one | |
952 | (`qm set <vmid> -delete efidisk0`) and add a new one as described above. This | |
953 | will reset any custom configurations you have made in the OVMF menu! | |
954 | ||
076d60ae | 955 | When using OVMF with a virtual display (without VGA passthrough), |
8e5720fd | 956 | you need to set the client resolution in the OVMF menu (which you can reach |
076d60ae DC |
957 | with a press of the ESC button during boot), or you have to choose |
958 | SPICE as the display type. | |
959 | ||
95e8e1b7 SR |
960 | [[qm_tpm]] |
961 | Trusted Platform Module (TPM) | |
962 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
963 | ||
964 | A *Trusted Platform Module* is a device which stores secret data - such as | |
965 | encryption keys - securely and provides tamper-resistance functions for | |
966 | validating system boot. | |
967 | ||
d6466262 TL |
968 | Certain operating systems (such as Windows 11) require such a device to be |
969 | attached to a machine (be it physical or virtual). | |
95e8e1b7 SR |
970 | |
971 | A TPM is added by specifying a *tpmstate* volume. This works similar to an | |
972 | efidisk, in that it cannot be changed (only removed) once created. You can add | |
973 | one via the following command: | |
974 | ||
32e8b5b2 AL |
975 | ---- |
976 | # qm set <vmid> -tpmstate0 <storage>:1,version=<version> | |
977 | ---- | |
95e8e1b7 SR |
978 | |
979 | Where *<storage>* is the storage you want to put the state on, and *<version>* | |
980 | is either 'v1.2' or 'v2.0'. You can also add one via the web interface, by | |
981 | choosing 'Add' -> 'TPM State' in the hardware section of a VM. | |
982 | ||
983 | The 'v2.0' TPM spec is newer and better supported, so unless you have a specific | |
984 | implementation that requires a 'v1.2' TPM, it should be preferred. | |
985 | ||
986 | NOTE: Compared to a physical TPM, an emulated one does *not* provide any real | |
987 | security benefits. The point of a TPM is that the data on it cannot be modified | |
988 | easily, except via commands specified as part of the TPM spec. Since with an | |
989 | emulated device the data storage happens on a regular volume, it can potentially | |
990 | be edited by anyone with access to it. | |
991 | ||
0ad30983 DC |
992 | [[qm_ivshmem]] |
993 | Inter-VM shared memory | |
994 | ~~~~~~~~~~~~~~~~~~~~~~ | |
995 | ||
8861c7ad TL |
996 | You can add an Inter-VM shared memory device (`ivshmem`), which allows one to |
997 | share memory between the host and a guest, or also between multiple guests. | |
0ad30983 DC |
998 | |
999 | To add such a device, you can use `qm`: | |
1000 | ||
32e8b5b2 AL |
1001 | ---- |
1002 | # qm set <vmid> -ivshmem size=32,name=foo | |
1003 | ---- | |
0ad30983 DC |
1004 | |
1005 | Where the size is in MiB. The file will be located under | |
1006 | `/dev/shm/pve-shm-$name` (the default name is the vmid). | |
1007 | ||
4d1a19eb TL |
1008 | NOTE: Currently the device will get deleted as soon as any VM using it got |
1009 | shutdown or stopped. Open connections will still persist, but new connections | |
1010 | to the exact same device cannot be made anymore. | |
1011 | ||
8861c7ad | 1012 | A use case for such a device is the Looking Glass |
451bb75f SR |
1013 | footnote:[Looking Glass: https://looking-glass.io/] project, which enables high |
1014 | performance, low-latency display mirroring between host and guest. | |
0ad30983 | 1015 | |
ca8c3009 AL |
1016 | [[qm_audio_device]] |
1017 | Audio Device | |
1018 | ~~~~~~~~~~~~ | |
1019 | ||
1020 | To add an audio device run the following command: | |
1021 | ||
1022 | ---- | |
1023 | qm set <vmid> -audio0 device=<device> | |
1024 | ---- | |
1025 | ||
1026 | Supported audio devices are: | |
1027 | ||
1028 | * `ich9-intel-hda`: Intel HD Audio Controller, emulates ICH9 | |
1029 | * `intel-hda`: Intel HD Audio Controller, emulates ICH6 | |
1030 | * `AC97`: Audio Codec '97, useful for older operating systems like Windows XP | |
1031 | ||
cf41761d AL |
1032 | There are two backends available: |
1033 | ||
1034 | * 'spice' | |
1035 | * 'none' | |
1036 | ||
1037 | The 'spice' backend can be used in combination with xref:qm_display[SPICE] while | |
1038 | the 'none' backend can be useful if an audio device is needed in the VM for some | |
1039 | software to work. To use the physical audio device of the host use device | |
1040 | passthrough (see xref:qm_pci_passthrough[PCI Passthrough] and | |
1041 | xref:qm_usb_passthrough[USB Passthrough]). Remote protocols like Microsoft’s RDP | |
1042 | have options to play sound. | |
1043 | ||
ca8c3009 | 1044 | |
adb2c91d SR |
1045 | [[qm_virtio_rng]] |
1046 | VirtIO RNG | |
1047 | ~~~~~~~~~~ | |
1048 | ||
1049 | A RNG (Random Number Generator) is a device providing entropy ('randomness') to | |
1050 | a system. A virtual hardware-RNG can be used to provide such entropy from the | |
1051 | host system to a guest VM. This helps to avoid entropy starvation problems in | |
1052 | the guest (a situation where not enough entropy is available and the system may | |
1053 | slow down or run into problems), especially during the guests boot process. | |
1054 | ||
1055 | To add a VirtIO-based emulated RNG, run the following command: | |
1056 | ||
1057 | ---- | |
1058 | qm set <vmid> -rng0 source=<source>[,max_bytes=X,period=Y] | |
1059 | ---- | |
1060 | ||
1061 | `source` specifies where entropy is read from on the host and has to be one of | |
1062 | the following: | |
1063 | ||
1064 | * `/dev/urandom`: Non-blocking kernel entropy pool (preferred) | |
1065 | * `/dev/random`: Blocking kernel pool (not recommended, can lead to entropy | |
1066 | starvation on the host system) | |
1067 | * `/dev/hwrng`: To pass through a hardware RNG attached to the host (if multiple | |
1068 | are available, the one selected in | |
1069 | `/sys/devices/virtual/misc/hw_random/rng_current` will be used) | |
1070 | ||
1071 | A limit can be specified via the `max_bytes` and `period` parameters, they are | |
1072 | read as `max_bytes` per `period` in milliseconds. However, it does not represent | |
1073 | a linear relationship: 1024B/1000ms would mean that up to 1 KiB of data becomes | |
1074 | available on a 1 second timer, not that 1 KiB is streamed to the guest over the | |
1075 | course of one second. Reducing the `period` can thus be used to inject entropy | |
1076 | into the guest at a faster rate. | |
1077 | ||
1078 | By default, the limit is set to 1024 bytes per 1000 ms (1 KiB/s). It is | |
1079 | recommended to always use a limiter to avoid guests using too many host | |
1080 | resources. If desired, a value of '0' for `max_bytes` can be used to disable | |
1081 | all limits. | |
1082 | ||
777cf894 | 1083 | [[qm_bootorder]] |
8cd6f474 TL |
1084 | Device Boot Order |
1085 | ~~~~~~~~~~~~~~~~~ | |
777cf894 SR |
1086 | |
1087 | QEMU can tell the guest which devices it should boot from, and in which order. | |
d6466262 | 1088 | This can be specified in the config via the `boot` property, for example: |
777cf894 SR |
1089 | |
1090 | ---- | |
1091 | boot: order=scsi0;net0;hostpci0 | |
1092 | ---- | |
1093 | ||
1094 | [thumbnail="screenshot/gui-qemu-edit-bootorder.png"] | |
1095 | ||
1096 | This way, the guest would first attempt to boot from the disk `scsi0`, if that | |
1097 | fails, it would go on to attempt network boot from `net0`, and in case that | |
1098 | fails too, finally attempt to boot from a passed through PCIe device (seen as | |
1099 | disk in case of NVMe, otherwise tries to launch into an option ROM). | |
1100 | ||
1101 | On the GUI you can use a drag-and-drop editor to specify the boot order, and use | |
1102 | the checkbox to enable or disable certain devices for booting altogether. | |
1103 | ||
1104 | NOTE: If your guest uses multiple disks to boot the OS or load the bootloader, | |
1105 | all of them must be marked as 'bootable' (that is, they must have the checkbox | |
1106 | enabled or appear in the list in the config) for the guest to be able to boot. | |
1107 | This is because recent SeaBIOS and OVMF versions only initialize disks if they | |
1108 | are marked 'bootable'. | |
1109 | ||
1110 | In any case, even devices not appearing in the list or having the checkmark | |
1111 | disabled will still be available to the guest, once it's operating system has | |
1112 | booted and initialized them. The 'bootable' flag only affects the guest BIOS and | |
1113 | bootloader. | |
1114 | ||
1115 | ||
288e3f46 EK |
1116 | [[qm_startup_and_shutdown]] |
1117 | Automatic Start and Shutdown of Virtual Machines | |
1118 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1119 | ||
1120 | After creating your VMs, you probably want them to start automatically | |
1121 | when the host system boots. For this you need to select the option 'Start at | |
1122 | boot' from the 'Options' Tab of your VM in the web interface, or set it with | |
1123 | the following command: | |
1124 | ||
32e8b5b2 AL |
1125 | ---- |
1126 | # qm set <vmid> -onboot 1 | |
1127 | ---- | |
288e3f46 | 1128 | |
4dbeb548 DM |
1129 | .Start and Shutdown Order |
1130 | ||
1ff5e4e8 | 1131 | [thumbnail="screenshot/gui-qemu-edit-start-order.png"] |
4dbeb548 DM |
1132 | |
1133 | In some case you want to be able to fine tune the boot order of your | |
1134 | VMs, for instance if one of your VM is providing firewalling or DHCP | |
1135 | to other guest systems. For this you can use the following | |
1136 | parameters: | |
288e3f46 | 1137 | |
d6466262 | 1138 | * *Start/Shutdown order*: Defines the start order priority. For example, set it |
5afa9371 FG |
1139 | to 1 if you want the VM to be the first to be started. (We use the reverse |
1140 | startup order for shutdown, so a machine with a start order of 1 would be the | |
1141 | last to be shut down). If multiple VMs have the same order defined on a host, | |
1142 | they will additionally be ordered by 'VMID' in ascending order. | |
288e3f46 | 1143 | * *Startup delay*: Defines the interval between this VM start and subsequent |
d6466262 TL |
1144 | VMs starts. For example, set it to 240 if you want to wait 240 seconds before |
1145 | starting other VMs. | |
288e3f46 | 1146 | * *Shutdown timeout*: Defines the duration in seconds {pve} should wait |
d6466262 TL |
1147 | for the VM to be offline after issuing a shutdown command. By default this |
1148 | value is set to 180, which means that {pve} will issue a shutdown request and | |
1149 | wait 180 seconds for the machine to be offline. If the machine is still online | |
1150 | after the timeout it will be stopped forcefully. | |
288e3f46 | 1151 | |
2b2c6286 TL |
1152 | NOTE: VMs managed by the HA stack do not follow the 'start on boot' and |
1153 | 'boot order' options currently. Those VMs will be skipped by the startup and | |
1154 | shutdown algorithm as the HA manager itself ensures that VMs get started and | |
1155 | stopped. | |
1156 | ||
288e3f46 | 1157 | Please note that machines without a Start/Shutdown order parameter will always |
7eed72d8 | 1158 | start after those where the parameter is set. Further, this parameter can only |
d750c851 | 1159 | be enforced between virtual machines running on the same host, not |
288e3f46 | 1160 | cluster-wide. |
076d60ae | 1161 | |
0f7778ac DW |
1162 | If you require a delay between the host boot and the booting of the first VM, |
1163 | see the section on xref:first_guest_boot_delay[Proxmox VE Node Management]. | |
1164 | ||
c0f039aa AL |
1165 | |
1166 | [[qm_qemu_agent]] | |
c730e973 | 1167 | QEMU Guest Agent |
c0f039aa AL |
1168 | ~~~~~~~~~~~~~~~~ |
1169 | ||
c730e973 | 1170 | The QEMU Guest Agent is a service which runs inside the VM, providing a |
c0f039aa AL |
1171 | communication channel between the host and the guest. It is used to exchange |
1172 | information and allows the host to issue commands to the guest. | |
1173 | ||
1174 | For example, the IP addresses in the VM summary panel are fetched via the guest | |
1175 | agent. | |
1176 | ||
1177 | Or when starting a backup, the guest is told via the guest agent to sync | |
1178 | outstanding writes via the 'fs-freeze' and 'fs-thaw' commands. | |
1179 | ||
1180 | For the guest agent to work properly the following steps must be taken: | |
1181 | ||
1182 | * install the agent in the guest and make sure it is running | |
1183 | * enable the communication via the agent in {pve} | |
1184 | ||
1185 | Install Guest Agent | |
1186 | ^^^^^^^^^^^^^^^^^^^ | |
1187 | ||
1188 | For most Linux distributions, the guest agent is available. The package is | |
1189 | usually named `qemu-guest-agent`. | |
1190 | ||
1191 | For Windows, it can be installed from the | |
1192 | https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/stable-virtio/virtio-win.iso[Fedora | |
1193 | VirtIO driver ISO]. | |
1194 | ||
80df0d2e | 1195 | [[qm_qga_enable]] |
c0f039aa AL |
1196 | Enable Guest Agent Communication |
1197 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
1198 | ||
1199 | Communication from {pve} with the guest agent can be enabled in the VM's | |
1200 | *Options* panel. A fresh start of the VM is necessary for the changes to take | |
1201 | effect. | |
1202 | ||
80df0d2e TL |
1203 | [[qm_qga_auto_trim]] |
1204 | Automatic TRIM Using QGA | |
1205 | ^^^^^^^^^^^^^^^^^^^^^^^^ | |
1206 | ||
c0f039aa AL |
1207 | It is possible to enable the 'Run guest-trim' option. With this enabled, |
1208 | {pve} will issue a trim command to the guest after the following | |
1209 | operations that have the potential to write out zeros to the storage: | |
1210 | ||
1211 | * moving a disk to another storage | |
1212 | * live migrating a VM to another node with local storage | |
1213 | ||
1214 | On a thin provisioned storage, this can help to free up unused space. | |
1215 | ||
95117b6c FE |
1216 | NOTE: There is a caveat with ext4 on Linux, because it uses an in-memory |
1217 | optimization to avoid issuing duplicate TRIM requests. Since the guest doesn't | |
1218 | know about the change in the underlying storage, only the first guest-trim will | |
1219 | run as expected. Subsequent ones, until the next reboot, will only consider | |
1220 | parts of the filesystem that changed since then. | |
1221 | ||
80df0d2e | 1222 | [[qm_qga_fsfreeze]] |
62bf5d75 CH |
1223 | Filesystem Freeze & Thaw on Backup |
1224 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
1225 | ||
1226 | By default, guest filesystems are synced via the 'fs-freeze' QEMU Guest Agent | |
1227 | Command when a backup is performed, to provide consistency. | |
1228 | ||
1229 | On Windows guests, some applications might handle consistent backups themselves | |
1230 | by hooking into the Windows VSS (Volume Shadow Copy Service) layer, a | |
1231 | 'fs-freeze' then might interfere with that. For example, it has been observed | |
1232 | that calling 'fs-freeze' with some SQL Servers triggers VSS to call the SQL | |
1233 | Writer VSS module in a mode that breaks the SQL Server backup chain for | |
1234 | differential backups. | |
1235 | ||
1236 | For such setups you can configure {pve} to not issue a freeze-and-thaw cycle on | |
266dd87d CH |
1237 | backup by setting the `freeze-fs-on-backup` QGA option to `0`. This can also be |
1238 | done via the GUI with the 'Freeze/thaw guest filesystems on backup for | |
1239 | consistency' option. | |
62bf5d75 | 1240 | |
80df0d2e | 1241 | IMPORTANT: Disabling this option can potentially lead to backups with inconsistent |
62bf5d75 CH |
1242 | filesystems and should therefore only be disabled if you know what you are |
1243 | doing. | |
1244 | ||
c0f039aa AL |
1245 | Troubleshooting |
1246 | ^^^^^^^^^^^^^^^ | |
1247 | ||
1248 | .VM does not shut down | |
1249 | ||
1250 | Make sure the guest agent is installed and running. | |
1251 | ||
1252 | Once the guest agent is enabled, {pve} will send power commands like | |
1253 | 'shutdown' via the guest agent. If the guest agent is not running, commands | |
1254 | cannot get executed properly and the shutdown command will run into a timeout. | |
1255 | ||
22a0091c AL |
1256 | [[qm_spice_enhancements]] |
1257 | SPICE Enhancements | |
1258 | ~~~~~~~~~~~~~~~~~~ | |
1259 | ||
1260 | SPICE Enhancements are optional features that can improve the remote viewer | |
1261 | experience. | |
1262 | ||
1263 | To enable them via the GUI go to the *Options* panel of the virtual machine. Run | |
1264 | the following command to enable them via the CLI: | |
1265 | ||
1266 | ---- | |
1267 | qm set <vmid> -spice_enhancements foldersharing=1,videostreaming=all | |
1268 | ---- | |
1269 | ||
1270 | NOTE: To use these features the <<qm_display,*Display*>> of the virtual machine | |
1271 | must be set to SPICE (qxl). | |
1272 | ||
1273 | Folder Sharing | |
1274 | ^^^^^^^^^^^^^^ | |
1275 | ||
1276 | Share a local folder with the guest. The `spice-webdavd` daemon needs to be | |
1277 | installed in the guest. It makes the shared folder available through a local | |
1278 | WebDAV server located at http://localhost:9843. | |
1279 | ||
1280 | For Windows guests the installer for the 'Spice WebDAV daemon' can be downloaded | |
1281 | from the | |
1282 | https://www.spice-space.org/download.html#windows-binaries[official SPICE website]. | |
1283 | ||
1284 | Most Linux distributions have a package called `spice-webdavd` that can be | |
1285 | installed. | |
1286 | ||
1287 | To share a folder in Virt-Viewer (Remote Viewer) go to 'File -> Preferences'. | |
1288 | Select the folder to share and then enable the checkbox. | |
1289 | ||
1290 | NOTE: Folder sharing currently only works in the Linux version of Virt-Viewer. | |
1291 | ||
0dcd22f5 AL |
1292 | CAUTION: Experimental! Currently this feature does not work reliably. |
1293 | ||
22a0091c AL |
1294 | Video Streaming |
1295 | ^^^^^^^^^^^^^^^ | |
1296 | ||
1297 | Fast refreshing areas are encoded into a video stream. Two options exist: | |
1298 | ||
1299 | * *all*: Any fast refreshing area will be encoded into a video stream. | |
1300 | * *filter*: Additional filters are used to decide if video streaming should be | |
1301 | used (currently only small window surfaces are skipped). | |
1302 | ||
1303 | A general recommendation if video streaming should be enabled and which option | |
1304 | to choose from cannot be given. Your mileage may vary depending on the specific | |
1305 | circumstances. | |
1306 | ||
1307 | Troubleshooting | |
1308 | ^^^^^^^^^^^^^^^ | |
1309 | ||
19a58e02 | 1310 | .Shared folder does not show up |
22a0091c AL |
1311 | |
1312 | Make sure the WebDAV service is enabled and running in the guest. On Windows it | |
1313 | is called 'Spice webdav proxy'. In Linux the name is 'spice-webdavd' but can be | |
1314 | different depending on the distribution. | |
1315 | ||
1316 | If the service is running, check the WebDAV server by opening | |
1317 | http://localhost:9843 in a browser in the guest. | |
1318 | ||
1319 | It can help to restart the SPICE session. | |
c73c190f DM |
1320 | |
1321 | [[qm_migration]] | |
1322 | Migration | |
1323 | --------- | |
1324 | ||
1ff5e4e8 | 1325 | [thumbnail="screenshot/gui-qemu-migrate.png"] |
e4bcef0a | 1326 | |
c73c190f DM |
1327 | If you have a cluster, you can migrate your VM to another host with |
1328 | ||
32e8b5b2 AL |
1329 | ---- |
1330 | # qm migrate <vmid> <target> | |
1331 | ---- | |
c73c190f | 1332 | |
8df8cfb7 DC |
1333 | There are generally two mechanisms for this |
1334 | ||
1335 | * Online Migration (aka Live Migration) | |
1336 | * Offline Migration | |
1337 | ||
1338 | Online Migration | |
1339 | ~~~~~~~~~~~~~~~~ | |
1340 | ||
27780834 | 1341 | If your VM is running and no locally bound resources are configured (such as |
9632a85d | 1342 | devices that are passed through), you can initiate a live migration with the `--online` |
e2b3622a | 1343 | flag in the `qm migration` command evocation. The web interface defaults to |
27780834 | 1344 | live migration when the VM is running. |
c73c190f | 1345 | |
8df8cfb7 DC |
1346 | How it works |
1347 | ^^^^^^^^^^^^ | |
1348 | ||
27780834 TL |
1349 | Online migration first starts a new QEMU process on the target host with the |
1350 | 'incoming' flag, which performs only basic initialization with the guest vCPUs | |
1351 | still paused and then waits for the guest memory and device state data streams | |
1352 | of the source Virtual Machine. | |
1353 | All other resources, such as disks, are either shared or got already sent | |
1354 | before runtime state migration of the VMs begins; so only the memory content | |
1355 | and device state remain to be transferred. | |
1356 | ||
1357 | Once this connection is established, the source begins asynchronously sending | |
1358 | the memory content to the target. If the guest memory on the source changes, | |
1359 | those sections are marked dirty and another pass is made to send the guest | |
1360 | memory data. | |
1361 | This loop is repeated until the data difference between running source VM | |
1362 | and incoming target VM is small enough to be sent in a few milliseconds, | |
1363 | because then the source VM can be paused completely, without a user or program | |
1364 | noticing the pause, so that the remaining data can be sent to the target, and | |
1365 | then unpause the targets VM's CPU to make it the new running VM in well under a | |
1366 | second. | |
8df8cfb7 DC |
1367 | |
1368 | Requirements | |
1369 | ^^^^^^^^^^^^ | |
1370 | ||
1371 | For Live Migration to work, there are some things required: | |
1372 | ||
27780834 TL |
1373 | * The VM has no local resources that cannot be migrated. For example, |
1374 | PCI or USB devices that are passed through currently block live-migration. | |
1375 | Local Disks, on the other hand, can be migrated by sending them to the target | |
1376 | just fine. | |
1377 | * The hosts are located in the same {pve} cluster. | |
1378 | * The hosts have a working (and reliable) network connection between them. | |
1379 | * The target host must have the same, or higher versions of the | |
1380 | {pve} packages. Although it can sometimes work the other way around, this | |
1381 | cannot be guaranteed. | |
1382 | * The hosts have CPUs from the same vendor with similar capabilities. Different | |
1383 | vendor *might* work depending on the actual models and VMs CPU type | |
1384 | configured, but it cannot be guaranteed - so please test before deploying | |
1385 | such a setup in production. | |
8df8cfb7 DC |
1386 | |
1387 | Offline Migration | |
1388 | ~~~~~~~~~~~~~~~~~ | |
1389 | ||
27780834 TL |
1390 | If you have local resources, you can still migrate your VMs offline as long as |
1391 | all disk are on storage defined on both hosts. | |
1392 | Migration then copies the disks to the target host over the network, as with | |
9632a85d | 1393 | online migration. Note that any hardware passthrough configuration may need to |
27780834 TL |
1394 | be adapted to the device location on the target host. |
1395 | ||
1396 | // TODO: mention hardware map IDs as better way to solve that, once available | |
c73c190f | 1397 | |
eeb87f95 DM |
1398 | [[qm_copy_and_clone]] |
1399 | Copies and Clones | |
1400 | ----------------- | |
9e55c76d | 1401 | |
1ff5e4e8 | 1402 | [thumbnail="screenshot/gui-qemu-full-clone.png"] |
9e55c76d DM |
1403 | |
1404 | VM installation is usually done using an installation media (CD-ROM) | |
61018238 | 1405 | from the operating system vendor. Depending on the OS, this can be a |
9e55c76d DM |
1406 | time consuming task one might want to avoid. |
1407 | ||
1408 | An easy way to deploy many VMs of the same type is to copy an existing | |
1409 | VM. We use the term 'clone' for such copies, and distinguish between | |
1410 | 'linked' and 'full' clones. | |
1411 | ||
1412 | Full Clone:: | |
1413 | ||
1414 | The result of such copy is an independent VM. The | |
1415 | new VM does not share any storage resources with the original. | |
1416 | + | |
707e37a2 | 1417 | |
9e55c76d DM |
1418 | It is possible to select a *Target Storage*, so one can use this to |
1419 | migrate a VM to a totally different storage. You can also change the | |
1420 | disk image *Format* if the storage driver supports several formats. | |
1421 | + | |
707e37a2 | 1422 | |
730fbca4 | 1423 | NOTE: A full clone needs to read and copy all VM image data. This is |
9e55c76d | 1424 | usually much slower than creating a linked clone. |
707e37a2 DM |
1425 | + |
1426 | ||
1427 | Some storage types allows to copy a specific *Snapshot*, which | |
1428 | defaults to the 'current' VM data. This also means that the final copy | |
1429 | never includes any additional snapshots from the original VM. | |
1430 | ||
9e55c76d DM |
1431 | |
1432 | Linked Clone:: | |
1433 | ||
730fbca4 | 1434 | Modern storage drivers support a way to generate fast linked |
9e55c76d DM |
1435 | clones. Such a clone is a writable copy whose initial contents are the |
1436 | same as the original data. Creating a linked clone is nearly | |
1437 | instantaneous, and initially consumes no additional space. | |
1438 | + | |
707e37a2 | 1439 | |
9e55c76d DM |
1440 | They are called 'linked' because the new image still refers to the |
1441 | original. Unmodified data blocks are read from the original image, but | |
1442 | modification are written (and afterwards read) from a new | |
1443 | location. This technique is called 'Copy-on-write'. | |
1444 | + | |
707e37a2 DM |
1445 | |
1446 | This requires that the original volume is read-only. With {pve} one | |
1447 | can convert any VM into a read-only <<qm_templates, Template>>). Such | |
1448 | templates can later be used to create linked clones efficiently. | |
1449 | + | |
1450 | ||
730fbca4 OB |
1451 | NOTE: You cannot delete an original template while linked clones |
1452 | exist. | |
9e55c76d | 1453 | + |
707e37a2 DM |
1454 | |
1455 | It is not possible to change the *Target storage* for linked clones, | |
1456 | because this is a storage internal feature. | |
9e55c76d DM |
1457 | |
1458 | ||
1459 | The *Target node* option allows you to create the new VM on a | |
1460 | different node. The only restriction is that the VM is on shared | |
1461 | storage, and that storage is also available on the target node. | |
1462 | ||
730fbca4 | 1463 | To avoid resource conflicts, all network interface MAC addresses get |
9e55c76d DM |
1464 | randomized, and we generate a new 'UUID' for the VM BIOS (smbios1) |
1465 | setting. | |
1466 | ||
1467 | ||
707e37a2 DM |
1468 | [[qm_templates]] |
1469 | Virtual Machine Templates | |
1470 | ------------------------- | |
1471 | ||
1472 | One can convert a VM into a Template. Such templates are read-only, | |
1473 | and you can use them to create linked clones. | |
1474 | ||
1475 | NOTE: It is not possible to start templates, because this would modify | |
1476 | the disk images. If you want to change the template, create a linked | |
1477 | clone and modify that. | |
1478 | ||
319d5325 DC |
1479 | VM Generation ID |
1480 | ---------------- | |
1481 | ||
941ff8d3 | 1482 | {pve} supports Virtual Machine Generation ID ('vmgenid') footnote:[Official |
effa4818 TL |
1483 | 'vmgenid' Specification |
1484 | https://docs.microsoft.com/en-us/windows/desktop/hyperv_v2/virtual-machine-generation-identifier] | |
1485 | for virtual machines. | |
1486 | This can be used by the guest operating system to detect any event resulting | |
1487 | in a time shift event, for example, restoring a backup or a snapshot rollback. | |
319d5325 | 1488 | |
effa4818 TL |
1489 | When creating new VMs, a 'vmgenid' will be automatically generated and saved |
1490 | in its configuration file. | |
319d5325 | 1491 | |
effa4818 TL |
1492 | To create and add a 'vmgenid' to an already existing VM one can pass the |
1493 | special value `1' to let {pve} autogenerate one or manually set the 'UUID' | |
d6466262 TL |
1494 | footnote:[Online GUID generator http://guid.one/] by using it as value, for |
1495 | example: | |
319d5325 | 1496 | |
effa4818 | 1497 | ---- |
32e8b5b2 AL |
1498 | # qm set VMID -vmgenid 1 |
1499 | # qm set VMID -vmgenid 00000000-0000-0000-0000-000000000000 | |
effa4818 | 1500 | ---- |
319d5325 | 1501 | |
cfd48f55 TL |
1502 | NOTE: The initial addition of a 'vmgenid' device to an existing VM, may result |
1503 | in the same effects as a change on snapshot rollback, backup restore, etc., has | |
1504 | as the VM can interpret this as generation change. | |
1505 | ||
effa4818 TL |
1506 | In the rare case the 'vmgenid' mechanism is not wanted one can pass `0' for |
1507 | its value on VM creation, or retroactively delete the property in the | |
1508 | configuration with: | |
319d5325 | 1509 | |
effa4818 | 1510 | ---- |
32e8b5b2 | 1511 | # qm set VMID -delete vmgenid |
effa4818 | 1512 | ---- |
319d5325 | 1513 | |
effa4818 TL |
1514 | The most prominent use case for 'vmgenid' are newer Microsoft Windows |
1515 | operating systems, which use it to avoid problems in time sensitive or | |
d6466262 | 1516 | replicate services (such as databases or domain controller |
cfd48f55 TL |
1517 | footnote:[https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/get-started/virtual-dc/virtualized-domain-controller-architecture]) |
1518 | on snapshot rollback, backup restore or a whole VM clone operation. | |
319d5325 | 1519 | |
c069256d EK |
1520 | Importing Virtual Machines and disk images |
1521 | ------------------------------------------ | |
56368da8 EK |
1522 | |
1523 | A VM export from a foreign hypervisor takes usually the form of one or more disk | |
59552707 | 1524 | images, with a configuration file describing the settings of the VM (RAM, |
56368da8 EK |
1525 | number of cores). + |
1526 | The disk images can be in the vmdk format, if the disks come from | |
59552707 DM |
1527 | VMware or VirtualBox, or qcow2 if the disks come from a KVM hypervisor. |
1528 | The most popular configuration format for VM exports is the OVF standard, but in | |
1529 | practice interoperation is limited because many settings are not implemented in | |
1530 | the standard itself, and hypervisors export the supplementary information | |
56368da8 EK |
1531 | in non-standard extensions. |
1532 | ||
1533 | Besides the problem of format, importing disk images from other hypervisors | |
1534 | may fail if the emulated hardware changes too much from one hypervisor to | |
1535 | another. Windows VMs are particularly concerned by this, as the OS is very | |
1536 | picky about any changes of hardware. This problem may be solved by | |
1537 | installing the MergeIDE.zip utility available from the Internet before exporting | |
1538 | and choosing a hard disk type of *IDE* before booting the imported Windows VM. | |
1539 | ||
59552707 | 1540 | Finally there is the question of paravirtualized drivers, which improve the |
56368da8 EK |
1541 | speed of the emulated system and are specific to the hypervisor. |
1542 | GNU/Linux and other free Unix OSes have all the necessary drivers installed by | |
1543 | default and you can switch to the paravirtualized drivers right after importing | |
59552707 | 1544 | the VM. For Windows VMs, you need to install the Windows paravirtualized |
56368da8 EK |
1545 | drivers by yourself. |
1546 | ||
1547 | GNU/Linux and other free Unix can usually be imported without hassle. Note | |
eb01c5cf | 1548 | that we cannot guarantee a successful import/export of Windows VMs in all |
56368da8 EK |
1549 | cases due to the problems above. |
1550 | ||
c069256d EK |
1551 | Step-by-step example of a Windows OVF import |
1552 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
56368da8 | 1553 | |
59552707 | 1554 | Microsoft provides |
c069256d | 1555 | https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/[Virtual Machines downloads] |
144d5ede | 1556 | to get started with Windows development.We are going to use one of these |
c069256d | 1557 | to demonstrate the OVF import feature. |
56368da8 | 1558 | |
c069256d EK |
1559 | Download the Virtual Machine zip |
1560 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
56368da8 | 1561 | |
144d5ede | 1562 | After getting informed about the user agreement, choose the _Windows 10 |
c069256d | 1563 | Enterprise (Evaluation - Build)_ for the VMware platform, and download the zip. |
56368da8 EK |
1564 | |
1565 | Extract the disk image from the zip | |
1566 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
1567 | ||
c069256d EK |
1568 | Using the `unzip` utility or any archiver of your choice, unpack the zip, |
1569 | and copy via ssh/scp the ovf and vmdk files to your {pve} host. | |
56368da8 | 1570 | |
c069256d EK |
1571 | Import the Virtual Machine |
1572 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
56368da8 | 1573 | |
c069256d EK |
1574 | This will create a new virtual machine, using cores, memory and |
1575 | VM name as read from the OVF manifest, and import the disks to the +local-lvm+ | |
1576 | storage. You have to configure the network manually. | |
56368da8 | 1577 | |
32e8b5b2 AL |
1578 | ---- |
1579 | # qm importovf 999 WinDev1709Eval.ovf local-lvm | |
1580 | ---- | |
56368da8 | 1581 | |
c069256d | 1582 | The VM is ready to be started. |
56368da8 | 1583 | |
c069256d EK |
1584 | Adding an external disk image to a Virtual Machine |
1585 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
56368da8 | 1586 | |
144d5ede | 1587 | You can also add an existing disk image to a VM, either coming from a |
c069256d EK |
1588 | foreign hypervisor, or one that you created yourself. |
1589 | ||
1590 | Suppose you created a Debian/Ubuntu disk image with the 'vmdebootstrap' tool: | |
1591 | ||
1592 | vmdebootstrap --verbose \ | |
67d59a35 | 1593 | --size 10GiB --serial-console \ |
c069256d EK |
1594 | --grub --no-extlinux \ |
1595 | --package openssh-server \ | |
1596 | --package avahi-daemon \ | |
1597 | --package qemu-guest-agent \ | |
1598 | --hostname vm600 --enable-dhcp \ | |
1599 | --customize=./copy_pub_ssh.sh \ | |
1600 | --sparse --image vm600.raw | |
1601 | ||
10a2a4aa FE |
1602 | You can now create a new target VM, importing the image to the storage `pvedir` |
1603 | and attaching it to the VM's SCSI controller: | |
c069256d | 1604 | |
32e8b5b2 AL |
1605 | ---- |
1606 | # qm create 600 --net0 virtio,bridge=vmbr0 --name vm600 --serial0 socket \ | |
10a2a4aa FE |
1607 | --boot order=scsi0 --scsihw virtio-scsi-pci --ostype l26 \ |
1608 | --scsi0 pvedir:0,import-from=/path/to/dir/vm600.raw | |
32e8b5b2 | 1609 | ---- |
c069256d EK |
1610 | |
1611 | The VM is ready to be started. | |
707e37a2 | 1612 | |
7eb69fd2 | 1613 | |
16b4185a | 1614 | ifndef::wiki[] |
7eb69fd2 | 1615 | include::qm-cloud-init.adoc[] |
16b4185a DM |
1616 | endif::wiki[] |
1617 | ||
6e4c46c4 DC |
1618 | ifndef::wiki[] |
1619 | include::qm-pci-passthrough.adoc[] | |
1620 | endif::wiki[] | |
16b4185a | 1621 | |
c2c8eb89 | 1622 | Hookscripts |
91f416b7 | 1623 | ----------- |
c2c8eb89 DC |
1624 | |
1625 | You can add a hook script to VMs with the config property `hookscript`. | |
1626 | ||
32e8b5b2 AL |
1627 | ---- |
1628 | # qm set 100 --hookscript local:snippets/hookscript.pl | |
1629 | ---- | |
c2c8eb89 DC |
1630 | |
1631 | It will be called during various phases of the guests lifetime. | |
1632 | For an example and documentation see the example script under | |
1633 | `/usr/share/pve-docs/examples/guest-example-hookscript.pl`. | |
7eb69fd2 | 1634 | |
88a31964 DC |
1635 | [[qm_hibernate]] |
1636 | Hibernation | |
1637 | ----------- | |
1638 | ||
1639 | You can suspend a VM to disk with the GUI option `Hibernate` or with | |
1640 | ||
32e8b5b2 AL |
1641 | ---- |
1642 | # qm suspend ID --todisk | |
1643 | ---- | |
88a31964 DC |
1644 | |
1645 | That means that the current content of the memory will be saved onto disk | |
1646 | and the VM gets stopped. On the next start, the memory content will be | |
1647 | loaded and the VM can continue where it was left off. | |
1648 | ||
1649 | [[qm_vmstatestorage]] | |
1650 | .State storage selection | |
1651 | If no target storage for the memory is given, it will be automatically | |
1652 | chosen, the first of: | |
1653 | ||
1654 | 1. The storage `vmstatestorage` from the VM config. | |
1655 | 2. The first shared storage from any VM disk. | |
1656 | 3. The first non-shared storage from any VM disk. | |
1657 | 4. The storage `local` as a fallback. | |
1658 | ||
e2a867b2 DC |
1659 | [[resource_mapping]] |
1660 | Resource Mapping | |
bd0cc33d | 1661 | ---------------- |
e2a867b2 | 1662 | |
481a0ee4 DC |
1663 | [thumbnail="screenshot/gui-datacenter-resource-mappings.png"] |
1664 | ||
e2a867b2 DC |
1665 | When using or referencing local resources (e.g. address of a pci device), using |
1666 | the raw address or id is sometimes problematic, for example: | |
1667 | ||
1668 | * when using HA, a different device with the same id or path may exist on the | |
1669 | target node, and if one is not careful when assigning such guests to HA | |
1670 | groups, the wrong device could be used, breaking configurations. | |
1671 | ||
1672 | * changing hardware can change ids and paths, so one would have to check all | |
1673 | assigned devices and see if the path or id is still correct. | |
1674 | ||
1675 | To handle this better, one can define cluster wide resource mappings, such that | |
1676 | a resource has a cluster unique, user selected identifier which can correspond | |
1677 | to different devices on different hosts. With this, HA won't start a guest with | |
1678 | a wrong device, and hardware changes can be detected. | |
1679 | ||
1680 | Creating such a mapping can be done with the {pve} web GUI under `Datacenter` | |
1681 | in the relevant tab in the `Resource Mappings` category, or on the cli with | |
1682 | ||
1683 | ---- | |
d772991e | 1684 | # pvesh create /cluster/mapping/<type> <options> |
e2a867b2 DC |
1685 | ---- |
1686 | ||
4657b9ff TL |
1687 | [thumbnail="screenshot/gui-datacenter-mapping-pci-edit.png"] |
1688 | ||
d772991e TL |
1689 | Where `<type>` is the hardware type (currently either `pci` or `usb`) and |
1690 | `<options>` are the device mappings and other configuration parameters. | |
e2a867b2 DC |
1691 | |
1692 | Note that the options must include a map property with all identifying | |
1693 | properties of that hardware, so that it's possible to verify the hardware did | |
1694 | not change and the correct device is passed through. | |
1695 | ||
1696 | For example to add a PCI device as `device1` with the path `0000:01:00.0` that | |
1697 | has the device id `0001` and the vendor id `0002` on the node `node1`, and | |
1698 | `0000:02:00.0` on `node2` you can add it with: | |
1699 | ||
1700 | ---- | |
1701 | # pvesh create /cluster/mapping/pci --id device1 \ | |
1702 | --map node=node1,path=0000:01:00.0,id=0002:0001 \ | |
1703 | --map node=node2,path=0000:02:00.0,id=0002:0001 | |
1704 | ---- | |
1705 | ||
1706 | You must repeat the `map` parameter for each node where that device should have | |
1707 | a mapping (note that you can currently only map one USB device per node per | |
1708 | mapping). | |
1709 | ||
1710 | Using the GUI makes this much easier, as the correct properties are | |
1711 | automatically picked up and sent to the API. | |
1712 | ||
481a0ee4 DC |
1713 | [thumbnail="screenshot/gui-datacenter-mapping-usb-edit.png"] |
1714 | ||
e2a867b2 DC |
1715 | It's also possible for PCI devices to provide multiple devices per node with |
1716 | multiple map properties for the nodes. If such a device is assigned to a guest, | |
1717 | the first free one will be used when the guest is started. The order of the | |
1718 | paths given is also the order in which they are tried, so arbitrary allocation | |
1719 | policies can be implemented. | |
1720 | ||
1721 | This is useful for devices with SR-IOV, since some times it is not important | |
1722 | which exact virtual function is passed through. | |
1723 | ||
1724 | You can assign such a device to a guest either with the GUI or with | |
1725 | ||
1726 | ---- | |
d772991e | 1727 | # qm set ID -hostpci0 <name> |
e2a867b2 DC |
1728 | ---- |
1729 | ||
1730 | for PCI devices, or | |
1731 | ||
1732 | ---- | |
d772991e | 1733 | # qm set <vmid> -usb0 <name> |
e2a867b2 DC |
1734 | ---- |
1735 | ||
1736 | for USB devices. | |
1737 | ||
d772991e | 1738 | Where `<vmid>` is the guests id and `<name>` is the chosen name for the created |
e2a867b2 DC |
1739 | mapping. All usual options for passing through the devices are allowed, such as |
1740 | `mdev`. | |
1741 | ||
d772991e TL |
1742 | To create mappings `Mapping.Modify` on `/mapping/<type>/<name>` is necessary |
1743 | (where `<type>` is the device type and `<name>` is the name of the mapping). | |
e2a867b2 | 1744 | |
d772991e TL |
1745 | To use these mappings, `Mapping.Use` on `/mapping/<type>/<name>` is necessary |
1746 | (in addition to the normal guest privileges to edit the configuration). | |
e2a867b2 | 1747 | |
8c1189b6 | 1748 | Managing Virtual Machines with `qm` |
dd042288 | 1749 | ------------------------------------ |
f69cfd23 | 1750 | |
c730e973 | 1751 | qm is the tool to manage QEMU/KVM virtual machines on {pve}. You can |
f69cfd23 DM |
1752 | create and destroy virtual machines, and control execution |
1753 | (start/stop/suspend/resume). Besides that, you can use qm to set | |
1754 | parameters in the associated config file. It is also possible to | |
1755 | create and delete virtual disks. | |
1756 | ||
dd042288 EK |
1757 | CLI Usage Examples |
1758 | ~~~~~~~~~~~~~~~~~~ | |
1759 | ||
b01b1f2c EK |
1760 | Using an iso file uploaded on the 'local' storage, create a VM |
1761 | with a 4 GB IDE disk on the 'local-lvm' storage | |
dd042288 | 1762 | |
32e8b5b2 AL |
1763 | ---- |
1764 | # qm create 300 -ide0 local-lvm:4 -net0 e1000 -cdrom local:iso/proxmox-mailgateway_2.1.iso | |
1765 | ---- | |
dd042288 EK |
1766 | |
1767 | Start the new VM | |
1768 | ||
32e8b5b2 AL |
1769 | ---- |
1770 | # qm start 300 | |
1771 | ---- | |
dd042288 EK |
1772 | |
1773 | Send a shutdown request, then wait until the VM is stopped. | |
1774 | ||
32e8b5b2 AL |
1775 | ---- |
1776 | # qm shutdown 300 && qm wait 300 | |
1777 | ---- | |
dd042288 EK |
1778 | |
1779 | Same as above, but only wait for 40 seconds. | |
1780 | ||
32e8b5b2 AL |
1781 | ---- |
1782 | # qm shutdown 300 && qm wait 300 -timeout 40 | |
1783 | ---- | |
dd042288 | 1784 | |
87927c65 DJ |
1785 | Destroying a VM always removes it from Access Control Lists and it always |
1786 | removes the firewall configuration of the VM. You have to activate | |
1787 | '--purge', if you want to additionally remove the VM from replication jobs, | |
1788 | backup jobs and HA resource configurations. | |
1789 | ||
32e8b5b2 AL |
1790 | ---- |
1791 | # qm destroy 300 --purge | |
1792 | ---- | |
87927c65 | 1793 | |
66aecccb AL |
1794 | Move a disk image to a different storage. |
1795 | ||
32e8b5b2 AL |
1796 | ---- |
1797 | # qm move-disk 300 scsi0 other-storage | |
1798 | ---- | |
66aecccb AL |
1799 | |
1800 | Reassign a disk image to a different VM. This will remove the disk `scsi1` from | |
1801 | the source VM and attaches it as `scsi3` to the target VM. In the background | |
1802 | the disk image is being renamed so that the name matches the new owner. | |
1803 | ||
32e8b5b2 AL |
1804 | ---- |
1805 | # qm move-disk 300 scsi1 --target-vmid 400 --target-disk scsi3 | |
1806 | ---- | |
87927c65 | 1807 | |
f0a8ab95 DM |
1808 | |
1809 | [[qm_configuration]] | |
f69cfd23 DM |
1810 | Configuration |
1811 | ------------- | |
1812 | ||
f0a8ab95 DM |
1813 | VM configuration files are stored inside the Proxmox cluster file |
1814 | system, and can be accessed at `/etc/pve/qemu-server/<VMID>.conf`. | |
1815 | Like other files stored inside `/etc/pve/`, they get automatically | |
1816 | replicated to all other cluster nodes. | |
f69cfd23 | 1817 | |
f0a8ab95 DM |
1818 | NOTE: VMIDs < 100 are reserved for internal purposes, and VMIDs need to be |
1819 | unique cluster wide. | |
1820 | ||
1821 | .Example VM Configuration | |
1822 | ---- | |
777cf894 | 1823 | boot: order=virtio0;net0 |
f0a8ab95 DM |
1824 | cores: 1 |
1825 | sockets: 1 | |
1826 | memory: 512 | |
1827 | name: webmail | |
1828 | ostype: l26 | |
f0a8ab95 DM |
1829 | net0: e1000=EE:D2:28:5F:B6:3E,bridge=vmbr0 |
1830 | virtio0: local:vm-100-disk-1,size=32G | |
1831 | ---- | |
1832 | ||
1833 | Those configuration files are simple text files, and you can edit them | |
1834 | using a normal text editor (`vi`, `nano`, ...). This is sometimes | |
1835 | useful to do small corrections, but keep in mind that you need to | |
1836 | restart the VM to apply such changes. | |
1837 | ||
1838 | For that reason, it is usually better to use the `qm` command to | |
1839 | generate and modify those files, or do the whole thing using the GUI. | |
1840 | Our toolkit is smart enough to instantaneously apply most changes to | |
1841 | running VM. This feature is called "hot plug", and there is no | |
1842 | need to restart the VM in that case. | |
1843 | ||
1844 | ||
1845 | File Format | |
1846 | ~~~~~~~~~~~ | |
1847 | ||
1848 | VM configuration files use a simple colon separated key/value | |
1849 | format. Each line has the following format: | |
1850 | ||
1851 | ----- | |
1852 | # this is a comment | |
1853 | OPTION: value | |
1854 | ----- | |
1855 | ||
1856 | Blank lines in those files are ignored, and lines starting with a `#` | |
1857 | character are treated as comments and are also ignored. | |
1858 | ||
1859 | ||
1860 | [[qm_snapshots]] | |
1861 | Snapshots | |
1862 | ~~~~~~~~~ | |
1863 | ||
1864 | When you create a snapshot, `qm` stores the configuration at snapshot | |
1865 | time into a separate snapshot section within the same configuration | |
1866 | file. For example, after creating a snapshot called ``testsnapshot'', | |
1867 | your configuration file will look like this: | |
1868 | ||
1869 | .VM configuration with snapshot | |
1870 | ---- | |
1871 | memory: 512 | |
1872 | swap: 512 | |
1873 | parent: testsnaphot | |
1874 | ... | |
1875 | ||
1876 | [testsnaphot] | |
1877 | memory: 512 | |
1878 | swap: 512 | |
1879 | snaptime: 1457170803 | |
1880 | ... | |
1881 | ---- | |
1882 | ||
1883 | There are a few snapshot related properties like `parent` and | |
1884 | `snaptime`. The `parent` property is used to store the parent/child | |
1885 | relationship between snapshots. `snaptime` is the snapshot creation | |
1886 | time stamp (Unix epoch). | |
f69cfd23 | 1887 | |
88a31964 DC |
1888 | You can optionally save the memory of a running VM with the option `vmstate`. |
1889 | For details about how the target storage gets chosen for the VM state, see | |
1890 | xref:qm_vmstatestorage[State storage selection] in the chapter | |
1891 | xref:qm_hibernate[Hibernation]. | |
f69cfd23 | 1892 | |
80c0adcb | 1893 | [[qm_options]] |
a7f36905 DM |
1894 | Options |
1895 | ~~~~~~~ | |
1896 | ||
1897 | include::qm.conf.5-opts.adoc[] | |
1898 | ||
f69cfd23 DM |
1899 | |
1900 | Locks | |
1901 | ----- | |
1902 | ||
d6466262 TL |
1903 | Online migrations, snapshots and backups (`vzdump`) set a lock to prevent |
1904 | incompatible concurrent actions on the affected VMs. Sometimes you need to | |
1905 | remove such a lock manually (for example after a power failure). | |
f69cfd23 | 1906 | |
32e8b5b2 AL |
1907 | ---- |
1908 | # qm unlock <vmid> | |
1909 | ---- | |
f69cfd23 | 1910 | |
0bcc62dd DM |
1911 | CAUTION: Only do that if you are sure the action which set the lock is |
1912 | no longer running. | |
1913 | ||
16b4185a DM |
1914 | ifdef::wiki[] |
1915 | ||
1916 | See Also | |
1917 | ~~~~~~~~ | |
1918 | ||
1919 | * link:/wiki/Cloud-Init_Support[Cloud-Init Support] | |
1920 | ||
1921 | endif::wiki[] | |
1922 | ||
1923 | ||
f69cfd23 | 1924 | ifdef::manvolnum[] |
704f19fb DM |
1925 | |
1926 | Files | |
1927 | ------ | |
1928 | ||
1929 | `/etc/pve/qemu-server/<VMID>.conf`:: | |
1930 | ||
1931 | Configuration file for the VM '<VMID>'. | |
1932 | ||
1933 | ||
f69cfd23 DM |
1934 | include::pve-copyright.adoc[] |
1935 | endif::manvolnum[] |