Explicitly close leftover connections in the destructor,
otherwise the IO::Multiplex instance can be leaked causing
the qmp connection to never be closed.
This could occur for instance when cancelling vzdump with
ctrl+c with extremely unlucky timing...
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Thomas Lamprecht [Tue, 29 Oct 2019 18:04:01 +0000 (19:04 +0100)]
cleanup do_import, s/optional/params/ and move skiplock into params
mixed with indentation changes a whole lot of other changes which
should normally not mixed to much together, but this is all a bit
tangled and I'm not sure if splitting it into two or three parts
would help anybody.. just use "-w" (ignore whitespace changes) when
looking at the diff..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dominic Jäger [Mon, 28 Oct 2019 11:47:34 +0000 (12:47 +0100)]
Import OVF: Lock config with "lock" property
Previously a VMID conflict was possible when creating a VM on another node
between locking the config with lock_config_full and writing to it for the
first time with write_config.
Using create_and_lock_config eliminates this possibility. This means that now
the "lock" property is set in the config instead of using flock only.
$param was empty when it was assigned the three values "name", "memory" and
"cores" before being assigned to $conf later on. Assigning those values
directly to $conf avoids confusion about what the two variables contain.
Dominic Jäger [Mon, 28 Oct 2019 11:47:32 +0000 (12:47 +0100)]
replace remaining vm_destroy call-sites with destroy_vm
This function has been used in one place only into which we inlined its
functionality. Removing it avoids confusion between vm_destroy and vm_destroy.
The whole $importfn is executed in a lock_config_full.
As a consequence, for the inlined code:
1. lock_config is redundant
2. it is not possible that the VM has been started (check_running) in the
meanwhile
Additionally, it is not possible that the "lock" property has been written into
the VM's config file (check_lock) in the meanwhile
Add warning after eval so that it does not go unnoticed if it ever comes into
action.
Stefan Reiter [Mon, 28 Oct 2019 13:30:41 +0000 (14:30 +0100)]
hugepages: fix memory size checking
The codepath for "any" hugepages did not check if memory size was even,
leading to the code below trying to allocate half a hugepage (e.g. VM
with 2049MiB RAM would lead to 1024.5 2kB hugepages).
Also improve error message for systems with only 1GB hugepages enabled.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Dominik Csapak [Fri, 25 Oct 2019 12:36:06 +0000 (14:36 +0200)]
fix #2434: extend machine regex
with qemu 4.0.1, there is now a machine type pc-q35-4.0.1 which does not fit
into our regex
this broke live migration of q35, as we give the machine type (incl version
info) to 'qm start' on the target node, which checks it against the
JSONSchema
to fix this, extend the regex to allow any number of version levels,
for q35, i440fx and virt (to be more future proof)
Dominik Csapak [Wed, 23 Oct 2019 09:39:53 +0000 (11:39 +0200)]
fix reverting for non-existing configs
reverting a nonexisting option did not work with the latest changes
in pve-guest-common, because we do not delete the pending option
in 'add_to_pending_delete' anymore
this had the effect that we had following in the config:
[pending]
option: pendingvalue
delete: option
which would do the deletion code and the pending add code
(e.g. delete the pending cloud init drive and creating it again)
to avoid that situation, we need to remove the option from the pending hash
in the 'delete loop'
Stefan Reiter [Tue, 22 Oct 2019 15:25:48 +0000 (17:25 +0200)]
fix #2408, #2355, #2380: use scsi-hd backend for iSCSI as well
As mentioned in #2408, live-migrating a VM between storages that use
different scsi backends (scsi-hd, scsi-generic, scsi-block) breaks.
To fix, from QEMU 4.1 machine types onward (to not break current
behaviour any more), only use scsi-hd, as in recent versions, there is
almost no difference between the two anyway.
scsi-block (which potentially also breaks) requires a flag to be
manually set on the disk, so we can assume the user knows what they're
doing.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com> Suggested-by: Daniel Berteaud <daniel@firewall-services.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Oguz Bektas [Tue, 22 Oct 2019 10:34:27 +0000 (12:34 +0200)]
pending apply/hotplug: don't hard code force to true
Each pending options has a hash value which has the 'force'
information encoded as entry. But, this can be { force => 1 } or
{ force => 0 }, so we actually need to check the value and not just
set force to the hash directly, as else we have force always truthy..
fixes a bug where 'detach' caused disks to be destroyed immediately,
because $force parameter was always true since hash is true.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Mira Limbeck [Fri, 27 Sep 2019 13:13:30 +0000 (15:13 +0200)]
cloudinit: fix vm start hanging with disk on ZFS
With the changes to pve-storage in commit 56362cf the startup hangs for
5 minutes on ZFS if the cloudinit disk does not exist. Instead of
calling activate_volume followed by file_size_info we now call
volume_size_info. This should work reliably on all storages that support
cloudinit disks.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Mira Limbeck [Fri, 27 Sep 2019 14:22:01 +0000 (16:22 +0200)]
fix #2344: ignore cloudinit in replication check
When adding a cloudinit disk it does not contain media=cdrom until it is
actually created. This means the check in check_replication fails to
detect cloudinit and it is recognized as normal disk. Then parse_volname
fails because it does not match the vm-$vmid-XYZ format. To fix this we
now check explicitly if the volname matches cloudinit and if so, return
early.
Additionally 2 small cleanups replacing cloudinit regexes with the
same check for volname matches cloudinit.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Christian Ebner [Tue, 15 Oct 2019 11:00:25 +0000 (13:00 +0200)]
fix #1291: add option purge for vm_destroy api call
When destroying a VM, we intentionally did not remove all related
configs such as backup or replication jobs.
The intention of this flag is to allow the removal of references to
the VM being removed from such configs on destroy.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 18 Oct 2019 09:21:58 +0000 (11:21 +0200)]
destroy_vm: use write_config from our Config module to set an "empty" config
brings us more in line with what we do in pve-container, also it's
good to not use file_set_contents directly if we have all those nice
wrapper interface methods to do things in a safe and guaranteed way.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dominic Jäger [Tue, 15 Oct 2019 10:17:41 +0000 (12:17 +0200)]
Fix #2412: Missing VMs in pools
Between calling vm_destroy and removing the ID from user.cfg (remove_vm_access)
creating a new VM with this ID was possible. VMs could go missing from pools as
a consequence.
Adding a lock solves this for clones from the same node. Additionally,
unlinking must happen at the very end of the deletion process to avoid that
other nodes use the ID in the meanwhile.
Thomas Lamprecht [Thu, 17 Oct 2019 17:13:01 +0000 (19:13 +0200)]
Fix #2171: vm_start: volid based statefiles were not activated
So, while we could just make this a special case before the
config_to_command call and set the $conf->{vmstate} to the statefile
for the case were it's a valid volumeid, the special case handling
get's much easier when we do this outside of that method.
So it's basically a trade-off, and after looking far to long at all
nice revisions Alwin made for me and Fabians request, and even trying
out different approaches, it was never perfect.
But having slight code duplication over the movement mess I proposed
(as I did not had the full picture then, sorry Alwin) felt like the
slightly nicer trade off, as all worked I just use this one now, it
has very clear semantics, easy to understand and that now three lines
are duplicated is IMO irrelevant.
Co-developed-by: Alwin Antreich <a.antreich@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stefan Reiter [Thu, 10 Oct 2019 10:18:41 +0000 (12:18 +0200)]
fix #2402: allow 1GB hugepages if 2MB is unavailable
As reported in bug #2402, a system started with "default_hugepagesz=1G
hugepagesz=1G" does not have a /sys/kernel/mm/hugepages/hugepages-2048kB
directory.
To fix, ignore the missing directory in hugepages_mount (since it might
not be needed anyway), and correctly check if the requested hugepage
size is available in hugepages_size instead.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
test: cfg2cmd: do NOT sort expected/actual commands
In general it matters where a command line options is positioned
inside a QEMU command, so we want to actually also check the order in
the cfg2cmd test
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
cfg2cmd: sort PCI bridges when adding them for stabillity
In general it matters where a command line options is positioned
inside a QEMU command, so we want to actually also check the order in
the cfg2cmd test, to do so we need to avoid false positives like this
added.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Aaron Lauterer [Tue, 8 Oct 2019 15:56:15 +0000 (17:56 +0200)]
cfg2cmd: fix serial-bus for spice foldersharing
Thanks to Gilberto Nunes for finding a bug where the VM would not start
with foldersharing enabled and the qemu agent option disabled [0].
The cause was that the device org.spice-space.webdav.0 would not find a
virtio-serial-bus in this situation.
Since we always create a virtio-serial-bus for the spice vdagent it
seems sensible to use that also for the foldersharing device by moving
it in front of the other spice devices.
Mira Limbeck [Wed, 25 Sep 2019 16:12:17 +0000 (18:12 +0200)]
fix #2217: don't copy cloudinit disk on clone
This removes the cloudinit disk from the list of drives to clone. As the
cloudinit disk is recreated on every VM start, it's not necessary to
clone it.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Thomas Lamprecht [Thu, 26 Sep 2019 08:54:05 +0000 (10:54 +0200)]
cfg2cmd: support USB 3 SPICE ports with 4.0 machine feature
The reason for why we did not do this in the first place was the fact
that the "usb3" flag could be set in older qemu-server versions, we
just ignored it but not filtered it out of the config..
That means there can be VMs out there which would now become a
different HW layout, and issue for migration and live-snapshot
restore.
But, actually, while the "usb3" property could be set it allowed to
start the VM in only if an additional USB devices was added to the VM
with USB2, or the VM uses "q35" based machine - as else no "ehci" was
available, and thus the "ignored" USB3 - SPICE could not get attached
anywhere -> QEMU chickened out.
And if a user had a configuration where this could started we have
still a bit luck, live-migration was not possible as the "can't
migrate VM which uses local devices:" check still hit, as in
qemu-server older than 6.0-8 we explicitly checked for "spice" when
seeing what usb device were not local, so a "spice,usb3=X" was always
(luckily) wrongly detected as local device -> migration was blocked.
So we only have one case left: restoring a live-snapshot. Here sadly
there seems no way out, it was possible to do with a "spice,usb3=1"
usb device, and thus all Snapshots taken on such VMs after they had a
clean restart on PVE 6 (to have a machine version >= 4.0) are broken
- but can be easily fixed by removing the "usb3=1" from the
problematic snapshot config.
As restoring a snapshot can be repeated more than once even on
failure without rendering the snapshot or VM permanently unusable,
this should be a reasonable compromise.
I strongly believe that the chance is so small that no one is
affected in practice and the property description mentioned that it
was not supported. If anybody is affected on snapshot restore we can
help them on a case-per-case basis.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Mira Limbeck [Wed, 25 Sep 2019 13:30:12 +0000 (15:30 +0200)]
fix #2382: delete cloudinit disk before restoring
The fix introduced in commit bf4a933 did not work as intended. We're
iterating over the $oldconf, not over $virtdev_hash. This means
$drive->{is_cloudinit} is always undefined. Instead use the $exclude_cloudinit
parameter from drive_is_cdrom().
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Thomas Lamprecht [Tue, 24 Sep 2019 16:08:48 +0000 (18:08 +0200)]
move qmeventd to own directory
It's really not nice if such many files, source code, meta-files, …
linger around in the top level directory..
Also, cleanup the build a bit, i.e., use LDFLAGS as dpkg-buildpackage
can set some LDFLAGS so it'd be nice if both CFLAFGS and LDFLAGS have
the same (related) ones.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
usb: Fix local resource check of Spice USB devices
The check relied on the fact that in a regular use case a usb device of
type spice would not have any other settings like `usb3=1`. An exact
match worked fine for this.
This patch changes the behaviour and makes it possible to migrate VMs
with Spice USB devices that have additional usb options set.
Thomas Lamprecht [Sat, 21 Sep 2019 11:41:06 +0000 (13:41 +0200)]
cfg2cmd: mock SPICE/VNC port allocation methods
to make them independent of environment, else other running VMs or
quickly sequential test runs may result in false negative test
results, as the port differed.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
To not change current behaviour and thus breaking live migration USB3
for a Spice USB device requires Qemu v4.1.
The old behavior was that even though technically it was possible to
the set `usb3=1` setting, it was ignored. The bus was hardcoded to
ehci. If another USB2 device was added or the machine type was set to
Q35 an ehci controller was present and the VM was able to boot.
With this patch the behaviour is changing and the bus is set to xhci if
USB3 is set for the Spice USB device and the VM is running under Qemu
v4.1.