We now get rid of all the PVE::CLIHandler baggage which
reduces the code a lot. It is also not compatible with the
new lxc.hook.version=1 method of hooks!
The new helper is specific to lxc hooks and supports both
current `lxc.hook.version`s.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Stefan Reiter [Mon, 28 Oct 2019 11:59:14 +0000 (12:59 +0100)]
setup: do host architecture translation ourself
This was done by the PVE:Tools backed get_host_arch method, but as we
were the only user of that specific translation and it's quite LXC
related it makes more sense to do it here. This also allows reuse of
the PVE::Tools function.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Christian Ebner [Tue, 15 Oct 2019 11:00:24 +0000 (13:00 +0200)]
fix #1291: add option purge for destroy_vm api call
When destroying a CT, we intentionally did not remove all related
configs such as backup or replication jobs.
The intention of this flag is to allow the removal of references to
the VM being removed from such configs on destroy.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Oguz Bektas [Mon, 14 Oct 2019 08:28:51 +0000 (10:28 +0200)]
implement pending changes
previous behaviour directly applied the possible config changes, and
died when there was something which can't be applied while CT is
running.
instead, we now write all the changes directly into the config pending
section, and then apply or hotplug the changes depending on whether CT
is running. the non-hotpluggable changes are left as pending changes.
Oguz Bektas [Mon, 14 Oct 2019 08:28:49 +0000 (10:28 +0200)]
add vmconfig_hotplug_pending and vmconfig_apply_pending
vmconfig_hotplug_pending is responsible for checking if a key/value pair
in the pending section can be hotpugged, if yes; perform a generic
replace, or perform specific actions for hotplugging the special cases.
vmconfig_apply_pending is only supposed to be called when ct isn't live.
Oguz Bektas [Mon, 14 Oct 2019 08:28:46 +0000 (10:28 +0200)]
skip pending changes while taking backup
we can only clone the current state of container (without pending
changes), as otherwise the on-disk state might not match the
configuration. this also makes it more consistent to qemu-server
behavior.
Oguz Bektas [Mon, 14 Oct 2019 08:28:44 +0000 (10:28 +0200)]
api: config: use shared guesthelpers in GET call
since containers can also have pending changes now, we need a method to
get the current applied config as well as the one with the pending
changes inside. this makes the GET config api more consistent with
qemu-server's by reusing load_current_config and load_snapshot_config from
AbstractConfig.
to decide which method to call, we look at the parameters.
Oguz Bektas [Mon, 14 Oct 2019 08:28:41 +0000 (10:28 +0200)]
adapt CT config parser for pending changes
config parser can now read/write [pve:pending] section. this was named
such, instead of [PENDING], after on- and offline discussion regarding
namespacing the pending section and snapshots.
Between calling destroy_lxc_container and removing the ID from
user.cfg (remove_vm_access) creating a new CT with this ID was
possible. CTs could go missing from pools as a consequence.
unlinking must happen at the very end of the deletion
process to avoid that other nodes use the ID in the meanwhile
Further lock the config after the VM was destroyed with a config lock
named, well, destroyed. This way it's easy to know that the CT was
destroyed but has still the config skelleton and FW, access etc.
stuff possible left over.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Oguz Bektas [Mon, 14 Oct 2019 08:28:47 +0000 (10:28 +0200)]
prepend underscores for is_volume_in_use private helper
this helper was defined twice, once as 'my $is_volume_in_use' sub and
second as a helper sub. as our other helpers with a similar structure,
it is better to prepend the variable sub with two underscores.
Oguz Bektas [Fri, 13 Sep 2019 10:35:57 +0000 (12:35 +0200)]
fix issue where ttys aren't correctly set after restore
restore from unpriv to priv causes a problem with the log-in from web
console, since the /etc/securetty file isn't modified after a restore to
reflect that change (/dev/lxc/tty1 and so on).
template_fixup is normally called in post_create_hook, but we have no
$password or $ssh_keys to call the hook with during the restore. instead
we call template_fixup by itself to fix the ttys on some distributions.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com> Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Oguz Bektas [Mon, 26 Aug 2019 14:06:32 +0000 (16:06 +0200)]
don't leave fstrim lock if mount_all fails
when a container has a mountpoint which can't be mounted for some
reason, mount_all dies and the fstrim lock stays. prevent this by
moving the call into eval, warn if any error occurs.
Still try to unmount all already mounted MPs so that nothing blocking
remains left.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Tue, 27 Aug 2019 16:49:01 +0000 (18:49 +0200)]
setup: allow CentOS 5 and CentOS 8
One is in the extended support phase, it should not be used but
people report that the CentOS 6 code path works just fine, so why
not...
The other is for the upcoming CentOS 8, while not fully testable for
compatibility yet, CentOS 7 code path should do the trick, else
we'll need to adapt it anyway, so see this as experimental
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
mountpoints: create parent dirs with correct owner
otherwise unprivileged containers might end up with directories that
they cannot modify since they are owned by the user root in the host
namespace, instead of root inside the container.
note: the problematic behaviour is only exhibited when an intermediate
directory needs to be created, e.g. a mountpoint /test/mp gets mounted,
and /test does not yet exist.
Thomas Lamprecht [Fri, 19 Jul 2019 13:42:13 +0000 (15:42 +0200)]
debian: bump compat to 12 and don't restart container.slice
since compat 10 the restart is default, as I want to use
'dh_installsystemd' (vs 'dh_systemd_start') I need at least compat
level 11, so go for the now recommended compat level 12.
diffoscope tells me that the main change us the wanted:
./postinst
> @@ -1,10 +1,15 @@
> #!/bin/sh
> set -e
> -# Automatically added by dh_systemd_start/12.1.1
> +# Automatically added by dh_installsystemd/12.1.1
> if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
> if [ -d /run/systemd/system ]; then
> systemctl --system daemon-reload >/dev/null || true
> - if [ -n "$2" ]; then
> - _dh_action=restart
> - else
> - _dh_action=start
> - fi
> - deb-systemd-invoke $_dh_action 'system-pve\x2dcontainer.slice' >/dev/null || true
> + deb-systemd-invoke start 'system-pve\x2dcontainer.slice' >/dev/null || true
> fi
> fi
> # End automatically added section
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 17 Jul 2019 10:07:40 +0000 (12:07 +0200)]
setup getty: ensure the getty.target is not masked
some distro templates have this masked by default, it makes sense to
always ensure that it can work, a CT admin can still prevent this by
using the .pve-ignore.$file mechanism.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Acked-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Thomas Lamprecht [Thu, 18 Jul 2019 15:17:17 +0000 (17:17 +0200)]
setup getty: drop now obsolete setup_systemd_console
The setup_container_getty_service can now handle also old
getty@.service if the newer container-getty@.service is not
available. So drop, and convert the two remaining users to calling
the now compatible setup_container_getty_service
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Acked-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Some recent distributions running as a LXC container eat the relative
low default limits up very fast. Thus increase all those
(semi-related) limits by a factor of 512. This was chosen by using
one of our bigger know CT setup (~1500 CTs per host) and the fact
that I can have only a very low count (circa 5 - 7) of running
"inotify watch hungry" CTs (e.g., ones with a recent systemd > 240).
So, as 5 * 512 is well >> 1500, we can assume with confidence to
allow most reasonable and existing setups by default.
As with the kernel commit d46eb14b735b11927d4bdc2d1854c311af19de6d
"fs: fsnotify: account fsnotify metadata to kmemcg" [0] the memory
usage from the watch and queue overhead is accounted to the users
respective memory CGroup (i.e., for LXC containers their memory
limit) we can do this without to much fear of negative implications.
Don't change the hardcoded kernel default values directly though,
ship a sysctl.d configuration file, which is a bit more transparent
about what happens and can be shipped by the component needing this
(i.e., pve-container).
Follow the considerations of `man 5 sysctl.d` for shipping:
> Packages should install their configuration files in /lib/. Files
> in /etc/ are reserved for the local administrator, who may use this
> logic to override the configuration files installed by vendor
> packages. All configuration files are sorted by their filename in
> lexicographic order, regardless of which of the directories they
> reside in. If multiple files specify the same option, the entry in
> the file with the lexicographically latest name will take
> precedence. It is recommended to prefix all filenames with a
> two-digit number and a dash, to simplify the ordering of the files.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stefan Reiter [Tue, 9 Jul 2019 15:20:57 +0000 (17:20 +0200)]
fix #2270: allow custom lxc options to be restored as root
Seems to be a regression introduced with f360d7f16b094fa258cf82d2557d06f3284435e4 (related to #2028).
$conf->{'lxc'} would always be defined, hence we never replaced it with
the restored options.
Co-developed-by: Oguz Bektas <o.bektas@proxmox.com> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Oguz Bektas [Fri, 5 Jul 2019 11:27:05 +0000 (13:27 +0200)]
fix #1451: allow one to add mount options to CT mountpoints
for now allows the following non-problematic ones:
* noexec - Do not permit execution of binaries on the mounted FS
* noatime - Do not update inode access times on this filesystem
* nosuid - Do not allow suid or sgid bits to take effect
* nodev - Do not interpret character or block devices on the FS
Dominic Jäger [Wed, 12 Jun 2019 10:04:57 +0000 (12:04 +0200)]
Fix #576: Fix dangling files for Move Disk
When Move Disk is called for a container rsync starts copying it to a
new destination. This initial rsync process gets killed when the Stop
button gets pressed. At this moment the destination file is not fully
copied and useless as a consequence. Our code already tries to remove
it. However, rsync has forked and those forks are still accessing the
destination file for some time. Thus, the attempt to remove it fails.
With the patch we wait for other processes to release the destination
files. As we are in a mount namespace and protected by a config lock,
those other processes should be children of rsync only. The waiting
time was less than a second when I tried it. Afterwards, the existing
remove procedure is carried out.
Co-developed-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Dominic Jäger <d.jaeger@proxmox.com>
Alwin Antreich [Thu, 23 May 2019 07:13:40 +0000 (09:13 +0200)]
Fix: check if compression_map format is undefined
We want to check for an supported compression type, but the check was
not correct as this only works if both sides are scalars, but an
assignment to an array is always "truthy", so actually check explicitly
if the compression type is supported before.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com> Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This allows to have the same semantics as qemu-server:
* immediate hard-kill
* shutdown with kill after timeout
* shutdown without kill after timeout
And thus we finally can move the vm_shutdown API call to a correct
semantic, i.e., do not immediate hard kill if forceStop is not passed
but rather see it as stop after timeout knob.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
No call-site used this parameter, and thus it was dead code,
remove it not only for cleanup sake but also to make space for a new
"nokill-after-timeout" parameter, comming in a future patch.
This code was always dead since it was introduced with the addition
of vm_stop in commit b1bad293c4f7a6024bbd363b6784b3875ca5d098
so pretty safe to remove anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stoiko Ivanov [Mon, 6 May 2019 14:27:44 +0000 (16:27 +0200)]
raise supported fedora version to 30
Tested by installing a fedora 29 container and upgrading it via dnf [0].
The upgraded container boots, but in order to get networking running (and many
warnings and errors less in the journal) 'nesting' needs to be activated both
for privileged and unprivileged containers.
Christian Ebner [Wed, 17 Apr 2019 14:38:28 +0000 (16:38 +0200)]
fix: #1075: Correctly restore CT templates form backup
Restoring a backup from a CT template wrongly resulted in a CT with the template
flag set in the config.
This makes sure the CT template backup gets restored to a CT and only if the
storage supports templates, the resulting CT is converted to a template.
Otherwise the backup restores simply to a CT.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
Unconditionally add a '--bwlimit' parameter to the rsync invocation, defaulting
to an argument of '0' (= unlimited - see `man rsync).
Normally this is a rate per second, with a passed unit. With no unit
passed rsync assumes "K", which is exactly what our units are in, so
make our life easy and omit it.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>