there were two helpers that were not handling this correctly:
ct_make_path
since this never gets called with $opts, and there also is no 'owner'
and 'group' in $self, the previous logic could never work, sometimes
leaving nobody:nogroup files around for unprivileged containers.
since only the centos and suse plugins use this helper, the issue was
fairly limited.
ct_symlink
could create symlinks owned by nobody:nogroup. since symlinks are
created 777 by default, this just meant they were not modifiable inside
the container, but reading/dereferencing was no problem so it went
unnoticed so far.
Markus Frank [Fri, 11 Mar 2022 11:59:57 +0000 (12:59 +0100)]
fix #3917: Ignore fstrim failure in pct fstrim
With "noerr => 1" the function does not abort, when one of the
mountpoints is not fstrim compatible like zfs (has its own trim).
I do not think it is necessary to warn or error, because fstrim
tells when something is not trimmable and aborts.
make it more explicit (the whole call to the plugin's cleanup sub is
wrapped in an eval + warn anyway), so that future extensions can be
added after this point if they don't rely on snapshot removal being
successful.
Fabian Ebner [Wed, 23 Feb 2022 12:03:58 +0000 (13:03 +0100)]
fix #3424: api: snapshot delete: wait for active replication
A to-be-deleted snapshot might be actively used by replication,
resulting in a not (or only partially) removed snapshot and locked
(snapshot-delete) container. Simply wait a few seconds for any ongoing
replication.
Fabian Ebner [Wed, 23 Feb 2022 12:03:57 +0000 (13:03 +0100)]
partially fix #3424: vzdump: cleanup: wait for active replication
As replication and backup can happen at the same time, the vzdump
snapshot might be actively used by replication when backup tries
to cleanup, resulting in a not (or only partially) removed snapshot
and locked (snapshot-delete) container.
Wait up to 10 minutes for any ongoing replication. If replication
doesn't finish in time, the fact that there is no attempt to remove
the snapshot means that there's no risk for the container to end up in
a locked state. And the beginning of the next backup will force remove
the left-over snapshot, which will very likely succeed even at the
storage layer, because the replication really should be done by then
(subsequent replications shouldn't matter as they don't need to
re-transfer the vzdump snapshot).
Distro detection is done heuristically through the presence of a
`/nix/store` folder.
NixOS typically uses a script-based network configuration system that
isn't easy to configure from the outside, while the configuration
snippets would be simple to generate, bringing them in effect isn't.
LXC templates generated for proxmox are instead expected to use
systemd-networkd.
Signed-off-by: Harikrishnan R <rharikrishnan95@gmail.com>
[ Thomas: update/reword commit ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
the config is now updated anyway because of target-storage support, so
volume renaming is both 'free' and improves the chances of migration
with and without changing storages actually works successfully.
Dominik Csapak [Fri, 22 Oct 2021 06:44:13 +0000 (08:44 +0200)]
fix #3635: fix overly-strict pool permission check on create
we do not need Permissions.Modify on the pool as the actual required
check for 'VM.Allocate' for that pool is already handled below, so
remove it like we did in qemu-server 4fc5242 ("fix pool permission
checks on create")
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Tested-by: Mira Limbeck <m.limbeck@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fabian Ebner [Thu, 13 Jan 2022 11:04:04 +0000 (12:04 +0100)]
config: parse_volume: don't die when noerr is set
AFAICT, the only existing callers using noerr=1 are in
__snapshot_delete_remove_drive, and in AbstractConfig's
foreach_volume_full. The former should not be affected, as unknown
keys should never make their way in there. For the latter, it makes
iterating with
$opts = { extra_keys => ['vmstate'] }
possible while being agnostic of guest type. Previously, it would die
for LXC configs, but now the unknown key is simply skipped there.
Oguz Bektas [Wed, 1 Dec 2021 15:17:56 +0000 (16:17 +0100)]
config: allow 'lazytime' mount option for containers
worked fine here in ubuntu container.
root@CT1022:/# mount | grep lazy
/var/lib/pve/local-btrfs/images/1022/vm-1022-disk-0/disk.raw on / type ext4 (rw,relatime,lazytime)
/var/lib/pve/local-btrfs/images/1022/vm-1022-disk-0/disk.raw on /snap type ext4 (rw,relatime,lazytime)
requested in community forum [0]
[0]: https://forum.proxmox.com/threads/100454/
Tested-by: Dylan Whyte <d.whyte@proxmox.com> Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
with `storage` being optional (and not allowed for reassign operations),
the ACL path in the schema can end up as `/storage/-`, which is wrong.
replace it with an explicit check:
- target `storage` for move mp
- storage from source disk for reassign mp (we only rename here, but
it's still a new volume on that storage after all)
Aaron Lauterer [Tue, 9 Nov 2021 14:55:39 +0000 (15:55 +0100)]
api: move-volume: add move to another container
The goal of this is to expand the move-volume API endpoint to make it
possible to move a container volume / mountpoint to another container.
For unused volumes, the API parameters have been changed to allow them
as well. This means, additional checks had to be introduced to avoid
migration of an unusedX volume to another storage. Some follow up work
is needed for that to work properly.
Moving the rootfs from or to another container is prohibited.
Oguz Bektas [Wed, 13 Oct 2021 12:31:53 +0000 (14:31 +0200)]
api: clone_vm: don't include snapshot properties
apparently this caused a weird[0] bug... when a container with a snapshot was
cloned, it would take 'parent: foo' from the original container. if you
add a new snapshot 'bar', and then another one 'foo', this causes the
snapshots to become parents of each other (thus not parsed correctly in
the tree view of GUI nor with 'pct listsnapshot CTID')
we also drop these properties for VMs, so it makes sense to do the same
here as well.
Fabian Ebner [Thu, 7 Oct 2021 10:48:03 +0000 (12:48 +0200)]
setup: also set contents of /etc/timezone
Some distributions like CentOS 8 and Gentoo don't have the file, so
only update if it already existed.
A slight change in behavior in set_timezone is that the warning will
now trigger if /etc/localtime is a link to $tz_path, but $tz_path does
not exist. Previously, it would return early if the link matched.
Programs that rely on /etc/timezone within the container will now see
the configured timezone too. While that is more correct, it's still a
change that might be unexpected.
Reported in the community forum:
https://forum.proxmox.com/threads/pct-create-command-with-timezone-host-option-fails-to-create-a-container.97538/
Oguz Bektas [Tue, 5 Oct 2021 08:09:51 +0000 (10:09 +0200)]
setup: fix creating unmanaged containers
ssh_host_key_generate did not explicitly return in unmanaged plugin,
causing the post_create_hook to fail because of an invalid hash
reference (cannot use "1" as a HASH ref, "1" was likely being returned
implicitly as the scalar value of 'my ($self) = @_;')
setup: alpine: also use CIDR format for newer versions
original ifupdown supports this since years and ifupdown-ng, the
ifupdown replacement from the Alpine Linux corner, does too
https://github.com/ifupdown-ng/ifupdown-ng/blob/main/doc/ADMIN-GUIDE.md
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Wed, 29 Sep 2021 17:45:23 +0000 (19:45 +0200)]
setup: drop copying binfmt qemu-static executable
The binfmt-support and qemu-user-static package setup the
`/proc/sys/fs/binfmt_misc/' entry with the "fix binary" `F` flag:
> The usual behaviour of binfmt_misc is to spawn the binary lazily
> when the misc format file is invoked. However, this doesn't work
> very well in the face of mount namespaces and changeroots, so the F
> mode opens the binary as soon as the emulation is installed and
> uses the opened image to spawn the emulator, meaning it is always
> available once installed, regardless of how the environment
> changes.
--
https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html
which seems to be enough to make it work. binfmt-support's changelog
has some indication that it can use the `F` flag since the version
shipped in Debian Buster (PVE 6), and this support was added before
that, which would explain the earlier need for it..
Drop it now and slowly roll it out, if somebody really is using this
obscure PVE feature and yells we can always revert/workaround it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
if a volume is only referenced in the pending section of a config it was
previously not removed when removing the CT, unless the non-default
'remove unreferenced disks' option was enabled.
keeping track of volume IDs which we attempt to remove gets rid of false
warnings in case a volume is referenced both in the config and the
pending section, or multiple times in the config for other reasons.
Thomas Lamprecht [Tue, 28 Sep 2021 12:52:26 +0000 (14:52 +0200)]
setup: avoid one-argument bless
> Normally, bless takes two arguments: a reference to the referent
> that is to become the object, and a string naming the desired class
> of that object. However, the second argument is actually optional,
> and defaults to the current package name.
-- page 365 of Perl Best Practice, Convay.
That means that a inheriting module would get the wrong class due to
that, we do not really have that issue with Setup now, but copy-is-my
hobby would allow that error to infect other code ;-)
If one wants the default behavior should say so explicitly, e.g.:
bless { ... }, __PACKAGE__
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dominik Csapak [Wed, 4 Aug 2021 10:51:07 +0000 (12:51 +0200)]
add old config and unprivileged to check_ct_modify_config_perm
we'll need that for checking the features more granularly
for it to work correctly, we have to move the permission checks
into the 'lock_config' sub, since we now also need to check the current
config and it could change between the permission check and the lock
fix #3478: abort container creation on arch detection timeout
increased the timeout for detect_arch from 5 to 10 seconds.
until now, on any error detect_architecture would fall back to amd64.
to avoid falling back due to an timeout error this function now dies
on timeout errors.
additionally minor changes to the error messages have been made.
this is what's actually applied to the container (although
the container may be imposing an even stricter limit, but
that's not what we want to see...)
also, the v2 cpuset list may be empty (and often is for
unprivileged+nesting containers), which currently fails to
parse
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>