rw/ro race occurs when a container contains the same bind
mount twice and another container contains a bind mount
containing the first container's destination. If the double
bind mounts are both meant to be read-only then the second
container could theoretically swap them out between the
mount and the read-only remount call, then swap them back
for the test. So to verify this we use the same file
descriptor we use for the dev/inode check and perform an
faccessat() call and expect it to return EROFS and nothing
else.
Also include O_NOFOLLOW in the checks' openat() calls.
While the container itself might not be running and cannot
influence the mounting between check_mount_path() and
mount(), this is a possibility when multiple containers
have write access to the same recursive bind mount
hierarchy.
This patch adds a walk_tree_nofollow() function performing
two things: It walks a path from a starting point following
no symlinks and erroring if it encounters one. And if
requested creates all the missing directories.
This replaces both the combination of check_mount_path() and
mkpath(), and the check_mount_path() in bindmount() while
giving the latter the ability to also access the "last
component" of the path via openat() a second time after
mounting (as an alternative to also including an fstatat()
syscall) in order to verify the path which was ultimately
mounted is indeed the path walked in the first check.
if the mountpoint configuration is not explicitly provided
via -rootfs or -mpX , recreate all volume mountpoints on the
provided -storage , restore the archive and then add
non-volume mountpoints back to the configuration.
if -rootfs (and optionally any -mpX) are provided, the old
configuration contained in the backup archive is ignored.
this allows reconfiguration of mountpoints when restoring a
backup, e.g. to change mountpoint paths or other options. in
this mode restoring backups to bind or device mountpoints is
also possible.
LXC::Config: make mountpoint_backup_enabled a class method
All subs in there are defined as class method and the commit
which introduced this also used it as one in
PVE::VZDump::LXC and thereby broke vzdump on containers
as it passes wrong parameters to the function.
Close #999: gentoo: hostname is in /etc/conf.d/hostname
This is also a shell-style assignment but should only
contain the hostname and nothing else. Although if someone
complains we could switch to ct_modify_file there as well.
remove_gateway_scripts() for debian containers can easily
match user-created scripts since it's not strict enough,
which now that we always clean up gateways appears to be an
issue for some.
To deal with this we update this portion to also use
BEGIN/END markers like we did with other files. (Note that
this means we explicitly include the BEGIN/END comment lines
in the 'attribute' list while parsing sections.)
In order to not break existing container configurations we
now remove either a begin/end marked section in
remove_gateway_scripts(), or if none was found fall back to
a much stricter variant of the old matching algorithm which
only triggers if the gateway setup lines are complete and
unmodified.
With DHCPv6 we're in a similar boat as with Alpine: The
default templates only ship busybox' udhcpc which doesn't do
ipv6 at all, and if dhclient/dhcpcd are installed there'd
still be no way to configure ONLY dhcpv6 without including
dhcpv4. (We could just dump in a dhclientv6.sh script to
/lib64/netifrc/net/dhclientv6.sh if someone absolutely needs
it...)
Also the fact that the network configuration can in *theory*
be a full blown bash script is a bit inconvenient.
Thomas Lamprecht [Wed, 11 May 2016 15:06:14 +0000 (17:06 +0200)]
setup: check if securetty exists
If securetty does not exists yet (e.g. some Alpine 3.2 templates do
that) this leads to an die on CT creation, although we do not need
an existing securetty file as all login devices/ttys are already
allowed if not existing.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
It seems busybox has some problems with links thus the tty dev
detection doesn't work, as workaround add also the lxc/tty[1-4] and
lxc/console devs to securetty to allow root login over the console.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
remove the calls to "ip route" in post-up and pre-down even
if no new gateway was defined for an interface, otherwise
those hooks will remain until manually removed.
Centos needs these in route6-$iface, not route-$iface.
It also seems to make sense to not include the
IPV6_DEFAULTGW when a route6-$iface file is used containing
the default gateway.
$conf->{rootfs} is supposed to be the property string value,
not the parsed property string. since this method is called
only twice (once for retrieving the rootfs information only,
once for retrieving the config only) and the second call
never needs the 'rootfs' part of the configuration, we can
safely not set it instead of introducing ugly workarounds
(e.g. setting a fake volume path or worse).
fix #942: restore ACL and other rootfs options from backup
unless overridden by explicitly setting the rootfs
parameter, restoring from a backup will now copy the rootfs
properties from the backup archive, except for 'volume' and
'ro' (for obvious reasons).
Unfortunately it can still happen that LXC's network link
deletion netlink messages get dropped/ignored. This is the
same issue as initially reported on the forums by sigxcpu in
October, however, it seems that some users hit this problem
more reliably currently.
debian: support containers upgraded to use systemd
These otherwise spawn consoles at /dev/pts%I and cause
errors in the logs about the container-getty@ services.
This happens for instance when dist-upgrading from wheezy to
jessie.
Read the container root password from stdin when creating a
container with 'pct create ... -password', instead of
providing it as command line argument. This is consistent
with 'pveum adduser' and pvesh, as described in #737 and #777.
Fix #918: add /dev/mapper symlinks for dm-* devices
Mount canonicalizes paths unless the -c option is used. This
is mostly fine but for device-mapper nodes (/dev/dm-*) it'll
fetch the /dev/mapper/* path and pass that to the mount
system call resulting in /proc/mounts showing the
/dev/mapper path. This is neither the one we provided (since
we use /dev/$vg/$lv), nor the one userspace tools will find
in /dev currently.
Since the dm-* paths are rather inconvenient to look at we
decided to keep mount's behavior and compensate by providing
the /dev/mapper symlinks for devices via the autodev hook.
Add force parameter for migration with bind/dev mp
Add a new 'force' parameter that allows to force the
migration of a container despite configured bind or device
mountpoints, which will be ignored/skipped.
this allows to set the rootfs to <storage>:<size>,
automatically creating an empty volume of the specified
size on the specified storage, like for non-rootfs mps.