Dwight Engen [Wed, 19 Feb 2014 21:44:19 +0000 (16:44 -0500)]
fix mounts not propagating back to root mntns during create and clone
Systems based on systemd mount the root shared by default. We don't want
mounts done during creation by templates nor those done internally by
bdev during rsync based clones to propagate to the root mntns.
The create case already had the right check, but the mount call was
missing "/", so it was failing.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Stéphane Graber [Tue, 18 Feb 2014 22:33:51 +0000 (17:33 -0500)]
Set a reasonable fallback for get_rundir
If get_rundir can't find XDG_RUNTIME_DIR in the environment, it'll
attempt to build a path using ~/.cache/lxc/run/. Should that fail
because of missing $HOME in the environment, it'll then return NULL an
all callers will fail in that case.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Tue, 18 Feb 2014 21:12:52 +0000 (15:12 -0600)]
Fix unprivileged networking
If we are unprivileged and have asked for a veth device, then create
a pipe over which to pass the veth names.
Network-related todos:
1. set mtu on the container side of veth device
2. set mtu in lxc-user-nic. Note that this probably requires an
update to the /etc/lxc/lxc-usernet file :(
Serge Hallyn [Tue, 18 Feb 2014 21:01:38 +0000 (15:01 -0600)]
cache whether 'optional' was in mntopts
after commit 4e4ca16158f91ac1271495638a4e62881169474e we are
checking for optional in mntopts after we forcibly remove it.
Cache whether we had it before removing it.
Serge Hallyn [Mon, 17 Feb 2014 18:47:35 +0000 (12:47 -0600)]
attach: try to use the container's seccomp policy
We can't get the actual policy (in the case where the policy file
has changed) from the container, but at least we can use the
seccomp policy file listed in the container config file.
(If anyone wants to further improve this, it may be better to get
the seccomp policy over the cmd api; not sure that's what we want,
and this seems simpler to hook into the existing code, so I went
this way for now)
Stéphane Graber [Mon, 17 Feb 2014 15:51:53 +0000 (10:51 -0500)]
download: Support nested containers in unpriv
This adds detection for the case where we are root in an unprivileged
container and then run LXC from there. In this case, we want to download
to the system location, ignore the missing uid/gid ranges and run
templates that are userns-ready.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
S.Çağlar Onur [Sun, 16 Feb 2014 21:20:48 +0000 (16:20 -0500)]
fill missing netdev fields for unprivileged containers
lxc-user-nic now returns the names of the interfaces and
unpriv_assign_nic function parses that information to fill
missing netdev->veth_attr.pair and netdev->name.
With this patch get_running_config_item started to provide
correct information;
Dwight Engen [Thu, 13 Feb 2014 21:13:03 +0000 (16:13 -0500)]
create fd, stdin, stdout, stderr symlinks in /dev
The kernel's Documentation/devices.txt says that these symlinks should
exist in /dev (they are listed in the "Compulsory" section). I'm not
currently adding nfsd and X0R since they are required for iBCS, but
they can be easily added to the array later if need be.
Signed-off-by: Dwight Engen <dwight.engen@oracle.com> Acked-by: Michael H. Warfield <mhw@WittsEnd.com> Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Stéphane Graber [Thu, 13 Feb 2014 17:42:21 +0000 (12:42 -0500)]
lxc-start-ephemeral: Use attach
With this change, systems that support it will use attach to run any
provided command.
This doesn't change the default behaviour of attaching to tty1, but it
does make it much easier to script or even get a quick shell with:
lxc-start-ephemeral -o p1 -n p2 -- /bin/bash
I'm doing the setgid,initgroups,setuid,setenv magic in python rather
than using the attach_wait parameters as I need access to the pwd module
in the target namespace to grab the required information.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Stéphane Graber [Thu, 13 Feb 2014 16:17:48 +0000 (11:17 -0500)]
coverity: Do chdir following chroot
We used to do chdir(path), chroot(path). That's correct but not properly
handled coverity, so do chroot(path), chdir("/") instead as that's the
recommended way.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Thu, 13 Feb 2014 06:52:52 +0000 (00:52 -0600)]
overlayfs_clonepaths: if unpriv then rsync in a userns
This allows lxc-snapshot and lxc-clone -s from an overlayfs container
to work unprivileged. (lxc-clone -s from a directory backed container
already did work)
Stéphane Graber [Wed, 12 Feb 2014 22:46:06 +0000 (17:46 -0500)]
Fix some configure.ac issues
- Run on distro without lsb_release
- Don't try and interpret with_runtime_path as a command
- Don't print stuff on screen while in the middle of a check
Stéphane Graber [Wed, 12 Feb 2014 22:30:12 +0000 (17:30 -0500)]
travis: Build using the daily PPA
Now that we depend on seccomp2, the backport currently in precise is too
old to allow for a succesful build, so instead use ppa:ubuntu-lxc/daily
which contains recent versions of all needed build-dependencies.
Serge Hallyn [Wed, 12 Feb 2014 21:50:20 +0000 (15:50 -0600)]
seccomp: introduce v2 policy (v2)
v2 allows specifying system calls by name, and specifying
architecture. A policy looks like:
2
whitelist
open
read
write
close
mount
[x86]
open
read
Also use SCMP_ACT_KILL by default rather than SCMP_ACT_ERRNO(31) -
which confusingly returns 'EMLINK' on x86_64. Note this change
is also done for v1 as I think it is worthwhile.
With this patch, I can in fact use a seccomp policy like:
2
blacklist
mknod errno 0
after which 'sudo mknod null c 1 3' silently succeeds without
creating the null device.
changelog v2:
add blacklist support
support default action
support per-rule action
Stéphane Graber [Wed, 12 Feb 2014 16:58:15 +0000 (11:58 -0500)]
lxc-start-ephemeral: Allow unprivileged run
This allows running lxc-start-ephemeral using overlayfs. aufs remains
blocked as it hasn't been looked at and patched to work in the kernel at
this point (not sure if it ever wil).
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Wed, 12 Feb 2014 04:20:03 +0000 (22:20 -0600)]
check for access to lxcpath
The previous check for access to rootfs->path failed in the case of
overlayfs or loop backign stores. Instead just check early on for
access to lxcpath.
Instead force a copy clone. Else if the user makes a change
to the original container, the snapshot will be affected.
The user should first create a snapshot clone, then use
and snapshot that clone while leaving the original container
untouched.
TAMUKI Shoichi [Sat, 8 Feb 2014 09:15:40 +0000 (18:15 +0900)]
lxc-plamo: various small changes
- Change redirection of fd 200 to 9 (greater than 9 may conflict with
fd the shell uses internally)
- Replace numeric line addressing of ed to regular expression to avoid
correcting the line addressing at each modification of init scripts
- Correct the option order (trivial)
Serge Hallyn [Fri, 7 Feb 2014 19:00:50 +0000 (13:00 -0600)]
add_device_node: act in a chroot
The goal is to avoid an absolute symlink in the guest redirecting
us to the host's /dev. Thanks to the libvirt team for considering
that possibility!
We want to work on kernels which do not support setns, so we simply
chroot into the container before doing any rm/mknod. If /dev/vda5
is a symlink to /XXX, or /dev is a symlink to /etc, this is now
correctly resolved locally in the chroot.
We would have preferred to use realpath() to check that the resolved
path is not changed, but realpath across /proc/pid/root does not
work as expected.