Tycho Andersen [Tue, 15 Mar 2016 18:01:36 +0000 (12:01 -0600)]
build: fix build on android (and ppc)
The problem here is that dev_t on most platforms is `long unsigned`, but on
android (and ppc?) it's `long long unsigned`. Let's just upcast to `long
long unsigned` and use that format string to keep the compilers happy.
Tycho Andersen [Sat, 12 Mar 2016 01:10:40 +0000 (18:10 -0700)]
c/r: drop lxc.console=none config requirement
There are a few things going on in this patch.
1. /dev/console is an external mount since it is bind mounted from the
host. However, we don't want to use criu's --ext-mount-map auto handling
here, because that will bind mount exactly the same path from the host
on restore, but if the pts device is different on the target host, we'll
bind mount the wrong one, which is obviously wrong.
2. We need to tell CRIU how to restore the TTY. Since we declare the tty as
--external, we need to provide it via --inherit-fd (even though we've
already fixed up the environment).
Tycho Andersen [Sat, 12 Mar 2016 02:01:43 +0000 (19:01 -0700)]
criu: hide more stuff in criu.c
Various other functions/structures are now only used in criu.c, so let's
hide stuff there so as not to pollute headers.
This commit also bumps the required CRIU versions to 2.0. While we don't
*require* any features that aren't in 1.8 patchlevel 21 or above, 2.0 is a
vast improvement, and so we should use that instead.
Tycho Andersen [Thu, 10 Mar 2016 18:10:14 +0000 (11:10 -0700)]
cgroup: cgroup_escape takes no arguments
cgroup_escape() is a slight abuse of the cgroup code: what we really want
here is to escape the *current* process, whether it happens to be the LXC
monitor or not, into the / cgroups.
In the case of dump, we can't do an lxc_init(), because:
We don't want to make this a command to send to the handler, because again,
cgroup_escape() is intended to escape the *current* task to the root
cgroups.
So, let's just have cgroup_escape() build its own handler when required.
Serge Hallyn [Wed, 9 Mar 2016 07:04:46 +0000 (23:04 -0800)]
cgfsng: fix real bug and fake libc realloc bug
read_file was using the wrong value for the string length. Also,
realloc on i386 is wonky with small sizes - so use a batch size
to avoid small reallocs.
Serge Hallyn [Tue, 8 Mar 2016 03:10:58 +0000 (19:10 -0800)]
prevent containers from reading /sys/kernel/debug
Unprivileged containers cannot read it anyway, but also prevent root
owned containers from doing so. Sadly upstart's mountall won't run
if we try to prevent it from being mounted at all.
Serge Hallyn [Mon, 7 Mar 2016 19:16:43 +0000 (11:16 -0800)]
cgfsng - remove the code checking whether devices cgroup lines are already done
We may need to revert this, but I *think* we no longer need this
with default configs. The idea iirc was that if caller cannot
write to devices.allow (i.e. is in a user namespace), then ignore
permission failures if the cgroups are already sufficiently setup.
Serge Hallyn [Thu, 3 Mar 2016 18:31:23 +0000 (10:31 -0800)]
cgfsng: next generation filesystem-backed cgroup implementation
This makes simplifying assumptions: all usable cgroups must be
mounted under /sys/fs/cgroup/controller or /sys/fs/cgroup/contr1,contr2.
Currently this will only work with cgroup namespaces, because
lxc.mount.auto = cgroup is not implemented. So cgfsng_ops_init()
returns NULL if cgroup namespaces are not enabled.
lxc-attach -n a -- sh -c 'echo ERR >&2' > /dev/null
There seems to be no easy way to discern when we need to write to stderr
instead of stdout when we receive an event on the master fd of an allocated
pty. So we're using a "trick"/"hack". We write to STDOUT_FILENO if it refers to
a pty. If STDOUT_FILENO does not refer to a pty we check whether STDERR_FILENO
refers to a pty and if so write to it.
Signed-off-by: Christian Brauner <christian.brauner@mailbox.org>
Execute script lxc-devsetup also with sysvinit and upstart.
* This script sets /dev/.lxc which is needed for autodev containers.
* Previously was only executed with systemd. Execute it also with
the other init systems (sysvinit and upstart)
Signed-off-by: Carlos Alberto Lopez Perez <clopez@igalia.com>
Serge Hallyn [Thu, 3 Mar 2016 00:11:14 +0000 (16:11 -0800)]
cgfs: don't try to remove cgroups we haven't created
info_ptr->created_paths_count can be 0, so don't blindly dereference
info_ptr->created_paths[ created_paths_count - 1]. Apparently we never
used to have 0 at the cleanup_name_on_this_level before, but now that
we can fail with -eperm and not just -eexist, we do.
lxc should not reboot the container when lxc.hook.post-stop fails. It should
simply shutdown. This makes the behavior of lxc.hook.post-stop and
lxc.hook.pre-start consistent. When lxc.hook.pre-start fails, the container
does not start.
Signed-off-by: Christian Brauner <christian.brauner@mailbox.org>
lxc-attach: always allocate current controlling pty
lxc-attach uses lxc_console_create() to allocate a pty on the host.
lxc_console_create() in turn calls lxc_console_peer_default() which either
makes the current controlling pty our controlling pty for the container, or
uses whatever the user gave us (e.g. /dev/tty2 etc.). For lxc-attach we always
want the current controlling pty to be used. This commit ensures that we're in
fact always using the current controlling pty. The commit also fixes a segfault
when the user specified lxc.console.path = none.
Signed-off-by: Christian Brauner <christian.brauner@mailbox.org>
Serge Hallyn [Fri, 26 Feb 2016 20:03:09 +0000 (20:03 +0000)]
fix cgfs failure for unpriv users
Cgmanager was taught awhile ago that only some cgroup controllers are
crucial. Teach cgfs the same thing.
This patch needs improvement, but will fix failure of lxc without cgmanager
for unprivileged users for now. In particular, needed improvements include:
1. the check for crucial subsystems needs to include lxc.use
2. we should keep a list of the actually used subsystems so we don't keep
trying to chmod and enter after create has found we couldn't use a particular
subsystem
This fixes unprivileged lxc use. It does not appear to suffice to fix
nested unprivilegd lxd usage.