Serge Hallyn [Sun, 18 Mar 2012 23:31:40 +0000 (00:31 +0100)]
ubuntu templates cleanups
1. fix inconsistent use of '--auth-key' (not --auth_key) which broke their
usage
2. add --debug option to lxc-ubuntu (which does set -x to show what broke)
(idea from Idea from lifeless and benji)
3. fix incorrect assumption about group with -b option. User's default group
may not be the same as username. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Sun, 18 Mar 2012 23:31:40 +0000 (00:31 +0100)]
do check for utmp checking at the right time
We were doing the check for whether we need to watch utmp from a
thread cloned from that which will actually do the utmp watching.
As a result, the utmp file was always being watched, even if it
didn't need to be.
Serge Hallyn [Mon, 5 Mar 2012 22:53:14 +0000 (23:53 +0100)]
cgroups: fix broken support for deprecated ns cgroup
when using ns cgroup, use /cgroup/<init-cgroup> rather than
/cgroup/<init-cgroup>/lxc
At least lxc-start, lxc-stop, lxc-cgroup, lxc-console and lxc-ls work
with this patch. I've tested this in a 2.6.35 kernel with ns cgroup,
and in a 3.2 kernel without ns cgroup.
Note also that because of the check for container reboot support,
if we're using the ns cgroup we now end up with a /cgroup/<container>/2
cgroup created, empty, by the clone(CLONE_NEWPID). I'm really not
sure how much time we want to spend cleaning such things up since
ns cgroup is deprecated in kernel.
Signed-off-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Thu, 16 Feb 2012 20:14:13 +0000 (14:14 -0600)]
update ubuntu templates to provide macaddr and more
Add a macaddr if precisely one veth is specified but no hwaddr. Allow
specifying ssh authkeys. In cloud template, copy locales by default and allow
a tarball to be specified.
Signed-off-by: Ben Howard <ben.howard@canonical.com> Signed-off-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Thu, 16 Feb 2012 20:13:26 +0000 (14:13 -0600)]
lxc-ubuntu: fix obscure arguments
1. --path is meant to be passed by lxc-create, but should not be passed
in by users. Don't advertise it in --help.
2. --clean syntax ends up not making much sense. Get rid of it, and
add '--flush-cache' option instead.
Signed-off-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Thu, 16 Feb 2012 20:01:20 +0000 (14:01 -0600)]
ubuntu template changes
Author: Stéphane Graber <stgraber@ubuntu.com>
Use ubuntu/ubuntu instead of root/root by default. Stop
removing tty[56].conf in Precise. Stop messing with dhclient.conf.
Set devttydir on Precise to /dev/lxc to allow for clean upgrades.
Serge Hallyn [Tue, 7 Feb 2012 15:08:37 +0000 (09:08 -0600)]
if lxc-init can't mount /dev/shm, don't fail.
The 'lxc-init' (a lightweight init process used by lxc-execute in place of
upstart etc) tries to mount /dev/shm during startup. If that fails (for
instance /dev/shm does not exist) then it aborts execution and returns -1. This
is unreasonable as very few applications actually need /dev/shm.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Tue, 7 Feb 2012 15:01:41 +0000 (09:01 -0600)]
Don't raise error if container didn't sys_reboot
Don't call it an error if a container exits without calling sys_reboot.
Particularly since that will almost always be the case with lxc-execute.
This fixes a regression introduced in commit
"49296e2ebfe7c5f9d6ebafbb54f5c5e56a0cc085: support proper container
reboot"
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Fri, 3 Feb 2012 15:29:14 +0000 (09:29 -0600)]
lxc-ubuntu: Support for building a container of a foreign architecture
Support building a container of a foreign architecture if
qemu-user-static is installed. This is done by installing some packages
of the host architecture in the container using multi-arch.
Serge Hallyn [Thu, 2 Feb 2012 21:48:17 +0000 (15:48 -0600)]
fix lxc-netstat for nested cgroups
Use the correct path for the container's cgroup task file.
Also exit out early and cleanly if the container is not running,
and bind-mount /proc/$pid/net with '-n' to keep the entry out
of mtab, else the mtab entry will never go away.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Wed, 25 Jan 2012 04:05:28 +0000 (22:05 -0600)]
support proper container reboot
This patch looks for Daniel's kernel patch allowing the lxc monitor
to tell container reboot from shutdown based on the exit signal. If
that patch is not there, utmp monitoring is used. Otherwise, it only
looks for the signal. Note that the 'conf->need_utmp_watch' is
technically not necessary, as there is no harm in watching the utmp
file.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Mon, 23 Jan 2012 19:25:11 +0000 (13:25 -0600)]
add lvm support to lxc-create
1. Some templates copy the cached pristine rootfs using 'cp a b' where b is
$lxc_path/$name/rootfs. That doesn't do the right thing if rootfs already
exists, as it will when it is an lvm or other mount. So switch to
'rsync a/ b/'. (cp can be made to work too of course).
2. Update lxc-create to support backing stores. For now only lvm is
implemented.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Mon, 23 Jan 2012 18:07:44 +0000 (12:07 -0600)]
Support nested cgroups
With this patch, I can start a container 'o1' inside another container 'o1'.
(Of course, the containers must be on a different subnet)
Detail:
1. Create cgroups for containers under /lxc.
2. Support nested lxc: respect init's cgroup:
Create cgroups under init's cgroup. So if we start a container c2
inside a container 'c1', we'll use /sys/fs/cgroup/freezer/lxc/c1/lxc/c2
instead of /sys/fs/cgroup/freezer/c2. This allows a container c1
to be created inside container c1 It also allow a container's limits
to be enforced on all a container's children (which a MAC policy could
already enforce, in which case current lxc code would be unable to nest
altogether).
3. Finally, if a container's cgroup already exists, rename it rather than
failing to start the container. Try to WARN the user so they might go
clean the old cgroup up.
Whereas without this patch, container o1's cgroup would be
/sys/fs/cgroup/<subsys>/o1,
it now becomes
/sys/fs/cgroup/<subsys>/<initcgroup>/lxc/o1
so if init is in cgroup '/' then o1's freezer cgroup would be:
/sys/fs/cgroup/freezer/lxc/o1
Changelog:
. make lxc-ps work with separate mtab. If cgroups were mounted with -n,
and mtab is not linked to /proc/self/mounts, then 'mount -t cgroup' won't
show these mounts. So make lxc-ps not use it, but rather use
/proc/self/mounts directly.
. lxc-ls in the past assumed that a container's cgroup was just '/<name>'.
Now it is '/<host-init-cgroup>/lxc/<name>'. Handle that.
. first version of this patch was setting clone_children on
<path-to-cpusets-cgroup>/<init-cgroup>/lxc, not the parent of that dir.
That failed to initialize that cgroup, so tasks could not enter it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Mon, 23 Jan 2012 18:05:40 +0000 (12:05 -0600)]
lxc-ubuntu: use release-updates and release-security
Particularly for LTS releases, which many people will want to use in
their containers, it is not wise to not use -security and -updates.
Furthermore the fix allowing ssh to allow the container to shut down
is in lucid-updates only.
With this patch, after debootstrapping a container, we add -updates
and -security to sources.list and do an apt-get upgrade under chroot.
Unfortunately we need to do this because debootstrap doesn't know how
to.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Mon, 23 Jan 2012 17:57:59 +0000 (11:57 -0600)]
drop mac_admin and mac_override
mac_admin stops the container from loading LSM policy. Neither
selinux nor apparmor currently will do well with automatic namespacing
of policy (though it's coming in apparmor, after which we can re-enable
this).
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Kevin Cernekee [Sat, 25 Feb 2012 23:49:48 +0000 (00:49 +0100)]
Add MIPS as a supported architecture
The issue is similar to what was fixed in commit e7eb632c for ARM:
the "configure" script errors out because it is unable to set
LINUX_SRCARCH. Fix is to add MIPS to the list.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:14 +0000 (09:57 +0100)]
lxc-attach: Drop privileges when attaching to container unless requested otherwise
lxc-attach will now put the process that is attached to the container into
the correct cgroups corresponding to the container, set the correct
personality and drop the privileges.
The information is extracted from entries in /proc of the init process of
the container. Note that this relies on the (reasonable) assumption that the
init process does not in fact drop additional capabilities from its bounding
set.
Additionally, 2 command line options are added to lxc-attach: One to prevent
the capabilities from being dropped and the process from being put into the
cgroup (-e, --elevated-privileges) and a second one to explicitly state the
architecture which the process will see, (-a, --arch) which defaults to the
container's current architecture.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:14 +0000 (09:57 +0100)]
Move lxc_attach from namespace.c to attach.c and rename it to lxc_attach_to_ns
Since lxc-attach helper functions now have an own source file, lxc_attach is
moved from namespace.c to attach.c and is renamed to lxc_attach_to_ns,
because that better reflects what the function does (attaching to a
container can also contain the setting of the process's personality, adding
it to the corresponding cgroups and dropping specific capabilities).
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:14 +0000 (09:57 +0100)]
Add attach.[ch]: Helper functions for lxc-attach
The following helper functions for lxc-attach are added to a new file
attach.c:
- lxc_proc_get_context_info: Get cgroup memberships, personality and
capability bounding set from /proc for a given process.
- lxc_proc_free_context_info: Free the data structure responsible
- lxc_attach_proc_to_cgroups: Add the process specified by the pid
parameter to the cgroups given by the ctx parameter.
- lxc_attach_drop_privs: Drop capabilities to the capability mask given in
the ctx parameter.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:14 +0000 (09:57 +0100)]
Add lxc_config_parse_arch to parse architecture strings
Add the function lxc_config_parse_arch that parses an architecture string
(x86, i686, x86_64, amd64) and returns the corresponding personality. This
is required for lxc-attach, which accepts architectures independently of
lxc.arch. The parsing of lxc.arch now also uses the same function to ensure
consistency.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:14 +0000 (09:57 +0100)]
cgroup: Make cgroup_attach a public function
lxc-attach needs to be able to attach a process to specific cgroup, so
cgroup_attach is renamed to lxc_cgroup_attach and now also defined in the
header file.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:13 +0000 (09:57 +0100)]
Enable get_cgroup_mount to search for mount points satisfying multiple subsystems at once
lxc-attach functionality reads /proc/init_pid/cgroup to determine the cgroup
of the container for a given subsystem. However, since subsystems may be
mounted together, we want to be on the safe side and be sure that we really
find the correct mount point, so we allow get_cgroup_mount to check for
*all* the subsystems; the subsystem parameter may now be a comma-separated
list.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:13 +0000 (09:57 +0100)]
Accept numeric values for capabilities to drop
lxc.cap.drop now also accepts numeric values for capabilities. This allows
the user to specify capabilities LXC doesn't know about yet or capabilities
that were not part of the kernel headers LXC was compiled against.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Thu, 23 Feb 2012 08:57:13 +0000 (09:57 +0100)]
Add function to determine CAP_LAST_CAP of the current kernel dynamically
The function lxc_caps_last_cap() determines CAP_LAST_CAP of the current kernel
dynamically. It first tries to read /proc/sys/kernel/cap_last_cap. If that
fails, because the kernel does not support this interface yet, it loops
through all capabilities and tries to determine whether the current capability
is part of the bounding set. The first capability for which prctl() fails is
considered to be CAP_LAST_CAP.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Greg Kurz [Thu, 5 Jan 2012 22:34:46 +0000 (23:34 +0100)]
lxc: line buffered output for lxc-monitor
A typical usage is to start lxc-monitor in popen() and parse the ouput.
Unfortunately, glibc defaults to block buffering for pipes and you may
have to wait several lines before anything is written to stdout... this
prevent the use of lxc-monitor to implement automatons. Let's go line
buffered !
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge Hallyn [Thu, 5 Jan 2012 22:34:46 +0000 (23:34 +0100)]
ubuntu template: use -updates and -security (v3)
Particularly for LTS releases, which many people will want to use in
their containers, it is not wise to not use release-security and
release-updates. Furthermore the fix allowing ssh to allow the container
to shut down is in lucid-updates only.
With this patch, after debootstrapping a container, we add -updates and
-security to sources.list and do an upgrade under chroot. Unfortunately
we need to do this because debootstrap doesn't know how to.
Changelog:
Nov 14: as Stéphane Graber suggested, make sure no daemons start on
the host while doing dist-upgrade from chroot.
Nov 15: use security.ubuntu.com, not mirror. (stgraber)
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Christian Seiler [Tue, 15 Nov 2011 17:53:53 +0000 (18:53 +0100)]
Set high byte of mac addresses for host veth devices to 0xfe
When used in conjunction with a bridge, veth devices with random addresses
may change the mac address of the bridge itself if the mac address of the
interface newly added is numerically lower than the previous mac address
of the bridge. This is documented kernel behavior. To avoid changing the
host's mac address back and forth when starting and/or stopping containers,
this patch ensures that the high byte of the mac address of the veth
interface visible from the host side is set to 0xfe.
A similar logic is also implemented in libvirt.
Fixes SF bug #3411497
See also: <http://thread.gmane.org/gmane.linux.kernel.containers.lxc.general/2709>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Greg Kurz [Thu, 10 Nov 2011 08:41:46 +0000 (09:41 +0100)]
lxc: use -iquote instead of -I
To avoid name collisions between local and system header
files. For example, if you try to include the <pty.h>
system file, you end up including the one from lxc...
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Tuomas Suutari [Fri, 28 Oct 2011 21:55:38 +0000 (23:55 +0200)]
lxc-fedora.in: Fix fetching of the fedora-release rpm
The hardcoded URL seems to be broken and 404 error was not
checked. Now the mirror is selected from mirrorlist (instead of
hardcoding to funet.fi) and fetch errors are checked.
Also added a retry loop (with 3 tries) to find a working mirror, since
some of the mirrors are not OK.
Signed-off-by: Tuomas Suutari <tuomas.suutari@gmail.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Greg Kurz [Mon, 24 Oct 2011 12:56:30 +0000 (14:56 +0200)]
lxc: introduce lxc_execute()
This patch allows to create application containers with liblxc.so directly.
Some code cleanups on the way:
- separate ops for lxc_execute() and lxc_start(): the factorisation is wrong
here as we may have specific things to do if we're running an application
container. It deserves separate ops.
- lxc_arguments_dup() is merged in the pre-exec operation: this is a first
use for the execute op introduced just above. It's better to build the
arguments to execvp() where they're really used.
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com>
Rainer Weikusat [Mon, 24 Oct 2011 12:47:58 +0000 (14:47 +0200)]
Don't list containers w/ active console sessions multiple times
The lxc-ls shell script uses netstat -xa to get a listing of AF_UNIX
sockets it then parses in order to determine the names of presently
running containers. This is wrong because it will list the
listening socket and all sockets created by accepting connections on
that. This causes the script to display the names of containers with
active lxc-console sessions 1 + n times, n being the number of active
console sessions. The patch below fixes this by using netstat -xl
instead which only displays the listening sockets.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge E. Hallyn [Mon, 24 Oct 2011 12:38:30 +0000 (14:38 +0200)]
Accurately detect whether a system supports clone_children
If multiple cgroups are mounted under /sys/fs/cgroup, then the
original check ends up looking for /sys/fs/cgroup/cgroup.clone_children,
which does not exist because that is just a tmpfs.
So make sure to check an actual cgroupfs.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge E. Hallyn [Mon, 24 Oct 2011 12:38:30 +0000 (14:38 +0200)]
Let sshd template work on ubuntu systems.
/dev/shm is a symlink to /run/shm, so we need /run/shm
to exist in the container rootfs. Also, /dev/mqueue does
not exist on the host, and can't be created by the container.
But we don't really need it so ignore that.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Serge E. Hallyn [Mon, 24 Oct 2011 12:38:30 +0000 (14:38 +0200)]
ubuntu template: disallow cap_sys_module (by popular demand)
This isn't particularly reassuring, and will be moot with user
namespaces, but as people are asking for it, turn off sys_module.
While we're at it, turn off mac_admin and mac_override.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
We think it's better for now to only warn the user about a fd leaking into
the container. Also remove the call to readlink() as it isn't really useful
now: since the container will start anyway, the user can look into /proc/../fd
or use lsof or whatever.
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Greg Kurz [Tue, 13 Sep 2011 13:08:04 +0000 (15:08 +0200)]
fixes for rpmbuild
This patch fixes some makefile/specfile issues when running
rpmbuild with the distributed lxc specfile:
- fixes usage of installation directories for config files,
rootfs, templates and lxc-init so that they're calculated
at make time instead of configure time. Thanks to this,
all installed items go under $RPM_BUILD_ROOT when running
rpmbuild
- introduce --disable-rpath option to configure to avoid
check-rpaths errors when building non-root.
- introduce a lxc-libs package in the default spec file
to allow concurrent installation of 32 bit and 64 bit
libraries.
v2: - fix circular reference in lxc.pc
- ship lxc.pc with lxc-devel
Signed-off-by: Greg Kurz <gkurz@fr.ibm.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
For veth and macvlan networks, this can look up the host address on the
bridge (link) interface and add a default route on the guest to that
address. This facilitates a typical setup where guests are bridged
together.
syntax:
lxc.ipv4.gateway = auto
lxc.ipv6.gateway = auto
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>