Serge Hallyn [Fri, 22 Nov 2013 20:39:37 +0000 (14:39 -0600)]
lxcapi_destroy: run in a namespace if we are unprivileged
This is necessary to have the rights to remove files owned by our subuids.
Also update lxc_rmdir_onedev to return 0 on success, -1 on failure.
Callers were not consistent in using it correctly, and this is more
in keeping with the rest of our code.
Serge Hallyn [Fri, 22 Nov 2013 04:11:43 +0000 (22:11 -0600)]
remove HAVE_NEWUIDMAP and NEWUIDMAP
Always build lxc-usernsexec. Else we require having uidmap
installed on the build host for no good reason. And we never
actually used the NEWUIDMAP path we detected.
Added a file "lxc.service" for a systemd service file.
Added a file "lxc-devsetup" to setup /dev/ on startup to support autodev
in containers.
Service file references lxc-devsetup as an ExecStartPre command. The
lxc-devsetup script is not dependent on systemd or Fedora and can
be used at bootup on any system.
Modified lxc.spec.in to install the two new files on Fedora. The systemd
specific code in the lxc.spec file may need some review and conditionalize
for systemd on non-systemd rpm-based systems.
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com> Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Serge Hallyn [Thu, 21 Nov 2013 15:05:59 +0000 (09:05 -0600)]
lxcapi_clone: set the right environment variable for mounted fs
If the container is dir-backed, we don't actually mount it (to
support unprivileged use). So always set the LXC_ROOTFS_MOUNT
to bdev->dest, not to the rootfs path specified in the container
configuration.
If autodev is not specifically set to 0 or 1, attempts to determine if
systemd is being utilized and forces autodev=1 to prevent host system
conflicts and collisions.
If autodev is enabled and the host /dev is mounted with devtmpfs
or /dev/.lxc is mounted with another file system...
Each container created by a privileged user gets a /dev directory
mapped off the host /dev here:
/dev/.lxc/${name}.$( hash $lxcpath/$name )
Each container created by a non-privileged user gets a /dev/directory
mapped off the host /dev here:
/dev/.lxc/user/${name}.$( hash $lxcpath/$name )
The /dev/.lxc/user is mode 1777 to allow unpriv access.
The /dev/.lxc/{containerdev} is bind mounted into the container /dev.
Fallback on failure is to mount tmpfs into the container /dev.
A symlink is created from $lxcpath/$name/rootfs.dev back to the /dev
relative directory to provid a code consistent reference for updating
container devs.
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com> Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Nikola Kotur [Wed, 20 Nov 2013 15:07:37 +0000 (16:07 +0100)]
lxc-attach: elevate specific privileges
There are scenarios in which we want to execute process with specific
privileges elevated.
An example for this might be executing a process inside the container
securely, with capabilities dropped, but not in container's cgroup so
that we can have per process restrictions inside single container.
Similar to namespaces, privileges to be elevated can be OR'd:
lxc-attach --elevated-privileges='CAP|CGROUP' ...
Backward compatibility with previous versions is retained. In case no
privileges are specified behaviour is the same as before: all of them
are elevated.
Signed-off-by: Nikola Kotur <kotnick@gmail.com> Acked-By: Christian Seiler <christian@iwakd.de> Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Dwight Engen [Mon, 18 Nov 2013 17:28:14 +0000 (12:28 -0500)]
oracle template: prevent mingetty from calling vhangup(2)
This is needed when using the user namespace since the kernel check does
not allow user_ns root to successfully call vhangup(2), and mingetty will
quit in this case.
S.Çağlar Onur [Fri, 15 Nov 2013 04:56:04 +0000 (23:56 -0500)]
fix memory leaks reported by cppcheck in src/lxc/lxc_monitor.c. Since this is a cli tool it doesn't really matter but might silence some warnings for debugging
Stéphane Graber [Fri, 15 Nov 2013 20:14:00 +0000 (15:14 -0500)]
lxc-info: Rework based on mailinglist thread
So this implements the changes we discussed yesterday:
- Only one container may be queried at the time
- -n is now required once again
- -H + a single filter only returns the value
- -t/--is-state is now removed
Note that -S is considered as more than a single filter, so -H in that
case only affects the formatting of the values.
For the same reason, I haven't yet implemented the -H + multiple filters
case which we said should return a simple "key: value" output as it
wasn't trivial to re-arrange the stats code to print a different format
(for the other options, it's just a two lines change in the print
functions).
setup_netdev: re-read ifindex in LXC_NET_PHYS case
When moving an interface from the host netns to a container's,
the ifindex might not remain the same. This happens when the
index of the host interface is already assigned to another interface
in the new netns.
For veth/vlan/macvlan, virtual interfaces are first created on the host,
and then moved in the container. Since they are created after all other
interfaces are discovered, there is no chance for its assigned ifindex
to be already present in a freshly created netns, because it's a greater
number.
However, when moving a physical interface, there is a chance that its
ifindex in the host netns is not free in the new netns. The patch
forces ifindex re-read for the LXC_NET_PHYS case to update the
lxc_netdev structure.
pc-wurm [Fri, 8 Nov 2013 11:45:51 +0000 (12:45 +0100)]
Update lxc_create.c corrected argument usage example for -t
I think '-t timeout' was mistakenly written, so I corrected it to '-t
template', since the -t argument is used for setting templates, not
timeout as far as I know.
Dwight Engen [Tue, 12 Nov 2013 19:04:45 +0000 (14:04 -0500)]
fix multithreaded create()
We were calling save_config() twice within the create() flow, each
from a different process. Depending on order of scheduling, sometimes
the data from the first save_config() (which was just the stuff from
LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full
config), causing a truncated config file which would then cause lxc
to segfault once it read it back in because no rootfs.path was set.
This fixes it by only calling save_config() once in the create()
flow. A rejected alternative was to call fsync(fileno(fout)) before
the fclose in save_config.
Serge Hallyn [Mon, 11 Nov 2013 18:32:14 +0000 (12:32 -0600)]
lxc_abstract_unix_connect: accomodate containers started before Oct 28
commit aae93dd3dd20dd12c6b8f9f0490e2fb877ee3f09 fixed the command socket
name to use the right pathlen instead of always passing in the max
socket namelen. However, this breaks lxc-info/lxc-list/etc for
containers started before that commit. So if the correct command
sock name doesn't work, try the preexising one.
Note we can probably undo this "after awhile". Maybe in august 2014.
Serge Hallyn [Fri, 25 Oct 2013 23:03:57 +0000 (18:03 -0500)]
lxc-user-nic: rename nic inside container to desired name
To do so we do a quick setns into the container's netns. This
(unexpectedly) turns out cleaner than trying to rename it from
lxc_setup(), because we don't know the original nic name in
the container until we created it which we do in the parent
after the init has been cloned.
Serge Hallyn [Fri, 1 Nov 2013 20:27:49 +0000 (15:27 -0500)]
create_run_template: tell the template what caller's uid was mapped to
conf.c/conf.h: have replaced bool hostid_is_mapped() with int mapped_hostid()
which returns the mapped uid for the caller's uid on the host, or -1 if
none
create_run_template: pass caller's uid into template.
lxc-ubuntu-cloud:
1. accept --mapped-uid argument
2. don't write to devices cgroup - not allowed.
3. if running in userns, use $HOME/.cache
4. chown cached files to the uid to which our caller was
mapped
5. ignore /dev when extracting rootfs in a userns
Dwight Engen [Tue, 5 Nov 2013 18:17:02 +0000 (13:17 -0500)]
fix leak in list_active_containers()
Found by running the lxc-test-list test with valgrind. The names were
put into a local array, and never freed in the success case where the
caller didn't want the names returned and in the early out failure case.
Note we don't need to check the return from remove_from_array() because
we just successfully added the name above.
Dwight Engen [Mon, 4 Nov 2013 22:35:15 +0000 (17:35 -0500)]
allow lxcapi_get_cgroup_item() on lxc-execute containers
Containers started with lxc-execute may not have a conf, but
nothing in the implementation of lxcapi_get_cgroup_item()
actually needs/uses it, and it can be useful to get items out
of the containers' cgroup items.
Dwight Engen [Thu, 31 Oct 2013 20:38:30 +0000 (16:38 -0400)]
lua: fix stats collection using get_cgroup_item
Previously, the lua stats collection was building its own paths to the
cgroup files, which could be wrong depending on what --with-cgroup-pattern
was passed to configure. Fix it to use the get_cgroup_item api so it
always finds the files.
Remove cgroup_path_get since it is not used anymore.
S.Çağlar Onur [Fri, 1 Nov 2013 20:16:10 +0000 (16:16 -0400)]
valgrind drd tool shows conflicting stores happening at lxc_global_config_value@src/lxc/utils.c (v2)
Conflict occurs between following lines
[...]
269 if (values[i])
270 return values[i];
[...]
and
[...]
309 /* could not find value, use default */
310 values[i] = (*ptr)[1];
[...]
fix it using a specific lock dedicated to that problem as Serge suggested.
Also introduce a new autoconf parameter (--enable-mutex-debugging) to convert mutexes to error reporting type and to provide a stacktrace when locking fails.
Serge Hallyn [Fri, 1 Nov 2013 17:17:52 +0000 (12:17 -0500)]
always remount / rslave before running creation template (if root)
If we're not root, our mounts in private userns won't get pushed
back anyway. If we are root, we need to make sure that anything
the template does gets cleaned up.
Dwight Engen [Tue, 29 Oct 2013 20:46:16 +0000 (16:46 -0400)]
fix cgpath test
Commit 1ea59ad28 sets memory.use_hierarchy, which means that this test
cannot use memory.swappiness as its dummy cgroup item to set/unset since
writing to it with use_hierarchy set gets -EINVAL. Change test to use
memory.soft_limit_in_bytes instead.
Dwight Engen [Tue, 29 Oct 2013 18:38:00 +0000 (14:38 -0400)]
fix free() of args to startl
Coverity 1076328 marked this as "Use after free", which it isn't really,
its actually just free()ing the wrong 2nd, 3rd, etc... pointers. Test by
passing two or more args to startl, without this change you get segfault
when free()ing the second pointer/arg.
Serge Hallyn [Tue, 29 Oct 2013 17:48:46 +0000 (12:48 -0500)]
rpm spec: fix version numbering when building alpha, beta, rc
We want to ensure smooth upgrades when doing rpm -U throughout the
release cycle so this change implements the scheme documented at:
http://fedoraproject.org/wiki/Packaging%3aNamingGuidelines#NonNumericRelease
Dwight Engen [Tue, 29 Oct 2013 13:24:29 +0000 (09:24 -0400)]
coverity: ifr_name buffer not NULL terminated
The kernel (net/core/dev_ioctl.c:dev_ioctl()) is going to NULL terminate
this name after the copy-in of the ifr, so even though this is a fixed
sized array the last byte isn't usable as part of the name. All the ioctls
we're using go through this code path.
Use the ifr name in the DEBUG message in case it was possibly truncated.
Changes since v1:
* check the length of passed-in string
Changes since v2:
* remove non-abstract socket code path to simplify functions
* rename lxc_af_unix_* family to lxc_abstract_unix_*
On this system list_active_containers returns 14 containers while only 10 containers are running.
Following patch;
* Introduces array_contains function to do a binary search on given array,
* Starts to sort arrays inside the add_to_clist and add_to_names functions,
* Consumes array_contains in list_active_containers to eliminate duplicates,
* Replaces the linear search code in lxcapi_get_interfaces with the new function.
Changes since v1:
* Do not load containers if a if a container list is not passed in
* Fix possible memory leaks in lxcapi_get_ips and lxcapi_get_interfaces if realloc fails