In the case the container has a console with a valid slave pty file descriptor
we duplicate std{in,out,err} to the slave file descriptor so console logging
works correctly. When the container does not have a valid slave pty file
descriptor for its console and is started daemonized we should dup to
/dev/null.
Closes #1646.
Signed-off-by: Li Feng <lifeng68@huawei.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
we made std{err,in,out} a duplicate of the slave file descriptor of the console
if it existed. This meant we also duplicated all of them when we executed
application containers in the foreground even if some std{err,in,out} file
descriptor did not refer to a {p,t}ty. This blocked use cases such as:
echo foo | lxc-execute -n -- cat
which are very valid and common with application containers but less common
with system containers where we don't have to care about this. So my suggestion
is to unconditionally duplicate std{err,in,out} to the console file descriptor
if we are either running daemonized - this ensures that daemonized application
containers with a single bash shell keep on working - or when we are not
running an application container. In other cases we only duplicate those file
descriptors that actually refer to a {p,t}ty. This logic is similar to what we
do for lxc-attach already.
Refers to #1690.
Closes #2028.
Reported-by: Felix Abecassis <fabecassis@nvidia.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
When I first solved this problem I went for a fork() + setns() + clone() model.
This works fine but has unnecessary overhead for a couple of reasons:
- doing a full fork() including copying file descriptor table and virtual
memory
- using pipes to retrieve the pid of the second child (the actual container
process)
This can all be avoided by being a little smart in how we employ the clone()
syscall:
- using CLONE_VM will let us get rid of using pipes since we can simply write
to the handler because we share the memory with our parent
- using CLONE_VFORK will also let us get rid of using pipes since the execution
of the parent is suspended until the child returns
- using CLONE_VM will not cause virtual memory to be copied
- using CLONE_FILES will not cause the file descriptor table to be copied
Note that the intermediate clone() is used with CLONE_VM. Some glibc versions
used to reset the pid/tid to -1 when CLONE_VM was used without CLONE_THREAD.
But since the memory between parent and child is shared on CLONE_VM this would
invalidate the getpid() cache that glibc used to maintain and so getpid() in
the child would return the parent's pid. This is all fixed in newer glibc
versions where the getpid() cache is removed and the pid/tid is not reset
anymore. However, if for whatever reason you - dear commiter - somehow need to
get the pid of the dummy intermediate process for do_share_ns() you need to
call syscall(__NR_getpid) directly. The next lxc_clone() call does not employ
CLONE_VM and will be fine.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Avoid NULL-pointer dereference. Apparently monitor.{c,h} calls
lxc_check_inherited() with NULL passed for the config. This isn't really a big
issue since monitor.{c,h} is effectively dead for all liblxc versions that have
the state client patch. Also, the patch that introduces the relevant lines into
lxc_check_inherited() is only in master and yet unreleased.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
It doesn't make sense to error out when an app container doesn't pass explicit
arguments through c->start{l}(). This is especially true since we implemented
lxc.execute.cmd. However, even before we could have always relied on
lxc.init.cmd and errored out after that.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Callers can then make a decision whether they want to consider the peer closing
the connection an error or not. For example, a c->wait(c, "STOPPED", -1) call
can then consider a ECONNRESET not an error but rather see it - correctly - as
a container exiting before being able to register a state client.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Take the lock on the list after we've done all necessary work and check state.
If we are in requested state, do cleanup and return without adding the state
client to the state client list.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This adds reboot2() as a new API extension. This function properly wait until a
reboot succeeded. It takes a timeout argument. When set to > 0 reboot2() will
block until the timeout is reached, if timeout is set to zero reboot2() will
not block, if set to -1 reboot2() will block indefinitly.
The struct state_client gets rename to lxc_state_client since it's more in line
with other declarations. It also gets moved from the lxc_handler to the
lxc_conf struct so that the state clients waiting for reboots don't get
deallocated on reboot since the handler is deallocated on reboot.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
- setting the handler->state value is atomic on any POSIX implementation since
we're dealing with an integer (enum/lxc_state_t)
- while the state clients are served it is not possible for lxc_set_state() to
transition to the next state anyway so there's no danger in moving to the
next state with clients missing it
- we only care about the list being modified
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
There are multiple reasons why this is not required:
- every command is transactional
- we only care about the list being modified not the memory allocation and
other costly operations
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
We're dealing with an integer (lxc_state_t which is an enum). Any POSIX
implementation makes those operations atomic so there's not need in locking
this.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Abbas Ally [Sun, 3 Dec 2017 05:51:44 +0000 (05:51 +0000)]
Add bash completion to list backing store types for lxc-create -B
- Backing Store types are hard-coded (Not sure how to get programmatically)
- Closes #1236
CC-Hsu [Sat, 2 Dec 2017 11:27:34 +0000 (19:27 +0800)]
Add new dependency to Slackware template
I followed the [changelog of Slackware-current]<http://www.slackware.com/changelog/>,
and found that Slackware-current split hostname utility from util-linux package in Nov 17 2017.
So I add the new package to the template.
Change conf.c to export function write_id_mapping, which will now be
called inside main function of lxc_unshare.c.
This is required because setuid syscalls only permits a new userns to
set a new uid if the uid of parameter is mapped inside the ns using
uid_map file[1]. So, just after the clone invocation, map the uid passed as
parameter into the newly created user namespace, and put the current uid
as the ID-outside-ns. After the mapping is done, setuid call succeeds.
Felix Abecassis [Wed, 29 Nov 2017 04:27:53 +0000 (20:27 -0800)]
confile_utils: simplify lxc_config_net_hwaddr
In addition to the memory corruption fixed in ee3e84df78424d26fc6c90862fbe0fa92a686b0d,
this function was also performing invalid memory accesses for the following inputs:
- `lxc.net`
- `lxc.net.`
- `lxc.net.0.`
- `lxc.network`
- `lxc.network.0.`
Signed-off-by: Felix Abecassis <fabecassis@nvidia.com>