]> git.proxmox.com Git - mirror_lxc.git/log
mirror_lxc.git
6 years agostart: set environment variables correctly
Christian Brauner [Wed, 13 Sep 2017 02:01:41 +0000 (04:01 +0200)]
start: set environment variables correctly

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: move env setup before container setup
Christian Brauner [Tue, 12 Sep 2017 19:23:17 +0000 (21:23 +0200)]
start: move env setup before container setup

The hooks (e.g. lxc.hook.mount) should have the environment variables the user
gave us available.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agofix regex-typo in lxc-monitor.sgml.in
Christian von Roques [Tue, 12 Sep 2017 10:31:23 +0000 (12:31 +0200)]
fix regex-typo in lxc-monitor.sgml.in

To match names beginning with the letters "f" or "b" one can use
the regular expression "[fb].*" or "(f|b).*", but not "[f|b].*",
which would match strings beginning with "f", "|", or "b".

Signed-off-by: Christian von Roques <roques@z12.ch>
6 years agoplamo: Delete unnecessary process during container shutdown
KATOH Yasufumi [Tue, 12 Sep 2017 06:29:34 +0000 (15:29 +0900)]
plamo: Delete unnecessary process during container shutdown

Since some remounts/umounts is executed in the plamo shutdown script,
the filesystem on where a container exists might be mount as
read-only. This patch delete some mounts and umounts from the shutdown
script. It also delete hwclock setting process.

And delete an unncecessary output.

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
6 years agostorage: avoid segfault
Christian Brauner [Mon, 11 Sep 2017 01:30:00 +0000 (03:30 +0200)]
storage: avoid segfault

When the "lxc.rootfs.path" property is not set and users request a container
copy we would segfault since strstr() would be called on a NULL pointer.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: switch ids at last possible instance
Christian Brauner [Mon, 11 Sep 2017 01:16:06 +0000 (03:16 +0200)]
start: switch ids at last possible instance

This is technically not necessary but it is a privilege sensitive operation.
Meaning if anyone wants to do something that requires privilege it should be
done before the id switch. So let's move the id switch immediately before the
exec so that it's called at the last possible moment.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoexecute: enable console & standard /dev symlinks
Christian Brauner [Sun, 10 Sep 2017 11:49:18 +0000 (13:49 +0200)]
execute: enable console & standard /dev symlinks

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconfile: preserve newlines
Christian Brauner [Sun, 10 Sep 2017 07:38:57 +0000 (09:38 +0200)]
confile: preserve newlines

Users were confused when the config file created during cloning or copying a
container suddenly missed all newlines. Let's keep them.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: remove dead assignments
Christian Brauner [Sun, 10 Sep 2017 06:23:59 +0000 (08:23 +0200)]
network: remove dead assignments

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: remove double initialization
Christian Brauner [Sun, 10 Sep 2017 06:23:36 +0000 (08:23 +0200)]
lxc-user-nic: remove double initialization

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoutils: lxc_popen() remove dead assignments
Christian Brauner [Sun, 10 Sep 2017 06:09:52 +0000 (08:09 +0200)]
utils: lxc_popen() remove dead assignments

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agotests: avoid NULL pointer dereference
Christian Brauner [Sun, 10 Sep 2017 06:09:05 +0000 (08:09 +0200)]
tests: avoid NULL pointer dereference

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agotests: remove dead assignments
Christian Brauner [Sun, 10 Sep 2017 06:06:26 +0000 (08:06 +0200)]
tests: remove dead assignments

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc_usernsexec: remove dead assignments
Christian Brauner [Sun, 10 Sep 2017 06:03:06 +0000 (08:03 +0200)]
lxc_usernsexec: remove dead assignments

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-unshare: do not pass NULL pointer
Christian Brauner [Sun, 10 Sep 2017 06:01:31 +0000 (08:01 +0200)]
lxc-unshare: do not pass NULL pointer

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconfile: parse_idmaps() remove dead assignments
Christian Brauner [Sun, 10 Sep 2017 06:00:50 +0000 (08:00 +0200)]
confile: parse_idmaps() remove dead assignments

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agooverlay: fix use after free()
Christian Brauner [Sun, 10 Sep 2017 05:04:34 +0000 (07:04 +0200)]
overlay: fix use after free()

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoutils: do not write to 0 sized buffer
Christian Brauner [Sun, 10 Sep 2017 04:42:10 +0000 (06:42 +0200)]
utils: do not write to 0 sized buffer

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostorage/overlay: do not write to invalid memory
Christian Brauner [Sat, 9 Sep 2017 17:29:53 +0000 (19:29 +0200)]
storage/overlay: do not write to invalid memory

Closes #1802.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agocriu: use correct check initialization check
Christian Brauner [Sat, 9 Sep 2017 16:45:47 +0000 (18:45 +0200)]
criu: use correct check initialization check

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: remove dead variable
Christian Brauner [Sat, 9 Sep 2017 09:23:55 +0000 (11:23 +0200)]
start: remove dead variable

non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agomonitor: remove dead assignment
Christian Brauner [Sat, 9 Sep 2017 09:23:34 +0000 (11:23 +0200)]
monitor: remove dead assignment

non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconsole: remove dead assignments
Christian Brauner [Sat, 9 Sep 2017 09:23:14 +0000 (11:23 +0200)]
console: remove dead assignments

non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostorage: use userns_exec_full()
Christian Brauner [Sat, 9 Sep 2017 09:22:44 +0000 (11:22 +0200)]
storage: use userns_exec_full()

Closes #1800.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: userns_exec_full()
Christian Brauner [Sat, 9 Sep 2017 09:21:51 +0000 (11:21 +0200)]
start: userns_exec_full()

Closes #1800.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxccontainer: use userns_exec_full()
Christian Brauner [Sat, 9 Sep 2017 09:21:16 +0000 (11:21 +0200)]
lxccontainer: use userns_exec_full()

Closes #1800.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: add userns_exec_full()
Christian Brauner [Sat, 9 Sep 2017 09:20:57 +0000 (11:20 +0200)]
conf: add userns_exec_full()

Closes #1800.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agotools: fix lxc-upate-config
Christian Brauner [Wed, 6 Sep 2017 10:33:19 +0000 (12:33 +0200)]
tools: fix lxc-upate-config

- replace lxc.network.[i].ipv4 with lxc.net.[i].ipv4.address
- remove lxc.rootfs.backend lines

Closes #1790.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agodoc: Add lxc.cgroup.dir to Japanese lxc.container.conf(5)
KATOH Yasufumi [Wed, 6 Sep 2017 10:17:00 +0000 (19:17 +0900)]
doc: Add lxc.cgroup.dir to Japanese lxc.container.conf(5)

* and fix a typo in English man

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
6 years agodoc: Translate lxc(7) into Japanese
KATOH Yasufumi [Wed, 6 Sep 2017 10:00:19 +0000 (19:00 +0900)]
doc: Translate lxc(7) into Japanese

* Update for commit 594d6e30d6c86f55c340bf49f0aa15b761d7e627
* and some improvements

Signed-off-by: KATOH Yasufumi <karma@jazz.email.ne.jp>
6 years agoconsole: clean tty state + return 0 on peer exit
LiFeng [Tue, 5 Sep 2017 15:16:50 +0000 (23:16 +0800)]
console: clean tty state + return 0 on peer exit

In the past, if the console client exited, lxc_console_cb_con return 1. And
the lxc_poll will exit, the process will wait at waitpid. At this moment, the
process could not handle any command (For example get the container state
LXC_CMD_GET_STATE or stop the container LXC_CMD_STOP.).

I think we should clean the tty_state and return 0 in this case. So, we can use
the lxc-console to connect the console of the container. And we will not exit
the function lxc_polland we can handle the commands by lxc_cmd_process

Reproducer prior to this commit:
- open a new terminal, get the tty device name by command tty /dev/pts/6
- set lxc.console.path = /dev/pts/6
- start the container and the ouptut will print to /dev/pts/6
- close /dev/pts/6
- try an operation e.g. getting state with lxc-ls and lxc-ls will hang

Closes #1787.

Signed-off-by: LiFeng <lifeng68@huawei.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agocleanup: remove unnecessary zeroing
Wolfgang Bumiller [Wed, 6 Sep 2017 09:45:03 +0000 (11:45 +0200)]
cleanup: remove unnecessary zeroing

The entire netdev is zeroed via memset() already. Unions and
all.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
6 years agonetwork: add missing checks for empty links
Wolfgang Bumiller [Wed, 6 Sep 2017 09:51:03 +0000 (11:51 +0200)]
network: add missing checks for empty links

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
6 years agochange version to 2.1.0 in configure.ac
Stéphane Graber [Wed, 6 Sep 2017 02:31:20 +0000 (22:31 -0400)]
change version to 2.1.0 in configure.ac

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
6 years agoMerge pull request #1789 from brauner/2017-09-06/fix_documentation
Stéphane Graber [Tue, 5 Sep 2017 23:19:45 +0000 (19:19 -0400)]
Merge pull request #1789 from brauner/2017-09-06/fix_documentation

doc: adapt + update

6 years agodoc: bugfixes
Christian Brauner [Tue, 5 Sep 2017 22:43:05 +0000 (00:43 +0200)]
doc: bugfixes

- lxc.id_map -> lxc.idmap
- document lxc.cgroup.dir

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agodoc: lxc.sgml.in
Christian Brauner [Tue, 5 Sep 2017 22:30:40 +0000 (00:30 +0200)]
doc: lxc.sgml.in

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMinimal kernel version is now 3.10
Stéphane Graber [Tue, 5 Sep 2017 20:55:27 +0000 (16:55 -0400)]
Minimal kernel version is now 3.10

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
6 years agoMerge pull request #1788 from brauner/2017-09-05/fix_tty_creation
Serge Hallyn [Tue, 5 Sep 2017 17:51:57 +0000 (12:51 -0500)]
Merge pull request #1788 from brauner/2017-09-05/fix_tty_creation

conf: bugfixes

6 years agoconf: fix userns_exec_1()
Christian Brauner [Tue, 5 Sep 2017 15:43:31 +0000 (17:43 +0200)]
conf: fix userns_exec_1()

A bit of context:
userns_exec_1() is only used to operate based on privileges for the user's own
{g,u}id on the host and for the container root's unmapped {g,u}id. This means
we require only to establish a mapping from:
- the container root {g,u}id as seen from the host -> user's host {g,u}id
- the container root -> some sub{g,u}id

This function however was buggy. It relied on some pointer pointing to the same
memory, namely specific idmap entries in the idmap list in the container's
in-memory configuration. However, due to a stupid mistake of mine, the pointers
to be compared pointed to freshly allocated memory. They were never pointing to
the intended memory locations. To reproduce what I'm talking about prior to
this commit simply place:

    chb:999:1000000000
    chb:999:1
    chb:1000:1

in /etc/sub{g,u}id then create a container which requests the following
idmappings:

    lxc.idmap = u 0 999 999
    lxc.idmap = g 0 999 1000000000

and start the container. What we *would expect* is for liblxc to establish the
following mapping:

    newuidmap <pid> 0 999 999
    newgidmap <pid> 0 999 1000000000

since all required mappings are present. Due to the buggy pointer comparisons
what happened was:

    newuidmap <pid> 0 999 999 0 999 999
    newgidmap <pid> 0 999 1000000000 0 999 1000000000

Let's fix this.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: do not log uninitialized memory
Christian Brauner [Tue, 5 Sep 2017 11:46:53 +0000 (13:46 +0200)]
conf: do not log uninitialized memory

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: fix tty creation
Christian Brauner [Tue, 5 Sep 2017 10:41:30 +0000 (12:41 +0200)]
conf: fix tty creation

We allocate pty {master,slave} file descriptors in the childs namespaces after
we have setup devpts. After we have sent the pty file descriptors to the parent
and set up the pty file descriptors under /dev/tty* and before we exec the init
binary we need to delete these file descriptors in the child. However, one of
my commits made the deletion occur before setting up the file descriptors under
/dev/tty*. This caused a failures when trying to attach to the container's ttys
since they werent actually configured although the file descriptors were
available in the in-memory configuration of the parent.
This commit reworks setting up tty such that deletion occurs after all setup
has been performed. The commit is actually minimal but needs to also move all
the functions into one place since they well now be called from
"lxc_create_ttys()".

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: non-functional changes
Christian Brauner [Tue, 5 Sep 2017 10:19:28 +0000 (12:19 +0200)]
conf: non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1785 from brauner/2017-09-05/record_idmap_in_log
Stéphane Graber [Tue, 5 Sep 2017 00:18:25 +0000 (20:18 -0400)]
Merge pull request #1785 from brauner/2017-09-05/record_idmap_in_log

conf: record idmap that gets written

6 years agoconf: record idmap that gets written
Christian Brauner [Tue, 5 Sep 2017 00:00:29 +0000 (02:00 +0200)]
conf: record idmap that gets written

This will serve us well in the future!

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1784 from brauner/2017-09-05/document_handler_fields
Stéphane Graber [Mon, 4 Sep 2017 22:45:32 +0000 (18:45 -0400)]
Merge pull request #1784 from brauner/2017-09-05/document_handler_fields

start: document all handler fields

6 years agostart: document all handler fields
Christian Brauner [Mon, 4 Sep 2017 22:29:01 +0000 (00:29 +0200)]
start: document all handler fields

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1783 from brauner/2017-09-04/criu_version
Stéphane Graber [Mon, 4 Sep 2017 19:52:44 +0000 (15:52 -0400)]
Merge pull request #1783 from brauner/2017-09-04/criu_version

criu: add cmp_version()

6 years agocriu: add cmp_version()
Federico Briata [Mon, 4 Sep 2017 10:16:35 +0000 (12:16 +0200)]
criu: add cmp_version()

We cannot use strcmp(). Otherwise we incorrectly report e.g. that criu 2.12.1
is less than 2.8.

Signed-off-by: Federico Briata <federico-pietro.briata@cnhind.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1771 from brauner/2017-08-30/remove_executable_bit_from_console.c
Stéphane Graber [Mon, 4 Sep 2017 16:54:59 +0000 (12:54 -0400)]
Merge pull request #1771 from brauner/2017-08-30/remove_executable_bit_from_console.c

console: non-functional change

6 years agoMerge pull request #1782 from brauner/2017-09-04/fix_tty_sending
Stéphane Graber [Mon, 4 Sep 2017 15:54:23 +0000 (11:54 -0400)]
Merge pull request #1782 from brauner/2017-09-04/fix_tty_sending

conf: don't send ttys when none are configured

6 years agostart: don't let data_sock users close the fd
Christian Brauner [Mon, 4 Sep 2017 12:48:37 +0000 (14:48 +0200)]
start: don't let data_sock users close the fd

It is bad style to close an fd inside a function which didn't create it. Let's
rather close it transparently in start.c.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: don't send ttys when none are configured
Christian Brauner [Mon, 4 Sep 2017 12:35:02 +0000 (14:35 +0200)]
conf: don't send ttys when none are configured

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1773 from brauner/2017-08-31/ensure_lxc_user_nic_tests_privilege_...
Serge Hallyn [Mon, 4 Sep 2017 02:17:43 +0000 (21:17 -0500)]
Merge pull request #1773 from brauner/2017-08-31/ensure_lxc_user_nic_tests_privilege_over_netns

network: improvements + bugfixes

6 years agostart: switch from SOCK_DGRAM to SOCK_STREAM
Christian Brauner [Sun, 3 Sep 2017 23:27:30 +0000 (01:27 +0200)]
start: switch from SOCK_DGRAM to SOCK_STREAM

Writes < PIPE_BUF will be atomic. PIPE_BUF is guaranteed to be 512 by POSIX and
Linux guarantess 4096. Nothing we send around goes over this limit.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: send ttys in batches of 2
Christian Brauner [Sun, 3 Sep 2017 23:27:04 +0000 (01:27 +0200)]
conf: send ttys in batches of 2

I thought we could send all ttys at once but this limits the number of ttys
users can use because of iovec_len restrictions. So let's sent them in batches
of 2.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: simplify
Christian Brauner [Sun, 3 Sep 2017 18:49:54 +0000 (20:49 +0200)]
lxc-user-nic: simplify

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: remove allocation from lxc_mkifname()
Christian Brauner [Sun, 3 Sep 2017 18:37:21 +0000 (20:37 +0200)]
network: remove allocation from lxc_mkifname()

lxc_mkifname() really doesn't need to allocate any memory.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: fix grammar
Christian Brauner [Sun, 3 Sep 2017 15:08:23 +0000 (17:08 +0200)]
network: fix grammar

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: user send()/recv()
Christian Brauner [Sun, 3 Sep 2017 14:51:54 +0000 (16:51 +0200)]
network: user send()/recv()

Also move all functions to network.{c,h}.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agohandler: root -> am_root
Christian Brauner [Sun, 3 Sep 2017 14:44:41 +0000 (16:44 +0200)]
handler: root -> am_root

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: bugfixes
Christian Brauner [Sun, 3 Sep 2017 14:40:11 +0000 (16:40 +0200)]
lxc-user-nic: bugfixes

Since find_line() was changed before count_entries() started counting lines
wrong. It would report maximum reached before you actually reached your alloted
maximum.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoutils: add lxc_nic_exists()
Christian Brauner [Sun, 3 Sep 2017 14:35:48 +0000 (16:35 +0200)]
utils: add lxc_nic_exists()

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: keep lines from other {users,links}
Christian Brauner [Sat, 2 Sep 2017 17:44:10 +0000 (19:44 +0200)]
lxc-user-nic: keep lines from other {users,links}

Assume the db contained the following entries:

    chb veth lxcbr0 veth1
    chb veth lxcbr0 veth2
    chb veth lxdbr0 veth3
    chb veth lxdbr0 veth2
    didi veth lxcbr0 veth4

And you request

    cull_entries("chb", "veth", "lxdbr0", "veth3");

lxc-user-nic would wipe any entries that did not match irrespective of whether
they existed or not. Let's fix that.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: fix adding database entries
Christian Brauner [Sat, 2 Sep 2017 00:26:28 +0000 (02:26 +0200)]
lxc-user-nic: fix adding database entries

The code before inserted \0-bytes after every new line which made the db
basically unusable.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: remove netpipe
Christian Brauner [Fri, 1 Sep 2017 20:33:21 +0000 (22:33 +0200)]
network: remove netpipe

We use data_sock for all things we need to send around between parent and child
now. It doesn't make sense to have so many different pipes and sockets if one
will do just fine.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoCheck that there is netplan binary, rather than just just a config directory.
Dimitri John Ledkov [Thu, 31 Aug 2017 11:40:58 +0000 (12:40 +0100)]
Check that there is netplan binary, rather than just just a config directory.

Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agotemplates/ubuntu: support netplan in newer releases by default
Dimitri John Ledkov [Wed, 30 Aug 2017 12:45:27 +0000 (13:45 +0100)]
templates/ubuntu: support netplan in newer releases by default

If netplan is present in the container, configure default networking
with neplan instead of ifupdown. Also, do not install ifupdown when
boostrapping minbase variant, unless using currently support
non-netplan releases (trusty, zenial, zesty).

Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
6 years agonetwork: use correct network device name
Christian Brauner [Fri, 1 Sep 2017 17:34:43 +0000 (19:34 +0200)]
network: use correct network device name

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: stop recording saved physical net devices
Christian Brauner [Fri, 1 Sep 2017 14:44:46 +0000 (16:44 +0200)]
network: stop recording saved physical net devices

liblxc will now correctly log any network device names and ifindeces in their
respective network namespaces. So there's no need to record physical network
devices any more. This spares us heap allocations and memory we need to have
lying around til the container is shutdown.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: retrieve correct names and ifindices
Christian Brauner [Fri, 1 Sep 2017 13:30:28 +0000 (15:30 +0200)]
network: retrieve correct names and ifindices

On privileged network creation we only retrieved the names and ifindeces of
network devices in the host's network namespace. This meant that the monitor
process was acting on possibly incorrect information. With this commit we have
the child send back the correct device names and ifindeces in the container's
network namespace.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: non-functional changes
Christian Brauner [Fri, 1 Sep 2017 11:04:00 +0000 (13:04 +0200)]
start: non-functional changes

This renames the socketpair() variable "ttysock" to "data_sock" since we will
use it to send arbitrary data around, not just ttys anymore.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: non-functional changes
Christian Brauner [Fri, 1 Sep 2017 10:54:43 +0000 (12:54 +0200)]
network: non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: use static memory for net device names
Christian Brauner [Thu, 31 Aug 2017 22:23:30 +0000 (00:23 +0200)]
network: use static memory for net device names

All network devices can only be of size < IFNAMSIZ. So let's spare the useless
heap allocations and use static memory.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: initialize vars to silence gcc-7
Christian Brauner [Thu, 31 Aug 2017 21:13:44 +0000 (23:13 +0200)]
lxc-user-nic: initialize vars to silence gcc-7

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: free memory and check for error
Christian Brauner [Thu, 31 Aug 2017 21:08:28 +0000 (23:08 +0200)]
lxc-user-nic: free memory and check for error

- check for error on ifindex retrieval
- free allocated memory

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agostart: non-functional changes
Christian Brauner [Thu, 31 Aug 2017 21:01:46 +0000 (23:01 +0200)]
start: non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: retrieve the host's veth device ifindex
Christian Brauner [Thu, 31 Aug 2017 20:58:30 +0000 (22:58 +0200)]
network: retrieve the host's veth device ifindex

- Retrieve the host's veth device ifindex in the host's network namespace.
- Add a note why we retrieve the container's veth device ifindex in the host's
  network namespace.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1772 from brauner/2017-08-31/ensure_lxc_user_nic_tests_privilege_...
Serge Hallyn [Thu, 31 Aug 2017 17:15:22 +0000 (12:15 -0500)]
Merge pull request #1772 from brauner/2017-08-31/ensure_lxc_user_nic_tests_privilege_over_netns

lxc-user-nic: test privilege over netns on delete

6 years agonetwork: rework network creation
Christian Brauner [Thu, 31 Aug 2017 13:30:39 +0000 (15:30 +0200)]
network: rework network creation

- On unprivileged veth network creation have lxc-user-nic send the names of the
  veth devices and their respective ifindeces. The advantage of retrieving this
  information from lxc-user-nic is that we spare us sending around more stuff
  via the netpipe in start.c. Also, lxc-user-nic operates in both namespaces
  (the container's namespace and the hosts's namespace) via setns and so is
  guaranteed to retrieve the correct ifindex via if_nametoindex() which is an
  network namespace aware ioctl() call. While I'm pretty sure the ifindeces for
  veth devices are identical across network namespaces I'm weary to rely on
  this. We need the ifindexes to guarantee safe deletion of unprivileged
  network devices via lxc-user-nic later on since we use them to identify the
  network devices in their corresponding network namespaces.
- Move the network device logging from the child to the parent. The child does
  not have all of the information about the network devices available only the
  few bits it actually needs to now. The monitor process is the only process
  that needs all this information.
- The network creation code for privileged and unprivileged networks was
  previously mangled into one single function but at the same time some of the
  privileged code had additional functions that were called in other places in
  start.c. Let's divide and conquer and split out the privileged and
  unprivileged network creation into completely separate functions. This makes
  what's happening way more clear. This will also have no performance impact
  since either you are privileged and only execute the privileged network
  creation functions or you are unprivileged and only execute the unprivileged
  network creation functions.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: log ifindex for host side veth device
Christian Brauner [Thu, 31 Aug 2017 13:25:16 +0000 (15:25 +0200)]
network: log ifindex for host side veth device

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: document all fields in struct lxc_netdev
Christian Brauner [Thu, 31 Aug 2017 11:23:18 +0000 (13:23 +0200)]
network: document all fields in struct lxc_netdev

This is menial work but I'll thank myself later... a lot.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: add ifindex field for host veth device
Christian Brauner [Thu, 31 Aug 2017 11:19:33 +0000 (13:19 +0200)]
network: add ifindex field for host veth device

We should not just record the ifindex for the container's veth device but also
for the host's veth device. This is useful when {configuring,deconfiguring}
veth devices and becomes crucial when calling our lxc-user-nic setuid helper
where we rely on the ifindex to make decisions about whether we are licensed to
perform certain operations on the veth device in question.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: log veth_attr.pair and veth_attr.veth1
Christian Brauner [Thu, 31 Aug 2017 11:17:11 +0000 (13:17 +0200)]
network: log veth_attr.pair and veth_attr.veth1

If the user specified lxc.net.[i].veth.pair attribute to request that the host
side of a veth pair be given a specific name let's log it at the trace level.
Otherwise, if the user didn't not specify lxc.net.[i].veth.pair veth_attr.veth1
will contain the name of the host side veth device.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: test privilege over netns on delete
Christian Brauner [Wed, 30 Aug 2017 23:32:39 +0000 (01:32 +0200)]
lxc-user-nic: test privilege over netns on delete

When lxc-user-nic is called with the "delete" subcommand we need to make sure
that we are actually privileged over the network namespace for which we are
supposed to delete devices on the host. To this end we require that path to the
affected network namespace is passed. We then setns() to the network namespace
and drop privilege to the caller's real user id. Then we try to delete the
loopback interface which is not possible. If we are privileged over the network
namespace this operation will fail with ENOTSUP. If we are not privileged over
the network namespace we will get EPERM.

This is the first part of the commit. As of now nothing guarantees that the
caller does not just give us a random path to a network namespace it is
privileged over.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconfigure: remove slash from cgroup pattern
Christian Brauner [Wed, 30 Aug 2017 14:45:45 +0000 (16:45 +0200)]
configure: remove slash from cgroup pattern

This is the cause of the unnecessary extraneous slashes when creating cgroups.
Our lxc.system.conf page also clearly shows "lxc/%n" as example, not "/lxc%n".

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconsole: non-functional change
Christian Brauner [Wed, 30 Aug 2017 14:37:22 +0000 (16:37 +0200)]
console: non-functional change

Remove executable bit.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1769 from brauner/2017-08-30/improve_empty_cgroup_deletion
Stéphane Graber [Wed, 30 Aug 2017 14:35:06 +0000 (10:35 -0400)]
Merge pull request #1769 from brauner/2017-08-30/improve_empty_cgroup_deletion

Revert "cgfsng: try to delete parent cgroups"

6 years agoconfile: remove unnecessary cleanup code
Christian Brauner [Wed, 30 Aug 2017 10:26:42 +0000 (12:26 +0200)]
confile: remove unnecessary cleanup code

set_config_string_item() already free()s before setting the new value.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoRevert "cgfsng: try to delete parent cgroups"
Christian Brauner [Wed, 30 Aug 2017 10:26:10 +0000 (12:26 +0200)]
Revert "cgfsng: try to delete parent cgroups"

This reverts commit 92c590ae1ea40bc094603ab49c20b785cc88bb1d.

Problem:

    Commit 92c590ae1ea40bc094603ab49c20b785cc88bb1d introduced the following
    behavior:

    > cgfsng: try to delete parent cgroups
    >
    > Say we have
    >
    >     lxc.uts.name = c1
    >     lxc.cgroup.dir = lxd/a/b/c
    >
    > the path for the container's cgroup would be
    >
    >     lxd/a/b/c/c1
    >
    > When the container is shutdown we should not just try to delete "c1" we
    > should also try to delete "c", "b", "a", and "lxd". This is to ensure
    > that we don't leave empty cgroups around thereby increasing the chance
    > that we run into trouble with cgroup limits. The algorithm for this isn't
    > too costly since we can simply stop walking upwards at the first rmdir()
    > failure.

    The algorithm employs recursive_destroy() which opens each directory
    specified in lxc.cgroup.dir and tries to delete each directory within that
    directory. For example, assume "/sys/fs/cgroup/memory/lxd/a/b/c" only
    contains the cgroup "c1" for container "c1". Assume that "c1" calls
    recursive_destroy() to cleanup it's cgroups. It will first delete "c1" and
    anything underneath it. This is perfectly fine since anything underneath
    that cgroup is under its control. The new algorithm will then tell it to
    "recurse upwards". So recursive_destroy() will try to delete
    "/sys/fs/cgroup/lxd/a/b/c" next. Now assume that a second container "c2"
    has "lxc.cgroup.dir = lxd/a/b/c" set in its config file and calls
    cgroup_create(). This will create the *empty* cgroup
    "/sys/fs/cgroup/memory/lxd/a/b/c/c2". Now assume that after having created
    "c2" container "c1"'s call to recursive_destroy() reaches
    "/sys/fs/cgroup/memory/lxd/a/b/c/c2" before it is populated. Then the
    cgroup "c2" will be removed. Now "c2" calls cgroup_enter() to enter its
    created cgroup. This will fail since c1 deleted the cgroup "c2". (As a
    sidenote: This is in the set of the few race conditions that are actually
    easy to describe.)

Possible Solution:

    Instead of calling recursive_destroy() on all cgroups specified in
    lxc.cgroup.dir we only call recursive_destroy() on the container's own
    cgroup "/sys/fs/cgroup/memory/lxd/a/b/c/c1". When we start to recurse
    upwards we only call unlinkat(AT_FDCWD, path, AT_REMOVEDIR). This should
    avoid the race described above. My argument is as follows. Assume that the
    container c1 has created the cgroup "/sys/fs/cgroup/lxd/a/b/c/c1" for
    itself. Now c1 calls cgroup_destroy(). First, recursive_destroy() will be
    called on the cgroup "c1" which will delete any emtpy cgroup directories
    underneath "c1" and finally "c1" itself. This is fine since everything
    under "c1" is the container's c1 sole property. Now container c1 will call
    unlinkat() on "/sys/fs/cgroup/memory/lxd/a/b/c/c1":
    - Assume that in the meantime container c2 has created the cgroup
      "/sys/fs/cgroup/memory/lxd/a/b/c/c2". Then c1's unlinkat() will fail.
      This will stop c1 from recursing upwards. So c2's cgroup_enter() call
      will find all its cgroups intact and well. unlinkat() will come with the
      appropriate in-kernel locking which will stop it from racing with
      mkdir().
    - There's still a subtle race left. c2 might be calling an implementation
      of mkdir -p to try and create e.g. the cgroup
      "/sys/fs/cgroup/memory/lxd/a/b". Let's assume "b" exists then c2 will
      receive EEXIST on "b" and move on to create "c". Let's further assume c1
      has already deleted "c". c1 will now be able to delete
      "/sys/fs/cgroup/memory/lxd/a/b/" and c2's call to create "c" will fail.

The latter subtle race makes me rethink this approach. For now we'll just leave
empty cgroups behind since I don't want to start locking stuff.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1761 from brauner/2017-08-10/further_lxc_2.1_preparations
Serge Hallyn [Tue, 29 Aug 2017 19:57:18 +0000 (14:57 -0500)]
Merge pull request #1761 from brauner/2017-08-10/further_lxc_2.1_preparations

further lxc 2.1 preparations

6 years agoMerge pull request #1767 from xnox/upstart-ssh
Christian Brauner [Tue, 29 Aug 2017 14:52:35 +0000 (16:52 +0200)]
Merge pull request #1767 from xnox/upstart-ssh

templates/ubuntu: conditionally move upstart ssh job, as it is now op…

6 years agotemplates/ubuntu: conditionally move upstart ssh job, as it is now optional.
Dimitri John Ledkov [Tue, 29 Aug 2017 14:11:55 +0000 (15:11 +0100)]
templates/ubuntu: conditionally move upstart ssh job, as it is now optional.

Mimic the code from the debian template.

Signed-off-by: Dimitri John Ledkov <xnox@ubuntu.com>
6 years agonetwork: non-functional changes
Christian Brauner [Mon, 28 Aug 2017 10:23:29 +0000 (12:23 +0200)]
network: non-functional changes

This moves all of the network handling code into network.{c,h}. This makes what
is going on much clearer. Also it's easier to find relevant code if it is all
in one place.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoconf: increase lxc-user-nic buffer
Christian Brauner [Sun, 27 Aug 2017 12:48:52 +0000 (14:48 +0200)]
conf: increase lxc-user-nic buffer

This will allow us log more detailed failures.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: check db before trying to delete
Christian Brauner [Sun, 27 Aug 2017 07:17:10 +0000 (09:17 +0200)]
lxc-user-nic: check db before trying to delete

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agolxc-user-nic: non-functional changes
Christian Brauner [Sun, 27 Aug 2017 13:03:16 +0000 (15:03 +0200)]
lxc-user-nic: non-functional changes

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agonetwork: delete ovs for unprivileged networks
Christian Brauner [Sun, 27 Aug 2017 03:02:23 +0000 (05:02 +0200)]
network: delete ovs for unprivileged networks

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
6 years agoMerge pull request #1763 from brauner/2017-08-28/lxc_2.1_upgrade_script
Stéphane Graber [Mon, 28 Aug 2017 16:00:07 +0000 (12:00 -0400)]
Merge pull request #1763 from brauner/2017-08-28/lxc_2.1_upgrade_script

lxc-update-config: handle legacy networks

6 years agolxc-update-config: handle legacy networks
Christian Brauner [Mon, 28 Aug 2017 14:34:07 +0000 (16:34 +0200)]
lxc-update-config: handle legacy networks

Older instances of liblxc allowed to specify networks like this:

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxdbr0
lxc.network.name= eth0

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxdbr0
lxc.network.name = eth1

Each occurrence of "lxc.network.type" indicated the definition of a new
network. This syntax is not allowed in newer liblxc instances. Instead, network
must carry an index. So in new liblxc these two networks would be translated to:

lxc.net.0.type = veth
lxc.net.0.flags = up
lxc.net.0.link = lxdbr0
lxc.net.0.name= eth0

lxc.net.1.type = veth
lxc.net.1.flags = up
lxc.net.1.link = lxdbr0
lxc.net.1.name = eth1

The update script did not handle this case correctly. It should now.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>