]> git.proxmox.com Git - mirror_ubuntu-artful-kernel.git/blob - Documentation/cgroups/namespace.txt
powerpc/powernv: Add support to set power-shifting-ratio
[mirror_ubuntu-artful-kernel.git] / Documentation / cgroups / namespace.txt
1 CGroup Namespaces
2
3 CGroup Namespace provides a mechanism to virtualize the view of the
4 /proc/<pid>/cgroup file. The CLONE_NEWCGROUP clone-flag can be used with
5 clone() and unshare() syscalls to create a new cgroup namespace.
6 The process running inside the cgroup namespace will have its /proc/<pid>/cgroup
7 output restricted to cgroupns-root. cgroupns-root is the cgroup of the process
8 at the time of creation of the cgroup namespace.
9
10 Prior to CGroup Namespace, the /proc/<pid>/cgroup file used to show complete
11 path of the cgroup of a process. In a container setup (where a set of cgroups
12 and namespaces are intended to isolate processes), the /proc/<pid>/cgroup file
13 may leak potential system level information to the isolated processes.
14
15 For Example:
16 $ cat /proc/self/cgroup
17 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1
18
19 The path '/batchjobs/container_id1' can generally be considered as system-data
20 and its desirable to not expose it to the isolated process.
21
22 CGroup Namespaces can be used to restrict visibility of this path.
23 For Example:
24 # Before creating cgroup namespace
25 $ ls -l /proc/self/ns/cgroup
26 lrwxrwxrwx 1 root root 0 2014-07-15 10:37 /proc/self/ns/cgroup -> cgroup:[4026531835]
27 $ cat /proc/self/cgroup
28 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1
29
30 # unshare(CLONE_NEWCGROUP) and exec /bin/bash
31 $ ~/unshare -c
32 [ns]$ ls -l /proc/self/ns/cgroup
33 lrwxrwxrwx 1 root root 0 2014-07-15 10:35 /proc/self/ns/cgroup -> cgroup:[4026532183]
34 # From within new cgroupns, process sees that its in the root cgroup
35 [ns]$ cat /proc/self/cgroup
36 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/
37
38 # From global cgroupns:
39 $ cat /proc/<pid>/cgroup
40 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1
41
42 # Unshare cgroupns along with userns and mountns
43 # Following calls unshare(CLONE_NEWCGROUP|CLONE_NEWUSER|CLONE_NEWNS), then
44 # sets up uid/gid map and execs /bin/bash
45 $ ~/unshare -c -u -m
46 # Originally, we were in /batchjobs/container_id1 cgroup. Mount our own cgroup
47 # hierarchy.
48 [ns]$ mount -t cgroup cgroup /tmp/cgroup
49 [ns]$ ls -l /tmp/cgroup
50 total 0
51 -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.controllers
52 -r--r--r-- 1 root root 0 2014-10-13 09:32 cgroup.populated
53 -rw-r--r-- 1 root root 0 2014-10-13 09:25 cgroup.procs
54 -rw-r--r-- 1 root root 0 2014-10-13 09:32 cgroup.subtree_control
55
56 The cgroupns-root (/batchjobs/container_id1 in above example) becomes the
57 filesystem root for the namespace specific cgroupfs mount.
58
59 The virtualization of /proc/self/cgroup file combined with restricting
60 the view of cgroup hierarchy by namespace-private cgroupfs mount
61 should provide a completely isolated cgroup view inside the container.
62
63 In its current form, the cgroup namespaces patcheset provides following
64 behavior:
65
66 (1) The 'cgroupns-root' for a cgroup namespace is the cgroup in which
67 the process calling unshare is running.
68 For ex. if a process in /batchjobs/container_id1 cgroup calls unshare,
69 cgroup /batchjobs/container_id1 becomes the cgroupns-root.
70 For the init_cgroup_ns, this is the real root ('/') cgroup
71 (identified in code as cgrp_dfl_root.cgrp).
72
73 (2) The cgroupns-root cgroup does not change even if the namespace
74 creator process later moves to a different cgroup.
75 $ ~/unshare -c # unshare cgroupns in some cgroup
76 [ns]$ cat /proc/self/cgroup
77 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/
78 [ns]$ mkdir sub_cgrp_1
79 [ns]$ echo 0 > sub_cgrp_1/cgroup.procs
80 [ns]$ cat /proc/self/cgroup
81 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1
82
83 (3) Each process gets its CGROUPNS specific view of /proc/<pid>/cgroup
84 (a) Processes running inside the cgroup namespace will be able to see
85 cgroup paths (in /proc/self/cgroup) only inside their root cgroup
86 [ns]$ sleep 100000 & # From within unshared cgroupns
87 [1] 7353
88 [ns]$ echo 7353 > sub_cgrp_1/cgroup.procs
89 [ns]$ cat /proc/7353/cgroup
90 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1
91
92 (b) From global cgroupns, the real cgroup path will be visible:
93 $ cat /proc/7353/cgroup
94 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/batchjobs/container_id1/sub_cgrp_1
95
96 (c) From a sibling cgroupns (cgroupns root-ed at a different cgroup), cgroup
97 path relative to its own cgroupns-root will be shown:
98 # ns2's cgroupns-root is at '/batchjobs/container_id2'
99 [ns2]$ cat /proc/7353/cgroup
100 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/../container_id2/sub_cgrp_1
101
102 Note that the relative path always starts with '/' to indicate that its
103 relative to the cgroupns-root of the caller.
104
105 (4) Processes inside a cgroupns can move in-and-out of the cgroupns-root
106 (if they have proper access to external cgroups).
107 # From inside cgroupns (with cgroupns-root at /batchjobs/container_id1), and
108 # assuming that the global hierarchy is still accessible inside cgroupns:
109 $ cat /proc/7353/cgroup
110 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/sub_cgrp_1
111 $ echo 7353 > batchjobs/container_id2/cgroup.procs
112 $ cat /proc/7353/cgroup
113 0:cpuset,cpu,cpuacct,memory,devices,freezer,hugetlb:/../container_id2
114
115 Note that this kind of setup is not encouraged. A task inside cgroupns
116 should only be exposed to its own cgroupns hierarchy. Otherwise it makes
117 the virtualization of /proc/<pid>/cgroup less useful.
118
119 (5) Setns to another cgroup namespace is allowed when:
120 (a) the process has CAP_SYS_ADMIN in its current userns
121 (b) the process has CAP_SYS_ADMIN in the target cgroupns' userns
122 No implicit cgroup changes happen with attaching to another cgroupns. It
123 is expected that the somone moves the attaching process under the target
124 cgroupns-root.
125
126 (6) When some thread from a multi-threaded process unshares its
127 cgroup-namespace, the new cgroupns gets applied to the entire process (all
128 the threads). For the unified-hierarchy this is expected as it only allows
129 process-level containerization. For the legacy hierarchies this may be
130 unexpected. So all the threads in the process will have the same cgroup.
131
132 (7) The cgroup namespace is alive as long as there is atleast 1
133 process inside it. When the last process exits, the cgroup
134 namespace is destroyed. The cgroupns-root and the actual cgroups
135 remain though.
136
137 (8) Namespace specific cgroup hierarchy can be mounted by a process running
138 inside cgroupns:
139 $ mount -t cgroup -o __DEVEL__sane_behavior cgroup $MOUNT_POINT
140
141 This will mount the unified cgroup hierarchy with cgroupns-root as the
142 filesystem root. The process needs CAP_SYS_ADMIN in its userns and mntns.