]> git.proxmox.com Git - mirror_lxcfs.git/blame - README.md
proc: Fix /proc/cpuinfo not respecting personality
[mirror_lxcfs.git] / README.md
CommitLineData
60f73aff
SG
1# lxcfs
2
c397924a 3## Introduction
12993ccc
CB
4LXCFS is a small FUSE filesystem written with the intention of making Linux
5containers feel more like a virtual machine. It started as a side-project of
6`LXC` but is useable by any runtime.
758ad80c 7
12993ccc
CB
8LXCFS will take care that the information provided by crucial files in `procfs`
9such as:
10
11```
12/proc/cpuinfo
13/proc/diskstats
14/proc/meminfo
15/proc/stat
16/proc/swaps
17/proc/uptime
6cc153e6 18/proc/slabinfo
71f17cd2 19/sys/devices/system/cpu/online
12993ccc
CB
20```
21
22are container aware such that the values displayed (e.g. in `/proc/uptime`)
23really reflect how long the container is running and not how long the host is
24running.
25
26Prior to the implementation of cgroup namespaces by Serge Hallyn `LXCFS` also
27provided a container aware `cgroupfs` tree. It took care that the container
28only had access to cgroups underneath it's own cgroups and thus provided
29additional safety. For systems without support for cgroup namespaces `LXCFS`
8b9d0a3f
CB
30will still provide this feature but it is mostly considered deprecated.
31
32## Upgrading `LXCFS` without restart
33
34`LXCFS` is split into a shared library (a libtool module, to be precise)
35`liblxcfs` and a simple binary `lxcfs`. When upgrading to a newer version of
36`LXCFS` the `lxcfs` binary will not be restarted. Instead it will detect that
37a new version of the shared library is available and will reload it using
38`dlclose(3)` and `dlopen(3)`. This design was chosen so that the fuse main loop
39that `LXCFS` uses will not need to be restarted. If it were then all containers
40using `LXCFS` would need to be restarted since they would otherwise be left
41with broken fuse mounts.
42
3f9b9afb
CB
43To force a reload of the shared library at the next possible instance simply
44send `SIGUSR1` to the pid of the running `LXCFS` process. This can be as simple
45as doing:
46
e5c2d189 47 rm /usr/lib64/lxcfs/liblxcfs.so # MUST to delete the old library file first
48 cp liblxcfs.so /usr/lib64/lxcfs/liblxcfs.so # to place new library file
49 kill -s USR1 $(pidof lxcfs) # reload
3f9b9afb 50
8b9d0a3f
CB
51### musl
52
53To achieve smooth upgrades through shared library reloads `LXCFS` also relies
54on the fact that when `dlclose(3)` drops the last reference to the shared
55library destructors are run and when `dlopen(3)` is called constructors are
56run. While this is true for `glibc` it is not true for `musl` (See the section
57[Unloading libraries](https://wiki.musl-libc.org/functional-differences-from-glibc.html).).
3f9b9afb
CB
58So users of `LXCFS` on `musl` are advised to restart `LXCFS` completely and all
59containers making use of it.
955ce662 60
bbf99398 61## Building
bbf99398 62
d18b5eb5
CB
63In order to build LXCFS install fuse and the fuse development headers according
64to your distro. LXCFS prefers `fuse3` but does work with new enough `fuse2`
65versions:
66
bbf99398
LW
67 git clone git://github.com/lxc/lxcfs
68 cd lxcfs
d18b5eb5
CB
69 meson setup -Dinit-script=systemd --prefix=/usr build/
70 meson compile -C build/
71 sudo meson install -C build/
bbf99398 72
ef53a287
AM
73To build with sanitizers you have to specify `-Db_sanitize=...` option to `meson setup`.
74For example, to enable ASAN and UBSAN:
75
76 meson setup -Dinit-script=systemd --prefix=/usr build/ -Db_sanitize=address,undefined
77 meson compile -C build/
78
c397924a 79## Usage
758ad80c
SH
80The recommended command to run lxcfs is:
81
c397924a 82 sudo mkdir -p /var/lib/lxcfs
40dd7f1b 83 sudo lxcfs /var/lib/lxcfs
7456f3b5 84
12993ccc
CB
85A container runtime wishing to use `LXCFS` should then bind mount the
86approriate files into the correct places on container startup.
87
88### LXC
7456f3b5
SG
89In order to use lxcfs with systemd-based containers, you can either use
90LXC 1.1 in which case it should work automatically, or otherwise, copy
77647bf9
EG
91the `lxc.mount.hook` and `lxc.reboot.hook` files (once built) from this tree to
92`/usr/share/lxcfs`, make sure it is executable, then add the
93following lines to your container configuration:
5b1e45dd 94```
77647bf9 95lxc.mount.auto = cgroup:mixed
1a188fcb 96lxc.autodev = 1
ef65395d 97lxc.kmsg = 0
77647bf9 98lxc.include = /usr/share/lxc/config/common.conf.d/00-lxcfs.conf
5b1e45dd 99```
12993ccc 100
7e60aa1b 101## Using with Docker
102
103```
104docker run -it -m 256m --memory-swap 256m \
105 -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
106 -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
107 -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
108 -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
109 -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
110 -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
6cc153e6 111 -v /var/lib/lxcfs/proc/slabinfo:/proc/slabinfo:rw \
808b7db2 112 -v /var/lib/lxcfs/sys/devices/system/cpu:/sys/devices/system/cpu:rw \
7e60aa1b 113 ubuntu:18.04 /bin/bash
114 ```
115
116 In a system with swap enabled, the parameter "-u" can be used to set all values in "meminfo" that refer to the swap to 0.
117
118 sudo lxcfs -u /var/lib/lxcfs
6279c0f4
SG
119
120## Swap handling
121If you noticed LXCFS not showing any SWAP in your container despite
122having SWAP on your system, please read this section carefully and look
123for instructions on how to enable SWAP accounting for your distribution.
124
125Swap cgroup handling on Linux is very confusing and there just isn't a
126perfect way for LXCFS to handle it.
127
128Terminology used below:
129 - RAM refers to `memory.usage_in_bytes` and `memory.limit_in_bytes`
130 - RAM+SWAP refers to `memory.memsw.usage_in_bytes` and `memory.memsw.limit_in_bytes`
131
132The main issues are:
133 - SWAP accounting is often opt-in and, requiring a special kernel boot
134 time option (`swapaccount=1`) and/or special kernel build options
135 (`CONFIG_MEMCG_SWAP`).
136
137 - Both a RAM limit and a RAM+SWAP limit can be set. The delta however
138 isn't the available SWAP space as the kernel is still free to SWAP as
139 much of the RAM as it feels like. This makes it impossible to render
140 a SWAP device size as using the delta between RAM and RAM+SWAP for that
141 wouldn't account for the kernel swapping more pages, leading to swap
142 usage exceeding swap total.
143
144 - It's impossible to disable SWAP in a given container. The closest
145 that can be done is setting swappiness down to 0 which severly limits
146 the risk of swapping pages but doesn't eliminate it.
147
148As a result, LXCFS had to make some compromise which go as follow:
149 - When SWAP accounting isn't enabled, no SWAP space is reported at all.
150 This is simply because there is no way to know the SWAP consumption.
151 The container may very much be using some SWAP though, there's just
152 no way to know how much of it and showing a SWAP device would require
153 some kind of SWAP usage to be reported. Showing the host value would be
154 completely wrong, showing a 0 value would be equallty wrong.
155
156 - Because SWAP usage for a given container can exceed the delta between
157 RAM and RAM+SWAP, the SWAP size is always reported to be the smaller of
158 the RAM+SWAP limit or the host SWAP device itself. This ensures that at no
159 point SWAP usage will be allowed to exceed the SWAP size.
160
161 - If the swappiness is set to 0 and there is no SWAP usage, no SWAP is reported.
162 However if there is SWAP usage, then a SWAP device of the size of the
163 usage (100% full) is reported. This provides adequate reporting of
164 the memory consumption while preventing applications from assuming more
165 SWAP is available.