]> git.proxmox.com Git - mirror_lxcfs.git/blob - README.md
Merge pull request #570 from mihalicyn/readme_sanitizers
[mirror_lxcfs.git] / README.md
1 # lxcfs
2
3 ## Introduction
4 LXCFS is a small FUSE filesystem written with the intention of making Linux
5 containers feel more like a virtual machine. It started as a side-project of
6 `LXC` but is useable by any runtime.
7
8 LXCFS will take care that the information provided by crucial files in `procfs`
9 such as:
10
11 ```
12 /proc/cpuinfo
13 /proc/diskstats
14 /proc/meminfo
15 /proc/stat
16 /proc/swaps
17 /proc/uptime
18 /proc/slabinfo
19 /sys/devices/system/cpu
20 /sys/devices/system/cpu/online
21 ```
22
23 are container aware such that the values displayed (e.g. in `/proc/uptime`)
24 really reflect how long the container is running and not how long the host is
25 running.
26
27 Prior to the implementation of cgroup namespaces by Serge Hallyn `LXCFS` also
28 provided a container aware `cgroupfs` tree. It took care that the container
29 only had access to cgroups underneath it's own cgroups and thus provided
30 additional safety. For systems without support for cgroup namespaces `LXCFS`
31 will still provide this feature but it is mostly considered deprecated.
32
33 ## Upgrading `LXCFS` without restart
34
35 `LXCFS` is split into a shared library (a libtool module, to be precise)
36 `liblxcfs` and a simple binary `lxcfs`. When upgrading to a newer version of
37 `LXCFS` the `lxcfs` binary will not be restarted. Instead it will detect that
38 a new version of the shared library is available and will reload it using
39 `dlclose(3)` and `dlopen(3)`. This design was chosen so that the fuse main loop
40 that `LXCFS` uses will not need to be restarted. If it were then all containers
41 using `LXCFS` would need to be restarted since they would otherwise be left
42 with broken fuse mounts.
43
44 To force a reload of the shared library at the next possible instance simply
45 send `SIGUSR1` to the pid of the running `LXCFS` process. This can be as simple
46 as doing:
47
48 rm /usr/lib64/lxcfs/liblxcfs.so # MUST to delete the old library file first
49 cp liblxcfs.so /usr/lib64/lxcfs/liblxcfs.so # to place new library file
50 kill -s USR1 $(pidof lxcfs) # reload
51
52 ### musl
53
54 To achieve smooth upgrades through shared library reloads `LXCFS` also relies
55 on the fact that when `dlclose(3)` drops the last reference to the shared
56 library destructors are run and when `dlopen(3)` is called constructors are
57 run. While this is true for `glibc` it is not true for `musl` (See the section
58 [Unloading libraries](https://wiki.musl-libc.org/functional-differences-from-glibc.html).).
59 So users of `LXCFS` on `musl` are advised to restart `LXCFS` completely and all
60 containers making use of it.
61
62 ## Building
63
64 In order to build LXCFS install fuse and the fuse development headers according
65 to your distro. LXCFS prefers `fuse3` but does work with new enough `fuse2`
66 versions:
67
68 git clone git://github.com/lxc/lxcfs
69 cd lxcfs
70 meson setup -Dinit-script=systemd --prefix=/usr build/
71 meson compile -C build/
72 sudo meson install -C build/
73
74 To build with sanitizers you have to specify `-Db_sanitize=...` option to `meson setup`.
75 For example, to enable ASAN and UBSAN:
76
77 meson setup -Dinit-script=systemd --prefix=/usr build/ -Db_sanitize=address,undefined
78 meson compile -C build/
79
80 ## Usage
81 The recommended command to run lxcfs is:
82
83 sudo mkdir -p /var/lib/lxcfs
84 sudo lxcfs /var/lib/lxcfs
85
86 A container runtime wishing to use `LXCFS` should then bind mount the
87 approriate files into the correct places on container startup.
88
89 ### LXC
90 In order to use lxcfs with systemd-based containers, you can either use
91 LXC 1.1 in which case it should work automatically, or otherwise, copy
92 the `lxc.mount.hook` and `lxc.reboot.hook` files (once built) from this tree to
93 `/usr/share/lxcfs`, make sure it is executable, then add the
94 following lines to your container configuration:
95 ```
96 lxc.mount.auto = cgroup:mixed
97 lxc.autodev = 1
98 lxc.kmsg = 0
99 lxc.include = /usr/share/lxc/config/common.conf.d/00-lxcfs.conf
100 ```
101
102 ## Using with Docker
103
104 ```
105 docker run -it -m 256m --memory-swap 256m \
106 -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
107 -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
108 -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
109 -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
110 -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
111 -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
112 -v /var/lib/lxcfs/proc/slabinfo:/proc/slabinfo:rw \
113 -v /var/lib/lxcfs/sys/devices/system/cpu:/sys/devices/system/cpu:rw \
114 ubuntu:18.04 /bin/bash
115 ```
116
117 In a system with swap enabled, the parameter "-u" can be used to set all values in "meminfo" that refer to the swap to 0.
118
119 sudo lxcfs -u /var/lib/lxcfs
120
121 ## Swap handling
122 If you noticed LXCFS not showing any SWAP in your container despite
123 having SWAP on your system, please read this section carefully and look
124 for instructions on how to enable SWAP accounting for your distribution.
125
126 Swap cgroup handling on Linux is very confusing and there just isn't a
127 perfect way for LXCFS to handle it.
128
129 Terminology used below:
130 - RAM refers to `memory.usage_in_bytes` and `memory.limit_in_bytes`
131 - RAM+SWAP refers to `memory.memsw.usage_in_bytes` and `memory.memsw.limit_in_bytes`
132
133 The main issues are:
134 - SWAP accounting is often opt-in and, requiring a special kernel boot
135 time option (`swapaccount=1`) and/or special kernel build options
136 (`CONFIG_MEMCG_SWAP`).
137
138 - Both a RAM limit and a RAM+SWAP limit can be set. The delta however
139 isn't the available SWAP space as the kernel is still free to SWAP as
140 much of the RAM as it feels like. This makes it impossible to render
141 a SWAP device size as using the delta between RAM and RAM+SWAP for that
142 wouldn't account for the kernel swapping more pages, leading to swap
143 usage exceeding swap total.
144
145 - It's impossible to disable SWAP in a given container. The closest
146 that can be done is setting swappiness down to 0 which severly limits
147 the risk of swapping pages but doesn't eliminate it.
148
149 As a result, LXCFS had to make some compromise which go as follow:
150 - When SWAP accounting isn't enabled, no SWAP space is reported at all.
151 This is simply because there is no way to know the SWAP consumption.
152 The container may very much be using some SWAP though, there's just
153 no way to know how much of it and showing a SWAP device would require
154 some kind of SWAP usage to be reported. Showing the host value would be
155 completely wrong, showing a 0 value would be equallty wrong.
156
157 - Because SWAP usage for a given container can exceed the delta between
158 RAM and RAM+SWAP, the SWAP size is always reported to be the smaller of
159 the RAM+SWAP limit or the host SWAP device itself. This ensures that at no
160 point SWAP usage will be allowed to exceed the SWAP size.
161
162 - If the swappiness is set to 0 and there is no SWAP usage, no SWAP is reported.
163 However if there is SWAP usage, then a SWAP device of the size of the
164 usage (100% full) is reported. This provides adequate reporting of
165 the memory consumption while preventing applications from assuming more
166 SWAP is available.