4 LXCFS is a small FUSE filesystem written with the intention of making Linux
5 containers feel more like a virtual machine. It started as a side-project of
6 `LXC` but is useable by any runtime.
8 LXCFS will take care that the information provided by crucial files in `procfs`
18 /sys/devices/system/cpu/online
21 are container aware such that the values displayed (e.g. in `/proc/uptime`)
22 really reflect how long the container is running and not how long the host is
25 Prior to the implementation of cgroup namespaces by Serge Hallyn `LXCFS` also
26 provided a container aware `cgroupfs` tree. It took care that the container
27 only had access to cgroups underneath it's own cgroups and thus provided
28 additional safety. For systems without support for cgroup namespaces `LXCFS`
29 will still provide this feature but it is mostly considered deprecated.
31 ## Upgrading `LXCFS` without restart
33 `LXCFS` is split into a shared library (a libtool module, to be precise)
34 `liblxcfs` and a simple binary `lxcfs`. When upgrading to a newer version of
35 `LXCFS` the `lxcfs` binary will not be restarted. Instead it will detect that
36 a new version of the shared library is available and will reload it using
37 `dlclose(3)` and `dlopen(3)`. This design was chosen so that the fuse main loop
38 that `LXCFS` uses will not need to be restarted. If it were then all containers
39 using `LXCFS` would need to be restarted since they would otherwise be left
40 with broken fuse mounts.
42 To force a reload of the shared library at the next possible instance simply
43 send `SIGUSR1` to the pid of the running `LXCFS` process. This can be as simple
46 kill -s USR1 $(pidof lxcfs)
50 To achieve smooth upgrades through shared library reloads `LXCFS` also relies
51 on the fact that when `dlclose(3)` drops the last reference to the shared
52 library destructors are run and when `dlopen(3)` is called constructors are
53 run. While this is true for `glibc` it is not true for `musl` (See the section
54 [Unloading libraries](https://wiki.musl-libc.org/functional-differences-from-glibc.html).).
55 So users of `LXCFS` on `musl` are advised to restart `LXCFS` completely and all
56 containers making use of it.
59 Build lxcfs as follows:
61 yum install fuse fuse-lib fuse-devel
62 git clone git://github.com/lxc/lxcfs
70 The recommended command to run lxcfs is:
72 sudo mkdir -p /var/lib/lxcfs
73 sudo lxcfs /var/lib/lxcfs
75 A container runtime wishing to use `LXCFS` should then bind mount the
76 approriate files into the correct places on container startup.
79 In order to use lxcfs with systemd-based containers, you can either use
80 LXC 1.1 in which case it should work automatically, or otherwise, copy
81 the `lxc.mount.hook` and `lxc.reboot.hook` files (once built) from this tree to
82 `/usr/share/lxcfs`, make sure it is executable, then add the
83 following lines to your container configuration:
85 lxc.mount.auto = cgroup:mixed
88 lxc.include = /usr/share/lxc/config/common.conf.d/00-lxcfs.conf
94 docker run -it -m 256m --memory-swap 256m \
95 -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
96 -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
97 -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
98 -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
99 -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
100 -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
101 ubuntu:18.04 /bin/bash
104 In a system with swap enabled, the parameter "-u" can be used to set all values in "meminfo" that refer to the swap to 0.
106 sudo lxcfs -u /var/lib/lxcfs
109 If you noticed LXCFS not showing any SWAP in your container despite
110 having SWAP on your system, please read this section carefully and look
111 for instructions on how to enable SWAP accounting for your distribution.
113 Swap cgroup handling on Linux is very confusing and there just isn't a
114 perfect way for LXCFS to handle it.
116 Terminology used below:
117 - RAM refers to `memory.usage_in_bytes` and `memory.limit_in_bytes`
118 - RAM+SWAP refers to `memory.memsw.usage_in_bytes` and `memory.memsw.limit_in_bytes`
121 - SWAP accounting is often opt-in and, requiring a special kernel boot
122 time option (`swapaccount=1`) and/or special kernel build options
123 (`CONFIG_MEMCG_SWAP`).
125 - Both a RAM limit and a RAM+SWAP limit can be set. The delta however
126 isn't the available SWAP space as the kernel is still free to SWAP as
127 much of the RAM as it feels like. This makes it impossible to render
128 a SWAP device size as using the delta between RAM and RAM+SWAP for that
129 wouldn't account for the kernel swapping more pages, leading to swap
130 usage exceeding swap total.
132 - It's impossible to disable SWAP in a given container. The closest
133 that can be done is setting swappiness down to 0 which severly limits
134 the risk of swapping pages but doesn't eliminate it.
136 As a result, LXCFS had to make some compromise which go as follow:
137 - When SWAP accounting isn't enabled, no SWAP space is reported at all.
138 This is simply because there is no way to know the SWAP consumption.
139 The container may very much be using some SWAP though, there's just
140 no way to know how much of it and showing a SWAP device would require
141 some kind of SWAP usage to be reported. Showing the host value would be
142 completely wrong, showing a 0 value would be equallty wrong.
144 - Because SWAP usage for a given container can exceed the delta between
145 RAM and RAM+SWAP, the SWAP size is always reported to be the smaller of
146 the RAM+SWAP limit or the host SWAP device itself. This ensures that at no
147 point SWAP usage will be allowed to exceed the SWAP size.
149 - If the swappiness is set to 0 and there is no SWAP usage, no SWAP is reported.
150 However if there is SWAP usage, then a SWAP device of the size of the
151 usage (100% full) is reported. This provides adequate reporting of
152 the memory consumption while preventing applications from assuming more