]> git.proxmox.com Git - mirror_lxcfs.git/log
mirror_lxcfs.git
5 weeks agoRelease LXCFS 6.0.0 main stable-6.0 v6.0.0
Stéphane Graber [Thu, 28 Mar 2024 02:44:27 +0000 (22:44 -0400)]
Release LXCFS 6.0.0

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
5 weeks agoMerge pull request #633 from mihalicyn/cgroup_walkup_to_root_fixup
Christian Brauner [Wed, 27 Mar 2024 14:03:36 +0000 (15:03 +0100)]
Merge pull request #633 from mihalicyn/cgroup_walkup_to_root_fixup

cgroup_utils: explicitly check for cgroup2 FDs in cgroup_walkup_to_root

5 weeks agocgroup_utils: explicitly check for cgroup2 FDs in cgroup_walkup_to_root
Alexander Mikhalitsyn [Wed, 27 Mar 2024 13:47:55 +0000 (14:47 +0100)]
cgroup_utils: explicitly check for cgroup2 FDs in cgroup_walkup_to_root

See:
https://github.com/lxc/lxcfs/pull/617#discussion_r1533524372

Suggested-by: Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
5 weeks agoMerge pull request #617 from alexhudspith/cgroup2-swap-534
Stéphane Graber [Wed, 27 Mar 2024 13:03:19 +0000 (09:03 -0400)]
Merge pull request #617 from alexhudspith/cgroup2-swap-534

Fix swap handling for cgroups v2

5 weeks agoproc: Fix swap handling for cgroups v2 (zero limits)
Alex Hudspith [Mon, 6 Nov 2023 09:17:38 +0000 (09:17 +0000)]
proc: Fix swap handling for cgroups v2 (zero limits)

Since memory.swap.max = 0 is valid under v2, limits of 0 must not be
treated differently. Instead, use UINT64_MAX as the default limit. This aligns
with cgroups v1 behaviour anyway since 'limit_in_bytes' files contain a large
number for unspecified limits (2^63).

Resolves: #534
Signed-off-by: Alex Hudspith <alex@hudspith.io>
5 weeks agoproc: Fix swap handling for cgroups v2 (can_use_swap)
Alex Hudspith [Mon, 6 Nov 2023 09:17:38 +0000 (09:17 +0000)]
proc: Fix swap handling for cgroups v2 (can_use_swap)

On cgroups v2, there are no swap current/max files at the cgroup root, so
can_use_swap must look lower in the hierarchy to determine if swap accounting
is enabled. To also account for memory accounting being turned off at some
level, walk the hierarchy upwards from lxcfs' own cgroup.

Signed-off-by: Alex Hudspith <alex@hudspith.io>
[ added check cgroup pointer is not NULL in lxcfs_init() ]
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
5 weeks agoproc_fuse: Fix get_swap_info typo swtotal == 0 -> *swtotal == 0
Alex Hudspith [Mon, 6 Nov 2023 09:17:38 +0000 (09:17 +0000)]
proc_fuse: Fix get_swap_info typo swtotal == 0 -> *swtotal == 0

Signed-off-by: Alex Hudspith <alex@hudspith.io>
5 weeks agoMerge pull request #631 from stgraber/main
Alexander Mikhalitsyn [Tue, 26 Mar 2024 14:23:51 +0000 (15:23 +0100)]
Merge pull request #631 from stgraber/main

Revert "github: workaround CI issue with ASAN"

5 weeks agoRevert "github: workaround CI issue with ASAN"
Stéphane Graber [Tue, 26 Mar 2024 14:21:07 +0000 (10:21 -0400)]
Revert "github: workaround CI issue with ASAN"

This reverts commit 351775512350bfb45c6486f39a7aa7cc76f690c7.

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
6 weeks agoMerge pull request #630 from mihalicyn/sys_write_forbid
Christian Brauner [Mon, 18 Mar 2024 11:51:07 +0000 (12:51 +0100)]
Merge pull request #630 from mihalicyn/sys_write_forbid

lxcfs: tighten policy about write() syscall

6 weeks agogithub: workaround CI issue with ASAN
Alexander Mikhalitsyn [Mon, 18 Mar 2024 10:41:01 +0000 (11:41 +0100)]
github: workaround CI issue with ASAN

https://github.com/actions/runner-images/issues/9491
https://github.com/google/fuzztest/commit/7b4f288ce94dd92a48a638301bb7ec5df0cf8c94

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
6 weeks agotests: use --enable-cgroup for tests
Alexander Mikhalitsyn [Fri, 15 Mar 2024 16:13:44 +0000 (17:13 +0100)]
tests: use --enable-cgroup for tests

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
6 weeks agolxcfs: introduce new option --enable-cgroup
Alexander Mikhalitsyn [Fri, 15 Mar 2024 15:49:47 +0000 (16:49 +0100)]
lxcfs: introduce new option --enable-cgroup

During our private discussion, Stéphane proposed
to add a new option --enable-cgroup to explicitly
enable old cgroup emulation code

It's worth mentioning that cgroup code in LXCFS
is not widely used, because it was written before
cgroup namespace era and not actual these days.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
6 weeks agosysfs: forbid write()
Alexander Mikhalitsyn [Fri, 15 Mar 2024 15:47:57 +0000 (16:47 +0100)]
sysfs: forbid write()

It's just dangerous to allow passthrough of write()
syscall anywhere under emulated sysfs subtree.

Let's forbid it.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
3 months agoMerge pull request #626 from stgraber/main
Serge Hallyn [Mon, 15 Jan 2024 02:06:31 +0000 (20:06 -0600)]
Merge pull request #626 from stgraber/main

lxc.mount.hook: Skip cpu sysfs logic if missing target

3 months agolxc.mount.hook: Skip cpu sysfs logic if missing target
Stéphane Graber [Sun, 14 Jan 2024 21:30:22 +0000 (16:30 -0500)]
lxc.mount.hook: Skip cpu sysfs logic if missing target

Closes #625

Suggested-by: amateur80lvl
Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
3 months agoMerge pull request #622 from zhaixiaojuan/main
Stéphane Graber [Sun, 14 Jan 2024 21:27:01 +0000 (22:27 +0100)]
Merge pull request #622 from zhaixiaojuan/main

Add macro pivot&bpf for loongarch64

3 months agoAdd macro pivot&bpf for loongarch64
zhaixiaojuan [Tue, 5 Dec 2023 07:06:10 +0000 (02:06 -0500)]
Add macro pivot&bpf for loongarch64

Signed-off-by: zhaixiaojuan <zhaixiaojuan@loongson.cn>
4 months agoMerge pull request #624 from peppaJoeng/main
Stéphane Graber [Thu, 14 Dec 2023 14:36:34 +0000 (09:36 -0500)]
Merge pull request #624 from peppaJoeng/main

typofix: fix incorrect printing in lxcfs help interface

4 months agotypofix: fix incorrect printing in lxcfs help interface
vegbir [Thu, 14 Dec 2023 07:27:07 +0000 (07:27 +0000)]
typofix: fix incorrect printing in lxcfs help interface

Signed-off-by: vegbir <yangjiaqi16@huawei.com>
5 months agoMerge pull request #620 from tych0/mkdir-lxcfs-target-dir
Stéphane Graber [Wed, 29 Nov 2023 18:58:43 +0000 (13:58 -0500)]
Merge pull request #620 from tych0/mkdir-lxcfs-target-dir

systemd: mkdir -p the target mount dir

5 months agosystemd: mkdir -p the target mount dir
Tycho Andersen [Wed, 29 Nov 2023 18:49:55 +0000 (11:49 -0700)]
systemd: mkdir -p the target mount dir

This is probably in a postinst for a debian package or a snap somewhere,
but we're repackaging it somewhere and I have an ugly sed to fix it up.
Let's do it here instead.

Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
6 months agoMerge pull request #615 from kyeongy/main
Stéphane Graber [Wed, 4 Oct 2023 16:07:40 +0000 (12:07 -0400)]
Merge pull request #615 from kyeongy/main

proc: fix MemAvailable in /proc/meminfo to exclude tmpfs files

6 months agoproc: fix MemAvailable in /proc/meminfo to exclude tmpfs files
Kyeong Yoo [Tue, 3 Oct 2023 03:36:51 +0000 (16:36 +1300)]
proc: fix MemAvailable in /proc/meminfo to exclude tmpfs files

The "total_cache" from memory.stat of cgroup includes
the memory used by tmpfs files ("total_shmem"). Considering
it as available memory is wrong because files created
on a tmpfs file system cannot be simply reclaimed.

So the available memory is calculated with the sum of:
 * Memory the kernel knows is free
 * Memory that contained in the kernel active file LRU,
   that can be reclaimed if necessary
 * Memory that is contained in the kernel non-active file
   LRU, that can be reclaimed if necessary

Signed-off-by: Kyeong Yoo <kyeong.yoo@alliedtelesis.co.nz>
7 months agoMerge pull request #612 from mihalicyn/load_daemon_signature_v2
Stéphane Graber [Fri, 29 Sep 2023 16:15:13 +0000 (12:15 -0400)]
Merge pull request #612 from mihalicyn/load_daemon_signature_v2

loadavg: make cleanup of start_loadavg

7 months agoMerge pull request #614 from stgraber/main
Christian Brauner [Fri, 29 Sep 2023 16:12:05 +0000 (18:12 +0200)]
Merge pull request #614 from stgraber/main

lxcfs: Add startup message

7 months agoMerge pull request #613 from mihalicyn/cpuview_debug_print_fix
Stéphane Graber [Fri, 29 Sep 2023 16:07:36 +0000 (12:07 -0400)]
Merge pull request #613 from mihalicyn/cpuview_debug_print_fix

cpuview: pass a correct argument to lxcfs_debug

7 months agolxcfs: Add startup message
Stéphane Graber [Fri, 29 Sep 2023 16:06:45 +0000 (12:06 -0400)]
lxcfs: Add startup message

Closes #560

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
7 months agocpuview: pass a correct argument to lxcfs_debug
Alexander Mikhalitsyn [Fri, 29 Sep 2023 15:28:46 +0000 (17:28 +0200)]
cpuview: pass a correct argument to lxcfs_debug

struct cg_proc_stat *cur;
...
lxcfs_debug("Removing stat node for %s\n", cur);

should be:

lxcfs_debug("Removing stat node for %s\n", cur->cg);

Only reproducible when DEBUG macro is defined.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
7 months agoloadavg: make cleanup of start_loadavg
Alexander Mikhalitsyn [Fri, 29 Sep 2023 14:34:05 +0000 (16:34 +0200)]
loadavg: make cleanup of start_loadavg

Cleanup start_loadavg code:
- add a new external symbol load_daemon_v2 with the pthread_create-like signature
- make hacky casts of pthread_t to int (and reverse) unnecessary for new API users

Related to: #610

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
7 months agoMerge pull request #611 from mihalicyn/cgroup2_devices_cleanup
Stéphane Graber [Fri, 29 Sep 2023 14:18:05 +0000 (10:18 -0400)]
Merge pull request #611 from mihalicyn/cgroup2_devices_cleanup

cgroups: cleanup and remove unused cgroup2_devices code

7 months agocgroups: cleanup and remove unused cgroup2_devices code
Alexander Mikhalitsyn [Fri, 29 Sep 2023 13:31:38 +0000 (15:31 +0200)]
cgroups: cleanup and remove unused cgroup2_devices code

cgroup2 driver code was imported from LXC, but cgroup2_devices
part is not used in LXCFS. Let's remove it.

Fixes: #607
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
7 months agoMerge pull request #610 from listout/clang16-build-fix
Stéphane Graber [Fri, 15 Sep 2023 15:17:24 +0000 (11:17 -0400)]
Merge pull request #610 from listout/clang16-build-fix

proc_loadavg.c: Fix incompatible integer to pointer conversion

7 months agoMerge pull request #609 from gibmat/fix-arm32-cpuinfo
Stéphane Graber [Fri, 15 Sep 2023 15:16:50 +0000 (11:16 -0400)]
Merge pull request #609 from gibmat/fix-arm32-cpuinfo

proc: Fix /proc/cpuinfo not respecting personality

7 months agoproc_loadavg.c: Fix incompatible integer to pointer conversion
Brahmajit Das [Tue, 5 Sep 2023 04:15:06 +0000 (04:15 +0000)]
proc_loadavg.c: Fix incompatible integer to pointer conversion

Newer compiler like Clang 16 and GCC 14 have certain error enabled by
default, namely -Werror=incompatible-function-pointer-types. Which
resutls in build error such as:

proc_loadavg.c:606:10: error: incompatible integer to pointer conversion returning int from a function with result type pthread_t

My patch supresses the error for now, but a proper fix would be better.
Fist discovered on Gentoo linux (bug #894348).

Bug: https://bugs.gentoo.org/894348
Closes: https://github.com/lxc/lxcfs/issues/561
Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com>
7 months agoproc: Fix /proc/cpuinfo not respecting personality
Mathias Gibbens [Mon, 4 Sep 2023 00:13:57 +0000 (00:13 +0000)]
proc: Fix /proc/cpuinfo not respecting personality

It was found that the personality within the container was not being
properly respected, which for large numbers of CPUs would break
reporting of /proc/cpuinfo in arm32 containers running on an arm64 host.

Signed-off-by: Mathias Gibbens <gibmat@debian.org>
8 months agoMerge pull request #606 from mihalicyn/loadavg_deadlock_fix
Christian Brauner [Thu, 10 Aug 2023 06:51:28 +0000 (08:51 +0200)]
Merge pull request #606 from mihalicyn/loadavg_deadlock_fix

proc_loadavg: fix ABBA deadlock between read/refresh

8 months agoproc_loadavg: fix ABBA deadlock between read/refresh
Alexander Mikhalitsyn [Wed, 9 Aug 2023 16:39:46 +0000 (18:39 +0200)]
proc_loadavg: fix ABBA deadlock between read/refresh

Idea of this fix is to always take nested locks in
the same order.

At the same time, we adding an extra check to insert_node()
that prevents adding a new load_node with the same cgroup
(->cg field) value. This is theoretically possible because
we don't hold .rilock/.lock when we call insert_node().

It looks like we have this issue from the initial
implementation of loadavg virtualization and it's hardly
reproducible that's why we weren't able to notice it.

Fixes: #605
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
9 months agoMerge pull request #594 from mihalicyn/fuse_interrupt
Stéphane Graber [Tue, 1 Aug 2023 19:53:22 +0000 (15:53 -0400)]
Merge pull request #594 from mihalicyn/fuse_interrupt

Partial support for FUSE_INTERRUPT

9 months agoMerge pull request #604 from stgraber/main
Christian Brauner [Mon, 24 Jul 2023 15:41:31 +0000 (17:41 +0200)]
Merge pull request #604 from stgraber/main

github: Update for main branch

9 months agogithub: Update for main branch
Stéphane Graber [Mon, 24 Jul 2023 15:28:04 +0000 (11:28 -0400)]
github: Update for main branch

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
12 months agocpuview: start to use interruptible lock primitives
Alexander Mikhalitsyn [Wed, 12 Apr 2023 22:16:32 +0000 (00:16 +0200)]
cpuview: start to use interruptible lock primitives

Let's start using fuse-interruptible locks in cpuview.
It's better to start from one place instead of converting everything
at once to prevent global degradations.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
12 months agolxcfs: add fuse interruptible locks
Alexander Mikhalitsyn [Wed, 12 Apr 2023 21:38:01 +0000 (23:38 +0200)]
lxcfs: add fuse interruptible locks

Adds a few helper functions which represents fuse interruptible
versions of a classical pthread locking primitives:
extern int mutex_lock_interruptible(pthread_mutex_t *l);
extern int rwlock_rdlock_interruptible(pthread_rwlock_t *l);
extern int rwlock_wrlock_interruptible(pthread_rwlock_t *l);

Does not change behavior.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
12 months agolxcfs: preparation for FUSE_INTERRUPT support
Alexander Mikhalitsyn [Wed, 12 Apr 2023 18:29:38 +0000 (20:29 +0200)]
lxcfs: preparation for FUSE_INTERRUPT support

This commit prepares lxcfs for FUSE_INTERRUPT support by
enabling interrupt on libfuse side and setting dummy signal handler
with SA_RESTART flag.

SA_RESTART is very imporant there, otherwise we can break something
accidentally.

This commit has no user-visible effects.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
12 months agoMerge pull request #591 from yowenter/master
Stéphane Graber [Sat, 15 Apr 2023 02:25:57 +0000 (22:25 -0400)]
Merge pull request #591 from yowenter/master

 cpuset cgroup path maybe different from cpu cgroup path in kubernetes

12 months agocpuview: resolve cpu cgroup path separately from cpuset
yowenter [Tue, 28 Mar 2023 06:32:07 +0000 (14:32 +0800)]
cpuview: resolve cpu cgroup path separately from cpuset
the cgroup path is different in kubernetes with containerd runtime.

Signed-off-by: yowenter <wenter.wu@gmail.com>
12 months agoMerge pull request #595 from mihalicyn/github_actions_2204
Stéphane Graber [Sat, 15 Apr 2023 02:22:16 +0000 (22:22 -0400)]
Merge pull request #595 from mihalicyn/github_actions_2204

github: start using ubuntu-22.04 image

12 months agotests: adapt for cgroup2
Alexander Mikhalitsyn [Thu, 13 Apr 2023 09:54:59 +0000 (11:54 +0200)]
tests: adapt for cgroup2

Make tests work with non-hybrid cgroup2 configuration.
- skip cgroupfs emulation tests (it's just obsolete and doesn't emulate cgroup2 properly)
- adapt another tests to cgroup2 (tasks -> cgroup.procs, and so on)

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
12 months agogithub: remove ubuntu-18.04
Alexander Mikhalitsyn [Thu, 13 Apr 2023 09:32:27 +0000 (11:32 +0200)]
github: remove ubuntu-18.04

Unfortunately, it's deprecated and not working properly:
https://github.blog/changelog/2022-08-09-github-actions-the-ubuntu-18-04-actions-runner-image-is-being-deprecated-and-will-be-removed-by-12-1-22/

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
12 months agogithub: start using ubuntu-22.04 image
Alexander Mikhalitsyn [Wed, 12 Apr 2023 22:31:20 +0000 (00:31 +0200)]
github: start using ubuntu-22.04 image

We've moved snap from core20 to core22, let's
test LXCFS on Ubuntu 22.04 too.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
13 months agoMerge pull request #590 from mihalicyn/diskstats_fix
Stéphane Graber [Mon, 20 Mar 2023 14:05:52 +0000 (10:05 -0400)]
Merge pull request #590 from mihalicyn/diskstats_fix

proc: fix /proc/diskstats output format

13 months agoproc: fix /proc/diskstats output format
Alexander Mikhalitsyn [Mon, 20 Mar 2023 10:17:48 +0000 (11:17 +0100)]
proc: fix /proc/diskstats output format

We've lost 15th column (discard) in /proc/diskstats output.

After this fix /proc/diskstats format in full agreement with 4.18 kernel.
In 5.5+ kernels two new fields were introduced flush_req/flush_time.
Unfortunately, we can't add support for them as cgroup doesn't provide us
with this stat info.

See also:
https://github.com/torvalds/linux/commit/b6866318657717c8914673a6394894d12bc9ff5e

Fixes #589
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
15 months agoMerge pull request #586 from stgraber/master
Christian Brauner [Tue, 31 Jan 2023 08:38:29 +0000 (09:38 +0100)]
Merge pull request #586 from stgraber/master

Revert "init: respect --prefix when installing systemd unit files"

15 months agoRevert "init: respect --prefix when installing systemd unit files"
Stéphane Graber [Tue, 31 Jan 2023 01:27:15 +0000 (20:27 -0500)]
Revert "init: respect --prefix when installing systemd unit files"

This reverts commit 9bfc897a0c6b010464dc490e955dbc1d1a66e4c3.

Reported to be causing issues in #555

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
15 months agoMerge pull request #584 from Blub/jinja-trailing-newlines
Stéphane Graber [Thu, 19 Jan 2023 16:31:32 +0000 (11:31 -0500)]
Merge pull request #584 from Blub/jinja-trailing-newlines

build: tools: keep trailing newline in jinja2 renderer

15 months agobuild: tools: keep trailing newline in jinja2 renderer
Wolfgang Bumiller [Thu, 19 Jan 2023 10:13:46 +0000 (11:13 +0100)]
build: tools: keep trailing newline in jinja2 renderer

Otherwise /usr/share/lxc/config/common.conf.d/00-lxcfs.conf
loses its trailing newline

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
15 months agoMerge pull request #579 from mihalicyn/cpuview_deadlock master
Christian Brauner [Mon, 16 Jan 2023 12:15:02 +0000 (13:15 +0100)]
Merge pull request #579 from mihalicyn/cpuview_deadlock

cpuview: fix ABBA deadlock in find_proc_stat_node

15 months agocpuview: fix ABBA deadlock in find_proc_stat_node
Alexander Mikhalitsyn [Mon, 16 Jan 2023 11:34:52 +0000 (12:34 +0100)]
cpuview: fix ABBA deadlock in find_proc_stat_node

Thanks to detailed report from Nikhil it was discovered
that on some workloads reads from lxcfs getting stuck.

After analysis of kernel crashdump it was found, that many
"mtail" processes waiting on read() from /proc/stat file.

First suspect was my last commit that fixes use-after-free,
but unfortunately it also adds ABBA deadlock.

Thread 1                                                   Thread 2

find_proc_stat_node():
rwlock_read                                               rwlock_read
mutex_lock(some_node) [taken]                      mutex_lock(some_node) [wait T1]
rwlock_unlock
rwlock_wrlock (prune_proc_stat_history call) [wait T2]

BOOM. That's deadlock.

Fix is simple, let's just move prune_proc_stat_history call
before taking mutex on cg_proc_stat node.

Fixes: 54db3e71b ("cpuview: fix possible use-after-free in find_proc_stat_node")
Issue #471

Reported-by: Nikhil Kshirsagar <nikhil.kshirsagar@canonical.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
15 months agoMerge pull request #577 from deleriux/file_info_api
Christian Brauner [Thu, 12 Jan 2023 16:48:56 +0000 (17:48 +0100)]
Merge pull request #577 from deleriux/file_info_api

lxcfs: handle NULL path in lxcfs_releasedir/lxcfs_release

15 months agolxcfs: handle NULL path in lxcfs_releasedir/lxcfs_release
Matthew Ife [Wed, 11 Jan 2023 14:30:06 +0000 (14:30 +0000)]
lxcfs: handle NULL path in lxcfs_releasedir/lxcfs_release

Implement a file_info_type function.
Create a series of defines for the file type

Inspect the file type for each file handler and perform the relevant
release code to free any memory.

If the type cannot be determined, print an error and return -EINVAL.

This should also be slightly faster than a strcmp() but its not likely
that measurable.

This code could be used elsewhere in the process to reduce the strcmp
requirements, but for now just handle the release/releasedir case.

This fixes SEGV in lxcfs_relase/releasedir when path=NULL but invoked a
strcmp().

Signed-off-by: Matthew Ife <matthewi@mustardsystems.com>
15 months agoMerge pull request #557 from tych0/dont-mask-system-cpu
Stéphane Graber [Fri, 6 Jan 2023 22:16:50 +0000 (17:16 -0500)]
Merge pull request #557 from tych0/dont-mask-system-cpu

sysfs: don't mask cpus in /sys/devices/system/cpu

15 months agoMerge pull request #558 from tych0/cpu-num-proc-stat
Stéphane Graber [Wed, 4 Jan 2023 17:20:15 +0000 (12:20 -0500)]
Merge pull request #558 from tych0/cpu-num-proc-stat

/proc/stat: render physical cpu number in non-view mode

15 months agosysfs: don't mask cpus in /sys/devices/system/cpu
Tycho Andersen [Thu, 27 Oct 2022 16:23:08 +0000 (10:23 -0600)]
sysfs: don't mask cpus in /sys/devices/system/cpu

The kernel does not mask the cpu%d dirs when they are offlined:

(root) /sys/devices/system/cpu # cat online
0-7
(root) /sys/devices/system/cpu # chcpu -d 4
CPU 4 disabled
(root) /sys/devices/system/cpu # cat online
0-3,5-7
(root) /sys/devices/system/cpu # cat offline
4
(root) /sys/devices/system/cpu # ls -al
total 0
drwxr-xr-x 16 root root    0 Oct 25 20:42 .
drwxr-xr-x 10 root root    0 Oct 25 20:42 ..
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu0
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu1
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu2
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu3
drwxr-xr-x  5 root root    0 Oct 25 20:42 cpu4
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu5
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu6
drwxr-xr-x  7 root root    0 Oct 25 20:42 cpu7
drwxr-xr-x  2 root root    0 Oct 25 20:43 cpufreq
drwxr-xr-x  2 root root    0 Oct 26 15:19 cpuidle
drwxr-xr-x  2 root root    0 Oct 26 15:19 hotplug
-r--r--r--  1 root root 4096 Oct 25 20:42 isolated
-r--r--r--  1 root root 4096 Oct 25 20:43 kernel_max
-r--r--r--  1 root root 4096 Oct 26 15:19 modalias
-r--r--r--  1 root root 4096 Oct 26 15:19 offline
-r--r--r--  1 root root 4096 Oct 25 20:42 online
-r--r--r--  1 root root 4096 Oct 25 20:43 possible
drwxr-xr-x  2 root root    0 Oct 26 15:19 power
-r--r--r--  1 root root 4096 Oct 25 20:43 present
drwxr-xr-x  2 root root    0 Oct 26 15:19 smt
-rw-r--r--  1 root root 4096 Oct 25 20:42 uevent
drwxr-xr-x  2 root root    0 Oct 26 15:19 vulnerabilities

let's not mask them in lxcfs either. In particular, we have observed this
causing problems with some JVMs' implementation of
Runtime.getRuntime().availableProcessors().

This is a bit of a strange patch: it seems masking this dir was always
incorrect, so we could go back to just not offering it as an lxcfs
endpoint, and having people use sysfs' implementation directly. But maybe
people are expecting it now, so I've left it as a proxy. Perhaps a more
appropriate patch is to just delete it entirely and add an API extension
note?

Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
15 months ago/proc/stat: render physical cpu number in non-view mode
Tycho Andersen [Fri, 28 Oct 2022 20:24:54 +0000 (14:24 -0600)]
/proc/stat: render physical cpu number in non-view mode

When the kernel has an offline CPU, it only renders the online CPUs in
/proc/stat.

When in non-use_view mode, /sys/devices/system/cpu/online shows the CPU
numbers as they actually are on the physical system, but /proc/stat used
"virtual" (i.e. always zero-indexed) numbers, which causes confusion for
some applications. Let's use the same use_view logic in /proc/stat as well.

(root) ~ # chcpu -d 4
CPU 4 disabled
(root) ~ # cat /sys/devices/system/cpu/online
0-3,5-7
(root) ~ # cat /proc/stat
cpu  5599257 116799 1924319 150675607 147630 0 51038 2454 0 0
cpu0 664385 19911 221078 19009148 14812 0 9832 2238 0 0
cpu1 783128 16478 282310 18416428 14325 0 6287 27 0 0
cpu2 822527 15344 275820 18355637 11215 0 5762 26 0 0
cpu3 807309 15631 277152 18399668 9697 0 7623 27 0 0
cpu5 712191 13680 245495 20759989 13959 0 6968 46 0 0
cpu6 728409 14828 247631 20794157 40037 0 5624 29 0 0
cpu7 709679 14321 251154 20785870 18872 0 6226 32 0 0
intr 10891106673 0 10 0 0 418 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 8 8 406552 434812 1863708 2335638 258448 3878047 1907518 1500768 1860249 1506066 1743056 1527121 1564894 258398 779468 275941 275383 277085 271838 271729 273211 271677 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 814064094
btime 1666730552
processes 5152733
procs_running 1
procs_blocked 0
softirq 772324623 4 31396133 263035 39649581 20 4 1593903 442777055 51 256644837

Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
16 months agoMerge pull request #571 from mihalicyn/libfuse3_direct_IO
Stéphane Graber [Fri, 16 Dec 2022 15:32:23 +0000 (10:32 -0500)]
Merge pull request #571 from mihalicyn/libfuse3_direct_IO

Libfuse3 direct io

16 months agolxcfs: fix copypaste typo in error message
Alexander Mikhalitsyn [Fri, 16 Dec 2022 15:24:44 +0000 (16:24 +0100)]
lxcfs: fix copypaste typo in error message

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agolxcfs: explicitly enable direct_IO for libfuse3
Alexander Mikhalitsyn [Fri, 16 Dec 2022 15:22:35 +0000 (16:22 +0100)]
lxcfs: explicitly enable direct_IO for libfuse3

It was discovered that with libfuse3 we lost FOPEN_DIRECT_IO flag
on (struct fuse_file)->open_flags. I'm sure that this is the reason
for all the strange bugs that our users met recently.

Fixes:
https://github.com/lxc/lxcfs/issues/565
https://discuss.linuxcontainers.org/t/number-of-cpus-reported-by-proc-stat-fluctuates-causing-issues/15780/14

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agoMerge pull request #570 from mihalicyn/readme_sanitizers
Christian Brauner [Tue, 6 Dec 2022 23:08:37 +0000 (00:08 +0100)]
Merge pull request #570 from mihalicyn/readme_sanitizers

Enable ASAN and UBSAN in PR tests

16 months agogithub: enable ASAN and UBSAN during PR tests
Alexander Mikhalitsyn [Tue, 6 Dec 2022 17:54:19 +0000 (18:54 +0100)]
github: enable ASAN and UBSAN during PR tests

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agocpuset_parse: check input string in cpuset_nexttok
Alexander Mikhalitsyn [Tue, 6 Dec 2022 19:06:15 +0000 (20:06 +0100)]
cpuset_parse: check input string in cpuset_nexttok

We have to check input string length to be at least 1 in size,
to safely add 1 to string pointer.

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agoREADME: how to build with sanitizers
Alexander Mikhalitsyn [Tue, 6 Dec 2022 17:46:07 +0000 (18:46 +0100)]
README: how to build with sanitizers

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agoMerge pull request #569 from mihalicyn/meson_coverity
Christian Brauner [Tue, 6 Dec 2022 17:38:06 +0000 (18:38 +0100)]
Merge pull request #569 from mihalicyn/meson_coverity

github: make coverity workflow work with meson

16 months agogithub: make coverity workflow work with meson
Alexander Mikhalitsyn [Tue, 6 Dec 2022 17:16:23 +0000 (18:16 +0100)]
github: make coverity workflow work with meson

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
16 months agoMerge pull request #555 from tych0/init-respect-prefix
Stéphane Graber [Mon, 5 Dec 2022 15:41:46 +0000 (10:41 -0500)]
Merge pull request #555 from tych0/init-respect-prefix

init: respect --prefix when installing systemd unit files

16 months agoMerge pull request #568 from mihalicyn/cg_proc_stat_use_after_free
Christian Brauner [Mon, 5 Dec 2022 15:15:56 +0000 (16:15 +0100)]
Merge pull request #568 from mihalicyn/cg_proc_stat_use_after_free

[RFC] cpuview: fix possible use-after-free in find_proc_stat_node

16 months agocpuview: fix possible use-after-free in find_proc_stat_node
Alexander Mikhalitsyn [Fri, 2 Dec 2022 11:57:33 +0000 (12:57 +0100)]
cpuview: fix possible use-after-free in find_proc_stat_node

Our current lock design uses 2 sync primitives.
First (pthread_rwlock) protects hash table buckets.
Second (pthread_mutex) protects each struct cg_proc_stat
from concurrent modification. But the problem is that function
find_proc_stat_node() can return a pointer to the node
(struct cg_proc_stat) which can be freed by prune_proc_stat_history()
call *before* we take pthread_mutex. Moreover, we perform
memory release of (struct cg_proc_stat) in prune_proc_stat_list()
without any protection like refcounter or mutex on (struct cg_proc_stat).

An attempt to guess what happens in:
https://github.com/lxc/lxcfs/issues/565
https://discuss.linuxcontainers.org/t/number-of-cpus-reported-by-proc-stat-fluctuates-causing-issues/15780/14

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
17 months agoMerge pull request #567 from mihalicyn/cpuinfo_with_personality
Christian Brauner [Thu, 1 Dec 2022 09:16:45 +0000 (10:16 +0100)]
Merge pull request #567 from mihalicyn/cpuinfo_with_personality

cpuinfo with personality

17 months agocpuview: paththrough personality when reading cpuinfo
Alexander Mikhalitsyn [Mon, 28 Nov 2022 13:54:56 +0000 (14:54 +0100)]
cpuview: paththrough personality when reading cpuinfo

Let's change processing thread personality if caller personality
is different. It allows to read /proc/cpuinfo properly in
some cases (arm64 rely on current->personality inside Linux kernel).

https://github.com/lxc/lxcfs/issues/553

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
17 months agoutils: add get_task_personality helper
Alexander Mikhalitsyn [Mon, 28 Nov 2022 13:51:24 +0000 (14:51 +0100)]
utils: add get_task_personality helper

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
17 months agomacro.h: add strnprintf macro
Alexander Mikhalitsyn [Wed, 30 Nov 2022 22:57:37 +0000 (23:57 +0100)]
macro.h: add strnprintf macro

Stolen from LXC

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
17 months agoutils: add safe_uint32() helper
Alexander Mikhalitsyn [Wed, 30 Nov 2022 16:25:24 +0000 (17:25 +0100)]
utils: add safe_uint32() helper

Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
17 months agoMerge pull request #564 from blue-troy/master
Stéphane Graber [Thu, 24 Nov 2022 04:04:26 +0000 (23:04 -0500)]
Merge pull request #564 from blue-troy/master

doc: guide for mount /sys/devices/system/cpu in docker demo

17 months agodoc: guide for mount /sys/devices/system/cpu in docker demo
blue-troy [Wed, 23 Nov 2022 07:51:50 +0000 (15:51 +0800)]
doc: guide for mount /sys/devices/system/cpu in docker demo

Signed-off-by: blue-troy <12729455+blue-troy@users.noreply.github.com>
17 months agoMerge pull request #563 from gibmat/fix-ia64-build
Stéphane Graber [Thu, 17 Nov 2022 23:55:41 +0000 (18:55 -0500)]
Merge pull request #563 from gibmat/fix-ia64-build

Fix build on ia64

17 months agoFix build on ia64
Mathias Gibbens [Thu, 17 Nov 2022 21:57:58 +0000 (21:57 +0000)]
Fix build on ia64

The relevant code was added in commit 35acc24, but the function/macro
prctl_arg() didn't seem to be defined anywhere in the repo. lxc
currently has a corresponding macro defined in src/lxc/macro.h that
casts the value to an unsigned long. But 0 doesn't require any special
handling, so remove the call to prctl_arg().

Verified that the code compiles properly on Debian's ia64 porterbox
(yttrium).

Signed-off-by: Mathias Gibbens <gibmat@debian.org>
19 months agoinit: respect --prefix when installing systemd unit files
Tycho Andersen [Mon, 12 Sep 2022 20:21:41 +0000 (14:21 -0600)]
init: respect --prefix when installing systemd unit files

Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
20 months agoMerge pull request #552 from bytedance/set-oom-score-adj
Stéphane Graber [Tue, 23 Aug 2022 21:06:24 +0000 (17:06 -0400)]
Merge pull request #552 from bytedance/set-oom-score-adj

set oom_score_adj of lxcfs process to -1000

20 months agoset oom_score_adj of lxcfs process to -1000
Teng Hu [Tue, 23 Aug 2022 09:31:58 +0000 (17:31 +0800)]
set oom_score_adj of lxcfs process to -1000

Disable oom killing entirely to minimize the hassle comes from
lxcfs exiting unexpectedly, e.g. the mountpoint got lost.

Signed-off-by: Teng Hu <huteng.ht@bytedance.com>
21 months agoMerge pull request #550 from Blub/bindings-reinit-fuse3-fixup
Stéphane Graber [Fri, 29 Jul 2022 14:56:01 +0000 (10:56 -0400)]
Merge pull request #550 from Blub/bindings-reinit-fuse3-fixup

fix reinitialization with fuse3

21 months agofix reinitialization with fuse3
Wolfgang Bumiller [Fri, 29 Jul 2022 07:30:10 +0000 (09:30 +0200)]
fix reinitialization with fuse3

With fuse3 `fuse_get_context` returns NULL before fuse was
fully initialized, so we must not access it.

Futher, we call 'do_reload' for normal initialization as
well, so let's prevent that from re-initializing the
bindings initially and only do this on actual reloads,
otherwise we do it twice on startup.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Fixes #549

21 months agoMerge pull request #545 from Blub/init-lib-at-reload-drop-fuse-init-return
Stéphane Graber [Tue, 5 Jul 2022 21:44:19 +0000 (17:44 -0400)]
Merge pull request #545 from Blub/init-lib-at-reload-drop-fuse-init-return

re-initialize library after reload

21 months agoMerge pull request #547 from stgraber/master
Christian Brauner [Tue, 5 Jul 2022 21:33:10 +0000 (23:33 +0200)]
Merge pull request #547 from stgraber/master

Complete Github Actions migration

21 months agogithub: Validate target branch
Stéphane Graber [Tue, 5 Jul 2022 21:26:42 +0000 (17:26 -0400)]
github: Validate target branch

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
21 months agogithub: Restrict permissions
Stéphane Graber [Tue, 5 Jul 2022 21:26:30 +0000 (17:26 -0400)]
github: Restrict permissions

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
21 months agoMerge pull request #546 from Blub/sys-cpu-cpu-readdir
Christian Brauner [Tue, 5 Jul 2022 12:47:02 +0000 (14:47 +0200)]
Merge pull request #546 from Blub/sys-cpu-cpu-readdir

replace opathdir with opendir_flags

21 months agoreplace opathdir with opendir_flags
Wolfgang Bumiller [Tue, 5 Jul 2022 11:55:20 +0000 (13:55 +0200)]
replace opathdir with opendir_flags

`opathdir` was used to replace `opendir` in order to ensure
`O_NOFOLLOW` and `O_CLOEXEC` were set, however it also added
`O_PATH` which prevents `readdir`/`getdents` to be used on
it, causing the `/sys/devices/system/cpu/<subdir>`
directories to be empty.

Instead, let's have an `opendir_flags` utility which simply
passed additional flags to the `open(..., O_DIRECTORY)` call
preceding `fdopendir()`.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
21 months agore-initialize library after reload
Wolfgang Bumiller [Tue, 5 Jul 2022 08:26:26 +0000 (10:26 +0200)]
re-initialize library after reload

When introducing versioned options, we started using fuse's
"init" callback in order to tell the library to set
`can_use_sys_cpu` and `has_versioned_opts` accordingly.

However, we forgot to also do this on a reload. Fix this by
simply calling `lxcfs_fuse_init()` in `do_reload()` as well.

Additionaly: ignore lxcfs_fuse_init()'s return value.
We just "passed through" the private_data from fuse which is
set via the `fuse_main()` call.

It's better to not leave this up to the library anyway in
order to make it easier to be fuse version agnostic in the
future.

Without this, issuing a reload to lxcfs would cause
files in `/sys/devices/system/cpu/` to be visible via
`readdir`, but accessing them would fail:

    ~ # ls /sys/devices/system/cpu/
    ls: /sys/devices/system/cpu/cpuidle: No such file or directory
    ls: /sys/devices/system/cpu/uevent: No such file or directory
    (...)

    ~ # echo /sys/devices/system/cpu/*
    /sys/devices/system/cpu/cpu0 /sys/devices/system/cpu/cpu1 (...)

    ~ # strace stat /sys/devices/system/cpu/cpu0
    lstat("/sys/devices/system/cpu/cpu0", 0x7ffdb2c57a00) = -1 ENOENT (No such file or directory)

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
21 months agoMerge pull request #544 from mchtech/cgoup-v2-cpu-count
Christian Brauner [Tue, 5 Jul 2022 05:25:32 +0000 (07:25 +0200)]
Merge pull request #544 from mchtech/cgoup-v2-cpu-count

cgroup v2: /sys/devices/system/cpu/online returns zero

21 months agocgroup v2: return cpuset cpu count when no quota is set
michuan [Mon, 4 Jul 2022 02:49:49 +0000 (10:49 +0800)]
cgroup v2: return cpuset cpu count when no quota is set

Signed-off-by: michuan <michu_an@126.com>
23 months agoMerge pull request #541 from bytedance/cleanup-getsize
Christian Brauner [Tue, 24 May 2022 12:31:06 +0000 (14:31 +0200)]
Merge pull request #541 from bytedance/cleanup-getsize

sysfs: cleanup sys_devices_system_cpu_online_getsize