]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephadm/troubleshooting.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / cephadm / troubleshooting.rst
1
2 Troubleshooting
3 ===============
4
5 Sometimes there is a need to investigate why a cephadm command failed or why
6 a specific service no longer runs properly.
7
8 As cephadm deploys daemons as containers, troubleshooting daemons is slightly
9 different. Here are a few tools and commands to help investigating issues.
10
11 Gathering log files
12 -------------------
13
14 Use journalctl to gather the log files of all daemons:
15
16 .. note:: By default cephadm now stores logs in journald. This means
17 that you will no longer find daemon logs in ``/var/log/ceph/``.
18
19 To read the log file of one specific daemon, run::
20
21 cephadm logs --name <name-of-daemon>
22
23 Note: this only works when run on the same host where the daemon is running. To
24 get logs of a daemon running on a different host, give the ``--fsid`` option::
25
26 cephadm logs --fsid <fsid> --name <name-of-daemon>
27
28 where the ``<fsid>`` corresponds to the cluster ID printed by ``ceph status``.
29
30 To fetch all log files of all daemons on a given host, run::
31
32 for name in $(cephadm ls | jq -r '.[].name') ; do
33 cephadm logs --fsid <fsid> --name "$name" > $name;
34 done
35
36 Collecting systemd status
37 -------------------------
38
39 To print the state of a systemd unit, run::
40
41 systemctl status "ceph-$(cephadm shell ceph fsid)@<service name>.service";
42
43
44 To fetch all state of all daemons of a given host, run::
45
46 fsid="$(cephadm shell ceph fsid)"
47 for name in $(cephadm ls | jq -r '.[].name') ; do
48 systemctl status "ceph-$fsid@$name.service" > $name;
49 done
50
51
52 List all downloaded container images
53 ------------------------------------
54
55 To list all container images that are downloaded on a host:
56
57 .. note:: ``Image`` might also be called `ImageID`
58
59 ::
60
61 podman ps -a --format json | jq '.[].Image'
62 "docker.io/library/centos:8"
63 "registry.opensuse.org/opensuse/leap:15.2"
64
65
66 Manually running containers
67 ---------------------------
68
69 Cephadm writes small wrappers that run a containers. Refer to
70 ``/var/lib/ceph/<cluster-fsid>/<service-name>/unit.run`` for the
71 container execution command.