]> git.proxmox.com Git - ceph.git/blame - ceph/doc/cephadm/operations.rst
udpdate/drop patches for 15.2.0
[ceph.git] / ceph / doc / cephadm / operations.rst
CommitLineData
9f95a23c
TL
1==================
2Cephadm Operations
3==================
4
5Watching cephadm log messages
6=============================
7
8Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
9monitor progress in realtime with::
10
11 # ceph -W cephadm
12
13By default it will show info-level events and above. To see
14debug-level messages too::
15
16 # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
17 # ceph -W cephadm --watch-debug
18
19Be careful: the debug messages are very verbose!
20
21You can see recent events with::
22
23 # ceph log last cephadm
24
25These events are also logged to the ``ceph.cephadm.log`` file on
26monitor hosts and to the monitor daemons' stderr.
27
28
29Ceph daemon logs
30================
31
32Logging to stdout
33-----------------
34
35Traditionally, Ceph daemons have logged to ``/var/log/ceph``. By
36default, cephadm daemons log to stderr and the logs are
37captured by the container runtime environment. For most systems, by
38default, these logs are sent to journald and accessible via
39``journalctl``.
40
41For example, to view the logs for the daemon ``mon.foo`` for a cluster
42with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
43something like::
44
45 journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
46
47This works well for normal operations when logging levels are low.
48
49To disable logging to stderr::
50
51 ceph config set global log_to_stderr false
52 ceph config set global mon_cluster_log_to_stderr false
53
54Logging to files
55----------------
56
57You can also configure Ceph daemons to log to files instead of stderr,
58just like they have in the past. When logging to files, Ceph logs appear
59in ``/var/log/ceph/<cluster-fsid>``.
60
61To enable logging to files::
62
63 ceph config set global log_to_file true
64 ceph config set global mon_cluster_log_to_file true
65
66We recommend disabling logging to stderr (see above) or else everything
67will be logged twice::
68
69 ceph config set global log_to_stderr false
70 ceph config set global mon_cluster_log_to_stderr false
71
72By default, cephadm sets up log rotation on each host to rotate these
73files. You can configure the logging retention schedule by modifying
74``/etc/logrotate.d/ceph.<cluster-fsid>``.
75
76
77Data location
78=============
79
80Cephadm daemon data and logs in slightly different locations than older
81versions of ceph:
82
83* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
84 that by default cephadm logs via stderr and the container runtime,
85 so these logs are normally not present.
86* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
87 (besides logs).
88* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
89 an individual daemon.
90* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
91 the cluster.
92* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
93 data directories for stateful daemons (e.g., monitor, prometheus)
94 that have been removed by cephadm.
95
96Disk usage
97----------
98
99Because a few Ceph daemons may store a significant amount of data in
100``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
101moving this directory to its own disk, partition, or logical volume so
102that it does not fill up the root file system.
103
104
105
106SSH Configuration
107=================
108
109Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
110with those hosts in a secure way.
111
112
113Default behavior
114----------------
115
116Cephadm stores an SSH key in the monitor that is used to
117connect to remote hosts. When the cluster is bootstrapped, this SSH
118key is generated automatically and no additional configuration
119is necessary.
120
121A *new* SSH key can be generated with::
122
123 ceph cephadm generate-key
124
125The public portion of the SSH key can be retrieved with::
126
127 ceph cephadm get-pub-key
128
129The currently stored SSH key can be deleted with::
130
131 ceph cephadm clear-key
132
133You can make use of an existing key by directly importing it with::
134
135 ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
136 ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
137
138You will then need to restart the mgr daemon to reload the configuration with::
139
140 ceph mgr fail
141
142
143Customizing the SSH configuration
144---------------------------------
145
146Cephadm generates an appropriate ``ssh_config`` file that is
147used for connecting to remote hosts. This configuration looks
148something like this::
149
150 Host *
151 User root
152 StrictHostKeyChecking no
153 UserKnownHostsFile /dev/null
154
155There are two ways to customize this configuration for your environment:
156
157#. Import a customized configuration file that will be stored
158 by the monitor with::
159
160 ceph cephadm set-ssh-config -i <ssh_config_file>
161
162 To remove a customized SSH config and revert back to the default behavior::
163
164 ceph cephadm clear-ssh-config
165
166#. You can configure a file location for the SSH configuration file with::
167
168 ceph config set mgr mgr/cephadm/ssh_config_file <path>
169
170 We do *not recommend* this approach. The path name must be
171 visible to *any* mgr daemon, and cephadm runs all daemons as
172 containers. That means that the file either need to be placed
173 inside a customized container image for your deployment, or
174 manually distributed to the mgr data directory
175 (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
176 ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
177
178
179Health checks
180=============
181
182CEPHADM_PAUSED
183--------------
184
185Cephadm background work has been paused with ``ceph orch pause``. Cephadm
186continues to perform passive monitoring activities (like checking
187host and daemon status), but it will not make any changes (like deploying
188or removing daemons).
189
190Resume cephadm work with::
191
192 ceph orch resume
193
194CEPHADM_STRAY_HOST
195------------------
196
197One or more hosts have running Ceph daemons but are not registered as
198hosts managed by *cephadm*. This means that those services cannot
199currently be managed by cephadm (e.g., restarted, upgraded, included
200in `ceph orch ps`).
201
202You can manage the host(s) with::
203
204 ceph orch host add *<hostname>*
205
206Note that you may need to configure SSH access to the remote host
207before this will work.
208
209Alternatively, you can manually connect to the host and ensure that
210services on that host are removed or migrated to a host that is
211managed by *cephadm*.
212
213You can also disable this warning entirely with::
214
215 ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
216
217CEPHADM_STRAY_DAEMON
218--------------------
219
220One or more Ceph daemons are running but not are not managed by
221*cephadm*. This may be because they were deployed using a different
222tool, or because they were started manually. Those
223services cannot currently be managed by cephadm (e.g., restarted,
224upgraded, or included in `ceph orch ps`).
225
226If the daemon is a stateful one (monitor or OSD), it should be adopted
227by cephadm; see :ref:`cephadm-adoption`. For stateless daemons, it is
228usually easiest to provision a new daemon with the ``ceph orch apply``
229command and then stop the unmanaged daemon.
230
231This warning can be disabled entirely with::
232
233 ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
234
235CEPHADM_HOST_CHECK_FAILED
236-------------------------
237
238One or more hosts have failed the basic cephadm host check, which verifies
239that (1) the host is reachable and cephadm can be executed there, and (2)
240that the host satisfies basic prerequisites, like a working container
241runtime (podman or docker) and working time synchronization.
242If this test fails, cephadm will no be able to manage services on that host.
243
244You can manually run this check with::
245
246 ceph cephadm check-host *<hostname>*
247
248You can remove a broken host from management with::
249
250 ceph orch host rm *<hostname>*
251
252You can disable this health warning with::
253
254 ceph config set mgr mgr/cephadm/warn_on_failed_host_check false