]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephadm/operations.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / cephadm / operations.rst
1 ==================
2 Cephadm Operations
3 ==================
4
5 Watching cephadm log messages
6 =============================
7
8 Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
9 monitor progress in realtime with::
10
11 # ceph -W cephadm
12
13 By default it will show info-level events and above. To see
14 debug-level messages too::
15
16 # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
17 # ceph -W cephadm --watch-debug
18
19 Be careful: the debug messages are very verbose!
20
21 You can see recent events with::
22
23 # ceph log last cephadm
24
25 These events are also logged to the ``ceph.cephadm.log`` file on
26 monitor hosts and to the monitor daemons' stderr.
27
28
29 Ceph daemon logs
30 ================
31
32 Logging to stdout
33 -----------------
34
35 Traditionally, Ceph daemons have logged to ``/var/log/ceph``. By
36 default, cephadm daemons log to stderr and the logs are
37 captured by the container runtime environment. For most systems, by
38 default, these logs are sent to journald and accessible via
39 ``journalctl``.
40
41 For example, to view the logs for the daemon ``mon.foo`` for a cluster
42 with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
43 something like::
44
45 journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
46
47 This works well for normal operations when logging levels are low.
48
49 To disable logging to stderr::
50
51 ceph config set global log_to_stderr false
52 ceph config set global mon_cluster_log_to_stderr false
53
54 Logging to files
55 ----------------
56
57 You can also configure Ceph daemons to log to files instead of stderr,
58 just like they have in the past. When logging to files, Ceph logs appear
59 in ``/var/log/ceph/<cluster-fsid>``.
60
61 To enable logging to files::
62
63 ceph config set global log_to_file true
64 ceph config set global mon_cluster_log_to_file true
65
66 We recommend disabling logging to stderr (see above) or else everything
67 will be logged twice::
68
69 ceph config set global log_to_stderr false
70 ceph config set global mon_cluster_log_to_stderr false
71
72 By default, cephadm sets up log rotation on each host to rotate these
73 files. You can configure the logging retention schedule by modifying
74 ``/etc/logrotate.d/ceph.<cluster-fsid>``.
75
76
77 Data location
78 =============
79
80 Cephadm daemon data and logs in slightly different locations than older
81 versions of ceph:
82
83 * ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
84 that by default cephadm logs via stderr and the container runtime,
85 so these logs are normally not present.
86 * ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
87 (besides logs).
88 * ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
89 an individual daemon.
90 * ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
91 the cluster.
92 * ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
93 data directories for stateful daemons (e.g., monitor, prometheus)
94 that have been removed by cephadm.
95
96 Disk usage
97 ----------
98
99 Because a few Ceph daemons may store a significant amount of data in
100 ``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
101 moving this directory to its own disk, partition, or logical volume so
102 that it does not fill up the root file system.
103
104
105
106 SSH Configuration
107 =================
108
109 Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
110 with those hosts in a secure way.
111
112
113 Default behavior
114 ----------------
115
116 Cephadm stores an SSH key in the monitor that is used to
117 connect to remote hosts. When the cluster is bootstrapped, this SSH
118 key is generated automatically and no additional configuration
119 is necessary.
120
121 A *new* SSH key can be generated with::
122
123 ceph cephadm generate-key
124
125 The public portion of the SSH key can be retrieved with::
126
127 ceph cephadm get-pub-key
128
129 The currently stored SSH key can be deleted with::
130
131 ceph cephadm clear-key
132
133 You can make use of an existing key by directly importing it with::
134
135 ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
136 ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
137
138 You will then need to restart the mgr daemon to reload the configuration with::
139
140 ceph mgr fail
141
142
143 Customizing the SSH configuration
144 ---------------------------------
145
146 Cephadm generates an appropriate ``ssh_config`` file that is
147 used for connecting to remote hosts. This configuration looks
148 something like this::
149
150 Host *
151 User root
152 StrictHostKeyChecking no
153 UserKnownHostsFile /dev/null
154
155 There are two ways to customize this configuration for your environment:
156
157 #. Import a customized configuration file that will be stored
158 by the monitor with::
159
160 ceph cephadm set-ssh-config -i <ssh_config_file>
161
162 To remove a customized SSH config and revert back to the default behavior::
163
164 ceph cephadm clear-ssh-config
165
166 #. You can configure a file location for the SSH configuration file with::
167
168 ceph config set mgr mgr/cephadm/ssh_config_file <path>
169
170 We do *not recommend* this approach. The path name must be
171 visible to *any* mgr daemon, and cephadm runs all daemons as
172 containers. That means that the file either need to be placed
173 inside a customized container image for your deployment, or
174 manually distributed to the mgr data directory
175 (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
176 ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
177
178
179 Health checks
180 =============
181
182 CEPHADM_PAUSED
183 --------------
184
185 Cephadm background work has been paused with ``ceph orch pause``. Cephadm
186 continues to perform passive monitoring activities (like checking
187 host and daemon status), but it will not make any changes (like deploying
188 or removing daemons).
189
190 Resume cephadm work with::
191
192 ceph orch resume
193
194 CEPHADM_STRAY_HOST
195 ------------------
196
197 One or more hosts have running Ceph daemons but are not registered as
198 hosts managed by *cephadm*. This means that those services cannot
199 currently be managed by cephadm (e.g., restarted, upgraded, included
200 in `ceph orch ps`).
201
202 You can manage the host(s) with::
203
204 ceph orch host add *<hostname>*
205
206 Note that you may need to configure SSH access to the remote host
207 before this will work.
208
209 Alternatively, you can manually connect to the host and ensure that
210 services on that host are removed or migrated to a host that is
211 managed by *cephadm*.
212
213 You can also disable this warning entirely with::
214
215 ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
216
217 CEPHADM_STRAY_DAEMON
218 --------------------
219
220 One or more Ceph daemons are running but not are not managed by
221 *cephadm*. This may be because they were deployed using a different
222 tool, or because they were started manually. Those
223 services cannot currently be managed by cephadm (e.g., restarted,
224 upgraded, or included in `ceph orch ps`).
225
226 If the daemon is a stateful one (monitor or OSD), it should be adopted
227 by cephadm; see :ref:`cephadm-adoption`. For stateless daemons, it is
228 usually easiest to provision a new daemon with the ``ceph orch apply``
229 command and then stop the unmanaged daemon.
230
231 This warning can be disabled entirely with::
232
233 ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
234
235 CEPHADM_HOST_CHECK_FAILED
236 -------------------------
237
238 One or more hosts have failed the basic cephadm host check, which verifies
239 that (1) the host is reachable and cephadm can be executed there, and (2)
240 that the host satisfies basic prerequisites, like a working container
241 runtime (podman or docker) and working time synchronization.
242 If this test fails, cephadm will no be able to manage services on that host.
243
244 You can manually run this check with::
245
246 ceph cephadm check-host *<hostname>*
247
248 You can remove a broken host from management with::
249
250 ceph orch host rm *<hostname>*
251
252 You can disable this health warning with::
253
254 ceph config set mgr mgr/cephadm/warn_on_failed_host_check false