[ceph.git] / ceph / doc / cephadm / operations.rst

==================
Cephadm Operations
==================

Watching cephadm log messages
=============================

Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
monitor progress in realtime with::

  # ceph -W cephadm

By default it will show info-level events and above.  To see
debug-level messages too::

  # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
  # ceph -W cephadm --watch-debug

Be careful: the debug messages are very verbose!

You can see recent events with::

  # ceph log last cephadm

These events are also logged to the ``ceph.cephadm.log`` file on
monitor hosts and to the monitor daemons' stderr.


Ceph daemon logs
================

Logging to stdout
-----------------

Traditionally, Ceph daemons have logged to ``/var/log/ceph``.  By
default, cephadm daemons log to stderr and the logs are
captured by the container runtime environment.  For most systems, by
default, these logs are sent to journald and accessible via
``journalctl``.

For example, to view the logs for the daemon ``mon.foo`` for a cluster
with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
something like::

  journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo

This works well for normal operations when logging levels are low.

To disable logging to stderr::

  ceph config set global log_to_stderr false
  ceph config set global mon_cluster_log_to_stderr false

Logging to files
----------------

You can also configure Ceph daemons to log to files instead of stderr,
just like they have in the past.  When logging to files, Ceph logs appear
in ``/var/log/ceph/<cluster-fsid>``.

To enable logging to files::

  ceph config set global log_to_file true
  ceph config set global mon_cluster_log_to_file true

We recommend disabling logging to stderr (see above) or else everything
will be logged twice::

  ceph config set global log_to_stderr false
  ceph config set global mon_cluster_log_to_stderr false

By default, cephadm sets up log rotation on each host to rotate these
files.  You can configure the logging retention schedule by modifying
``/etc/logrotate.d/ceph.<cluster-fsid>``.


Data location
=============

Cephadm daemon data and logs in slightly different locations than older
versions of ceph:

* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs.  Note
  that by default cephadm logs via stderr and the container runtime,
  so these logs are normally not present.
* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
  (besides logs).
* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
  an individual daemon.
* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
  the cluster.
* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
  data directories for stateful daemons (e.g., monitor, prometheus)
  that have been removed by cephadm.

Disk usage
----------

Because a few Ceph daemons may store a significant amount of data in
``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
moving this directory to its own disk, partition, or logical volume so
that it does not fill up the root file system.


SSH Configuration
=================

Cephadm uses SSH to connect to remote hosts.  SSH uses a key to authenticate
with those hosts in a secure way.


Default behavior
----------------

Cephadm stores an SSH key in the monitor that is used to
connect to remote hosts.  When the cluster is bootstrapped, this SSH
key is generated automatically and no additional configuration
is necessary.

A *new* SSH key can be generated with::

  ceph cephadm generate-key

The public portion of the SSH key can be retrieved with::

  ceph cephadm get-pub-key

The currently stored SSH key can be deleted with::

  ceph cephadm clear-key

You can make use of an existing key by directly importing it with::

  ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
  ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>

You will then need to restart the mgr daemon to reload the configuration with::

  ceph mgr fail


Customizing the SSH configuration
---------------------------------

Cephadm generates an appropriate ``ssh_config`` file that is
used for connecting to remote hosts.  This configuration looks
something like this::

  Host *
  User root
  StrictHostKeyChecking no
  UserKnownHostsFile /dev/null

There are two ways to customize this configuration for your environment:

#. Import a customized configuration file that will be stored
   by the monitor with::

     ceph cephadm set-ssh-config -i <ssh_config_file>

   To remove a customized SSH config and revert back to the default behavior::

     ceph cephadm clear-ssh-config

#. You can configure a file location for the SSH configuration file with::

     ceph config set mgr mgr/cephadm/ssh_config_file <path>

   We do *not recommend* this approach.  The path name must be
   visible to *any* mgr daemon, and cephadm runs all daemons as
   containers. That means that the file either need to be placed
   inside a customized container image for your deployment, or
   manually distributed to the mgr data directory
   (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
   ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).


Health checks
=============

CEPHADM_PAUSED
--------------

Cephadm background work has been paused with ``ceph orch pause``.  Cephadm
continues to perform passive monitoring activities (like checking
host and daemon status), but it will not make any changes (like deploying
or removing daemons).

Resume cephadm work with::

  ceph orch resume

CEPHADM_STRAY_HOST
------------------

One or more hosts have running Ceph daemons but are not registered as
hosts managed by *cephadm*.  This means that those services cannot
currently be managed by cephadm (e.g., restarted, upgraded, included
in `ceph orch ps`).

You can manage the host(s) with::

  ceph orch host add *<hostname>*

Note that you may need to configure SSH access to the remote host
before this will work.

Alternatively, you can manually connect to the host and ensure that
services on that host are removed or migrated to a host that is
managed by *cephadm*.

You can also disable this warning entirely with::

  ceph config set mgr mgr/cephadm/warn_on_stray_hosts false

CEPHADM_STRAY_DAEMON
--------------------

One or more Ceph daemons are running but not are not managed by
*cephadm*.  This may be because they were deployed using a different
tool, or because they were started manually.  Those
services cannot currently be managed by cephadm (e.g., restarted,
upgraded, or included in `ceph orch ps`).

If the daemon is a stateful one (monitor or OSD), it should be adopted
by cephadm; see :ref:`cephadm-adoption`.  For stateless daemons, it is
usually easiest to provision a new daemon with the ``ceph orch apply``
command and then stop the unmanaged daemon.

This warning can be disabled entirely with::

  ceph config set mgr mgr/cephadm/warn_on_stray_daemons false

CEPHADM_HOST_CHECK_FAILED
-------------------------

One or more hosts have failed the basic cephadm host check, which verifies
that (1) the host is reachable and cephadm can be executed there, and (2)
that the host satisfies basic prerequisites, like a working container
runtime (podman or docker) and working time synchronization.
If this test fails, cephadm will no be able to manage services on that host.

You can manually run this check with::

  ceph cephadm check-host *<hostname>*

You can remove a broken host from management with::

  ceph orch host rm *<hostname>*

You can disable this health warning with::

  ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
Commit	Line	Data
9f95a23c TL	1	==================
	2	Cephadm Operations
	3	==================
	4
	5	Watching cephadm log messages
	6	=============================
	7
	8	Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
	9	monitor progress in realtime with::
	10
	11	# ceph -W cephadm
	12
	13	By default it will show info-level events and above. To see
	14	debug-level messages too::
	15
	16	# ceph config set mgr mgr/cephadm/log_to_cluster_level debug
	17	# ceph -W cephadm --watch-debug
	18
	19	Be careful: the debug messages are very verbose!
	20
	21	You can see recent events with::
	22
	23	# ceph log last cephadm
	24
	25	These events are also logged to the ``ceph.cephadm.log`` file on
	26	monitor hosts and to the monitor daemons' stderr.
	27
	28
	29	Ceph daemon logs
	30	================
	31
	32	Logging to stdout
	33	-----------------
	34
	35	Traditionally, Ceph daemons have logged to ``/var/log/ceph``. By
	36	default, cephadm daemons log to stderr and the logs are
	37	captured by the container runtime environment. For most systems, by
	38	default, these logs are sent to journald and accessible via
	39	``journalctl``.
	40
	41	For example, to view the logs for the daemon ``mon.foo`` for a cluster
	42	with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
	43	something like::
	44
	45	journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
	46
	47	This works well for normal operations when logging levels are low.
	48
	49	To disable logging to stderr::
	50
	51	ceph config set global log_to_stderr false
	52	ceph config set global mon_cluster_log_to_stderr false
	53
	54	Logging to files
	55	----------------
	56
	57	You can also configure Ceph daemons to log to files instead of stderr,
	58	just like they have in the past. When logging to files, Ceph logs appear
	59	in ``/var/log/ceph/<cluster-fsid>``.
	60
	61	To enable logging to files::
	62
	63	ceph config set global log_to_file true
	64	ceph config set global mon_cluster_log_to_file true
65
66	We recommend disabling logging to stderr (see above) or else everything
67	will be logged twice::
68
69	ceph config set global log_to_stderr false
70	ceph config set global mon_cluster_log_to_stderr false
71
72	By default, cephadm sets up log rotation on each host to rotate these
73	files. You can configure the logging retention schedule by modifying
74	``/etc/logrotate.d/ceph.<cluster-fsid>``.
75
76
77	Data location
78	=============
79
80	Cephadm daemon data and logs in slightly different locations than older
81	versions of ceph:
82
83	* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
84	that by default cephadm logs via stderr and the container runtime,
85	so these logs are normally not present.
86	* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
87	(besides logs).
88	* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
89	an individual daemon.
90	* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
91	the cluster.
92	* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
93	data directories for stateful daemons (e.g., monitor, prometheus)
94	that have been removed by cephadm.
95
96	Disk usage
97	----------
98
99	Because a few Ceph daemons may store a significant amount of data in
100	``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
101	moving this directory to its own disk, partition, or logical volume so
102	that it does not fill up the root file system.
103
104
105
106	SSH Configuration
107	=================
108
109	Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
110	with those hosts in a secure way.
111
112
113	Default behavior
114	----------------
115
116	Cephadm stores an SSH key in the monitor that is used to
117	connect to remote hosts. When the cluster is bootstrapped, this SSH
118	key is generated automatically and no additional configuration
119	is necessary.
120
121	A new SSH key can be generated with::
122
123	ceph cephadm generate-key
124
125	The public portion of the SSH key can be retrieved with::
126
127	ceph cephadm get-pub-key
128
129	The currently stored SSH key can be deleted with::
130
131	ceph cephadm clear-key
132
133	You can make use of an existing key by directly importing it with::
134
135	ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
136	ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
137
138	You will then need to restart the mgr daemon to reload the configuration with::
139
140	ceph mgr fail
141
142
143	Customizing the SSH configuration
144	---------------------------------
145
146	Cephadm generates an appropriate ``ssh_config`` file that is
147	used for connecting to remote hosts. This configuration looks
148	something like this::
149
150	Host *
151	User root
152	StrictHostKeyChecking no
153	UserKnownHostsFile /dev/null
154
155	There are two ways to customize this configuration for your environment:
156
157	#. Import a customized configuration file that will be stored
158	by the monitor with::
159
160	ceph cephadm set-ssh-config -i <ssh_config_file>
161
162	To remove a customized SSH config and revert back to the default behavior::
163
164	ceph cephadm clear-ssh-config
165
166	#. You can configure a file location for the SSH configuration file with::
167
168	ceph config set mgr mgr/cephadm/ssh_config_file <path>
169
170	We do not recommend this approach. The path name must be
171	visible to any mgr daemon, and cephadm runs all daemons as
172	containers. That means that the file either need to be placed
173	inside a customized container image for your deployment, or
174	manually distributed to the mgr data directory
175	(``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
176	``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
177
178
179	Health checks
180	=============
181
182	CEPHADM_PAUSED
183	--------------
184
185	Cephadm background work has been paused with ``ceph orch pause``. Cephadm
186	continues to perform passive monitoring activities (like checking
187	host and daemon status), but it will not make any changes (like deploying
188	or removing daemons).
189
190	Resume cephadm work with::
191
192	ceph orch resume
193
194	CEPHADM_STRAY_HOST
195	------------------
196
197	One or more hosts have running Ceph daemons but are not registered as
198	hosts managed by cephadm. This means that those services cannot
199	currently be managed by cephadm (e.g., restarted, upgraded, included
200	in `ceph orch ps`).
201
202	You can manage the host(s) with::
203
204	ceph orch host add <hostname>
205
206	Note that you may need to configure SSH access to the remote host
207	before this will work.
208
209	Alternatively, you can manually connect to the host and ensure that
210	services on that host are removed or migrated to a host that is
211	managed by cephadm.
212
213	You can also disable this warning entirely with::
214
215	ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
216
217	CEPHADM_STRAY_DAEMON
218	--------------------
219
220	One or more Ceph daemons are running but not are not managed by
221	cephadm. This may be because they were deployed using a different
222	tool, or because they were started manually. Those
223	services cannot currently be managed by cephadm (e.g., restarted,
224	upgraded, or included in `ceph orch ps`).
225
226	If the daemon is a stateful one (monitor or OSD), it should be adopted
227	by cephadm; see :ref:`cephadm-adoption`. For stateless daemons, it is
228	usually easiest to provision a new daemon with the ``ceph orch apply``
229	command and then stop the unmanaged daemon.
230
231	This warning can be disabled entirely with::
232
233	ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
234
235	CEPHADM_HOST_CHECK_FAILED
236	-------------------------
237
238	One or more hosts have failed the basic cephadm host check, which verifies
239	that (1) the host is reachable and cephadm can be executed there, and (2)
240	that the host satisfies basic prerequisites, like a working container
241	runtime (podman or docker) and working time synchronization.
242	If this test fails, cephadm will no be able to manage services on that host.
243
244	You can manually run this check with::
245
246	ceph cephadm check-host <hostname>
247
248	You can remove a broken host from management with::
249
250	ceph orch host rm <hostname>
251
252	You can disable this health warning with::
253
254	ceph config set mgr mgr/cephadm/warn_on_failed_host_check false