[ceph.git] / ceph / doc / cephadm / operations.rst

==================
Cephadm Operations
==================

Watching cephadm log messages
=============================

Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
monitor progress in realtime with::

  # ceph -W cephadm

By default it will show info-level events and above.  To see
debug-level messages too::

  # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
  # ceph -W cephadm --watch-debug

Be careful: the debug messages are very verbose!

You can see recent events with::

  # ceph log last cephadm

These events are also logged to the ``ceph.cephadm.log`` file on
monitor hosts and to the monitor daemons' stderr.


.. _cephadm-logs:

Ceph daemon logs
================

Logging to stdout
-----------------

Traditionally, Ceph daemons have logged to ``/var/log/ceph``.  By
default, cephadm daemons log to stderr and the logs are
captured by the container runtime environment.  For most systems, by
default, these logs are sent to journald and accessible via
``journalctl``.

For example, to view the logs for the daemon ``mon.foo`` for a cluster
with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
something like::

  journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo

This works well for normal operations when logging levels are low.

To disable logging to stderr::

  ceph config set global log_to_stderr false
  ceph config set global mon_cluster_log_to_stderr false

Logging to files
----------------

You can also configure Ceph daemons to log to files instead of stderr,
just like they have in the past.  When logging to files, Ceph logs appear
in ``/var/log/ceph/<cluster-fsid>``.

To enable logging to files::

  ceph config set global log_to_file true
  ceph config set global mon_cluster_log_to_file true

We recommend disabling logging to stderr (see above) or else everything
will be logged twice::

  ceph config set global log_to_stderr false
  ceph config set global mon_cluster_log_to_stderr false

By default, cephadm sets up log rotation on each host to rotate these
files.  You can configure the logging retention schedule by modifying
``/etc/logrotate.d/ceph.<cluster-fsid>``.


Data location
=============

Cephadm daemon data and logs in slightly different locations than older
versions of ceph:

* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs.  Note
  that by default cephadm logs via stderr and the container runtime,
  so these logs are normally not present.
* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
  (besides logs).
* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
  an individual daemon.
* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
  the cluster.
* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
  data directories for stateful daemons (e.g., monitor, prometheus)
  that have been removed by cephadm.

Disk usage
----------

Because a few Ceph daemons may store a significant amount of data in
``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
moving this directory to its own disk, partition, or logical volume so
that it does not fill up the root file system.


Health checks
=============
The cephadm module provides additional healthchecks to supplement the default healthchecks
provided by the Cluster. These additional healthchecks fall into two categories;

- **cephadm operations**: Healthchecks in this category are always executed when the cephadm module is active.
- **cluster configuration**: These healthchecks are *optional*, and focus on the configuration of the hosts in
  the cluster

CEPHADM Operations
------------------

CEPHADM_PAUSED
^^^^^^^^^^^^^^

Cephadm background work has been paused with ``ceph orch pause``.  Cephadm
continues to perform passive monitoring activities (like checking
host and daemon status), but it will not make any changes (like deploying
or removing daemons).

Resume cephadm work with::

  ceph orch resume

.. _cephadm-stray-host:

CEPHADM_STRAY_HOST
^^^^^^^^^^^^^^^^^^

One or more hosts have running Ceph daemons but are not registered as
hosts managed by *cephadm*.  This means that those services cannot
currently be managed by cephadm (e.g., restarted, upgraded, included
in `ceph orch ps`).

You can manage the host(s) with::

  ceph orch host add *<hostname>*

Note that you may need to configure SSH access to the remote host
before this will work.

Alternatively, you can manually connect to the host and ensure that
services on that host are removed or migrated to a host that is
managed by *cephadm*.

You can also disable this warning entirely with::

  ceph config set mgr mgr/cephadm/warn_on_stray_hosts false

See :ref:`cephadm-fqdn` for more information about host names and
domain names.

CEPHADM_STRAY_DAEMON
^^^^^^^^^^^^^^^^^^^^

One or more Ceph daemons are running but not are not managed by
*cephadm*.  This may be because they were deployed using a different
tool, or because they were started manually.  Those
services cannot currently be managed by cephadm (e.g., restarted,
upgraded, or included in `ceph orch ps`).

If the daemon is a stateful one (monitor or OSD), it should be adopted
by cephadm; see :ref:`cephadm-adoption`.  For stateless daemons, it is
usually easiest to provision a new daemon with the ``ceph orch apply``
command and then stop the unmanaged daemon.

This warning can be disabled entirely with::

  ceph config set mgr mgr/cephadm/warn_on_stray_daemons false

CEPHADM_HOST_CHECK_FAILED
^^^^^^^^^^^^^^^^^^^^^^^^^

One or more hosts have failed the basic cephadm host check, which verifies
that (1) the host is reachable and cephadm can be executed there, and (2)
that the host satisfies basic prerequisites, like a working container
runtime (podman or docker) and working time synchronization.
If this test fails, cephadm will no be able to manage services on that host.

You can manually run this check with::

  ceph cephadm check-host *<hostname>*

You can remove a broken host from management with::

  ceph orch host rm *<hostname>*

You can disable this health warning with::

  ceph config set mgr mgr/cephadm/warn_on_failed_host_check false

Cluster Configuration Checks
----------------------------
Cephadm periodically scans each of the hosts in the cluster, to understand the state
of the OS, disks, NICs etc. These facts can then be analysed for consistency across the hosts
in the cluster to identify any configuration anomalies.

The configuration checks are an **optional** feature, enabled by the following command
::

  ceph config set mgr mgr/cephadm/config_checks_enabled true

The configuration checks are triggered after each host scan (1m). The cephadm log entries will
show the current state and outcome of the configuration checks as follows;

Disabled state (config_checks_enabled false)
::

  ALL cephadm checks are disabled, use 'ceph config set mgr mgr/cephadm/config_checks_enabled true' to enable

Enabled state (config_checks_enabled true)
::

  CEPHADM 8/8 checks enabled and executed (0 bypassed, 0 disabled). No issues detected

The configuration checks themselves are managed through several cephadm sub-commands.

To determine whether the configuration checks are enabled, you can use the following command
::

  ceph cephadm config-check status

This command will return the status of the configuration checker as either "Enabled" or "Disabled".


Listing all the configuration checks and their current state
::

  ceph cephadm config-check ls

  e.g.
    NAME             HEALTHCHECK                      STATUS   DESCRIPTION
  kernel_security  CEPHADM_CHECK_KERNEL_LSM         enabled  checks SELINUX/Apparmor profiles are consistent across cluster hosts
  os_subscription  CEPHADM_CHECK_SUBSCRIPTION       enabled  checks subscription states are consistent for all cluster hosts
  public_network   CEPHADM_CHECK_PUBLIC_MEMBERSHIP  enabled  check that all hosts have a NIC on the Ceph public_netork
  osd_mtu_size     CEPHADM_CHECK_MTU                enabled  check that OSD hosts share a common MTU setting
  osd_linkspeed    CEPHADM_CHECK_LINKSPEED          enabled  check that OSD hosts share a common linkspeed
  network_missing  CEPHADM_CHECK_NETWORK_MISSING    enabled  checks that the cluster/public networks defined exist on the Ceph hosts
  ceph_release     CEPHADM_CHECK_CEPH_RELEASE       enabled  check for Ceph version consistency - ceph daemons should be on the same release (unless upgrade is active)
  kernel_version   CEPHADM_CHECK_KERNEL_VERSION     enabled  checks that the MAJ.MIN of the kernel on Ceph hosts is consistent

The name of each configuration check, can then be used to enable or disable a specific check.
::

  ceph cephadm config-check disable <name>

  eg.
  ceph cephadm config-check disable kernel_security

CEPHADM_CHECK_KERNEL_LSM
^^^^^^^^^^^^^^^^^^^^^^^^
Each host within the cluster is expected to operate within the same Linux Security Module (LSM) state. For example,
if the majority of the hosts are running with SELINUX in enforcing mode, any host not running in this mode
would be flagged as an anomaly and a healtcheck (WARNING) state raised.

CEPHADM_CHECK_SUBSCRIPTION
^^^^^^^^^^^^^^^^^^^^^^^^^^
This check relates to the status of vendor subscription. This check is only performed for hosts using RHEL, but helps
to confirm that all your hosts are covered by an active subscription so patches and updates
are available.

CEPHADM_CHECK_PUBLIC_MEMBERSHIP
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All members of the cluster should have NICs configured on at least one of the public network subnets. Hosts
that are not on the public network will rely on routing which may affect performance

CEPHADM_CHECK_MTU
^^^^^^^^^^^^^^^^^
The MTU of the NICs on OSDs can be a key factor in consistent performance. This check examines hosts
that are running OSD services to ensure that the MTU is configured consistently within the cluster. This is
determined by establishing the MTU setting that the majority of hosts are using, with any anomalies being
resulting in a Ceph healthcheck.

CEPHADM_CHECK_LINKSPEED
^^^^^^^^^^^^^^^^^^^^^^^
Similar to the MTU check, linkspeed consistency is also a factor in consistent cluster performance.
This check determines the linkspeed shared by the majority of "OSD hosts", resulting in a healthcheck for
any hosts that are set at a lower linkspeed rate.

CEPHADM_CHECK_NETWORK_MISSING
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The public_network and cluster_network settings support subnet definitions for IPv4 and IPv6. If these
settings are not found on any host in the cluster a healthcheck is raised.

CEPHADM_CHECK_CEPH_RELEASE
^^^^^^^^^^^^^^^^^^^^^^^^^^
Under normal operations, the ceph cluster should be running daemons under the same ceph release (i.e. all
pacific). This check looks at the active release for each daemon, and reports any anomalies as a
healthcheck. *This check is bypassed if an upgrade process is active within the cluster.*

CEPHADM_CHECK_KERNEL_VERSION
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The OS kernel version (maj.min) is checked for consistency across the hosts. Once again, the
majority of the hosts is used as the basis of identifying anomalies.

Client keyrings and configs
===========================

Cephadm can distribute copies of the ``ceph.conf`` and client keyring
files to hosts.  For example, it is usually a good idea to store a
copy of the config and ``client.admin`` keyring on any hosts that will
be used to administer the cluster via the CLI.  By default, cephadm will do
this for any nodes with the ``_admin`` label (which normally includes the bootstrap
host).

When a client keyring is placed under management, cephadm will:

  - build a list of target hosts based on the specified placement spec (see :ref:`orchestrator-cli-placement-spec`)
  - store a copy of the ``/etc/ceph/ceph.conf`` file on the specified host(s)
  - store a copy of the keyring file on the specified host(s)
  - update the ``ceph.conf`` file as needed (e.g., due to a change in the cluster monitors)
  - update the keyring file if the entity's key is changed (e.g., via ``ceph auth ...`` commands)
  - ensure the keyring file has the specified ownership and mode
  - remove the keyring file when client keyring management is disabled
  - remove the keyring file from old hosts if the keyring placement spec is updated (as needed)

To view which client keyrings are currently under management::

  ceph orch client-keyring ls

To place a keyring under management::

  ceph orch client-keyring set <entity> <placement> [--mode=<mode>] [--owner=<uid>.<gid>] [--path=<path>]

- By default, the *path* will be ``/etc/ceph/client.{entity}.keyring``, which is where
  Ceph looks by default.  Be careful specifying alternate locations as existing files
  may be overwritten.
- A placement of ``*`` (all hosts) is common.
- The mode defaults to ``0600`` and ownership to ``0:0`` (user root, group root).

For example, to create and deploy a ``client.rbd`` key to hosts with the ``rbd-client`` label and group readable by uid/gid 107 (qemu),::

  ceph auth get-or-create-key client.rbd mon 'profile rbd' mgr 'profile rbd' osd 'profile rbd pool=my_rbd_pool'
  ceph orch client-keyring set client.rbd label:rbd-client --owner 107:107 --mode 640

The resulting keyring file is::

  -rw-r-----. 1 qemu qemu 156 Apr 21 08:47 /etc/ceph/client.client.rbd.keyring

To disable management of a keyring file::

  ceph orch client-keyring rm <entity>

Note that this will delete any keyring files for this entity that were previously written
to cluster nodes.


/etc/ceph/ceph.conf
===================

It may also be useful to distribute ``ceph.conf`` files to hosts without an associated
client keyring file.  By default, cephadm only deploys a ``ceph.conf`` file to hosts where a client keyring
is also distributed (see above).  To write config files to hosts without client keyrings::

    ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true

By default, the configs are written to all hosts (i.e., those listed
by ``ceph orch host ls``).  To specify which hosts get a ``ceph.conf``::

    ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts <placement spec>

For example, to distribute configs to hosts with the ``bare_config`` label,::

    ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts label:bare_config

(See :ref:`orchestrator-cli-placement-spec` for more information about placement specs.)
Commit	Line	Data
9f95a23c TL	1	==================
	2	Cephadm Operations
	3	==================
	4
	5	Watching cephadm log messages
	6	=============================
	7
	8	Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
	9	monitor progress in realtime with::
	10
	11	# ceph -W cephadm
	12
	13	By default it will show info-level events and above. To see
	14	debug-level messages too::
	15
	16	# ceph config set mgr mgr/cephadm/log_to_cluster_level debug
	17	# ceph -W cephadm --watch-debug
	18
	19	Be careful: the debug messages are very verbose!
	20
	21	You can see recent events with::
	22
	23	# ceph log last cephadm
	24
	25	These events are also logged to the ``ceph.cephadm.log`` file on
	26	monitor hosts and to the monitor daemons' stderr.
	27
	28
801d1391 TL	29	.. _cephadm-logs:
801d1391 TL	30
9f95a23c TL	31	Ceph daemon logs
	32	================
	33
	34	Logging to stdout
	35	-----------------
	36
	37	Traditionally, Ceph daemons have logged to ``/var/log/ceph``. By
	38	default, cephadm daemons log to stderr and the logs are
	39	captured by the container runtime environment. For most systems, by
	40	default, these logs are sent to journald and accessible via
	41	``journalctl``.
	42
	43	For example, to view the logs for the daemon ``mon.foo`` for a cluster
	44	with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
	45	something like::
	46
	47	journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
	48
	49	This works well for normal operations when logging levels are low.
	50
	51	To disable logging to stderr::
	52
	53	ceph config set global log_to_stderr false
	54	ceph config set global mon_cluster_log_to_stderr false
	55
	56	Logging to files
	57	----------------
	58
	59	You can also configure Ceph daemons to log to files instead of stderr,
	60	just like they have in the past. When logging to files, Ceph logs appear
	61	in ``/var/log/ceph/<cluster-fsid>``.
	62
	63	To enable logging to files::
	64
	65	ceph config set global log_to_file true
	66	ceph config set global mon_cluster_log_to_file true
	67
	68	We recommend disabling logging to stderr (see above) or else everything
	69	will be logged twice::
	70
	71	ceph config set global log_to_stderr false
	72	ceph config set global mon_cluster_log_to_stderr false
	73
	74	By default, cephadm sets up log rotation on each host to rotate these
	75	files. You can configure the logging retention schedule by modifying
	76	``/etc/logrotate.d/ceph.<cluster-fsid>``.
	77
	78
	79	Data location
	80	=============
	81
	82	Cephadm daemon data and logs in slightly different locations than older
	83	versions of ceph:
	84
	85	* ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs. Note
	86	that by default cephadm logs via stderr and the container runtime,
	87	so these logs are normally not present.
	88	* ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
	89	(besides logs).
	90	* ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
	91	an individual daemon.
	92	* ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
	93	the cluster.
	94	* ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
95	data directories for stateful daemons (e.g., monitor, prometheus)
96	that have been removed by cephadm.
97
98	Disk usage
99	----------
100
101	Because a few Ceph daemons may store a significant amount of data in
102	``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
103	moving this directory to its own disk, partition, or logical volume so
104	that it does not fill up the root file system.
105
106
9f95a23c TL	107	Health checks
9f95a23c TL	108	=============
f67539c2 TL	109	The cephadm module provides additional healthchecks to supplement the default healthchecks
	110	provided by the Cluster. These additional healthchecks fall into two categories;
	111
	112	- cephadm operations: Healthchecks in this category are always executed when the cephadm module is active.
	113	- cluster configuration: These healthchecks are optional, and focus on the configuration of the hosts in
	114	the cluster
	115
	116	CEPHADM Operations
	117	------------------
9f95a23c TL	118
9f95a23c TL	119	CEPHADM_PAUSED
f67539c2	120	^^^^^^^^^^^^^^
9f95a23c TL	121
	122	Cephadm background work has been paused with ``ceph orch pause``. Cephadm
	123	continues to perform passive monitoring activities (like checking
	124	host and daemon status), but it will not make any changes (like deploying
	125	or removing daemons).
	126
	127	Resume cephadm work with::
	128
	129	ceph orch resume
	130
f6b5b4d7 TL	131	.. _cephadm-stray-host:
f6b5b4d7 TL	132
9f95a23c	133	CEPHADM_STRAY_HOST
f67539c2	134	^^^^^^^^^^^^^^^^^^
9f95a23c TL	135
	136	One or more hosts have running Ceph daemons but are not registered as
	137	hosts managed by cephadm. This means that those services cannot
	138	currently be managed by cephadm (e.g., restarted, upgraded, included
	139	in `ceph orch ps`).
	140
	141	You can manage the host(s) with::
	142
	143	ceph orch host add <hostname>
	144
	145	Note that you may need to configure SSH access to the remote host
	146	before this will work.
	147
	148	Alternatively, you can manually connect to the host and ensure that
	149	services on that host are removed or migrated to a host that is
	150	managed by cephadm.
	151
	152	You can also disable this warning entirely with::
	153
	154	ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
	155
f6b5b4d7 TL	156	See :ref:`cephadm-fqdn` for more information about host names and
	157	domain names.
	158
9f95a23c	159	CEPHADM_STRAY_DAEMON
f67539c2	160	^^^^^^^^^^^^^^^^^^^^
9f95a23c TL	161
	162	One or more Ceph daemons are running but not are not managed by
	163	cephadm. This may be because they were deployed using a different
	164	tool, or because they were started manually. Those
	165	services cannot currently be managed by cephadm (e.g., restarted,
	166	upgraded, or included in `ceph orch ps`).
	167
	168	If the daemon is a stateful one (monitor or OSD), it should be adopted
	169	by cephadm; see :ref:`cephadm-adoption`. For stateless daemons, it is
	170	usually easiest to provision a new daemon with the ``ceph orch apply``
	171	command and then stop the unmanaged daemon.
	172
	173	This warning can be disabled entirely with::
	174
	175	ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
	176
	177	CEPHADM_HOST_CHECK_FAILED
f67539c2	178	^^^^^^^^^^^^^^^^^^^^^^^^^
9f95a23c TL	179
	180	One or more hosts have failed the basic cephadm host check, which verifies
	181	that (1) the host is reachable and cephadm can be executed there, and (2)
	182	that the host satisfies basic prerequisites, like a working container
	183	runtime (podman or docker) and working time synchronization.
	184	If this test fails, cephadm will no be able to manage services on that host.
	185
	186	You can manually run this check with::
	187
	188	ceph cephadm check-host <hostname>
	189
	190	You can remove a broken host from management with::
	191
	192	ceph orch host rm <hostname>
	193
	194	You can disable this health warning with::
	195
	196	ceph config set mgr mgr/cephadm/warn_on_failed_host_check false
e306af50	197
f67539c2 TL	198	Cluster Configuration Checks
	199	----------------------------
	200	Cephadm periodically scans each of the hosts in the cluster, to understand the state
	201	of the OS, disks, NICs etc. These facts can then be analysed for consistency across the hosts
	202	in the cluster to identify any configuration anomalies.
e306af50	203
f67539c2 TL	204	The configuration checks are an optional feature, enabled by the following command
f67539c2 TL	205	::
e306af50	206
f67539c2	207	ceph config set mgr mgr/cephadm/config_checks_enabled true
e306af50	208
f67539c2 TL	209	The configuration checks are triggered after each host scan (1m). The cephadm log entries will
f67539c2 TL	210	show the current state and outcome of the configuration checks as follows;
e306af50	211
f67539c2 TL	212	Disabled state (config_checks_enabled false)
f67539c2 TL	213	::
e306af50	214
f67539c2	215	ALL cephadm checks are disabled, use 'ceph config set mgr mgr/cephadm/config_checks_enabled true' to enable
f91f0fd5	216
f67539c2 TL	217	Enabled state (config_checks_enabled true)
f67539c2 TL	218	::
f91f0fd5	219
f67539c2	220	CEPHADM 8/8 checks enabled and executed (0 bypassed, 0 disabled). No issues detected
e306af50	221
f67539c2	222	The configuration checks themselves are managed through several cephadm sub-commands.
f91f0fd5	223
f67539c2 TL	224	To determine whether the configuration checks are enabled, you can use the following command
f67539c2 TL	225	::
f91f0fd5	226
f67539c2 TL	227	ceph cephadm config-check status
	228
	229	This command will return the status of the configuration checker as either "Enabled" or "Disabled".
	230
	231
	232	Listing all the configuration checks and their current state
	233	::
	234
	235	ceph cephadm config-check ls
	236
	237	e.g.
	238	NAME HEALTHCHECK STATUS DESCRIPTION
	239	kernel_security CEPHADM_CHECK_KERNEL_LSM enabled checks SELINUX/Apparmor profiles are consistent across cluster hosts
	240	os_subscription CEPHADM_CHECK_SUBSCRIPTION enabled checks subscription states are consistent for all cluster hosts
	241	public_network CEPHADM_CHECK_PUBLIC_MEMBERSHIP enabled check that all hosts have a NIC on the Ceph public_netork
	242	osd_mtu_size CEPHADM_CHECK_MTU enabled check that OSD hosts share a common MTU setting
	243	osd_linkspeed CEPHADM_CHECK_LINKSPEED enabled check that OSD hosts share a common linkspeed
	244	network_missing CEPHADM_CHECK_NETWORK_MISSING enabled checks that the cluster/public networks defined exist on the Ceph hosts
	245	ceph_release CEPHADM_CHECK_CEPH_RELEASE enabled check for Ceph version consistency - ceph daemons should be on the same release (unless upgrade is active)
	246	kernel_version CEPHADM_CHECK_KERNEL_VERSION enabled checks that the MAJ.MIN of the kernel on Ceph hosts is consistent
	247
	248	The name of each configuration check, can then be used to enable or disable a specific check.
	249	::
adb31ebb	250
f67539c2	251	ceph cephadm config-check disable <name>
adb31ebb	252
f67539c2 TL	253	eg.
f67539c2 TL	254	ceph cephadm config-check disable kernel_security
adb31ebb	255
f67539c2 TL	256	CEPHADM_CHECK_KERNEL_LSM
	257	^^^^^^^^^^^^^^^^^^^^^^^^
	258	Each host within the cluster is expected to operate within the same Linux Security Module (LSM) state. For example,
	259	if the majority of the hosts are running with SELINUX in enforcing mode, any host not running in this mode
	260	would be flagged as an anomaly and a healtcheck (WARNING) state raised.
adb31ebb	261
f67539c2 TL	262	CEPHADM_CHECK_SUBSCRIPTION
	263	^^^^^^^^^^^^^^^^^^^^^^^^^^
	264	This check relates to the status of vendor subscription. This check is only performed for hosts using RHEL, but helps
	265	to confirm that all your hosts are covered by an active subscription so patches and updates
	266	are available.
adb31ebb	267
f67539c2 TL	268	CEPHADM_CHECK_PUBLIC_MEMBERSHIP
	269	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	270	All members of the cluster should have NICs configured on at least one of the public network subnets. Hosts
	271	that are not on the public network will rely on routing which may affect performance
adb31ebb	272
f67539c2 TL	273	CEPHADM_CHECK_MTU
	274	^^^^^^^^^^^^^^^^^
	275	The MTU of the NICs on OSDs can be a key factor in consistent performance. This check examines hosts
	276	that are running OSD services to ensure that the MTU is configured consistently within the cluster. This is
	277	determined by establishing the MTU setting that the majority of hosts are using, with any anomalies being
	278	resulting in a Ceph healthcheck.
adb31ebb	279
f67539c2 TL	280	CEPHADM_CHECK_LINKSPEED
	281	^^^^^^^^^^^^^^^^^^^^^^^
	282	Similar to the MTU check, linkspeed consistency is also a factor in consistent cluster performance.
	283	This check determines the linkspeed shared by the majority of "OSD hosts", resulting in a healthcheck for
	284	any hosts that are set at a lower linkspeed rate.
adb31ebb	285
f67539c2 TL	286	CEPHADM_CHECK_NETWORK_MISSING
	287	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	288	The public_network and cluster_network settings support subnet definitions for IPv4 and IPv6. If these
	289	settings are not found on any host in the cluster a healthcheck is raised.
adb31ebb	290
f67539c2 TL	291	CEPHADM_CHECK_CEPH_RELEASE
	292	^^^^^^^^^^^^^^^^^^^^^^^^^^
	293	Under normal operations, the ceph cluster should be running daemons under the same ceph release (i.e. all
	294	pacific). This check looks at the active release for each daemon, and reports any anomalies as a
	295	healthcheck. This check is bypassed if an upgrade process is active within the cluster.
adb31ebb	296
f67539c2 TL	297	CEPHADM_CHECK_KERNEL_VERSION
	298	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	299	The OS kernel version (maj.min) is checked for consistency across the hosts. Once again, the
	300	majority of the hosts is used as the basis of identifying anomalies.
adb31ebb	301
b3b6e05e TL	302	Client keyrings and configs
	303	===========================
	304
	305	Cephadm can distribute copies of the ``ceph.conf`` and client keyring
	306	files to hosts. For example, it is usually a good idea to store a
	307	copy of the config and ``client.admin`` keyring on any hosts that will
	308	be used to administer the cluster via the CLI. By default, cephadm will do
	309	this for any nodes with the ``_admin`` label (which normally includes the bootstrap
	310	host).
	311
	312	When a client keyring is placed under management, cephadm will:
	313
	314	- build a list of target hosts based on the specified placement spec (see :ref:`orchestrator-cli-placement-spec`)
	315	- store a copy of the ``/etc/ceph/ceph.conf`` file on the specified host(s)
	316	- store a copy of the keyring file on the specified host(s)
	317	- update the ``ceph.conf`` file as needed (e.g., due to a change in the cluster monitors)
	318	- update the keyring file if the entity's key is changed (e.g., via ``ceph auth ...`` commands)
	319	- ensure the keyring file has the specified ownership and mode
	320	- remove the keyring file when client keyring management is disabled
	321	- remove the keyring file from old hosts if the keyring placement spec is updated (as needed)
	322
	323	To view which client keyrings are currently under management::
	324
	325	ceph orch client-keyring ls
	326
	327	To place a keyring under management::
f67539c2	328
b3b6e05e	329	ceph orch client-keyring set <entity> <placement> [--mode=<mode>] [--owner=<uid>.<gid>] [--path=<path>]
f67539c2	330
b3b6e05e TL	331	- By default, the path will be ``/etc/ceph/client.{entity}.keyring``, which is where
	332	Ceph looks by default. Be careful specifying alternate locations as existing files
	333	may be overwritten.
	334	- A placement of ``*`` (all hosts) is common.
	335	- The mode defaults to ``0600`` and ownership to ``0:0`` (user root, group root).
adb31ebb	336
b3b6e05e	337	For example, to create and deploy a ``client.rbd`` key to hosts with the ``rbd-client`` label and group readable by uid/gid 107 (qemu),::
adb31ebb	338
b3b6e05e TL	339	ceph auth get-or-create-key client.rbd mon 'profile rbd' mgr 'profile rbd' osd 'profile rbd pool=my_rbd_pool'
b3b6e05e TL	340	ceph orch client-keyring set client.rbd label:rbd-client --owner 107:107 --mode 640
f67539c2	341
b3b6e05e TL	342	The resulting keyring file is::
	343
	344	-rw-r-----. 1 qemu qemu 156 Apr 21 08:47 /etc/ceph/client.client.rbd.keyring
	345
	346	To disable management of a keyring file::
	347
	348	ceph orch client-keyring rm <entity>
	349
	350	Note that this will delete any keyring files for this entity that were previously written
	351	to cluster nodes.
	352
	353
	354	/etc/ceph/ceph.conf
	355	===================
adb31ebb	356
b3b6e05e TL	357	It may also be useful to distribute ``ceph.conf`` files to hosts without an associated
	358	client keyring file. By default, cephadm only deploys a ``ceph.conf`` file to hosts where a client keyring
	359	is also distributed (see above). To write config files to hosts without client keyrings::
adb31ebb	360
b3b6e05e	361	ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf true
adb31ebb	362
b3b6e05e TL	363	By default, the configs are written to all hosts (i.e., those listed
b3b6e05e TL	364	by ``ceph orch host ls``). To specify which hosts get a ``ceph.conf``::
adb31ebb	365
b3b6e05e	366	ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts <placement spec>
adb31ebb	367
b3b6e05e	368	For example, to distribute configs to hosts with the ``bare_config`` label,::
adb31ebb	369
b3b6e05e	370	ceph config set mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts label:bare_config
adb31ebb	371
b3b6e05e	372	(See :ref:`orchestrator-cli-placement-spec` for more information about placement specs.)