[ceph.git] / ceph / doc / cephfs / health-messages.rst


.. _cephfs-health-messages:

======================
CephFS health messages
======================

Cluster health checks
=====================

The Ceph monitor daemons will generate health messages in response
to certain states of the file system map structure (and the enclosed MDS maps).

Message: mds rank(s) *ranks* have failed
Description: One or more MDS ranks are not currently assigned to
an MDS daemon; the cluster will not recover until a suitable replacement
daemon starts.

Message: mds rank(s) *ranks* are damaged
Description: One or more MDS ranks has encountered severe damage to
its stored metadata, and cannot start again until it is repaired.

Message: mds cluster is degraded
Description: One or more MDS ranks are not currently up and running, clients
may pause metadata IO until this situation is resolved.  This includes
ranks being failed or damaged, and additionally includes ranks
which are running on an MDS but have not yet made it to the *active*
state (e.g. ranks currently in *replay* state).

Message: mds *names* are laggy
Description: The named MDS daemons have failed to send beacon messages
to the monitor for at least ``mds_beacon_grace`` (default 15s), while
they are supposed to send beacon messages every ``mds_beacon_interval``
(default 4s).  The daemons may have crashed.  The Ceph monitor will
automatically replace laggy daemons with standbys if any are available.

Message: insufficient standby daemons available
Description: One or more file systems are configured to have a certain number
of standby daemons available (including daemons in standby-replay) but the
cluster does not have enough standby daemons. The standby daemons not in replay
count towards any file system (i.e. they may overlap). This warning can
configured by setting ``ceph fs set <fs> standby_count_wanted <count>``.  Use
zero for ``count`` to disable.


Daemon-reported health checks
=============================

MDS daemons can identify a variety of unwanted conditions, and
indicate these to the operator in the output of ``ceph status``.
These conditions have human readable messages, and additionally
a unique code starting with ``MDS_``.

.. highlight:: console

``ceph health detail`` shows the details of the conditions. Following
is a typical health report from a cluster experiencing MDS related
performance issues::

  $ ceph health detail
  HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests
  MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
     mds.fs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs
  MDS_SLOW_REQUEST 1 MDSs report slow requests
     mds.fs-01(mds.0): 5 slow requests are blocked > 30 secs

Where, for instance, ``MDS_SLOW_REQUEST`` is the unique code representing the
condition where requests are taking long time to complete. And the following
description shows its severity and the MDS daemons which are serving these
slow requests.

This page lists the health checks raised by MDS daemons. For the checks from
other daemons, please see :ref:`health-checks`.

``MDS_TRIM``
------------

  Message
    "Behind on trimming..."
  Description
    CephFS maintains a metadata journal that is divided into
    *log segments*.  The length of journal (in number of segments) is controlled
    by the setting ``mds_log_max_segments``, and when the number of segments
    exceeds that setting the MDS starts writing back metadata so that it
    can remove (trim) the oldest segments.  If this writeback is happening
    too slowly, or a software bug is preventing trimming, then this health
    message may appear.  The threshold for this message to appear is controlled by
    the config option ``mds_log_warn_factor``, the default is 2.0.

``MDS_HEALTH_CLIENT_LATE_RELEASE``, ``MDS_HEALTH_CLIENT_LATE_RELEASE_MANY``
---------------------------------------------------------------------------

  Message
    "Client *name* failing to respond to capability release"
  Description
    CephFS clients are issued *capabilities* by the MDS, which
    are like locks.  Sometimes, for example when another client needs access,
    the MDS will request clients release their capabilities.  If the client
    is unresponsive or buggy, it might fail to do so promptly or fail to do
    so at all.  This message appears if a client has taken longer than
    ``session_timeout`` (default 60s) to comply.

``MDS_CLIENT_RECALL``, ``MDS_HEALTH_CLIENT_RECALL_MANY``
--------------------------------------------------------

  Message
    "Client *name* failing to respond to cache pressure"
  Description
    Clients maintain a metadata cache.  Items (such as inodes) in the
    client cache are also pinned in the MDS cache, so when the MDS needs to shrink
    its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
    clients to shrink their caches too.  If the client is unresponsive or buggy,
    this can prevent the MDS from properly staying within its cache limits and it
    may eventually run out of memory and crash.  This message appears if a client
    has failed to release more than
    ``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
    ``mds_recall_max_decay_rate``) within the last
    ``mds_recall_warning_decay_rate`` second.

``MDS_CLIENT_OLDEST_TID``, ``MDS_CLIENT_OLDEST_TID_MANY``
---------------------------------------------------------

  Message
    "Client *name* failing to advance its oldest client/flush tid"
  Description
    The CephFS client-MDS protocol uses a field called the
    *oldest tid* to inform the MDS of which client requests are fully
    complete and may therefore be forgotten about by the MDS.  If a buggy
    client is failing to advance this field, then the MDS may be prevented
    from properly cleaning up resources used by client requests.  This message
    appears if a client appears to have more than ``max_completed_requests``
    (default 100000) requests that are complete on the MDS side but haven't
    yet been accounted for in the client's *oldest tid* value. The last tid
    used by the MDS to trim completed client requests (or flush) is included
    as part of `session ls` (or `client ls`) command as a debug aid.

``MDS_DAMAGE``
--------------

  Message
    "Metadata damage detected"
  Description
    Corrupt or missing metadata was encountered when reading
    from the metadata pool.  This message indicates that the damage was
    sufficiently isolated for the MDS to continue operating, although
    client accesses to the damaged subtree will return IO errors.  Use
    the ``damage ls`` admin socket command to get more detail on the damage.
    This message appears as soon as any damage is encountered.

``MDS_HEALTH_READ_ONLY``
------------------------

  Message
    "MDS in read-only mode"
  Description
    The MDS has gone into readonly mode and will return EROFS
    error codes to client operations that attempt to modify any metadata.  The
    MDS will go into readonly mode if it encounters a write error while
    writing to the metadata pool, or if forced to by an administrator using
    the *force_readonly* admin socket command.

``MDS_SLOW_REQUEST``
--------------------

  Message
    "*N* slow requests are blocked"

  Description
    One or more client requests have not been completed promptly,
    indicating that the MDS is either running very slowly, or that the RADOS
    cluster is not acknowledging journal writes promptly, or that there is a bug.
    Use the ``ops`` admin socket command to list outstanding metadata operations.
    This message appears if any client requests have taken longer than
    ``mds_op_complaint_time`` (default 30s).

``MDS_CACHE_OVERSIZED``
-----------------------

  Message
    "Too many inodes in cache"
  Description
    The MDS is not succeeding in trimming its cache to comply with the
    limit set by the administrator.  If the MDS cache becomes too large, the daemon
    may exhaust available memory and crash.  By default, this message appears if
    the actual cache size (in memory) is at least 50% greater than
    ``mds_cache_memory_limit`` (default 4GB). Modify ``mds_health_cache_threshold``
    to set the warning ratio.

``FS_WITH_FAILED_MDS``
----------------------

  Message
    "Some MDS ranks do not have standby replacements"

  Description
    Normally, a failed MDS rank will be replaced by a standby MDS. This situation
    is transient and is not considered critical. However, if there are no standby
    MDSs available to replace an active MDS rank, this health warning is generated.

``MDS_INSUFFICIENT_STANDBY``
----------------------------

  Message
    "Insufficient number of available standby(-replay) MDS daemons than configured"

  Description
    The minimum number of standby(-replay) MDS daemons can be configured by setting
    ``standby_count_wanted`` configuration variable. This health warning is generated
    when the configured value mismatches the number of standby(-replay) MDS daemons
    available.

``FS_DEGRADED``
----------------------------

  Message
    "Some MDS ranks have been marked failed or damaged"

  Description
    When one or more MDS rank ends up in failed or damaged state due to
    an unrecoverable error. The file system may be partially or fully
    unavailable when one (or more) ranks are offline.

``MDS_UP_LESS_THAN_MAX``
----------------------------

  Message
    "Number of active ranks are less than configured number of maximum MDSs"

  Description
    The maximum number of MDS ranks can be configured by setting ``max_mds``
    configuration variable. This health warning is generated when the number
    of MDS ranks falls below this configured value.

``MDS_ALL_DOWN``
----------------------------

  Message
    "None of the MDS ranks are available (file system offline)"

  Description
    All MDS ranks are unavailable resulting in the file system to be completely
    offline.

``MDS_CLIENTS_LAGGY``
----------------------------
  Message
    "Client *ID* is laggy; not evicted because some OSD(s) is/are laggy"

  Description
    If OSD(s) is laggy (due to certain conditions like network cut-off, etc)
    then it might make clients laggy(session might get idle or cannot flush
    dirty data for cap revokes). If ``defer_client_eviction_on_laggy_osds`` is
    set to true (default true), client eviction will not take place and thus
    this health warning will be generated.

``MDS_CLIENTS_BROKEN_ROOTSQUASH``
---------------------------------
  Message
    "X client(s) with broken root_squash implementation (MDS_CLIENTS_BROKEN_ROOTSQUASH)"

  Description
    A bug was discovered in root_squash which would potentially lose changes made by a
    client restricted with root_squash caps. The fix required a change to the protocol
    and a client upgrade is required.

    This is a HEALTH_ERR warning because of the danger of inconsistency and lost
    data. It is recommended to either upgrade your clients, discontinue using
    root_squash in the interim, or silence the warning if desired.

    To evict and permanently block broken clients from connecting to the
    cluster, set the ``required_client_feature`` bit ``client_mds_auth_caps``.
Commit	Line	Data
7c673cae	1
11fdf7f2 TL	2	.. _cephfs-health-messages:
11fdf7f2 TL	3
7c673cae FG	4	======================
	5	CephFS health messages
	6	======================
	7
	8	Cluster health checks
	9	=====================
	10
	11	The Ceph monitor daemons will generate health messages in response
9f95a23c	12	to certain states of the file system map structure (and the enclosed MDS maps).
7c673cae FG	13
	14	Message: mds rank(s) ranks have failed
	15	Description: One or more MDS ranks are not currently assigned to
	16	an MDS daemon; the cluster will not recover until a suitable replacement
	17	daemon starts.
	18
	19	Message: mds rank(s) ranks are damaged
	20	Description: One or more MDS ranks has encountered severe damage to
	21	its stored metadata, and cannot start again until it is repaired.
	22
	23	Message: mds cluster is degraded
	24	Description: One or more MDS ranks are not currently up and running, clients
	25	may pause metadata IO until this situation is resolved. This includes
	26	ranks being failed or damaged, and additionally includes ranks
	27	which are running on an MDS but have not yet made it to the active
	28	state (e.g. ranks currently in replay state).
	29
	30	Message: mds names are laggy
	31	Description: The named MDS daemons have failed to send beacon messages
	32	to the monitor for at least ``mds_beacon_grace`` (default 15s), while
	33	they are supposed to send beacon messages every ``mds_beacon_interval``
	34	(default 4s). The daemons may have crashed. The Ceph monitor will
	35	automatically replace laggy daemons with standbys if any are available.
	36
	37	Message: insufficient standby daemons available
	38	Description: One or more file systems are configured to have a certain number
	39	of standby daemons available (including daemons in standby-replay) but the
11fdf7f2	40	cluster does not have enough standby daemons. The standby daemons not in replay
7c673cae FG	41	count towards any file system (i.e. they may overlap). This warning can
	42	configured by setting ``ceph fs set <fs> standby_count_wanted <count>``. Use
	43	zero for ``count`` to disable.
	44
	45
	46	Daemon-reported health checks
	47	=============================
	48
	49	MDS daemons can identify a variety of unwanted conditions, and
	50	indicate these to the operator in the output of ``ceph status``.
f67539c2 TL	51	These conditions have human readable messages, and additionally
	52	a unique code starting with ``MDS_``.
	53
	54	.. highlight:: console
	55
	56	``ceph health detail`` shows the details of the conditions. Following
	57	is a typical health report from a cluster experiencing MDS related
	58	performance issues::
	59
	60	$ ceph health detail
	61	HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests
	62	MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
	63	mds.fs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs
	64	MDS_SLOW_REQUEST 1 MDSs report slow requests
	65	mds.fs-01(mds.0): 5 slow requests are blocked > 30 secs
	66
20effc67	67	Where, for instance, ``MDS_SLOW_REQUEST`` is the unique code representing the
f67539c2 TL	68	condition where requests are taking long time to complete. And the following
	69	description shows its severity and the MDS daemons which are serving these
	70	slow requests.
	71
	72	This page lists the health checks raised by MDS daemons. For the checks from
	73	other daemons, please see :ref:`health-checks`.
	74
20effc67 TL	75	``MDS_TRIM``
20effc67 TL	76	------------
f67539c2 TL	77
	78	Message
	79	"Behind on trimming..."
	80	Description
	81	CephFS maintains a metadata journal that is divided into
	82	log segments. The length of journal (in number of segments) is controlled
	83	by the setting ``mds_log_max_segments``, and when the number of segments
	84	exceeds that setting the MDS starts writing back metadata so that it
	85	can remove (trim) the oldest segments. If this writeback is happening
	86	too slowly, or a software bug is preventing trimming, then this health
	87	message may appear. The threshold for this message to appear is controlled by
	88	the config option ``mds_log_warn_factor``, the default is 2.0.
20effc67 TL	89
	90	``MDS_HEALTH_CLIENT_LATE_RELEASE``, ``MDS_HEALTH_CLIENT_LATE_RELEASE_MANY``
	91	---------------------------------------------------------------------------
f67539c2 TL	92
	93	Message
	94	"Client name failing to respond to capability release"
	95	Description
	96	CephFS clients are issued capabilities by the MDS, which
	97	are like locks. Sometimes, for example when another client needs access,
	98	the MDS will request clients release their capabilities. If the client
	99	is unresponsive or buggy, it might fail to do so promptly or fail to do
	100	so at all. This message appears if a client has taken longer than
	101	``session_timeout`` (default 60s) to comply.
20effc67 TL	102
	103	``MDS_CLIENT_RECALL``, ``MDS_HEALTH_CLIENT_RECALL_MANY``
	104	--------------------------------------------------------
f67539c2 TL	105
	106	Message
	107	"Client name failing to respond to cache pressure"
	108	Description
	109	Clients maintain a metadata cache. Items (such as inodes) in the
	110	client cache are also pinned in the MDS cache, so when the MDS needs to shrink
	111	its cache (to stay within ``mds_cache_memory_limit``), it sends messages to
	112	clients to shrink their caches too. If the client is unresponsive or buggy,
	113	this can prevent the MDS from properly staying within its cache limits and it
	114	may eventually run out of memory and crash. This message appears if a client
	115	has failed to release more than
	116	``mds_recall_warning_threshold`` capabilities (decaying with a half-life of
	117	``mds_recall_max_decay_rate``) within the last
	118	``mds_recall_warning_decay_rate`` second.
20effc67 TL	119
	120	``MDS_CLIENT_OLDEST_TID``, ``MDS_CLIENT_OLDEST_TID_MANY``
	121	---------------------------------------------------------
f67539c2 TL	122
	123	Message
	124	"Client name failing to advance its oldest client/flush tid"
	125	Description
	126	The CephFS client-MDS protocol uses a field called the
	127	oldest tid to inform the MDS of which client requests are fully
	128	complete and may therefore be forgotten about by the MDS. If a buggy
	129	client is failing to advance this field, then the MDS may be prevented
	130	from properly cleaning up resources used by client requests. This message
	131	appears if a client appears to have more than ``max_completed_requests``
	132	(default 100000) requests that are complete on the MDS side but haven't
f38dd50b TL	133	yet been accounted for in the client's oldest tid value. The last tid
	134	used by the MDS to trim completed client requests (or flush) is included
	135	as part of `session ls` (or `client ls`) command as a debug aid.
20effc67 TL	136
	137	``MDS_DAMAGE``
	138	--------------
f67539c2 TL	139
	140	Message
	141	"Metadata damage detected"
	142	Description
	143	Corrupt or missing metadata was encountered when reading
	144	from the metadata pool. This message indicates that the damage was
	145	sufficiently isolated for the MDS to continue operating, although
	146	client accesses to the damaged subtree will return IO errors. Use
	147	the ``damage ls`` admin socket command to get more detail on the damage.
	148	This message appears as soon as any damage is encountered.
20effc67 TL	149
	150	``MDS_HEALTH_READ_ONLY``
	151	------------------------
f67539c2 TL	152
	153	Message
	154	"MDS in read-only mode"
	155	Description
	156	The MDS has gone into readonly mode and will return EROFS
	157	error codes to client operations that attempt to modify any metadata. The
	158	MDS will go into readonly mode if it encounters a write error while
	159	writing to the metadata pool, or if forced to by an administrator using
	160	the force_readonly admin socket command.
20effc67 TL	161
	162	``MDS_SLOW_REQUEST``
	163	--------------------
f67539c2 TL	164
	165	Message
	166	"N slow requests are blocked"
	167
	168	Description
	169	One or more client requests have not been completed promptly,
	170	indicating that the MDS is either running very slowly, or that the RADOS
	171	cluster is not acknowledging journal writes promptly, or that there is a bug.
	172	Use the ``ops`` admin socket command to list outstanding metadata operations.
	173	This message appears if any client requests have taken longer than
	174	``mds_op_complaint_time`` (default 30s).
20effc67 TL	175
	176	``MDS_CACHE_OVERSIZED``
	177	-----------------------
f67539c2 TL	178
	179	Message
	180	"Too many inodes in cache"
	181	Description
	182	The MDS is not succeeding in trimming its cache to comply with the
	183	limit set by the administrator. If the MDS cache becomes too large, the daemon
	184	may exhaust available memory and crash. By default, this message appears if
	185	the actual cache size (in memory) is at least 50% greater than
20effc67	186	``mds_cache_memory_limit`` (default 4GB). Modify ``mds_health_cache_threshold``
f67539c2	187	to set the warning ratio.
20effc67 TL	188
	189	``FS_WITH_FAILED_MDS``
	190	----------------------
	191
	192	Message
	193	"Some MDS ranks do not have standby replacements"
	194
	195	Description
	196	Normally, a failed MDS rank will be replaced by a standby MDS. This situation
	197	is transient and is not considered critical. However, if there are no standby
	198	MDSs available to replace an active MDS rank, this health warning is generated.
	199
	200	``MDS_INSUFFICIENT_STANDBY``
	201	----------------------------
	202
	203	Message
	204	"Insufficient number of available standby(-replay) MDS daemons than configured"
	205
	206	Description
	207	The minimum number of standby(-replay) MDS daemons can be configured by setting
	208	``standby_count_wanted`` configuration variable. This health warning is generated
	209	when the configured value mismatches the number of standby(-replay) MDS daemons
	210	available.
	211
	212	``FS_DEGRADED``
	213	----------------------------
	214
	215	Message
	216	"Some MDS ranks have been marked failed or damaged"
	217
	218	Description
	219	When one or more MDS rank ends up in failed or damaged state due to
	220	an unrecoverable error. The file system may be partially or fully
	221	unavailable when one (or more) ranks are offline.
	222
	223	``MDS_UP_LESS_THAN_MAX``
	224	----------------------------
	225
	226	Message
	227	"Number of active ranks are less than configured number of maximum MDSs"
	228
	229	Description
	230	The maximum number of MDS ranks can be configured by setting ``max_mds``
	231	configuration variable. This health warning is generated when the number
	232	of MDS ranks falls below this configured value.
	233
	234	``MDS_ALL_DOWN``
	235	----------------------------
	236
	237	Message
	238	"None of the MDS ranks are available (file system offline)"
	239
	240	Description
	241	All MDS ranks are unavailable resulting in the file system to be completely
	242	offline.
f38dd50b TL	243
	244	``MDS_CLIENTS_LAGGY``
	245	----------------------------
	246	Message
	247	"Client ID is laggy; not evicted because some OSD(s) is/are laggy"
	248
	249	Description
	250	If OSD(s) is laggy (due to certain conditions like network cut-off, etc)
	251	then it might make clients laggy(session might get idle or cannot flush
	252	dirty data for cap revokes). If ``defer_client_eviction_on_laggy_osds`` is
	253	set to true (default true), client eviction will not take place and thus
	254	this health warning will be generated.
	255
	256	``MDS_CLIENTS_BROKEN_ROOTSQUASH``
	257	---------------------------------
	258	Message
	259	"X client(s) with broken root_squash implementation (MDS_CLIENTS_BROKEN_ROOTSQUASH)"
	260
	261	Description
	262	A bug was discovered in root_squash which would potentially lose changes made by a
	263	client restricted with root_squash caps. The fix required a change to the protocol
	264	and a client upgrade is required.
	265
	266	This is a HEALTH_ERR warning because of the danger of inconsistency and lost
	267	data. It is recommended to either upgrade your clients, discontinue using
	268	root_squash in the interim, or silence the warning if desired.
	269
	270	To evict and permanently block broken clients from connecting to the
	271	cluster, set the ``required_client_feature`` bit ``client_mds_auth_caps``.