[ceph.git] / ceph / doc / cephfs / health-messages.rst


======================
CephFS health messages
======================

Cluster health checks
=====================

The Ceph monitor daemons will generate health messages in response
to certain states of the filesystem map structure (and the enclosed MDS maps).

Message: mds rank(s) *ranks* have failed
Description: One or more MDS ranks are not currently assigned to
an MDS daemon; the cluster will not recover until a suitable replacement
daemon starts.

Message: mds rank(s) *ranks* are damaged
Description: One or more MDS ranks has encountered severe damage to
its stored metadata, and cannot start again until it is repaired.

Message: mds cluster is degraded
Description: One or more MDS ranks are not currently up and running, clients
may pause metadata IO until this situation is resolved.  This includes
ranks being failed or damaged, and additionally includes ranks
which are running on an MDS but have not yet made it to the *active*
state (e.g. ranks currently in *replay* state).

Message: mds *names* are laggy
Description: The named MDS daemons have failed to send beacon messages
to the monitor for at least ``mds_beacon_grace`` (default 15s), while
they are supposed to send beacon messages every ``mds_beacon_interval``
(default 4s).  The daemons may have crashed.  The Ceph monitor will
automatically replace laggy daemons with standbys if any are available.

Message: insufficient standby daemons available
Description: One or more file systems are configured to have a certain number
of standby daemons available (including daemons in standby-replay) but the
cluster does not have enough standby daemons. The standby deamons not in replay
count towards any file system (i.e. they may overlap). This warning can
configured by setting ``ceph fs set <fs> standby_count_wanted <count>``.  Use
zero for ``count`` to disable.


Daemon-reported health checks
=============================

MDS daemons can identify a variety of unwanted conditions, and
indicate these to the operator in the output of ``ceph status``.
This conditions have human readable messages, and additionally
a unique code starting MDS_HEALTH which appears in JSON output.

Message: "Behind on trimming..."
Code: MDS_HEALTH_TRIM
Description: CephFS maintains a metadata journal that is divided into
*log segments*.  The length of journal (in number of segments) is controlled
by the setting ``mds_log_max_segments``, and when the number of segments
exceeds that setting the MDS starts writing back metadata so that it
can remove (trim) the oldest segments.  If this writeback is happening
too slowly, or a software bug is preventing trimming, then this health
message may appear.  The threshold for this message to appear is for the
number of segments to be double ``mds_log_max_segments``.

Message: "Client *name* failing to respond to capability release"
Code: MDS_HEALTH_CLIENT_LATE_RELEASE, MDS_HEALTH_CLIENT_LATE_RELEASE_MANY
Description: CephFS clients are issued *capabilities* by the MDS, which
are like locks.  Sometimes, for example when another client needs access,
the MDS will request clients release their capabilities.  If the client
is unresponsive or buggy, it might fail to do so promptly or fail to do
so at all.  This message appears if a client has taken longer than
``mds_revoke_cap_timeout`` (default 60s) to comply.

Message: "Client *name* failing to respond to cache pressure"
Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY
Description: Clients maintain a metadata cache.  Items (such as inodes) in the
client cache are also pinned in the MDS cache, so when the MDS needs to shrink
its cache (to stay within ``mds_cache_size`` or ``mds_cache_memory_limit``), it
sends messages to clients to shrink their caches too.  If the client is
unresponsive or buggy, this can prevent the MDS from properly staying within
its cache limits and it may eventually run out of memory and crash.  This
message appears if a client has taken more than ``mds_recall_state_timeout``
(default 60s) to comply.

Message: "Client *name* failing to advance its oldest client/flush tid"
Code: MDS_HEALTH_CLIENT_OLDEST_TID, MDS_HEALTH_CLIENT_OLDEST_TID_MANY
Description: The CephFS client-MDS protocol uses a field called the
*oldest tid* to inform the MDS of which client requests are fully
complete and may therefore be forgotten about by the MDS.  If a buggy
client is failing to advance this field, then the MDS may be prevented
from properly cleaning up resources used by client requests.  This message
appears if a client appears to have more than ``max_completed_requests``
(default 100000) requests that are complete on the MDS side but haven't
yet been accounted for in the client's *oldest tid* value.

Message: "Metadata damage detected"
Code: MDS_HEALTH_DAMAGE,
Description: Corrupt or missing metadata was encountered when reading
from the metadata pool.  This message indicates that the damage was
sufficiently isolated for the MDS to continue operating, although
client accesses to the damaged subtree will return IO errors.  Use
the ``damage ls`` admin socket command to get more detail on the damage.
This message appears as soon as any damage is encountered.

Message: "MDS in read-only mode"
Code: MDS_HEALTH_READ_ONLY,
Description: The MDS has gone into readonly mode and will return EROFS
error codes to client operations that attempt to modify any metadata.  The
MDS will go into readonly mode if it encounters a write error while
writing to the metadata pool, or if forced to by an administrator using
the *force_readonly* admin socket command.

Message: *N* slow requests are blocked"
Code: MDS_HEALTH_SLOW_REQUEST,
Description: One or more client requests have not been completed promptly,
indicating that the MDS is either running very slowly, or that the RADOS
cluster is not acknowledging journal writes promptly, or that there is a bug.
Use the ``ops`` admin socket command to list outstanding metadata operations.
This message appears if any client requests have taken longer than
``mds_op_complaint_time`` (default 30s).

Message: "Too many inodes in cache"
Code: MDS_HEALTH_CACHE_OVERSIZED
Description: The MDS is not succeeding in trimming its cache to comply with the
limit set by the administrator.  If the MDS cache becomes too large, the daemon
may exhaust available memory and crash.  By default, this message appears if
the actual cache size (in inodes or memory) is at least 50% greater than
``mds_cache_size`` (default 100000) or ``mds_cache_memory_limit`` (default
1GB). Modify ``mds_health_cache_threshold`` to set the warning ratio.
Commit	Line	Data
7c673cae FG	1
	2	======================
	3	CephFS health messages
	4	======================
	5
	6	Cluster health checks
	7	=====================
	8
	9	The Ceph monitor daemons will generate health messages in response
	10	to certain states of the filesystem map structure (and the enclosed MDS maps).
	11
	12	Message: mds rank(s) ranks have failed
	13	Description: One or more MDS ranks are not currently assigned to
	14	an MDS daemon; the cluster will not recover until a suitable replacement
	15	daemon starts.
	16
	17	Message: mds rank(s) ranks are damaged
	18	Description: One or more MDS ranks has encountered severe damage to
	19	its stored metadata, and cannot start again until it is repaired.
	20
	21	Message: mds cluster is degraded
	22	Description: One or more MDS ranks are not currently up and running, clients
	23	may pause metadata IO until this situation is resolved. This includes
	24	ranks being failed or damaged, and additionally includes ranks
	25	which are running on an MDS but have not yet made it to the active
	26	state (e.g. ranks currently in replay state).
	27
	28	Message: mds names are laggy
	29	Description: The named MDS daemons have failed to send beacon messages
	30	to the monitor for at least ``mds_beacon_grace`` (default 15s), while
	31	they are supposed to send beacon messages every ``mds_beacon_interval``
	32	(default 4s). The daemons may have crashed. The Ceph monitor will
	33	automatically replace laggy daemons with standbys if any are available.
	34
	35	Message: insufficient standby daemons available
	36	Description: One or more file systems are configured to have a certain number
	37	of standby daemons available (including daemons in standby-replay) but the
	38	cluster does not have enough standby daemons. The standby deamons not in replay
	39	count towards any file system (i.e. they may overlap). This warning can
	40	configured by setting ``ceph fs set <fs> standby_count_wanted <count>``. Use
	41	zero for ``count`` to disable.
	42
	43
	44	Daemon-reported health checks
	45	=============================
	46
	47	MDS daemons can identify a variety of unwanted conditions, and
	48	indicate these to the operator in the output of ``ceph status``.
	49	This conditions have human readable messages, and additionally
	50	a unique code starting MDS_HEALTH which appears in JSON output.
	51
	52	Message: "Behind on trimming..."
	53	Code: MDS_HEALTH_TRIM
	54	Description: CephFS maintains a metadata journal that is divided into
	55	log segments. The length of journal (in number of segments) is controlled
	56	by the setting ``mds_log_max_segments``, and when the number of segments
	57	exceeds that setting the MDS starts writing back metadata so that it
	58	can remove (trim) the oldest segments. If this writeback is happening
	59	too slowly, or a software bug is preventing trimming, then this health
	60	message may appear. The threshold for this message to appear is for the
	61	number of segments to be double ``mds_log_max_segments``.
	62
	63	Message: "Client name failing to respond to capability release"
	64	Code: MDS_HEALTH_CLIENT_LATE_RELEASE, MDS_HEALTH_CLIENT_LATE_RELEASE_MANY
65	Description: CephFS clients are issued capabilities by the MDS, which
66	are like locks. Sometimes, for example when another client needs access,
67	the MDS will request clients release their capabilities. If the client
68	is unresponsive or buggy, it might fail to do so promptly or fail to do
69	so at all. This message appears if a client has taken longer than
70	``mds_revoke_cap_timeout`` (default 60s) to comply.
71
72	Message: "Client name failing to respond to cache pressure"
73	Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY
181888fb FG	74	Description: Clients maintain a metadata cache. Items (such as inodes) in the
	75	client cache are also pinned in the MDS cache, so when the MDS needs to shrink
	76	its cache (to stay within ``mds_cache_size`` or ``mds_cache_memory_limit``), it
	77	sends messages to clients to shrink their caches too. If the client is
	78	unresponsive or buggy, this can prevent the MDS from properly staying within
	79	its cache limits and it may eventually run out of memory and crash. This
	80	message appears if a client has taken more than ``mds_recall_state_timeout``
	81	(default 60s) to comply.
7c673cae FG	82
	83	Message: "Client name failing to advance its oldest client/flush tid"
	84	Code: MDS_HEALTH_CLIENT_OLDEST_TID, MDS_HEALTH_CLIENT_OLDEST_TID_MANY
	85	Description: The CephFS client-MDS protocol uses a field called the
	86	oldest tid to inform the MDS of which client requests are fully
	87	complete and may therefore be forgotten about by the MDS. If a buggy
	88	client is failing to advance this field, then the MDS may be prevented
	89	from properly cleaning up resources used by client requests. This message
	90	appears if a client appears to have more than ``max_completed_requests``
	91	(default 100000) requests that are complete on the MDS side but haven't
	92	yet been accounted for in the client's oldest tid value.
	93
	94	Message: "Metadata damage detected"
	95	Code: MDS_HEALTH_DAMAGE,
	96	Description: Corrupt or missing metadata was encountered when reading
	97	from the metadata pool. This message indicates that the damage was
	98	sufficiently isolated for the MDS to continue operating, although
	99	client accesses to the damaged subtree will return IO errors. Use
	100	the ``damage ls`` admin socket command to get more detail on the damage.
	101	This message appears as soon as any damage is encountered.
	102
	103	Message: "MDS in read-only mode"
	104	Code: MDS_HEALTH_READ_ONLY,
	105	Description: The MDS has gone into readonly mode and will return EROFS
	106	error codes to client operations that attempt to modify any metadata. The
	107	MDS will go into readonly mode if it encounters a write error while
	108	writing to the metadata pool, or if forced to by an administrator using
	109	the force_readonly admin socket command.
	110
	111	Message: N slow requests are blocked"
	112	Code: MDS_HEALTH_SLOW_REQUEST,
	113	Description: One or more client requests have not been completed promptly,
	114	indicating that the MDS is either running very slowly, or that the RADOS
	115	cluster is not acknowledging journal writes promptly, or that there is a bug.
	116	Use the ``ops`` admin socket command to list outstanding metadata operations.
	117	This message appears if any client requests have taken longer than
	118	``mds_op_complaint_time`` (default 30s).
	119
	120	Message: "Too many inodes in cache"
	121	Code: MDS_HEALTH_CACHE_OVERSIZED
181888fb FG	122	Description: The MDS is not succeeding in trimming its cache to comply with the
	123	limit set by the administrator. If the MDS cache becomes too large, the daemon
	124	may exhaust available memory and crash. By default, this message appears if
	125	the actual cache size (in inodes or memory) is at least 50% greater than
	126	``mds_cache_size`` (default 100000) or ``mds_cache_memory_limit`` (default
	127	1GB). Modify ``mds_health_cache_threshold`` to set the warning ratio.