]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | |
2 | ====================== | |
3 | CephFS health messages | |
4 | ====================== | |
5 | ||
6 | Cluster health checks | |
7 | ===================== | |
8 | ||
9 | The Ceph monitor daemons will generate health messages in response | |
10 | to certain states of the filesystem map structure (and the enclosed MDS maps). | |
11 | ||
12 | Message: mds rank(s) *ranks* have failed | |
13 | Description: One or more MDS ranks are not currently assigned to | |
14 | an MDS daemon; the cluster will not recover until a suitable replacement | |
15 | daemon starts. | |
16 | ||
17 | Message: mds rank(s) *ranks* are damaged | |
18 | Description: One or more MDS ranks has encountered severe damage to | |
19 | its stored metadata, and cannot start again until it is repaired. | |
20 | ||
21 | Message: mds cluster is degraded | |
22 | Description: One or more MDS ranks are not currently up and running, clients | |
23 | may pause metadata IO until this situation is resolved. This includes | |
24 | ranks being failed or damaged, and additionally includes ranks | |
25 | which are running on an MDS but have not yet made it to the *active* | |
26 | state (e.g. ranks currently in *replay* state). | |
27 | ||
28 | Message: mds *names* are laggy | |
29 | Description: The named MDS daemons have failed to send beacon messages | |
30 | to the monitor for at least ``mds_beacon_grace`` (default 15s), while | |
31 | they are supposed to send beacon messages every ``mds_beacon_interval`` | |
32 | (default 4s). The daemons may have crashed. The Ceph monitor will | |
33 | automatically replace laggy daemons with standbys if any are available. | |
34 | ||
35 | Message: insufficient standby daemons available | |
36 | Description: One or more file systems are configured to have a certain number | |
37 | of standby daemons available (including daemons in standby-replay) but the | |
38 | cluster does not have enough standby daemons. The standby deamons not in replay | |
39 | count towards any file system (i.e. they may overlap). This warning can | |
40 | configured by setting ``ceph fs set <fs> standby_count_wanted <count>``. Use | |
41 | zero for ``count`` to disable. | |
42 | ||
43 | ||
44 | Daemon-reported health checks | |
45 | ============================= | |
46 | ||
47 | MDS daemons can identify a variety of unwanted conditions, and | |
48 | indicate these to the operator in the output of ``ceph status``. | |
49 | This conditions have human readable messages, and additionally | |
50 | a unique code starting MDS_HEALTH which appears in JSON output. | |
51 | ||
52 | Message: "Behind on trimming..." | |
53 | Code: MDS_HEALTH_TRIM | |
54 | Description: CephFS maintains a metadata journal that is divided into | |
55 | *log segments*. The length of journal (in number of segments) is controlled | |
56 | by the setting ``mds_log_max_segments``, and when the number of segments | |
57 | exceeds that setting the MDS starts writing back metadata so that it | |
58 | can remove (trim) the oldest segments. If this writeback is happening | |
59 | too slowly, or a software bug is preventing trimming, then this health | |
60 | message may appear. The threshold for this message to appear is for the | |
61 | number of segments to be double ``mds_log_max_segments``. | |
62 | ||
63 | Message: "Client *name* failing to respond to capability release" | |
64 | Code: MDS_HEALTH_CLIENT_LATE_RELEASE, MDS_HEALTH_CLIENT_LATE_RELEASE_MANY | |
65 | Description: CephFS clients are issued *capabilities* by the MDS, which | |
66 | are like locks. Sometimes, for example when another client needs access, | |
67 | the MDS will request clients release their capabilities. If the client | |
68 | is unresponsive or buggy, it might fail to do so promptly or fail to do | |
69 | so at all. This message appears if a client has taken longer than | |
70 | ``mds_revoke_cap_timeout`` (default 60s) to comply. | |
71 | ||
72 | Message: "Client *name* failing to respond to cache pressure" | |
73 | Code: MDS_HEALTH_CLIENT_RECALL, MDS_HEALTH_CLIENT_RECALL_MANY | |
181888fb FG |
74 | Description: Clients maintain a metadata cache. Items (such as inodes) in the |
75 | client cache are also pinned in the MDS cache, so when the MDS needs to shrink | |
76 | its cache (to stay within ``mds_cache_size`` or ``mds_cache_memory_limit``), it | |
77 | sends messages to clients to shrink their caches too. If the client is | |
78 | unresponsive or buggy, this can prevent the MDS from properly staying within | |
79 | its cache limits and it may eventually run out of memory and crash. This | |
80 | message appears if a client has taken more than ``mds_recall_state_timeout`` | |
81 | (default 60s) to comply. | |
7c673cae FG |
82 | |
83 | Message: "Client *name* failing to advance its oldest client/flush tid" | |
84 | Code: MDS_HEALTH_CLIENT_OLDEST_TID, MDS_HEALTH_CLIENT_OLDEST_TID_MANY | |
85 | Description: The CephFS client-MDS protocol uses a field called the | |
86 | *oldest tid* to inform the MDS of which client requests are fully | |
87 | complete and may therefore be forgotten about by the MDS. If a buggy | |
88 | client is failing to advance this field, then the MDS may be prevented | |
89 | from properly cleaning up resources used by client requests. This message | |
90 | appears if a client appears to have more than ``max_completed_requests`` | |
91 | (default 100000) requests that are complete on the MDS side but haven't | |
92 | yet been accounted for in the client's *oldest tid* value. | |
93 | ||
94 | Message: "Metadata damage detected" | |
95 | Code: MDS_HEALTH_DAMAGE, | |
96 | Description: Corrupt or missing metadata was encountered when reading | |
97 | from the metadata pool. This message indicates that the damage was | |
98 | sufficiently isolated for the MDS to continue operating, although | |
99 | client accesses to the damaged subtree will return IO errors. Use | |
100 | the ``damage ls`` admin socket command to get more detail on the damage. | |
101 | This message appears as soon as any damage is encountered. | |
102 | ||
103 | Message: "MDS in read-only mode" | |
104 | Code: MDS_HEALTH_READ_ONLY, | |
105 | Description: The MDS has gone into readonly mode and will return EROFS | |
106 | error codes to client operations that attempt to modify any metadata. The | |
107 | MDS will go into readonly mode if it encounters a write error while | |
108 | writing to the metadata pool, or if forced to by an administrator using | |
109 | the *force_readonly* admin socket command. | |
110 | ||
111 | Message: *N* slow requests are blocked" | |
112 | Code: MDS_HEALTH_SLOW_REQUEST, | |
113 | Description: One or more client requests have not been completed promptly, | |
114 | indicating that the MDS is either running very slowly, or that the RADOS | |
115 | cluster is not acknowledging journal writes promptly, or that there is a bug. | |
116 | Use the ``ops`` admin socket command to list outstanding metadata operations. | |
117 | This message appears if any client requests have taken longer than | |
118 | ``mds_op_complaint_time`` (default 30s). | |
119 | ||
120 | Message: "Too many inodes in cache" | |
121 | Code: MDS_HEALTH_CACHE_OVERSIZED | |
181888fb FG |
122 | Description: The MDS is not succeeding in trimming its cache to comply with the |
123 | limit set by the administrator. If the MDS cache becomes too large, the daemon | |
124 | may exhaust available memory and crash. By default, this message appears if | |
125 | the actual cache size (in inodes or memory) is at least 50% greater than | |
126 | ``mds_cache_size`` (default 100000) or ``mds_cache_memory_limit`` (default | |
127 | 1GB). Modify ``mds_health_cache_threshold`` to set the warning ratio. |