]>
Commit | Line | Data |
---|---|---|
7c673cae | 1 | |
11fdf7f2 TL |
2 | .. _cephfs-health-messages: |
3 | ||
7c673cae FG |
4 | ====================== |
5 | CephFS health messages | |
6 | ====================== | |
7 | ||
8 | Cluster health checks | |
9 | ===================== | |
10 | ||
11 | The Ceph monitor daemons will generate health messages in response | |
9f95a23c | 12 | to certain states of the file system map structure (and the enclosed MDS maps). |
7c673cae FG |
13 | |
14 | Message: mds rank(s) *ranks* have failed | |
15 | Description: One or more MDS ranks are not currently assigned to | |
16 | an MDS daemon; the cluster will not recover until a suitable replacement | |
17 | daemon starts. | |
18 | ||
19 | Message: mds rank(s) *ranks* are damaged | |
20 | Description: One or more MDS ranks has encountered severe damage to | |
21 | its stored metadata, and cannot start again until it is repaired. | |
22 | ||
23 | Message: mds cluster is degraded | |
24 | Description: One or more MDS ranks are not currently up and running, clients | |
25 | may pause metadata IO until this situation is resolved. This includes | |
26 | ranks being failed or damaged, and additionally includes ranks | |
27 | which are running on an MDS but have not yet made it to the *active* | |
28 | state (e.g. ranks currently in *replay* state). | |
29 | ||
30 | Message: mds *names* are laggy | |
31 | Description: The named MDS daemons have failed to send beacon messages | |
32 | to the monitor for at least ``mds_beacon_grace`` (default 15s), while | |
33 | they are supposed to send beacon messages every ``mds_beacon_interval`` | |
34 | (default 4s). The daemons may have crashed. The Ceph monitor will | |
35 | automatically replace laggy daemons with standbys if any are available. | |
36 | ||
37 | Message: insufficient standby daemons available | |
38 | Description: One or more file systems are configured to have a certain number | |
39 | of standby daemons available (including daemons in standby-replay) but the | |
11fdf7f2 | 40 | cluster does not have enough standby daemons. The standby daemons not in replay |
7c673cae FG |
41 | count towards any file system (i.e. they may overlap). This warning can |
42 | configured by setting ``ceph fs set <fs> standby_count_wanted <count>``. Use | |
43 | zero for ``count`` to disable. | |
44 | ||
45 | ||
46 | Daemon-reported health checks | |
47 | ============================= | |
48 | ||
49 | MDS daemons can identify a variety of unwanted conditions, and | |
50 | indicate these to the operator in the output of ``ceph status``. | |
f67539c2 TL |
51 | These conditions have human readable messages, and additionally |
52 | a unique code starting with ``MDS_``. | |
53 | ||
54 | .. highlight:: console | |
55 | ||
56 | ``ceph health detail`` shows the details of the conditions. Following | |
57 | is a typical health report from a cluster experiencing MDS related | |
58 | performance issues:: | |
59 | ||
60 | $ ceph health detail | |
61 | HEALTH_WARN 1 MDSs report slow metadata IOs; 1 MDSs report slow requests | |
62 | MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs | |
63 | mds.fs-01(mds.0): 3 slow metadata IOs are blocked > 30 secs, oldest blocked for 51123 secs | |
64 | MDS_SLOW_REQUEST 1 MDSs report slow requests | |
65 | mds.fs-01(mds.0): 5 slow requests are blocked > 30 secs | |
66 | ||
20effc67 | 67 | Where, for instance, ``MDS_SLOW_REQUEST`` is the unique code representing the |
f67539c2 TL |
68 | condition where requests are taking long time to complete. And the following |
69 | description shows its severity and the MDS daemons which are serving these | |
70 | slow requests. | |
71 | ||
72 | This page lists the health checks raised by MDS daemons. For the checks from | |
73 | other daemons, please see :ref:`health-checks`. | |
74 | ||
20effc67 TL |
75 | ``MDS_TRIM`` |
76 | ------------ | |
f67539c2 TL |
77 | |
78 | Message | |
79 | "Behind on trimming..." | |
80 | Description | |
81 | CephFS maintains a metadata journal that is divided into | |
82 | *log segments*. The length of journal (in number of segments) is controlled | |
83 | by the setting ``mds_log_max_segments``, and when the number of segments | |
84 | exceeds that setting the MDS starts writing back metadata so that it | |
85 | can remove (trim) the oldest segments. If this writeback is happening | |
86 | too slowly, or a software bug is preventing trimming, then this health | |
87 | message may appear. The threshold for this message to appear is controlled by | |
88 | the config option ``mds_log_warn_factor``, the default is 2.0. | |
20effc67 TL |
89 | |
90 | ``MDS_HEALTH_CLIENT_LATE_RELEASE``, ``MDS_HEALTH_CLIENT_LATE_RELEASE_MANY`` | |
91 | --------------------------------------------------------------------------- | |
f67539c2 TL |
92 | |
93 | Message | |
94 | "Client *name* failing to respond to capability release" | |
95 | Description | |
96 | CephFS clients are issued *capabilities* by the MDS, which | |
97 | are like locks. Sometimes, for example when another client needs access, | |
98 | the MDS will request clients release their capabilities. If the client | |
99 | is unresponsive or buggy, it might fail to do so promptly or fail to do | |
100 | so at all. This message appears if a client has taken longer than | |
101 | ``session_timeout`` (default 60s) to comply. | |
20effc67 TL |
102 | |
103 | ``MDS_CLIENT_RECALL``, ``MDS_HEALTH_CLIENT_RECALL_MANY`` | |
104 | -------------------------------------------------------- | |
f67539c2 TL |
105 | |
106 | Message | |
107 | "Client *name* failing to respond to cache pressure" | |
108 | Description | |
109 | Clients maintain a metadata cache. Items (such as inodes) in the | |
110 | client cache are also pinned in the MDS cache, so when the MDS needs to shrink | |
111 | its cache (to stay within ``mds_cache_memory_limit``), it sends messages to | |
112 | clients to shrink their caches too. If the client is unresponsive or buggy, | |
113 | this can prevent the MDS from properly staying within its cache limits and it | |
114 | may eventually run out of memory and crash. This message appears if a client | |
115 | has failed to release more than | |
116 | ``mds_recall_warning_threshold`` capabilities (decaying with a half-life of | |
117 | ``mds_recall_max_decay_rate``) within the last | |
118 | ``mds_recall_warning_decay_rate`` second. | |
20effc67 TL |
119 | |
120 | ``MDS_CLIENT_OLDEST_TID``, ``MDS_CLIENT_OLDEST_TID_MANY`` | |
121 | --------------------------------------------------------- | |
f67539c2 TL |
122 | |
123 | Message | |
124 | "Client *name* failing to advance its oldest client/flush tid" | |
125 | Description | |
126 | The CephFS client-MDS protocol uses a field called the | |
127 | *oldest tid* to inform the MDS of which client requests are fully | |
128 | complete and may therefore be forgotten about by the MDS. If a buggy | |
129 | client is failing to advance this field, then the MDS may be prevented | |
130 | from properly cleaning up resources used by client requests. This message | |
131 | appears if a client appears to have more than ``max_completed_requests`` | |
132 | (default 100000) requests that are complete on the MDS side but haven't | |
f38dd50b TL |
133 | yet been accounted for in the client's *oldest tid* value. The last tid |
134 | used by the MDS to trim completed client requests (or flush) is included | |
135 | as part of `session ls` (or `client ls`) command as a debug aid. | |
20effc67 TL |
136 | |
137 | ``MDS_DAMAGE`` | |
138 | -------------- | |
f67539c2 TL |
139 | |
140 | Message | |
141 | "Metadata damage detected" | |
142 | Description | |
143 | Corrupt or missing metadata was encountered when reading | |
144 | from the metadata pool. This message indicates that the damage was | |
145 | sufficiently isolated for the MDS to continue operating, although | |
146 | client accesses to the damaged subtree will return IO errors. Use | |
147 | the ``damage ls`` admin socket command to get more detail on the damage. | |
148 | This message appears as soon as any damage is encountered. | |
20effc67 TL |
149 | |
150 | ``MDS_HEALTH_READ_ONLY`` | |
151 | ------------------------ | |
f67539c2 TL |
152 | |
153 | Message | |
154 | "MDS in read-only mode" | |
155 | Description | |
156 | The MDS has gone into readonly mode and will return EROFS | |
157 | error codes to client operations that attempt to modify any metadata. The | |
158 | MDS will go into readonly mode if it encounters a write error while | |
159 | writing to the metadata pool, or if forced to by an administrator using | |
160 | the *force_readonly* admin socket command. | |
20effc67 TL |
161 | |
162 | ``MDS_SLOW_REQUEST`` | |
163 | -------------------- | |
f67539c2 TL |
164 | |
165 | Message | |
166 | "*N* slow requests are blocked" | |
167 | ||
168 | Description | |
169 | One or more client requests have not been completed promptly, | |
170 | indicating that the MDS is either running very slowly, or that the RADOS | |
171 | cluster is not acknowledging journal writes promptly, or that there is a bug. | |
172 | Use the ``ops`` admin socket command to list outstanding metadata operations. | |
173 | This message appears if any client requests have taken longer than | |
174 | ``mds_op_complaint_time`` (default 30s). | |
20effc67 TL |
175 | |
176 | ``MDS_CACHE_OVERSIZED`` | |
177 | ----------------------- | |
f67539c2 TL |
178 | |
179 | Message | |
180 | "Too many inodes in cache" | |
181 | Description | |
182 | The MDS is not succeeding in trimming its cache to comply with the | |
183 | limit set by the administrator. If the MDS cache becomes too large, the daemon | |
184 | may exhaust available memory and crash. By default, this message appears if | |
185 | the actual cache size (in memory) is at least 50% greater than | |
20effc67 | 186 | ``mds_cache_memory_limit`` (default 4GB). Modify ``mds_health_cache_threshold`` |
f67539c2 | 187 | to set the warning ratio. |
20effc67 TL |
188 | |
189 | ``FS_WITH_FAILED_MDS`` | |
190 | ---------------------- | |
191 | ||
192 | Message | |
193 | "Some MDS ranks do not have standby replacements" | |
194 | ||
195 | Description | |
196 | Normally, a failed MDS rank will be replaced by a standby MDS. This situation | |
197 | is transient and is not considered critical. However, if there are no standby | |
198 | MDSs available to replace an active MDS rank, this health warning is generated. | |
199 | ||
200 | ``MDS_INSUFFICIENT_STANDBY`` | |
201 | ---------------------------- | |
202 | ||
203 | Message | |
204 | "Insufficient number of available standby(-replay) MDS daemons than configured" | |
205 | ||
206 | Description | |
207 | The minimum number of standby(-replay) MDS daemons can be configured by setting | |
208 | ``standby_count_wanted`` configuration variable. This health warning is generated | |
209 | when the configured value mismatches the number of standby(-replay) MDS daemons | |
210 | available. | |
211 | ||
212 | ``FS_DEGRADED`` | |
213 | ---------------------------- | |
214 | ||
215 | Message | |
216 | "Some MDS ranks have been marked failed or damaged" | |
217 | ||
218 | Description | |
219 | When one or more MDS rank ends up in failed or damaged state due to | |
220 | an unrecoverable error. The file system may be partially or fully | |
221 | unavailable when one (or more) ranks are offline. | |
222 | ||
223 | ``MDS_UP_LESS_THAN_MAX`` | |
224 | ---------------------------- | |
225 | ||
226 | Message | |
227 | "Number of active ranks are less than configured number of maximum MDSs" | |
228 | ||
229 | Description | |
230 | The maximum number of MDS ranks can be configured by setting ``max_mds`` | |
231 | configuration variable. This health warning is generated when the number | |
232 | of MDS ranks falls below this configured value. | |
233 | ||
234 | ``MDS_ALL_DOWN`` | |
235 | ---------------------------- | |
236 | ||
237 | Message | |
238 | "None of the MDS ranks are available (file system offline)" | |
239 | ||
240 | Description | |
241 | All MDS ranks are unavailable resulting in the file system to be completely | |
242 | offline. | |
f38dd50b TL |
243 | |
244 | ``MDS_CLIENTS_LAGGY`` | |
245 | ---------------------------- | |
246 | Message | |
247 | "Client *ID* is laggy; not evicted because some OSD(s) is/are laggy" | |
248 | ||
249 | Description | |
250 | If OSD(s) is laggy (due to certain conditions like network cut-off, etc) | |
251 | then it might make clients laggy(session might get idle or cannot flush | |
252 | dirty data for cap revokes). If ``defer_client_eviction_on_laggy_osds`` is | |
253 | set to true (default true), client eviction will not take place and thus | |
254 | this health warning will be generated. | |
255 | ||
256 | ``MDS_CLIENTS_BROKEN_ROOTSQUASH`` | |
257 | --------------------------------- | |
258 | Message | |
259 | "X client(s) with broken root_squash implementation (MDS_CLIENTS_BROKEN_ROOTSQUASH)" | |
260 | ||
261 | Description | |
262 | A bug was discovered in root_squash which would potentially lose changes made by a | |
263 | client restricted with root_squash caps. The fix required a change to the protocol | |
264 | and a client upgrade is required. | |
265 | ||
266 | This is a HEALTH_ERR warning because of the danger of inconsistency and lost | |
267 | data. It is recommended to either upgrade your clients, discontinue using | |
268 | root_squash in the interim, or silence the warning if desired. | |
269 | ||
270 | To evict and permanently block broken clients from connecting to the | |
271 | cluster, set the ``required_client_feature`` bit ``client_mds_auth_caps``. |