]>
Commit | Line | Data |
---|---|---|
9f95a23c | 1 | .. _health-checks: |
c07f9fc5 | 2 | |
1e59de90 TL |
3 | =============== |
4 | Health checks | |
5 | =============== | |
c07f9fc5 FG |
6 | |
7 | Overview | |
8 | ======== | |
9 | ||
1e59de90 TL |
10 | There is a finite set of health messages that a Ceph cluster can raise. These |
11 | messages are known as *health checks*. Each health check has a unique | |
12 | identifier. | |
c07f9fc5 | 13 | |
1e59de90 TL |
14 | The identifier is a terse human-readable string -- that is, the identifier is |
15 | readable in much the same way as a typical variable name. It is intended to | |
16 | enable tools (for example, UIs) to make sense of health checks and present them | |
17 | in a way that reflects their meaning. | |
c07f9fc5 FG |
18 | |
19 | This page lists the health checks that are raised by the monitor and manager | |
1e59de90 | 20 | daemons. In addition to these, you might see health checks that originate |
11fdf7f2 | 21 | from MDS daemons (see :ref:`cephfs-health-messages`), and health checks |
1e59de90 | 22 | that are defined by ``ceph-mgr`` python modules. |
c07f9fc5 FG |
23 | |
24 | Definitions | |
25 | =========== | |
26 | ||
11fdf7f2 TL |
27 | Monitor |
28 | ------- | |
29 | ||
f67539c2 TL |
30 | DAEMON_OLD_VERSION |
31 | __________________ | |
32 | ||
1e59de90 TL |
33 | Warn if one or more old versions of Ceph are running on any daemons. A health |
34 | check is raised if multiple versions are detected. This condition must exist | |
35 | for a period of time greater than ``mon_warn_older_version_delay`` (set to one | |
36 | week by default) in order for the health check to be raised. This allows most | |
37 | upgrades to proceed without the occurrence of a false warning. If the upgrade | |
38 | is paused for an extended time period, ``health mute`` can be used by running | |
39 | ``ceph health mute DAEMON_OLD_VERSION --sticky``. Be sure, however, to run | |
40 | ``ceph health unmute DAEMON_OLD_VERSION`` after the upgrade has finished. | |
f67539c2 | 41 | |
11fdf7f2 TL |
42 | MON_DOWN |
43 | ________ | |
44 | ||
1e59de90 TL |
45 | One or more monitor daemons are currently down. The cluster requires a majority |
46 | (more than one-half) of the monitors to be available. When one or more monitors | |
47 | are down, clients might have a harder time forming their initial connection to | |
48 | the cluster, as they might need to try more addresses before they reach an | |
49 | operating monitor. | |
11fdf7f2 | 50 | |
1e59de90 TL |
51 | The down monitor daemon should be restarted as soon as possible to reduce the |
52 | risk of a subsequent monitor failure leading to a service outage. | |
11fdf7f2 TL |
53 | |
54 | MON_CLOCK_SKEW | |
55 | ______________ | |
56 | ||
57 | The clocks on the hosts running the ceph-mon monitor daemons are not | |
1e59de90 TL |
58 | well-synchronized. This health check is raised if the cluster detects a clock |
59 | skew greater than ``mon_clock_drift_allowed``. | |
11fdf7f2 | 60 | |
1e59de90 | 61 | This issue is best resolved by synchronizing the clocks by using a tool like |
11fdf7f2 TL |
62 | ``ntpd`` or ``chrony``. |
63 | ||
64 | If it is impractical to keep the clocks closely synchronized, the | |
1e59de90 TL |
65 | ``mon_clock_drift_allowed`` threshold can also be increased. However, this |
66 | value must stay significantly below the ``mon_lease`` interval in order for the | |
67 | monitor cluster to function properly. | |
11fdf7f2 TL |
68 | |
69 | MON_MSGR2_NOT_ENABLED | |
70 | _____________________ | |
71 | ||
1e59de90 TL |
72 | The :confval:`ms_bind_msgr2` option is enabled but one or more monitors are |
73 | not configured to bind to a v2 port in the cluster's monmap. This | |
74 | means that features specific to the msgr2 protocol (for example, encryption) | |
75 | are unavailable on some or all connections. | |
11fdf7f2 | 76 | |
1e59de90 | 77 | In most cases this can be corrected by running the following command: |
11fdf7f2 | 78 | |
39ae355f TL |
79 | .. prompt:: bash $ |
80 | ||
81 | ceph mon enable-msgr2 | |
11fdf7f2 | 82 | |
1e59de90 TL |
83 | After this command is run, any monitor configured to listen on the old default |
84 | port (6789) will continue to listen for v1 connections on 6789 and begin to | |
85 | listen for v2 connections on the new default port 3300. | |
11fdf7f2 | 86 | |
1e59de90 TL |
87 | If a monitor is configured to listen for v1 connections on a non-standard port |
88 | (that is, a port other than 6789), then the monmap will need to be modified | |
89 | manually. | |
11fdf7f2 TL |
90 | |
91 | ||
9f95a23c TL |
92 | MON_DISK_LOW |
93 | ____________ | |
94 | ||
1e59de90 TL |
95 | One or more monitors are low on disk space. This health check is raised if the |
96 | percentage of available space on the file system used by the monitor database | |
97 | (normally ``/var/lib/ceph/mon``) drops below the percentage value | |
9f95a23c TL |
98 | ``mon_data_avail_warn`` (default: 30%). |
99 | ||
1e59de90 TL |
100 | This alert might indicate that some other process or user on the system is |
101 | filling up the file system used by the monitor. It might also | |
102 | indicate that the monitor database is too large (see ``MON_DISK_BIG`` | |
9f95a23c TL |
103 | below). |
104 | ||
1e59de90 TL |
105 | If space cannot be freed, the monitor's data directory might need to be |
106 | moved to another storage device or file system (this relocation process must be carried out while the monitor | |
107 | daemon is not running). | |
9f95a23c TL |
108 | |
109 | ||
110 | MON_DISK_CRIT | |
111 | _____________ | |
112 | ||
1e59de90 TL |
113 | One or more monitors are critically low on disk space. This health check is raised if the |
114 | percentage of available space on the file system used by the monitor database | |
115 | (normally ``/var/lib/ceph/mon``) drops below the percentage value | |
116 | ``mon_data_avail_crit`` (default: 5%). See ``MON_DISK_LOW``, above. | |
9f95a23c TL |
117 | |
118 | MON_DISK_BIG | |
119 | ____________ | |
120 | ||
1e59de90 TL |
121 | The database size for one or more monitors is very large. This health check is |
122 | raised if the size of the monitor database is larger than | |
9f95a23c TL |
123 | ``mon_data_size_warn`` (default: 15 GiB). |
124 | ||
1e59de90 TL |
125 | A large database is unusual, but does not necessarily indicate a problem. |
126 | Monitor databases might grow in size when there are placement groups that have | |
127 | not reached an ``active+clean`` state in a long time. | |
9f95a23c | 128 | |
1e59de90 TL |
129 | This alert might also indicate that the monitor's database is not properly |
130 | compacting, an issue that has been observed with some older versions of leveldb | |
131 | and rocksdb. Forcing a compaction with ``ceph daemon mon.<id> compact`` might | |
132 | shrink the database's on-disk size. | |
9f95a23c | 133 | |
1e59de90 TL |
134 | This alert might also indicate that the monitor has a bug that prevents it from |
135 | pruning the cluster metadata that it stores. If the problem persists, please | |
136 | report a bug. | |
9f95a23c | 137 | |
1e59de90 | 138 | To adjust the warning threshold, run the following command: |
39ae355f TL |
139 | |
140 | .. prompt:: bash $ | |
9f95a23c | 141 | |
39ae355f | 142 | ceph config set global mon_data_size_warn <size> |
9f95a23c | 143 | |
1e59de90 | 144 | |
c5c27e9a TL |
145 | AUTH_INSECURE_GLOBAL_ID_RECLAIM |
146 | _______________________________ | |
147 | ||
1e59de90 TL |
148 | One or more clients or daemons that are connected to the cluster are not |
149 | securely reclaiming their ``global_id`` (a unique number that identifies each | |
150 | entity in the cluster) when reconnecting to a monitor. The client is being | |
151 | permitted to connect anyway because the | |
152 | ``auth_allow_insecure_global_id_reclaim`` option is set to ``true`` (which may | |
153 | be necessary until all Ceph clients have been upgraded) and because the | |
154 | ``auth_expose_insecure_global_id_reclaim`` option is set to ``true`` (which | |
155 | allows monitors to detect clients with "insecure reclaim" sooner by forcing | |
156 | those clients to reconnect immediately after their initial authentication). | |
c5c27e9a | 157 | |
1e59de90 TL |
158 | To identify which client(s) are using unpatched Ceph client code, run the |
159 | following command: | |
39ae355f TL |
160 | |
161 | .. prompt:: bash $ | |
c5c27e9a | 162 | |
39ae355f | 163 | ceph health detail |
c5c27e9a | 164 | |
1e59de90 TL |
165 | If you collect a dump of the clients that are connected to an individual |
166 | monitor and examine the ``global_id_status`` field in the output of the dump, | |
167 | you can see the ``global_id`` reclaim behavior of those clients. Here | |
168 | ``reclaim_insecure`` means that a client is unpatched and is contributing to | |
169 | this health check. To effect a client dump, run the following command: | |
c5c27e9a | 170 | |
39ae355f TL |
171 | .. prompt:: bash $ |
172 | ||
173 | ceph tell mon.\* sessions | |
c5c27e9a | 174 | |
1e59de90 TL |
175 | We strongly recommend that all clients in the system be upgraded to a newer |
176 | version of Ceph that correctly reclaims ``global_id`` values. After all clients | |
177 | have been updated, run the following command to stop allowing insecure | |
178 | reconnections: | |
39ae355f TL |
179 | |
180 | .. prompt:: bash $ | |
c5c27e9a | 181 | |
39ae355f | 182 | ceph config set mon auth_allow_insecure_global_id_reclaim false |
c5c27e9a | 183 | |
1e59de90 TL |
184 | If it is impractical to upgrade all clients immediately, you can temporarily |
185 | silence this alert by running the following command: | |
c5c27e9a | 186 | |
39ae355f | 187 | .. prompt:: bash $ |
c5c27e9a | 188 | |
39ae355f | 189 | ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w # 1 week |
c5c27e9a | 190 | |
1e59de90 TL |
191 | Although we do NOT recommend doing so, you can also disable this alert |
192 | indefinitely by running the following command: | |
39ae355f TL |
193 | |
194 | .. prompt:: bash $ | |
195 | ||
196 | ceph config set mon mon_warn_on_insecure_global_id_reclaim false | |
c5c27e9a TL |
197 | |
198 | AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED | |
199 | _______________________________________ | |
200 | ||
1e59de90 TL |
201 | Ceph is currently configured to allow clients that reconnect to monitors using |
202 | an insecure process to reclaim their previous ``global_id``. Such reclaiming is | |
203 | allowed because, by default, ``auth_allow_insecure_global_id_reclaim`` is set | |
204 | to ``true``. It might be necessary to leave this setting enabled while existing | |
205 | Ceph clients are upgraded to newer versions of Ceph that correctly and securely | |
206 | reclaim their ``global_id``. | |
c5c27e9a | 207 | |
1e59de90 TL |
208 | If the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM`` health check has not also been |
209 | raised and if the ``auth_expose_insecure_global_id_reclaim`` setting has not | |
210 | been disabled (it is enabled by default), then there are currently no clients | |
211 | connected that need to be upgraded. In that case, it is safe to disable | |
212 | ``insecure global_id reclaim`` by running the following command: | |
39ae355f TL |
213 | |
214 | .. prompt:: bash $ | |
c5c27e9a | 215 | |
39ae355f | 216 | ceph config set mon auth_allow_insecure_global_id_reclaim false |
c5c27e9a | 217 | |
1e59de90 TL |
218 | On the other hand, if there are still clients that need to be upgraded, then |
219 | this alert can be temporarily silenced by running the following command: | |
c5c27e9a | 220 | |
39ae355f TL |
221 | .. prompt:: bash $ |
222 | ||
223 | ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week | |
c5c27e9a | 224 | |
1e59de90 TL |
225 | Although we do NOT recommend doing so, you can also disable this alert indefinitely |
226 | by running the following command: | |
39ae355f TL |
227 | |
228 | .. prompt:: bash $ | |
c5c27e9a | 229 | |
39ae355f | 230 | ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false |
c5c27e9a | 231 | |
11fdf7f2 TL |
232 | |
233 | Manager | |
234 | ------- | |
235 | ||
9f95a23c TL |
236 | MGR_DOWN |
237 | ________ | |
238 | ||
1e59de90 TL |
239 | All manager daemons are currently down. The cluster should normally have at |
240 | least one running manager (``ceph-mgr``) daemon. If no manager daemon is | |
241 | running, the cluster's ability to monitor itself will be compromised, and parts | |
242 | of the management API will become unavailable (for example, the dashboard will | |
243 | not work, and most CLI commands that report metrics or runtime state will | |
244 | block). However, the cluster will still be able to perform all I/O operations | |
245 | and to recover from failures. | |
9f95a23c | 246 | |
1e59de90 TL |
247 | The "down" manager daemon should be restarted as soon as possible to ensure |
248 | that the cluster can be monitored (for example, so that the ``ceph -s`` | |
249 | information is up to date, or so that metrics can be scraped by Prometheus). | |
9f95a23c TL |
250 | |
251 | ||
11fdf7f2 TL |
252 | MGR_MODULE_DEPENDENCY |
253 | _____________________ | |
254 | ||
1e59de90 TL |
255 | An enabled manager module is failing its dependency check. This health check |
256 | typically comes with an explanatory message from the module about the problem. | |
11fdf7f2 | 257 | |
1e59de90 TL |
258 | For example, a module might report that a required package is not installed: in |
259 | this case, you should install the required package and restart your manager | |
260 | daemons. | |
11fdf7f2 | 261 | |
1e59de90 TL |
262 | This health check is applied only to enabled modules. If a module is not |
263 | enabled, you can see whether it is reporting dependency issues in the output of | |
264 | `ceph module ls`. | |
11fdf7f2 TL |
265 | |
266 | ||
267 | MGR_MODULE_ERROR | |
268 | ________________ | |
269 | ||
1e59de90 TL |
270 | A manager module has experienced an unexpected error. Typically, this means |
271 | that an unhandled exception was raised from the module's `serve` function. The | |
272 | human-readable description of the error might be obscurely worded if the | |
273 | exception did not provide a useful description of itself. | |
11fdf7f2 | 274 | |
1e59de90 | 275 | This health check might indicate a bug: please open a Ceph bug report if you |
11fdf7f2 TL |
276 | think you have encountered a bug. |
277 | ||
1e59de90 TL |
278 | However, if you believe the error is transient, you may restart your manager |
279 | daemon(s) or use ``ceph mgr fail`` on the active daemon in order to force | |
280 | failover to another daemon. | |
c07f9fc5 FG |
281 | |
282 | OSDs | |
283 | ---- | |
284 | ||
285 | OSD_DOWN | |
286 | ________ | |
287 | ||
1e59de90 TL |
288 | One or more OSDs are marked "down". The ceph-osd daemon might have been |
289 | stopped, or peer OSDs might be unable to reach the OSD over the network. | |
290 | Common causes include a stopped or crashed daemon, a "down" host, or a network | |
291 | outage. | |
c07f9fc5 | 292 | |
1e59de90 TL |
293 | Verify that the host is healthy, the daemon is started, and the network is |
294 | functioning. If the daemon has crashed, the daemon log file | |
295 | (``/var/log/ceph/ceph-osd.*``) might contain debugging information. | |
c07f9fc5 FG |
296 | |
297 | OSD_<crush type>_DOWN | |
298 | _____________________ | |
299 | ||
1e59de90 | 300 | (for example, OSD_HOST_DOWN, OSD_ROOT_DOWN) |
c07f9fc5 | 301 | |
1e59de90 TL |
302 | All of the OSDs within a particular CRUSH subtree are marked "down" (for |
303 | example, all OSDs on a host). | |
c07f9fc5 FG |
304 | |
305 | OSD_ORPHAN | |
306 | __________ | |
307 | ||
1e59de90 | 308 | An OSD is referenced in the CRUSH map hierarchy, but does not exist. |
c07f9fc5 | 309 | |
1e59de90 | 310 | To remove the OSD from the CRUSH map hierarchy, run the following command: |
39ae355f TL |
311 | |
312 | .. prompt:: bash $ | |
c07f9fc5 | 313 | |
39ae355f | 314 | ceph osd crush rm osd.<id> |
c07f9fc5 FG |
315 | |
316 | OSD_OUT_OF_ORDER_FULL | |
317 | _____________________ | |
318 | ||
1e59de90 TL |
319 | The utilization thresholds for `nearfull`, `backfillfull`, `full`, and/or |
320 | `failsafe_full` are not ascending. In particular, the following pattern is | |
321 | expected: `nearfull < backfillfull`, `backfillfull < full`, and `full < | |
c07f9fc5 FG |
322 | failsafe_full`. |
323 | ||
1e59de90 | 324 | To adjust these utilization thresholds, run the following commands: |
c07f9fc5 | 325 | |
39ae355f TL |
326 | .. prompt:: bash $ |
327 | ||
328 | ceph osd set-nearfull-ratio <ratio> | |
329 | ceph osd set-backfillfull-ratio <ratio> | |
330 | ceph osd set-full-ratio <ratio> | |
c07f9fc5 FG |
331 | |
332 | ||
333 | OSD_FULL | |
334 | ________ | |
335 | ||
1e59de90 TL |
336 | One or more OSDs have exceeded the `full` threshold and are preventing the |
337 | cluster from servicing writes. | |
c07f9fc5 | 338 | |
1e59de90 | 339 | To check utilization by pool, run the following command: |
39ae355f TL |
340 | |
341 | .. prompt:: bash $ | |
342 | ||
343 | ceph df | |
c07f9fc5 | 344 | |
1e59de90 | 345 | To see the currently defined `full` ratio, run the following command: |
c07f9fc5 | 346 | |
39ae355f | 347 | .. prompt:: bash $ |
c07f9fc5 | 348 | |
39ae355f | 349 | ceph osd dump | grep full_ratio |
c07f9fc5 FG |
350 | |
351 | A short-term workaround to restore write availability is to raise the full | |
1e59de90 | 352 | threshold by a small amount. To do so, run the following command: |
c07f9fc5 | 353 | |
39ae355f TL |
354 | .. prompt:: bash $ |
355 | ||
356 | ceph osd set-full-ratio <ratio> | |
c07f9fc5 | 357 | |
1e59de90 TL |
358 | Additional OSDs should be deployed in order to add new storage to the cluster, |
359 | or existing data should be deleted in order to free up space in the cluster. | |
11fdf7f2 | 360 | |
c07f9fc5 FG |
361 | OSD_BACKFILLFULL |
362 | ________________ | |
363 | ||
1e59de90 TL |
364 | One or more OSDs have exceeded the `backfillfull` threshold or *would* exceed |
365 | it if the currently-mapped backfills were to finish, which will prevent data | |
366 | from rebalancing to this OSD. This alert is an early warning that | |
367 | rebalancing might be unable to complete and that the cluster is approaching | |
368 | full. | |
39ae355f | 369 | |
1e59de90 | 370 | To check utilization by pool, run the following command: |
c07f9fc5 | 371 | |
39ae355f | 372 | .. prompt:: bash $ |
c07f9fc5 | 373 | |
39ae355f | 374 | ceph df |
c07f9fc5 FG |
375 | |
376 | OSD_NEARFULL | |
377 | ____________ | |
378 | ||
1e59de90 | 379 | One or more OSDs have exceeded the `nearfull` threshold. This alert is an early |
c07f9fc5 FG |
380 | warning that the cluster is approaching full. |
381 | ||
1e59de90 | 382 | To check utilization by pool, run the following command: |
39ae355f TL |
383 | |
384 | .. prompt:: bash $ | |
c07f9fc5 | 385 | |
39ae355f | 386 | ceph df |
c07f9fc5 FG |
387 | |
388 | OSDMAP_FLAGS | |
389 | ____________ | |
390 | ||
1e59de90 | 391 | One or more cluster flags of interest have been set. These flags include: |
c07f9fc5 | 392 | |
81eedcae | 393 | * *full* - the cluster is flagged as full and cannot serve writes |
1e59de90 | 394 | * *pauserd*, *pausewr* - there are paused reads or writes |
c07f9fc5 | 395 | * *noup* - OSDs are not allowed to start |
1e59de90 TL |
396 | * *nodown* - OSD failure reports are being ignored, and that means that the |
397 | monitors will not mark OSDs "down" | |
398 | * *noin* - OSDs that were previously marked ``out`` are not being marked | |
399 | back ``in`` when they start | |
400 | * *noout* - "down" OSDs are not automatically being marked ``out`` after the | |
c07f9fc5 FG |
401 | configured interval |
402 | * *nobackfill*, *norecover*, *norebalance* - recovery or data | |
403 | rebalancing is suspended | |
404 | * *noscrub*, *nodeep_scrub* - scrubbing is disabled | |
1e59de90 | 405 | * *notieragent* - cache-tiering activity is suspended |
c07f9fc5 | 406 | |
1e59de90 TL |
407 | With the exception of *full*, these flags can be set or cleared by running the |
408 | following commands: | |
c07f9fc5 | 409 | |
39ae355f TL |
410 | .. prompt:: bash $ |
411 | ||
412 | ceph osd set <flag> | |
413 | ceph osd unset <flag> | |
11fdf7f2 | 414 | |
c07f9fc5 FG |
415 | OSD_FLAGS |
416 | _________ | |
417 | ||
1e59de90 | 418 | One or more OSDs or CRUSH {nodes,device classes} have a flag of interest set. |
81eedcae | 419 | These flags include: |
c07f9fc5 | 420 | |
81eedcae TL |
421 | * *noup*: these OSDs are not allowed to start |
422 | * *nodown*: failure reports for these OSDs will be ignored | |
1e59de90 TL |
423 | * *noin*: if these OSDs were previously marked ``out`` automatically |
424 | after a failure, they will not be marked ``in`` when they start | |
425 | * *noout*: if these OSDs are "down" they will not automatically be marked | |
426 | ``out`` after the configured interval | |
c07f9fc5 | 427 | |
1e59de90 | 428 | To set and clear these flags in batch, run the following commands: |
39ae355f TL |
429 | |
430 | .. prompt:: bash $ | |
431 | ||
432 | ceph osd set-group <flags> <who> | |
433 | ceph osd unset-group <flags> <who> | |
c07f9fc5 | 434 | |
39ae355f | 435 | For example: |
c07f9fc5 | 436 | |
39ae355f | 437 | .. prompt:: bash $ |
c07f9fc5 | 438 | |
39ae355f TL |
439 | ceph osd set-group noup,noout osd.0 osd.1 |
440 | ceph osd unset-group noup,noout osd.0 osd.1 | |
441 | ceph osd set-group noup,noout host-foo | |
442 | ceph osd unset-group noup,noout host-foo | |
443 | ceph osd set-group noup,noout class-hdd | |
444 | ceph osd unset-group noup,noout class-hdd | |
c07f9fc5 FG |
445 | |
446 | OLD_CRUSH_TUNABLES | |
447 | __________________ | |
448 | ||
1e59de90 TL |
449 | The CRUSH map is using very old settings and should be updated. The oldest set |
450 | of tunables that can be used (that is, the oldest client version that can | |
451 | connect to the cluster) without raising this health check is determined by the | |
452 | ``mon_crush_min_required_version`` config option. For more information, see | |
453 | :ref:`crush-map-tunables`. | |
c07f9fc5 FG |
454 | |
455 | OLD_CRUSH_STRAW_CALC_VERSION | |
456 | ____________________________ | |
457 | ||
1e59de90 TL |
458 | The CRUSH map is using an older, non-optimal method of calculating intermediate |
459 | weight values for ``straw`` buckets. | |
c07f9fc5 | 460 | |
1e59de90 TL |
461 | The CRUSH map should be updated to use the newer method (that is: |
462 | ``straw_calc_version=1``). For more information, see :ref:`crush-map-tunables`. | |
c07f9fc5 FG |
463 | |
464 | CACHE_POOL_NO_HIT_SET | |
465 | _____________________ | |
466 | ||
1e59de90 TL |
467 | One or more cache pools are not configured with a *hit set* to track |
468 | utilization. This issue prevents the tiering agent from identifying cold | |
469 | objects that are to be flushed and evicted from the cache. | |
c07f9fc5 | 470 | |
1e59de90 | 471 | To configure hit sets on the cache pool, run the following commands: |
c07f9fc5 | 472 | |
39ae355f TL |
473 | .. prompt:: bash $ |
474 | ||
475 | ceph osd pool set <poolname> hit_set_type <type> | |
476 | ceph osd pool set <poolname> hit_set_period <period-in-seconds> | |
477 | ceph osd pool set <poolname> hit_set_count <number-of-hitsets> | |
478 | ceph osd pool set <poolname> hit_set_fpp <target-false-positive-rate> | |
c07f9fc5 FG |
479 | |
480 | OSD_NO_SORTBITWISE | |
481 | __________________ | |
482 | ||
1e59de90 | 483 | No pre-Luminous v12.y.z OSDs are running, but the ``sortbitwise`` flag has not |
c07f9fc5 FG |
484 | been set. |
485 | ||
1e59de90 TL |
486 | The ``sortbitwise`` flag must be set in order for OSDs running Luminous v12.y.z |
487 | or newer to start. To safely set the flag, run the following command: | |
39ae355f TL |
488 | |
489 | .. prompt:: bash $ | |
c07f9fc5 | 490 | |
39ae355f | 491 | ceph osd set sortbitwise |
c07f9fc5 | 492 | |
20effc67 TL |
493 | OSD_FILESTORE |
494 | __________________ | |
495 | ||
1e59de90 TL |
496 | Warn if OSDs are running Filestore. The Filestore OSD back end has been |
497 | deprecated; the BlueStore back end has been the default object store since the | |
498 | Ceph Luminous release. | |
20effc67 | 499 | |
1e59de90 TL |
500 | The 'mclock_scheduler' is not supported for Filestore OSDs. For this reason, |
501 | the default 'osd_op_queue' is set to 'wpq' for Filestore OSDs and is enforced | |
20effc67 TL |
502 | even if the user attempts to change it. |
503 | ||
1e59de90 | 504 | |
20effc67 | 505 | |
39ae355f | 506 | .. prompt:: bash $ |
20effc67 | 507 | |
39ae355f | 508 | ceph report | jq -c '."osd_metadata" | .[] | select(.osd_objectstore | contains("filestore")) | {id, osd_objectstore}' |
20effc67 | 509 | |
1e59de90 TL |
510 | **In order to upgrade to Reef or a later release, you must first migrate any |
511 | Filestore OSDs to BlueStore.** | |
512 | ||
513 | If you are upgrading a pre-Reef release to Reef or later, but it is not | |
514 | feasible to migrate Filestore OSDs to BlueStore immediately, you can | |
515 | temporarily silence this alert by running the following command: | |
39ae355f TL |
516 | |
517 | .. prompt:: bash $ | |
518 | ||
519 | ceph health mute OSD_FILESTORE | |
520 | ||
1e59de90 TL |
521 | Since this migration can take a considerable amount of time to complete, we |
522 | recommend that you begin the process well in advance of any update to Reef or | |
523 | to later releases. | |
20effc67 | 524 | |
c07f9fc5 FG |
525 | POOL_FULL |
526 | _________ | |
527 | ||
1e59de90 | 528 | One or more pools have reached their quota and are no longer allowing writes. |
c07f9fc5 | 529 | |
1e59de90 | 530 | To see pool quotas and utilization, run the following command: |
c07f9fc5 | 531 | |
39ae355f | 532 | .. prompt:: bash $ |
c07f9fc5 | 533 | |
39ae355f | 534 | ceph df detail |
c07f9fc5 | 535 | |
1e59de90 | 536 | If you opt to raise the pool quota, run the following commands: |
39ae355f TL |
537 | |
538 | .. prompt:: bash $ | |
539 | ||
540 | ceph osd pool set-quota <poolname> max_objects <num-objects> | |
541 | ceph osd pool set-quota <poolname> max_bytes <num-bytes> | |
c07f9fc5 | 542 | |
1e59de90 | 543 | If not, delete some existing data to reduce utilization. |
c07f9fc5 | 544 | |
81eedcae TL |
545 | BLUEFS_SPILLOVER |
546 | ________________ | |
547 | ||
1e59de90 TL |
548 | One or more OSDs that use the BlueStore back end have been allocated `db` |
549 | partitions (that is, storage space for metadata, normally on a faster device), | |
550 | but because that space has been filled, metadata has "spilled over" onto the | |
551 | slow device. This is not necessarily an error condition or even unexpected | |
552 | behavior, but may result in degraded performance. If the administrator had | |
553 | expected that all metadata would fit on the faster device, this alert indicates | |
81eedcae TL |
554 | that not enough space was provided. |
555 | ||
1e59de90 | 556 | To disable this alert on all OSDs, run the following command: |
39ae355f TL |
557 | |
558 | .. prompt:: bash $ | |
81eedcae | 559 | |
39ae355f | 560 | ceph config set osd bluestore_warn_on_bluefs_spillover false |
81eedcae | 561 | |
1e59de90 TL |
562 | Alternatively, to disable the alert on a specific OSD, run the following |
563 | command: | |
81eedcae | 564 | |
39ae355f TL |
565 | .. prompt:: bash $ |
566 | ||
567 | ceph config set osd.123 bluestore_warn_on_bluefs_spillover false | |
81eedcae | 568 | |
1e59de90 TL |
569 | To secure more metadata space, you can destroy and reprovision the OSD in |
570 | question. This process involves data migration and recovery. | |
81eedcae | 571 | |
1e59de90 TL |
572 | It might also be possible to expand the LVM logical volume that backs the `db` |
573 | storage. If the underlying LV has been expanded, you must stop the OSD daemon | |
574 | and inform BlueFS of the device-size change by running the following command: | |
39ae355f TL |
575 | |
576 | .. prompt:: bash $ | |
81eedcae | 577 | |
39ae355f | 578 | ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-$ID |
81eedcae | 579 | |
eafe8130 TL |
580 | BLUEFS_AVAILABLE_SPACE |
581 | ______________________ | |
582 | ||
1e59de90 | 583 | To see how much space is free for BlueFS, run the following command: |
39ae355f TL |
584 | |
585 | .. prompt:: bash $ | |
eafe8130 | 586 | |
39ae355f | 587 | ceph daemon osd.123 bluestore bluefs available |
eafe8130 | 588 | |
1e59de90 TL |
589 | This will output up to three values: ``BDEV_DB free``, ``BDEV_SLOW free``, and |
590 | ``available_from_bluestore``. ``BDEV_DB`` and ``BDEV_SLOW`` report the amount | |
591 | of space that has been acquired by BlueFS and is now considered free. The value | |
592 | ``available_from_bluestore`` indicates the ability of BlueStore to relinquish | |
593 | more space to BlueFS. It is normal for this value to differ from the amount of | |
594 | BlueStore free space, because the BlueFS allocation unit is typically larger | |
595 | than the BlueStore allocation unit. This means that only part of the BlueStore | |
596 | free space will be available for BlueFS. | |
eafe8130 TL |
597 | |
598 | BLUEFS_LOW_SPACE | |
599 | _________________ | |
600 | ||
1e59de90 TL |
601 | If BlueFS is running low on available free space and there is not much free |
602 | space available from BlueStore (in other words, `available_from_bluestore` has | |
603 | a low value), consider reducing the BlueFS allocation unit size. To simulate | |
604 | available space when the allocation unit is different, run the following | |
605 | command: | |
eafe8130 | 606 | |
39ae355f TL |
607 | .. prompt:: bash $ |
608 | ||
609 | ceph daemon osd.123 bluestore bluefs available <alloc-unit-size> | |
eafe8130 TL |
610 | |
611 | BLUESTORE_FRAGMENTATION | |
612 | _______________________ | |
613 | ||
1e59de90 TL |
614 | As BlueStore operates, the free space on the underlying storage will become |
615 | fragmented. This is normal and unavoidable, but excessive fragmentation causes | |
616 | slowdown. To inspect BlueStore fragmentation, run the following command: | |
39ae355f TL |
617 | |
618 | .. prompt:: bash $ | |
eafe8130 | 619 | |
39ae355f | 620 | ceph daemon osd.123 bluestore allocator score block |
eafe8130 | 621 | |
1e59de90 | 622 | The fragmentation score is given in a [0-1] range. |
eafe8130 TL |
623 | [0.0 .. 0.4] tiny fragmentation |
624 | [0.4 .. 0.7] small, acceptable fragmentation | |
625 | [0.7 .. 0.9] considerable, but safe fragmentation | |
1e59de90 | 626 | [0.9 .. 1.0] severe fragmentation, might impact BlueFS's ability to get space from BlueStore |
eafe8130 | 627 | |
1e59de90 | 628 | To see a detailed report of free fragments, run the following command: |
39ae355f TL |
629 | |
630 | .. prompt:: bash $ | |
eafe8130 | 631 | |
39ae355f | 632 | ceph daemon osd.123 bluestore allocator dump block |
eafe8130 | 633 | |
1e59de90 TL |
634 | For OSD processes that are not currently running, fragmentation can be |
635 | inspected with `ceph-bluestore-tool`. To see the fragmentation score, run the | |
636 | following command: | |
eafe8130 | 637 | |
39ae355f | 638 | .. prompt:: bash $ |
eafe8130 | 639 | |
39ae355f | 640 | ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-score |
eafe8130 | 641 | |
1e59de90 | 642 | To dump detailed free chunks, run the following command: |
39ae355f TL |
643 | |
644 | .. prompt:: bash $ | |
645 | ||
646 | ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-dump | |
eafe8130 | 647 | |
81eedcae TL |
648 | BLUESTORE_LEGACY_STATFS |
649 | _______________________ | |
650 | ||
1e59de90 TL |
651 | One or more OSDs have BlueStore volumes that were created prior to the |
652 | Nautilus release. (In Nautilus, BlueStore tracks its internal usage | |
653 | statistics on a granular, per-pool basis.) | |
654 | ||
655 | If *all* OSDs | |
656 | are older than Nautilus, this means that the per-pool metrics are | |
657 | simply unavailable. But if there is a mixture of pre-Nautilus and | |
81eedcae | 658 | post-Nautilus OSDs, the cluster usage statistics reported by ``ceph |
1e59de90 | 659 | df`` will be inaccurate. |
81eedcae | 660 | |
1e59de90 TL |
661 | The old OSDs can be updated to use the new usage-tracking scheme by stopping |
662 | each OSD, running a repair operation, and then restarting the OSD. For example, | |
663 | to update ``osd.123``, run the following commands: | |
39ae355f TL |
664 | |
665 | .. prompt:: bash $ | |
81eedcae | 666 | |
39ae355f TL |
667 | systemctl stop ceph-osd@123 |
668 | ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123 | |
669 | systemctl start ceph-osd@123 | |
81eedcae | 670 | |
1e59de90 | 671 | To disable this alert, run the following command: |
81eedcae | 672 | |
39ae355f TL |
673 | .. prompt:: bash $ |
674 | ||
675 | ceph config set global bluestore_warn_on_legacy_statfs false | |
81eedcae | 676 | |
9f95a23c TL |
677 | BLUESTORE_NO_PER_POOL_OMAP |
678 | __________________________ | |
679 | ||
1e59de90 TL |
680 | One or more OSDs have volumes that were created prior to the Octopus release. |
681 | (In Octopus and later releases, BlueStore tracks omap space utilization by | |
682 | pool.) | |
683 | ||
684 | If there are any BlueStore OSDs that do not have the new tracking enabled, the | |
685 | cluster will report an approximate value for per-pool omap usage based on the | |
686 | most recent deep scrub. | |
9f95a23c | 687 | |
1e59de90 TL |
688 | The OSDs can be updated to track by pool by stopping each OSD, running a repair |
689 | operation, and then restarting the OSD. For example, to update ``osd.123``, run | |
690 | the following commands: | |
39ae355f TL |
691 | |
692 | .. prompt:: bash $ | |
693 | ||
694 | systemctl stop ceph-osd@123 | |
695 | ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123 | |
696 | systemctl start ceph-osd@123 | |
9f95a23c | 697 | |
1e59de90 | 698 | To disable this alert, run the following command: |
9f95a23c | 699 | |
39ae355f | 700 | .. prompt:: bash $ |
9f95a23c | 701 | |
39ae355f | 702 | ceph config set global bluestore_warn_on_no_per_pool_omap false |
9f95a23c | 703 | |
f67539c2 TL |
704 | BLUESTORE_NO_PER_PG_OMAP |
705 | __________________________ | |
706 | ||
1e59de90 TL |
707 | One or more OSDs have volumes that were created prior to Pacific. (In Pacific |
708 | and later releases Bluestore tracks omap space utilitzation by Placement Group | |
709 | (PG).) | |
f67539c2 | 710 | |
1e59de90 TL |
711 | Per-PG omap allows faster PG removal when PGs migrate. |
712 | ||
713 | The older OSDs can be updated to track by PG by stopping each OSD, running a | |
714 | repair operation, and then restarting the OSD. For example, to update | |
715 | ``osd.123``, run the following commands: | |
f67539c2 | 716 | |
39ae355f | 717 | .. prompt:: bash $ |
f67539c2 | 718 | |
39ae355f TL |
719 | systemctl stop ceph-osd@123 |
720 | ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123 | |
721 | systemctl start ceph-osd@123 | |
f67539c2 | 722 | |
1e59de90 | 723 | To disable this alert, run the following command: |
39ae355f TL |
724 | |
725 | .. prompt:: bash $ | |
726 | ||
727 | ceph config set global bluestore_warn_on_no_per_pg_omap false | |
f67539c2 | 728 | |
81eedcae TL |
729 | |
730 | BLUESTORE_DISK_SIZE_MISMATCH | |
731 | ____________________________ | |
732 | ||
1e59de90 TL |
733 | One or more BlueStore OSDs have an internal inconsistency between the size of |
734 | the physical device and the metadata that tracks its size. This inconsistency | |
735 | can lead to the OSD(s) crashing in the future. | |
81eedcae | 736 | |
1e59de90 TL |
737 | The OSDs that have this inconsistency should be destroyed and reprovisioned. Be |
738 | very careful to execute this procedure on only one OSD at a time, so as to | |
739 | minimize the risk of losing any data. To execute this procedure, where ``$N`` | |
740 | is the OSD that has the inconsistency, run the following commands: | |
39ae355f TL |
741 | |
742 | .. prompt:: bash $ | |
81eedcae | 743 | |
39ae355f TL |
744 | ceph osd out osd.$N |
745 | while ! ceph osd safe-to-destroy osd.$N ; do sleep 1m ; done | |
746 | ceph osd destroy osd.$N | |
747 | ceph-volume lvm zap /path/to/device | |
748 | ceph-volume lvm create --osd-id $N --data /path/to/device | |
81eedcae | 749 | |
1e59de90 TL |
750 | .. note:: |
751 | ||
752 | Wait for this recovery procedure to completely on one OSD before running it | |
753 | on the next. | |
754 | ||
9f95a23c TL |
755 | BLUESTORE_NO_COMPRESSION |
756 | ________________________ | |
757 | ||
1e59de90 TL |
758 | One or more OSDs is unable to load a BlueStore compression plugin. This issue |
759 | might be caused by a broken installation, in which the ``ceph-osd`` binary does | |
760 | not match the compression plugins. Or it might be caused by a recent upgrade in | |
761 | which the ``ceph-osd`` daemon was not restarted. | |
9f95a23c | 762 | |
1e59de90 TL |
763 | To resolve this issue, verify that all of the packages on the host that is |
764 | running the affected OSD(s) are correctly installed and that the OSD daemon(s) | |
765 | have been restarted. If the problem persists, check the OSD log for information | |
766 | about the source of the problem. | |
9f95a23c | 767 | |
f67539c2 TL |
768 | BLUESTORE_SPURIOUS_READ_ERRORS |
769 | ______________________________ | |
770 | ||
1e59de90 TL |
771 | One or more BlueStore OSDs detect spurious read errors on the main device. |
772 | BlueStore has recovered from these errors by retrying disk reads. This alert | |
773 | might indicate issues with underlying hardware, issues with the I/O subsystem, | |
774 | or something similar. In theory, such issues can cause permanent data | |
775 | corruption. Some observations on the root cause of spurious read errors can be | |
776 | found here: https://tracker.ceph.com/issues/22464 | |
f67539c2 | 777 | |
1e59de90 TL |
778 | This alert does not require an immediate response, but the affected host might |
779 | need additional attention: for example, upgrading the host to the latest | |
780 | OS/kernel versions and implementing hardware-resource-utilization monitoring. | |
f67539c2 | 781 | |
1e59de90 | 782 | To disable this alert on all OSDs, run the following command: |
39ae355f TL |
783 | |
784 | .. prompt:: bash $ | |
785 | ||
786 | ceph config set osd bluestore_warn_on_spurious_read_errors false | |
f67539c2 | 787 | |
1e59de90 | 788 | Or, to disable this alert on a specific OSD, run the following command: |
f67539c2 | 789 | |
39ae355f | 790 | .. prompt:: bash $ |
f67539c2 | 791 | |
39ae355f | 792 | ceph config set osd.123 bluestore_warn_on_spurious_read_errors false |
9f95a23c | 793 | |
11fdf7f2 TL |
794 | Device health |
795 | ------------- | |
796 | ||
797 | DEVICE_HEALTH | |
798 | _____________ | |
799 | ||
1e59de90 TL |
800 | One or more OSD devices are expected to fail soon, where the warning threshold |
801 | is determined by the ``mgr/devicehealth/warn_threshold`` config option. | |
11fdf7f2 | 802 | |
1e59de90 TL |
803 | Because this alert applies only to OSDs that are currently marked ``in``, the |
804 | appropriate response to this expected failure is (1) to mark the OSD ``out`` so | |
805 | that data is migrated off of the OSD, and then (2) to remove the hardware from | |
806 | the system. Note that this marking ``out`` is normally done automatically if | |
807 | ``mgr/devicehealth/self_heal`` is enabled (as determined by | |
808 | ``mgr/devicehealth/mark_out_threshold``). | |
11fdf7f2 | 809 | |
1e59de90 | 810 | To check device health, run the following command: |
11fdf7f2 | 811 | |
39ae355f TL |
812 | .. prompt:: bash $ |
813 | ||
814 | ceph device info <device-id> | |
11fdf7f2 | 815 | |
1e59de90 TL |
816 | Device life expectancy is set either by a prediction model that the mgr runs or |
817 | by an external tool that is activated by running the following command: | |
39ae355f TL |
818 | |
819 | .. prompt:: bash $ | |
11fdf7f2 | 820 | |
39ae355f | 821 | ceph device set-life-expectancy <device-id> <from> <to> |
11fdf7f2 | 822 | |
1e59de90 TL |
823 | You can change the stored life expectancy manually, but such a change usually |
824 | doesn't accomplish anything. The reason for this is that whichever tool | |
825 | originally set the stored life expectancy will probably undo your change by | |
826 | setting it again, and a change to the stored value does not affect the actual | |
827 | health of the hardware device. | |
11fdf7f2 TL |
828 | |
829 | DEVICE_HEALTH_IN_USE | |
830 | ____________________ | |
831 | ||
1e59de90 TL |
832 | One or more devices (that is, OSDs) are expected to fail soon and have been |
833 | marked ``out`` of the cluster (as controlled by | |
834 | ``mgr/devicehealth/mark_out_threshold``), but they are still participating in | |
835 | one or more Placement Groups. This might be because the OSD(s) were marked | |
836 | ``out`` only recently and data is still migrating, or because data cannot be | |
837 | migrated off of the OSD(s) for some reason (for example, the cluster is nearly | |
838 | full, or the CRUSH hierarchy is structured so that there isn't another suitable | |
839 | OSD to migrate the data to). | |
11fdf7f2 | 840 | |
1e59de90 TL |
841 | This message can be silenced by disabling self-heal behavior (that is, setting |
842 | ``mgr/devicehealth/self_heal`` to ``false``), by adjusting | |
843 | ``mgr/devicehealth/mark_out_threshold``, or by addressing whichever condition | |
844 | is preventing data from being migrated off of the ailing OSD(s). | |
845 | ||
846 | .. _rados_health_checks_device_health_toomany: | |
11fdf7f2 TL |
847 | |
848 | DEVICE_HEALTH_TOOMANY | |
849 | _____________________ | |
850 | ||
1e59de90 TL |
851 | Too many devices (that is, OSDs) are expected to fail soon, and because |
852 | ``mgr/devicehealth/self_heal`` behavior is enabled, marking ``out`` all of the | |
853 | ailing OSDs would exceed the cluster's ``mon_osd_min_in_ratio`` ratio. This | |
854 | ratio prevents a cascade of too many OSDs from being automatically marked | |
855 | ``out``. | |
11fdf7f2 | 856 | |
1e59de90 TL |
857 | You should promptly add new OSDs to the cluster to prevent data loss, or |
858 | incrementally replace the failing OSDs. | |
11fdf7f2 | 859 | |
1e59de90 TL |
860 | Alternatively, you can silence this health check by adjusting options including |
861 | ``mon_osd_min_in_ratio`` or ``mgr/devicehealth/mark_out_threshold``. Be | |
862 | warned, however, that this will increase the likelihood of unrecoverable data | |
863 | loss. | |
11fdf7f2 TL |
864 | |
865 | ||
c07f9fc5 | 866 | Data health (pools & placement groups) |
d2e6a577 | 867 | -------------------------------------- |
c07f9fc5 FG |
868 | |
869 | PG_AVAILABILITY | |
870 | _______________ | |
871 | ||
1e59de90 TL |
872 | Data availability is reduced. In other words, the cluster is unable to service |
873 | potential read or write requests for at least some data in the cluster. More | |
874 | precisely, one or more Placement Groups (PGs) are in a state that does not | |
875 | allow I/O requests to be serviced. Any of the following PG states are | |
876 | problematic if they do not clear quickly: *peering*, *stale*, *incomplete*, and | |
877 | the lack of *active*. | |
c07f9fc5 | 878 | |
1e59de90 TL |
879 | For detailed information about which PGs are affected, run the following |
880 | command: | |
39ae355f TL |
881 | |
882 | .. prompt:: bash $ | |
c07f9fc5 | 883 | |
39ae355f | 884 | ceph health detail |
c07f9fc5 | 885 | |
1e59de90 TL |
886 | In most cases, the root cause of this issue is that one or more OSDs are |
887 | currently ``down``: see ``OSD_DOWN`` above. | |
c07f9fc5 | 888 | |
1e59de90 | 889 | To see the state of a specific problematic PG, run the following command: |
c07f9fc5 | 890 | |
39ae355f TL |
891 | .. prompt:: bash $ |
892 | ||
893 | ceph tell <pgid> query | |
c07f9fc5 FG |
894 | |
895 | PG_DEGRADED | |
896 | ___________ | |
897 | ||
1e59de90 TL |
898 | Data redundancy is reduced for some data: in other words, the cluster does not |
899 | have the desired number of replicas for all data (in the case of replicated | |
900 | pools) or erasure code fragments (in the case of erasure-coded pools). More | |
901 | precisely, one or more Placement Groups (PGs): | |
c07f9fc5 | 902 | |
1e59de90 TL |
903 | * have the *degraded* or *undersized* flag set, which means that there are not |
904 | enough instances of that PG in the cluster; or | |
905 | * have not had the *clean* state set for a long time. | |
c07f9fc5 | 906 | |
1e59de90 TL |
907 | For detailed information about which PGs are affected, run the following |
908 | command: | |
39ae355f TL |
909 | |
910 | .. prompt:: bash $ | |
c07f9fc5 | 911 | |
39ae355f | 912 | ceph health detail |
c07f9fc5 | 913 | |
1e59de90 TL |
914 | In most cases, the root cause of this issue is that one or more OSDs are |
915 | currently "down": see ``OSD_DOWN`` above. | |
c07f9fc5 | 916 | |
1e59de90 | 917 | To see the state of a specific problematic PG, run the following command: |
39ae355f TL |
918 | |
919 | .. prompt:: bash $ | |
c07f9fc5 | 920 | |
39ae355f | 921 | ceph tell <pgid> query |
c07f9fc5 FG |
922 | |
923 | ||
eafe8130 TL |
924 | PG_RECOVERY_FULL |
925 | ________________ | |
926 | ||
1e59de90 TL |
927 | Data redundancy might be reduced or even put at risk for some data due to a |
928 | lack of free space in the cluster. More precisely, one or more Placement Groups | |
929 | have the *recovery_toofull* flag set, which means that the cluster is unable to | |
930 | migrate or recover data because one or more OSDs are above the ``full`` | |
931 | threshold. | |
eafe8130 | 932 | |
1e59de90 | 933 | For steps to resolve this condition, see *OSD_FULL* above. |
eafe8130 TL |
934 | |
935 | PG_BACKFILL_FULL | |
c07f9fc5 FG |
936 | ________________ |
937 | ||
1e59de90 TL |
938 | Data redundancy might be reduced or even put at risk for some data due to a |
939 | lack of free space in the cluster. More precisely, one or more Placement Groups | |
940 | have the *backfill_toofull* flag set, which means that the cluster is unable to | |
941 | migrate or recover data because one or more OSDs are above the ``backfillfull`` | |
942 | threshold. | |
c07f9fc5 | 943 | |
1e59de90 | 944 | For steps to resolve this condition, see *OSD_BACKFILLFULL* above. |
c07f9fc5 FG |
945 | |
946 | PG_DAMAGED | |
947 | __________ | |
948 | ||
1e59de90 TL |
949 | Data scrubbing has discovered problems with data consistency in the cluster. |
950 | More precisely, one or more Placement Groups either (1) have the *inconsistent* | |
951 | or ``snaptrim_error`` flag set, which indicates that an earlier data scrub | |
952 | operation found a problem, or (2) have the *repair* flag set, which means that | |
953 | a repair for such an inconsistency is currently in progress. | |
c07f9fc5 | 954 | |
1e59de90 | 955 | For more information, see :doc:`pg-repair`. |
c07f9fc5 FG |
956 | |
957 | OSD_SCRUB_ERRORS | |
958 | ________________ | |
959 | ||
1e59de90 | 960 | Recent OSD scrubs have discovered inconsistencies. This alert is generally |
11fdf7f2 | 961 | paired with *PG_DAMAGED* (see above). |
c07f9fc5 | 962 | |
1e59de90 | 963 | For more information, see :doc:`pg-repair`. |
c07f9fc5 | 964 | |
f6b5b4d7 TL |
965 | OSD_TOO_MANY_REPAIRS |
966 | ____________________ | |
967 | ||
1e59de90 TL |
968 | The count of read repairs has exceeded the config value threshold |
969 | ``mon_osd_warn_num_repaired`` (default: ``10``). Because scrub handles errors | |
970 | only for data at rest, and because any read error that occurs when another | |
971 | replica is available will be repaired immediately so that the client can get | |
972 | the object data, there might exist failing disks that are not registering any | |
973 | scrub errors. This repair count is maintained as a way of identifying any such | |
974 | failing disks. | |
975 | ||
f6b5b4d7 | 976 | |
11fdf7f2 TL |
977 | LARGE_OMAP_OBJECTS |
978 | __________________ | |
979 | ||
1e59de90 TL |
980 | One or more pools contain large omap objects, as determined by |
981 | ``osd_deep_scrub_large_omap_object_key_threshold`` (threshold for the number of | |
982 | keys to determine what is considered a large omap object) or | |
983 | ``osd_deep_scrub_large_omap_object_value_sum_threshold`` (the threshold for the | |
984 | summed size in bytes of all key values to determine what is considered a large | |
985 | omap object) or both. To find more information on object name, key count, and | |
986 | size in bytes, search the cluster log for 'Large omap object found'. This issue | |
987 | can be caused by RGW-bucket index objects that do not have automatic resharding | |
988 | enabled. For more information on resharding, see :ref:`RGW Dynamic Bucket Index | |
989 | Resharding <rgw_dynamic_bucket_index_resharding>`. | |
11fdf7f2 | 990 | |
1e59de90 | 991 | To adjust the thresholds mentioned above, run the following commands: |
11fdf7f2 | 992 | |
39ae355f TL |
993 | .. prompt:: bash $ |
994 | ||
995 | ceph config set osd osd_deep_scrub_large_omap_object_key_threshold <keys> | |
996 | ceph config set osd osd_deep_scrub_large_omap_object_value_sum_threshold <bytes> | |
11fdf7f2 | 997 | |
c07f9fc5 FG |
998 | CACHE_POOL_NEAR_FULL |
999 | ____________________ | |
1000 | ||
1e59de90 TL |
1001 | A cache-tier pool is nearly full, as determined by the ``target_max_bytes`` and |
1002 | ``target_max_objects`` properties of the cache pool. Once the pool reaches the | |
1003 | target threshold, write requests to the pool might block while data is flushed | |
1004 | and evicted from the cache. This state normally leads to very high latencies | |
1005 | and poor performance. | |
c07f9fc5 | 1006 | |
1e59de90 | 1007 | To adjust the cache pool's target size, run the following commands: |
39ae355f TL |
1008 | |
1009 | .. prompt:: bash $ | |
c07f9fc5 | 1010 | |
39ae355f TL |
1011 | ceph osd pool set <cache-pool-name> target_max_bytes <bytes> |
1012 | ceph osd pool set <cache-pool-name> target_max_objects <objects> | |
c07f9fc5 | 1013 | |
1e59de90 TL |
1014 | There might be other reasons that normal cache flush and evict activity are |
1015 | throttled: for example, reduced availability of the base tier, reduced | |
1016 | performance of the base tier, or overall cluster load. | |
c07f9fc5 FG |
1017 | |
1018 | TOO_FEW_PGS | |
1019 | ___________ | |
1020 | ||
1e59de90 TL |
1021 | The number of Placement Groups (PGs) that are in use in the cluster is below |
1022 | the configurable threshold of ``mon_pg_warn_min_per_osd`` PGs per OSD. This can | |
1023 | lead to suboptimal distribution and suboptimal balance of data across the OSDs | |
1024 | in the cluster, and a reduction of overall performance. | |
c07f9fc5 | 1025 | |
1e59de90 | 1026 | If data pools have not yet been created, this condition is expected. |
c07f9fc5 | 1027 | |
1e59de90 TL |
1028 | To address this issue, you can increase the PG count for existing pools or |
1029 | create new pools. For more information, see | |
1030 | :ref:`choosing-number-of-placement-groups`. | |
11fdf7f2 | 1031 | |
92f5a8d4 TL |
1032 | POOL_PG_NUM_NOT_POWER_OF_TWO |
1033 | ____________________________ | |
1034 | ||
1e59de90 TL |
1035 | One or more pools have a ``pg_num`` value that is not a power of two. Although |
1036 | this is not strictly incorrect, it does lead to a less balanced distribution of | |
1037 | data because some Placement Groups will have roughly twice as much data as | |
1038 | others have. | |
92f5a8d4 | 1039 | |
1e59de90 TL |
1040 | This is easily corrected by setting the ``pg_num`` value for the affected |
1041 | pool(s) to a nearby power of two. To do so, run the following command: | |
39ae355f TL |
1042 | |
1043 | .. prompt:: bash $ | |
92f5a8d4 | 1044 | |
39ae355f | 1045 | ceph osd pool set <pool-name> pg_num <value> |
92f5a8d4 | 1046 | |
1e59de90 | 1047 | To disable this health check, run the following command: |
92f5a8d4 | 1048 | |
39ae355f TL |
1049 | .. prompt:: bash $ |
1050 | ||
1051 | ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false | |
92f5a8d4 | 1052 | |
11fdf7f2 TL |
1053 | POOL_TOO_FEW_PGS |
1054 | ________________ | |
1055 | ||
1e59de90 TL |
1056 | One or more pools should probably have more Placement Groups (PGs), given the |
1057 | amount of data that is currently stored in the pool. This issue can lead to | |
1058 | suboptimal distribution and suboptimal balance of data across the OSDs in the | |
1059 | cluster, and a reduction of overall performance. This alert is raised only if | |
1060 | the ``pg_autoscale_mode`` property on the pool is set to ``warn``. | |
11fdf7f2 | 1061 | |
1e59de90 TL |
1062 | To disable the alert, entirely disable auto-scaling of PGs for the pool by |
1063 | running the following command: | |
39ae355f TL |
1064 | |
1065 | .. prompt:: bash $ | |
1066 | ||
1067 | ceph osd pool set <pool-name> pg_autoscale_mode off | |
11fdf7f2 | 1068 | |
1e59de90 TL |
1069 | To allow the cluster to automatically adjust the number of PGs for the pool, |
1070 | run the following command: | |
11fdf7f2 | 1071 | |
39ae355f | 1072 | .. prompt:: bash $ |
11fdf7f2 | 1073 | |
39ae355f | 1074 | ceph osd pool set <pool-name> pg_autoscale_mode on |
11fdf7f2 | 1075 | |
1e59de90 TL |
1076 | Alternatively, to manually set the number of PGs for the pool to the |
1077 | recommended amount, run the following command: | |
11fdf7f2 | 1078 | |
39ae355f TL |
1079 | .. prompt:: bash $ |
1080 | ||
1081 | ceph osd pool set <pool-name> pg_num <new-pg-num> | |
11fdf7f2 | 1082 | |
1e59de90 TL |
1083 | For more information, see :ref:`choosing-number-of-placement-groups` and |
1084 | :ref:`pg-autoscaler`. | |
c07f9fc5 FG |
1085 | |
1086 | TOO_MANY_PGS | |
1087 | ____________ | |
1088 | ||
1e59de90 TL |
1089 | The number of Placement Groups (PGs) in use in the cluster is above the |
1090 | configurable threshold of ``mon_max_pg_per_osd`` PGs per OSD. If this threshold | |
1091 | is exceeded, the cluster will not allow new pools to be created, pool `pg_num` | |
1092 | to be increased, or pool replication to be increased (any of which, if allowed, | |
1093 | would lead to more PGs in the cluster). A large number of PGs can lead to | |
1094 | higher memory utilization for OSD daemons, slower peering after cluster state | |
1095 | changes (for example, OSD restarts, additions, or removals), and higher load on | |
1096 | the Manager and Monitor daemons. | |
c07f9fc5 | 1097 | |
1e59de90 TL |
1098 | The simplest way to mitigate the problem is to increase the number of OSDs in |
1099 | the cluster by adding more hardware. Note that, because the OSD count that is | |
1100 | used for the purposes of this health check is the number of ``in`` OSDs, | |
1101 | marking ``out`` OSDs ``in`` (if there are any ``out`` OSDs available) can also | |
1102 | help. To do so, run the following command: | |
39ae355f TL |
1103 | |
1104 | .. prompt:: bash $ | |
c07f9fc5 | 1105 | |
39ae355f | 1106 | ceph osd in <osd id(s)> |
c07f9fc5 | 1107 | |
1e59de90 | 1108 | For more information, see :ref:`choosing-number-of-placement-groups`. |
11fdf7f2 TL |
1109 | |
1110 | POOL_TOO_MANY_PGS | |
1111 | _________________ | |
1112 | ||
1e59de90 TL |
1113 | One or more pools should probably have fewer Placement Groups (PGs), given the |
1114 | amount of data that is currently stored in the pool. This issue can lead to | |
1115 | higher memory utilization for OSD daemons, slower peering after cluster state | |
1116 | changes (for example, OSD restarts, additions, or removals), and higher load on | |
1117 | the Manager and Monitor daemons. This alert is raised only if the | |
11fdf7f2 TL |
1118 | ``pg_autoscale_mode`` property on the pool is set to ``warn``. |
1119 | ||
1e59de90 TL |
1120 | To disable the alert, entirely disable auto-scaling of PGs for the pool by |
1121 | running the following command: | |
39ae355f TL |
1122 | |
1123 | .. prompt:: bash $ | |
11fdf7f2 | 1124 | |
39ae355f | 1125 | ceph osd pool set <pool-name> pg_autoscale_mode off |
11fdf7f2 | 1126 | |
1e59de90 TL |
1127 | To allow the cluster to automatically adjust the number of PGs for the pool, |
1128 | run the following command: | |
11fdf7f2 | 1129 | |
39ae355f TL |
1130 | .. prompt:: bash $ |
1131 | ||
1132 | ceph osd pool set <pool-name> pg_autoscale_mode on | |
11fdf7f2 | 1133 | |
1e59de90 TL |
1134 | Alternatively, to manually set the number of PGs for the pool to the |
1135 | recommended amount, run the following command: | |
39ae355f TL |
1136 | |
1137 | .. prompt:: bash $ | |
11fdf7f2 | 1138 | |
39ae355f | 1139 | ceph osd pool set <pool-name> pg_num <new-pg-num> |
11fdf7f2 | 1140 | |
1e59de90 TL |
1141 | For more information, see :ref:`choosing-number-of-placement-groups` and |
1142 | :ref:`pg-autoscaler`. | |
1143 | ||
11fdf7f2 | 1144 | |
9f95a23c | 1145 | POOL_TARGET_SIZE_BYTES_OVERCOMMITTED |
11fdf7f2 TL |
1146 | ____________________________________ |
1147 | ||
1e59de90 TL |
1148 | One or more pools have a ``target_size_bytes`` property that is set in order to |
1149 | estimate the expected size of the pool, but the value(s) of this property are | |
1150 | greater than the total available storage (either by themselves or in | |
1151 | combination with other pools). | |
11fdf7f2 | 1152 | |
1e59de90 TL |
1153 | This alert is usually an indication that the ``target_size_bytes`` value for |
1154 | the pool is too large and should be reduced or set to zero. To reduce the | |
1155 | ``target_size_bytes`` value or set it to zero, run the following command: | |
39ae355f TL |
1156 | |
1157 | .. prompt:: bash $ | |
11fdf7f2 | 1158 | |
39ae355f | 1159 | ceph osd pool set <pool-name> target_size_bytes 0 |
11fdf7f2 | 1160 | |
1e59de90 TL |
1161 | The above command sets the value of ``target_size_bytes`` to zero. To set the |
1162 | value of ``target_size_bytes`` to a non-zero value, replace the ``0`` with that | |
1163 | non-zero value. | |
1164 | ||
11fdf7f2 TL |
1165 | For more information, see :ref:`specifying_pool_target_size`. |
1166 | ||
9f95a23c | 1167 | POOL_HAS_TARGET_SIZE_BYTES_AND_RATIO |
11fdf7f2 TL |
1168 | ____________________________________ |
1169 | ||
1e59de90 TL |
1170 | One or more pools have both ``target_size_bytes`` and ``target_size_ratio`` set |
1171 | in order to estimate the expected size of the pool. Only one of these | |
1172 | properties should be non-zero. If both are set to a non-zero value, then | |
1173 | ``target_size_ratio`` takes precedence and ``target_size_bytes`` is ignored. | |
11fdf7f2 | 1174 | |
1e59de90 | 1175 | To reset ``target_size_bytes`` to zero, run the following command: |
11fdf7f2 | 1176 | |
39ae355f TL |
1177 | .. prompt:: bash $ |
1178 | ||
1179 | ceph osd pool set <pool-name> target_size_bytes 0 | |
11fdf7f2 TL |
1180 | |
1181 | For more information, see :ref:`specifying_pool_target_size`. | |
c07f9fc5 | 1182 | |
eafe8130 TL |
1183 | TOO_FEW_OSDS |
1184 | ____________ | |
1185 | ||
1e59de90 TL |
1186 | The number of OSDs in the cluster is below the configurable threshold of |
1187 | ``osd_pool_default_size``. This means that some or all data may not be able to | |
1188 | satisfy the data protection policy specified in CRUSH rules and pool settings. | |
eafe8130 | 1189 | |
c07f9fc5 FG |
1190 | SMALLER_PGP_NUM |
1191 | _______________ | |
1192 | ||
1e59de90 TL |
1193 | One or more pools have a ``pgp_num`` value less than ``pg_num``. This alert is |
1194 | normally an indication that the Placement Group (PG) count was increased | |
1195 | without any increase in the placement behavior. | |
c07f9fc5 | 1196 | |
1e59de90 TL |
1197 | This disparity is sometimes brought about deliberately, in order to separate |
1198 | out the `split` step when the PG count is adjusted from the data migration that | |
1199 | is needed when ``pgp_num`` is changed. | |
c07f9fc5 | 1200 | |
1e59de90 TL |
1201 | This issue is normally resolved by setting ``pgp_num`` to match ``pg_num``, so |
1202 | as to trigger the data migration, by running the following command: | |
39ae355f TL |
1203 | |
1204 | .. prompt:: bash $ | |
c07f9fc5 | 1205 | |
39ae355f | 1206 | ceph osd pool set <pool> pgp_num <pg-num-value> |
c07f9fc5 | 1207 | |
c07f9fc5 FG |
1208 | MANY_OBJECTS_PER_PG |
1209 | ___________________ | |
1210 | ||
1e59de90 TL |
1211 | One or more pools have an average number of objects per Placement Group (PG) |
1212 | that is significantly higher than the overall cluster average. The specific | |
1213 | threshold is determined by the ``mon_pg_warn_max_object_skew`` configuration | |
1214 | value. | |
c07f9fc5 | 1215 | |
1e59de90 TL |
1216 | This alert is usually an indication that the pool(s) that contain most of the |
1217 | data in the cluster have too few PGs, or that other pools that contain less | |
1218 | data have too many PGs. See *TOO_MANY_PGS* above. | |
c07f9fc5 | 1219 | |
1e59de90 TL |
1220 | To silence the health check, raise the threshold by adjusting the |
1221 | ``mon_pg_warn_max_object_skew`` config option on the managers. | |
c07f9fc5 | 1222 | |
1e59de90 | 1223 | The health check will be silenced for a specific pool only if |
20effc67 | 1224 | ``pg_autoscale_mode`` is set to ``on``. |
11fdf7f2 | 1225 | |
c07f9fc5 FG |
1226 | POOL_APP_NOT_ENABLED |
1227 | ____________________ | |
1228 | ||
aee94f69 TL |
1229 | A pool exists but the pool has not been tagged for use by a particular |
1230 | application. | |
c07f9fc5 | 1231 | |
1e59de90 TL |
1232 | To resolve this issue, tag the pool for use by an application. For |
1233 | example, if the pool is used by RBD, run the following command: | |
39ae355f TL |
1234 | |
1235 | .. prompt:: bash $ | |
c07f9fc5 | 1236 | |
39ae355f | 1237 | rbd pool init <poolname> |
c07f9fc5 | 1238 | |
1e59de90 TL |
1239 | Alternatively, if the pool is being used by a custom application (here 'foo'), |
1240 | you can label the pool by running the following low-level command: | |
c07f9fc5 | 1241 | |
39ae355f TL |
1242 | .. prompt:: bash $ |
1243 | ||
1244 | ceph osd pool application enable foo | |
c07f9fc5 | 1245 | |
11fdf7f2 | 1246 | For more information, see :ref:`associate-pool-to-application`. |
c07f9fc5 FG |
1247 | |
1248 | POOL_FULL | |
1249 | _________ | |
1250 | ||
1e59de90 TL |
1251 | One or more pools have reached (or are very close to reaching) their quota. The |
1252 | threshold to raise this health check is determined by the | |
1253 | ``mon_pool_quota_crit_threshold`` configuration option. | |
c07f9fc5 | 1254 | |
1e59de90 TL |
1255 | Pool quotas can be adjusted up or down (or removed) by running the following |
1256 | commands: | |
39ae355f TL |
1257 | |
1258 | .. prompt:: bash $ | |
c07f9fc5 | 1259 | |
39ae355f TL |
1260 | ceph osd pool set-quota <pool> max_bytes <bytes> |
1261 | ceph osd pool set-quota <pool> max_objects <objects> | |
c07f9fc5 | 1262 | |
1e59de90 | 1263 | To disable a quota, set the quota value to 0. |
c07f9fc5 FG |
1264 | |
1265 | POOL_NEAR_FULL | |
1266 | ______________ | |
1267 | ||
1e59de90 | 1268 | One or more pools are approaching a configured fullness threshold. |
f67539c2 | 1269 | |
1e59de90 TL |
1270 | One of the several thresholds that can raise this health check is determined by |
1271 | the ``mon_pool_quota_warn_threshold`` configuration option. | |
c07f9fc5 | 1272 | |
1e59de90 TL |
1273 | Pool quotas can be adjusted up or down (or removed) by running the following |
1274 | commands: | |
c07f9fc5 | 1275 | |
39ae355f TL |
1276 | .. prompt:: bash $ |
1277 | ||
1278 | ceph osd pool set-quota <pool> max_bytes <bytes> | |
1279 | ceph osd pool set-quota <pool> max_objects <objects> | |
c07f9fc5 | 1280 | |
1e59de90 | 1281 | To disable a quota, set the quota value to 0. |
c07f9fc5 | 1282 | |
1e59de90 TL |
1283 | Other thresholds that can raise the two health checks above are |
1284 | ``mon_osd_nearfull_ratio`` and ``mon_osd_full_ratio``. For details and | |
1285 | resolution, see :ref:`storage-capacity` and :ref:`no-free-drive-space`. | |
f67539c2 | 1286 | |
c07f9fc5 FG |
1287 | OBJECT_MISPLACED |
1288 | ________________ | |
1289 | ||
1e59de90 TL |
1290 | One or more objects in the cluster are not stored on the node that CRUSH would |
1291 | prefer that they be stored on. This alert is an indication that data migration | |
1292 | due to a recent cluster change has not yet completed. | |
c07f9fc5 | 1293 | |
1e59de90 TL |
1294 | Misplaced data is not a dangerous condition in and of itself; data consistency |
1295 | is never at risk, and old copies of objects will not be removed until the | |
1296 | desired number of new copies (in the desired locations) has been created. | |
c07f9fc5 FG |
1297 | |
1298 | OBJECT_UNFOUND | |
1299 | ______________ | |
1300 | ||
1e59de90 TL |
1301 | One or more objects in the cluster cannot be found. More precisely, the OSDs |
1302 | know that a new or updated copy of an object should exist, but no such copy has | |
1303 | been found on OSDs that are currently online. | |
c07f9fc5 FG |
1304 | |
1305 | Read or write requests to unfound objects will block. | |
1306 | ||
1e59de90 TL |
1307 | Ideally, a "down" OSD that has a more recent copy of the unfound object can be |
1308 | brought back online. To identify candidate OSDs, check the peering state of the | |
1309 | PG(s) responsible for the unfound object. To see the peering state, run the | |
1310 | following command: | |
39ae355f TL |
1311 | |
1312 | .. prompt:: bash $ | |
c07f9fc5 | 1313 | |
39ae355f | 1314 | ceph tell <pgid> query |
c07f9fc5 | 1315 | |
1e59de90 TL |
1316 | On the other hand, if the latest copy of the object is not available, the |
1317 | cluster can be told to roll back to a previous version of the object. For more | |
1318 | information, see :ref:`failures-osd-unfound`. | |
c07f9fc5 | 1319 | |
11fdf7f2 TL |
1320 | SLOW_OPS |
1321 | ________ | |
c07f9fc5 | 1322 | |
1e59de90 TL |
1323 | One or more OSD requests or monitor requests are taking a long time to process. |
1324 | This alert might be an indication of extreme load, a slow storage device, or a | |
1325 | software bug. | |
c07f9fc5 | 1326 | |
1e59de90 TL |
1327 | To query the request queue for the daemon that is causing the slowdown, run the |
1328 | following command from the daemon's host: | |
39ae355f TL |
1329 | |
1330 | .. prompt:: bash $ | |
c07f9fc5 | 1331 | |
39ae355f | 1332 | ceph daemon osd.<id> ops |
c07f9fc5 | 1333 | |
1e59de90 | 1334 | To see a summary of the slowest recent requests, run the following command: |
c07f9fc5 | 1335 | |
39ae355f | 1336 | .. prompt:: bash $ |
c07f9fc5 | 1337 | |
39ae355f | 1338 | ceph daemon osd.<id> dump_historic_ops |
c07f9fc5 | 1339 | |
1e59de90 | 1340 | To see the location of a specific OSD, run the following command: |
39ae355f TL |
1341 | |
1342 | .. prompt:: bash $ | |
1343 | ||
1344 | ceph osd find osd.<id> | |
c07f9fc5 | 1345 | |
c07f9fc5 FG |
1346 | PG_NOT_SCRUBBED |
1347 | _______________ | |
1348 | ||
1e59de90 TL |
1349 | One or more Placement Groups (PGs) have not been scrubbed recently. PGs are |
1350 | normally scrubbed within an interval determined by | |
1351 | :confval:`osd_scrub_max_interval` globally. This interval can be overridden on | |
1352 | per-pool basis by changing the value of the variable | |
1353 | :confval:`scrub_max_interval`. This health check is raised if a certain | |
1354 | percentage (determined by ``mon_warn_pg_not_scrubbed_ratio``) of the interval | |
1355 | has elapsed after the time the scrub was scheduled and no scrub has been | |
1356 | performed. | |
1357 | ||
1358 | PGs will be scrubbed only if they are flagged as ``clean`` (which means that | |
1359 | they are to be cleaned, and not that they have been examined and found to be | |
1360 | clean). Misplaced or degraded PGs will not be flagged as ``clean`` (see | |
1361 | *PG_AVAILABILITY* and *PG_DEGRADED* above). | |
c07f9fc5 | 1362 | |
1e59de90 | 1363 | To manually initiate a scrub of a clean PG, run the following command: |
c07f9fc5 | 1364 | |
1e59de90 | 1365 | .. prompt: bash $ |
c07f9fc5 | 1366 | |
1e59de90 | 1367 | ceph pg scrub <pgid> |
c07f9fc5 FG |
1368 | |
1369 | PG_NOT_DEEP_SCRUBBED | |
1370 | ____________________ | |
1371 | ||
1e59de90 TL |
1372 | One or more Placement Groups (PGs) have not been deep scrubbed recently. PGs |
1373 | are normally scrubbed every :confval:`osd_deep_scrub_interval` seconds at most. | |
1374 | This health check is raised if a certain percentage (determined by | |
1375 | ``mon_warn_pg_not_deep_scrubbed_ratio``) of the interval has elapsed after the | |
1376 | time the scrub was scheduled and no scrub has been performed. | |
c07f9fc5 | 1377 | |
1e59de90 TL |
1378 | PGs will receive a deep scrub only if they are flagged as *clean* (which means |
1379 | that they are to be cleaned, and not that they have been examined and found to | |
1380 | be clean). Misplaced or degraded PGs might not be flagged as ``clean`` (see | |
1381 | *PG_AVAILABILITY* and *PG_DEGRADED* above). | |
c07f9fc5 | 1382 | |
1e59de90 | 1383 | To manually initiate a deep scrub of a clean PG, run the following command: |
c07f9fc5 | 1384 | |
39ae355f TL |
1385 | .. prompt:: bash $ |
1386 | ||
1387 | ceph pg deep-scrub <pgid> | |
eafe8130 TL |
1388 | |
1389 | ||
9f95a23c TL |
1390 | PG_SLOW_SNAP_TRIMMING |
1391 | _____________________ | |
1392 | ||
1e59de90 TL |
1393 | The snapshot trim queue for one or more PGs has exceeded the configured warning |
1394 | threshold. This alert indicates either that an extremely large number of | |
1395 | snapshots was recently deleted, or that OSDs are unable to trim snapshots | |
1396 | quickly enough to keep up with the rate of new snapshot deletions. | |
9f95a23c | 1397 | |
1e59de90 TL |
1398 | The warning threshold is determined by the ``mon_osd_snap_trim_queue_warn_on`` |
1399 | option (default: 32768). | |
9f95a23c | 1400 | |
1e59de90 TL |
1401 | This alert might be raised if OSDs are under excessive load and unable to keep |
1402 | up with their background work, or if the OSDs' internal metadata database is | |
1403 | heavily fragmented and unable to perform. The alert might also indicate some | |
1404 | other performance issue with the OSDs. | |
9f95a23c | 1405 | |
1e59de90 TL |
1406 | The exact size of the snapshot trim queue is reported by the ``snaptrimq_len`` |
1407 | field of ``ceph pg ls -f json-detail``. | |
9f95a23c | 1408 | |
aee94f69 TL |
1409 | Stretch Mode |
1410 | ------------ | |
1411 | ||
1412 | INCORRECT_NUM_BUCKETS_STRETCH_MODE | |
1413 | __________________________________ | |
1414 | ||
1415 | Stretch mode currently only support 2 dividing buckets with OSDs, this warning suggests | |
1416 | that the number of dividing buckets is not equal to 2 after stretch mode is enabled. | |
1417 | You can expect unpredictable failures and MON assertions until the condition is fixed. | |
1418 | ||
1419 | We encourage you to fix this by removing additional dividing buckets or bump the | |
1420 | number of dividing buckets to 2. | |
1421 | ||
1422 | UNEVEN_WEIGHTS_STRETCH_MODE | |
1423 | ___________________________ | |
1424 | ||
1425 | The 2 dividing buckets must have equal weights when stretch mode is enabled. | |
1426 | This warning suggests that the 2 dividing buckets have uneven weights after | |
1427 | stretch mode is enabled. This is not immediately fatal, however, you can expect | |
1428 | Ceph to be confused when trying to process transitions between dividing buckets. | |
1429 | ||
1430 | We encourage you to fix this by making the weights even on both dividing buckets. | |
1431 | This can be done by making sure the combined weight of the OSDs on each dividing | |
1432 | bucket are the same. | |
1433 | ||
eafe8130 TL |
1434 | Miscellaneous |
1435 | ------------- | |
1436 | ||
1437 | RECENT_CRASH | |
1438 | ____________ | |
1439 | ||
1e59de90 TL |
1440 | One or more Ceph daemons have crashed recently, and the crash(es) have not yet |
1441 | been acknowledged and archived by the administrator. This alert might indicate | |
1442 | a software bug, a hardware problem (for example, a failing disk), or some other | |
1443 | problem. | |
eafe8130 | 1444 | |
1e59de90 | 1445 | To list recent crashes, run the following command: |
39ae355f TL |
1446 | |
1447 | .. prompt:: bash $ | |
1448 | ||
1449 | ceph crash ls-new | |
eafe8130 | 1450 | |
1e59de90 | 1451 | To examine information about a specific crash, run the following command: |
eafe8130 | 1452 | |
39ae355f | 1453 | .. prompt:: bash $ |
eafe8130 | 1454 | |
39ae355f | 1455 | ceph crash info <crash-id> |
eafe8130 | 1456 | |
1e59de90 TL |
1457 | To silence this alert, you can archive the crash (perhaps after the crash |
1458 | has been examined by an administrator) by running the following command: | |
eafe8130 | 1459 | |
39ae355f | 1460 | .. prompt:: bash $ |
eafe8130 | 1461 | |
39ae355f | 1462 | ceph crash archive <crash-id> |
eafe8130 | 1463 | |
1e59de90 | 1464 | Similarly, to archive all recent crashes, run the following command: |
39ae355f TL |
1465 | |
1466 | .. prompt:: bash $ | |
1467 | ||
1468 | ceph crash archive-all | |
eafe8130 | 1469 | |
1e59de90 TL |
1470 | Archived crashes will still be visible by running the command ``ceph crash |
1471 | ls``, but not by running the command ``ceph crash ls-new``. | |
20effc67 | 1472 | |
1e59de90 | 1473 | The time period that is considered recent is determined by the option |
20effc67 TL |
1474 | ``mgr/crash/warn_recent_interval`` (default: two weeks). |
1475 | ||
1e59de90 | 1476 | To entirely disable this alert, run the following command: |
39ae355f TL |
1477 | |
1478 | .. prompt:: bash $ | |
20effc67 | 1479 | |
39ae355f | 1480 | ceph config set mgr/crash/warn_recent_interval 0 |
20effc67 TL |
1481 | |
1482 | RECENT_MGR_MODULE_CRASH | |
1483 | _______________________ | |
1484 | ||
1e59de90 TL |
1485 | One or more ``ceph-mgr`` modules have crashed recently, and the crash(es) have |
1486 | not yet been acknowledged and archived by the administrator. This alert | |
1487 | usually indicates a software bug in one of the software modules that are | |
1488 | running inside the ``ceph-mgr`` daemon. The module that experienced the problem | |
1489 | might be disabled as a result, but other modules are unaffected and continue to | |
1490 | function as expected. | |
20effc67 | 1491 | |
1e59de90 TL |
1492 | As with the *RECENT_CRASH* health check, a specific crash can be inspected by |
1493 | running the following command: | |
20effc67 | 1494 | |
39ae355f TL |
1495 | .. prompt:: bash $ |
1496 | ||
1497 | ceph crash info <crash-id> | |
20effc67 | 1498 | |
1e59de90 TL |
1499 | To silence this alert, you can archive the crash (perhaps after the crash has |
1500 | been examined by an administrator) by running the following command: | |
39ae355f TL |
1501 | |
1502 | .. prompt:: bash $ | |
1503 | ||
1504 | ceph crash archive <crash-id> | |
20effc67 | 1505 | |
1e59de90 | 1506 | Similarly, to archive all recent crashes, run the following command: |
20effc67 | 1507 | |
39ae355f | 1508 | .. prompt:: bash $ |
20effc67 | 1509 | |
39ae355f | 1510 | ceph crash archive-all |
20effc67 | 1511 | |
1e59de90 TL |
1512 | Archived crashes will still be visible by running the command ``ceph crash ls`` |
1513 | but not by running the command ``ceph crash ls-new``. | |
eafe8130 | 1514 | |
1e59de90 | 1515 | The time period that is considered recent is determined by the option |
eafe8130 TL |
1516 | ``mgr/crash/warn_recent_interval`` (default: two weeks). |
1517 | ||
1e59de90 | 1518 | To entirely disable this alert, run the following command: |
eafe8130 | 1519 | |
39ae355f TL |
1520 | .. prompt:: bash $ |
1521 | ||
1522 | ceph config set mgr/crash/warn_recent_interval 0 | |
eafe8130 TL |
1523 | |
1524 | TELEMETRY_CHANGED | |
1525 | _________________ | |
1526 | ||
1e59de90 TL |
1527 | Telemetry has been enabled, but because the contents of the telemetry report |
1528 | have changed in the meantime, telemetry reports will not be sent. | |
eafe8130 | 1529 | |
1e59de90 TL |
1530 | Ceph developers occasionally revise the telemetry feature to include new and |
1531 | useful information, or to remove information found to be useless or sensitive. | |
1532 | If any new information is included in the report, Ceph requires the | |
1533 | administrator to re-enable telemetry. This requirement ensures that the | |
1534 | administrator has an opportunity to (re)review the information that will be | |
eafe8130 TL |
1535 | shared. |
1536 | ||
1e59de90 | 1537 | To review the contents of the telemetry report, run the following command: |
39ae355f TL |
1538 | |
1539 | .. prompt:: bash $ | |
eafe8130 | 1540 | |
39ae355f | 1541 | ceph telemetry show |
eafe8130 | 1542 | |
1e59de90 TL |
1543 | Note that the telemetry report consists of several channels that may be |
1544 | independently enabled or disabled. For more information, see :ref:`telemetry`. | |
eafe8130 | 1545 | |
1e59de90 | 1546 | To re-enable telemetry (and silence the alert), run the following command: |
39ae355f TL |
1547 | |
1548 | .. prompt:: bash $ | |
eafe8130 | 1549 | |
39ae355f | 1550 | ceph telemetry on |
eafe8130 | 1551 | |
1e59de90 | 1552 | To disable telemetry (and silence the alert), run the following command: |
eafe8130 | 1553 | |
39ae355f TL |
1554 | .. prompt:: bash $ |
1555 | ||
1556 | ceph telemetry off | |
9f95a23c TL |
1557 | |
1558 | AUTH_BAD_CAPS | |
1559 | _____________ | |
1560 | ||
1e59de90 TL |
1561 | One or more auth users have capabilities that cannot be parsed by the monitors. |
1562 | As a general rule, this alert indicates that there are one or more daemon types | |
1563 | that the user is not authorized to use to perform any action. | |
9f95a23c | 1564 | |
1e59de90 TL |
1565 | This alert is most likely to be raised after an upgrade if (1) the capabilities |
1566 | were set with an older version of Ceph that did not properly validate the | |
1567 | syntax of those capabilities, or if (2) the syntax of the capabilities has | |
1568 | changed. | |
9f95a23c | 1569 | |
1e59de90 | 1570 | To remove the user(s) in question, run the following command: |
39ae355f TL |
1571 | |
1572 | .. prompt:: bash $ | |
9f95a23c | 1573 | |
39ae355f | 1574 | ceph auth rm <entity-name> |
9f95a23c | 1575 | |
1e59de90 TL |
1576 | (This resolves the health check, but it prevents clients from being able to |
1577 | authenticate as the removed user.) | |
9f95a23c | 1578 | |
1e59de90 TL |
1579 | Alternatively, to update the capabilities for the user(s), run the following |
1580 | command: | |
39ae355f TL |
1581 | |
1582 | .. prompt:: bash $ | |
9f95a23c | 1583 | |
39ae355f | 1584 | ceph auth <entity-name> <daemon-type> <caps> [<daemon-type> <caps> ...] |
9f95a23c TL |
1585 | |
1586 | For more information about auth capabilities, see :ref:`user-management`. | |
1587 | ||
9f95a23c TL |
1588 | OSD_NO_DOWN_OUT_INTERVAL |
1589 | ________________________ | |
1590 | ||
1e59de90 TL |
1591 | The ``mon_osd_down_out_interval`` option is set to zero, which means that the |
1592 | system does not automatically perform any repair or healing operations when an | |
1593 | OSD fails. Instead, an administrator an external orchestrator must manually | |
1594 | mark "down" OSDs as ``out`` (by running ``ceph osd out <osd-id>``) in order to | |
1595 | trigger recovery. | |
9f95a23c | 1596 | |
1e59de90 TL |
1597 | This option is normally set to five or ten minutes, which should be enough time |
1598 | for a host to power-cycle or reboot. | |
9f95a23c | 1599 | |
1e59de90 TL |
1600 | To silence this alert, set ``mon_warn_on_osd_down_out_interval_zero`` to |
1601 | ``false`` by running the following command: | |
9f95a23c | 1602 | |
39ae355f TL |
1603 | .. prompt:: bash $ |
1604 | ||
1605 | ceph config global mon mon_warn_on_osd_down_out_interval_zero false | |
adb31ebb TL |
1606 | |
1607 | DASHBOARD_DEBUG | |
1608 | _______________ | |
1609 | ||
1e59de90 TL |
1610 | The Dashboard debug mode is enabled. This means that if there is an error while |
1611 | processing a REST API request, the HTTP error response will contain a Python | |
1612 | traceback. This mode should be disabled in production environments because such | |
1613 | a traceback might contain and expose sensitive information. | |
adb31ebb | 1614 | |
1e59de90 | 1615 | To disable the debug mode, run the following command: |
39ae355f TL |
1616 | |
1617 | .. prompt:: bash $ | |
adb31ebb | 1618 | |
39ae355f | 1619 | ceph dashboard debug disable |