ceph/doc/rados/operations/health-checks.rst

   1 .. _health-checks:
   2
   3 ===============
   4  Health checks
   5 ===============
   6
   7 Overview
   8 ========
   9
  10 There is a finite set of health messages that a Ceph cluster can raise. These
  11 messages are known as *health checks*. Each health check has a unique
  12 identifier.
  13
  14 The identifier is a terse human-readable string -- that is, the identifier is
  15 readable in much the same way as a typical variable name. It is intended to
  16 enable tools (for example, UIs) to make sense of health checks and present them
  17 in a way that reflects their meaning.
  18
  19 This page lists the health checks that are raised by the monitor and manager
  20 daemons. In addition to these, you might see health checks that originate
  21 from MDS daemons (see :ref:`cephfs-health-messages`), and health checks
  22 that are defined by ``ceph-mgr`` python modules.
  23
  24 Definitions
  25 ===========
  26
  27 Monitor
  28 -------
  29
  30 DAEMON_OLD_VERSION
  31 __________________
  32
  33 Warn if one or more old versions of Ceph are running on any daemons.  A health
  34 check is raised if multiple versions are detected.  This condition must exist
  35 for a period of time greater than ``mon_warn_older_version_delay`` (set to one
  36 week by default) in order for the health check to be raised. This allows most
  37 upgrades to proceed without the occurrence of a false warning. If the upgrade
  38 is paused for an extended time period, ``health mute`` can be used by running
  39 ``ceph health mute DAEMON_OLD_VERSION --sticky``. Be sure, however, to run
  40 ``ceph health unmute DAEMON_OLD_VERSION`` after the upgrade has finished.
  41
  42 MON_DOWN
  43 ________
  44
  45 One or more monitor daemons are currently down. The cluster requires a majority
  46 (more than one-half) of the monitors to be available. When one or more monitors
  47 are down, clients might have a harder time forming their initial connection to
  48 the cluster, as they might need to try more addresses before they reach an
  49 operating monitor.
  50
  51 The down monitor daemon should be restarted as soon as possible to reduce the
  52 risk of a subsequent monitor failure leading to a service outage.
  53
  54 MON_CLOCK_SKEW
  55 ______________
  56
  57 The clocks on the hosts running the ceph-mon monitor daemons are not
  58 well-synchronized. This health check is raised if the cluster detects a clock
  59 skew greater than ``mon_clock_drift_allowed``.
  60
  61 This issue is best resolved by synchronizing the clocks by using a tool like
  62 ``ntpd`` or ``chrony``.
  63
  64 If it is impractical to keep the clocks closely synchronized, the
  65 ``mon_clock_drift_allowed`` threshold can also be increased. However, this
  66 value must stay significantly below the ``mon_lease`` interval in order for the
  67 monitor cluster to function properly.
  68
  69 MON_MSGR2_NOT_ENABLED
  70 _____________________
  71
  72 The :confval:`ms_bind_msgr2` option is enabled but one or more monitors are
  73 not configured to bind to a v2 port in the cluster's monmap. This
  74 means that features specific to the msgr2 protocol (for example, encryption)
  75 are unavailable on some or all connections.
  76
  77 In most cases this can be corrected by running the following command:
  78
  79 .. prompt:: bash $
  80
  81    ceph mon enable-msgr2
  82
  83 After this command is run, any monitor configured to listen on the old default
  84 port (6789) will continue to listen for v1 connections on 6789 and begin to
  85 listen for v2 connections on the new default port 3300.
  86
  87 If a monitor is configured to listen for v1 connections on a non-standard port
  88 (that is, a port other than 6789), then the monmap will need to be modified
  89 manually.
  90
  91
  92 MON_DISK_LOW
  93 ____________
  94
  95 One or more monitors are low on disk space. This health check is raised if the
  96 percentage of available space on the file system used by the monitor database
  97 (normally ``/var/lib/ceph/mon``) drops below the percentage value
  98 ``mon_data_avail_warn`` (default: 30%).
  99
 100 This alert might indicate that some other process or user on the system is
 101 filling up the file system used by the monitor. It might also
 102 indicate that the monitor database is too large (see ``MON_DISK_BIG``
 103 below).
 104
 105 If space cannot be freed, the monitor's data directory might need to be
 106 moved to another storage device or file system (this relocation process must be carried out while the monitor
 107 daemon is not running).
 108
 109
 110 MON_DISK_CRIT
 111 _____________
 112
 113 One or more monitors are critically low on disk space. This health check is raised if the
 114 percentage of available space on the file system used by the monitor database
 115 (normally ``/var/lib/ceph/mon``) drops below the percentage value
 116 ``mon_data_avail_crit`` (default: 5%). See ``MON_DISK_LOW``, above.
 117
 118 MON_DISK_BIG
 119 ____________
 120
 121 The database size for one or more monitors is very large. This health check is
 122 raised if the size of the monitor database is larger than
 123 ``mon_data_size_warn`` (default: 15 GiB).
 124
 125 A large database is unusual, but does not necessarily indicate a problem.
 126 Monitor databases might grow in size when there are placement groups that have
 127 not reached an ``active+clean`` state in a long time.
 128
 129 This alert might also indicate that the monitor's database is not properly
 130 compacting, an issue that has been observed with some older versions of leveldb
 131 and rocksdb. Forcing a compaction with ``ceph daemon mon.<id> compact`` might
 132 shrink the database's on-disk size.
 133
 134 This alert might also indicate that the monitor has a bug that prevents it from
 135 pruning the cluster metadata that it stores. If the problem persists, please
 136 report a bug.
 137
 138 To adjust the warning threshold, run the following command:
 139
 140 .. prompt:: bash $
 141
 142    ceph config set global mon_data_size_warn <size>
 143
 144
 145 AUTH_INSECURE_GLOBAL_ID_RECLAIM
 146 _______________________________
 147
 148 One or more clients or daemons that are connected to the cluster are not
 149 securely reclaiming their ``global_id`` (a unique number that identifies each
 150 entity in the cluster) when reconnecting to a monitor. The client is being
 151 permitted to connect anyway because the
 152 ``auth_allow_insecure_global_id_reclaim`` option is set to ``true`` (which may
 153 be necessary until all Ceph clients have been upgraded) and because the
 154 ``auth_expose_insecure_global_id_reclaim`` option is set to ``true`` (which
 155 allows monitors to detect clients with "insecure reclaim" sooner by forcing
 156 those clients to reconnect immediately after their initial authentication).
 157
 158 To identify which client(s) are using unpatched Ceph client code, run the
 159 following command:
 160
 161 .. prompt:: bash $
 162
 163    ceph health detail
 164
 165 If you collect a dump of the clients that are connected to an individual
 166 monitor and examine the ``global_id_status`` field in the output of the dump,
 167 you can see the ``global_id`` reclaim behavior of those clients. Here
 168 ``reclaim_insecure`` means that a client is unpatched and is contributing to
 169 this health check.  To effect a client dump, run the following command:
 170
 171 .. prompt:: bash $
 172
 173    ceph tell mon.\* sessions
 174
 175 We strongly recommend that all clients in the system be upgraded to a newer
 176 version of Ceph that correctly reclaims ``global_id`` values. After all clients
 177 have been updated, run the following command to stop allowing insecure
 178 reconnections:
 179
 180 .. prompt:: bash $
 181
 182    ceph config set mon auth_allow_insecure_global_id_reclaim false
 183
 184 If it is impractical to upgrade all clients immediately, you can temporarily
 185 silence this alert by running the following command:
 186
 187 .. prompt:: bash $
 188
 189    ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w   # 1 week
 190
 191 Although we do NOT recommend doing so, you can also disable this alert
 192 indefinitely by running the following command:
 193
 194 .. prompt:: bash $
 195
 196    ceph config set mon mon_warn_on_insecure_global_id_reclaim false
 197
 198 AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED
 199 _______________________________________
 200
 201 Ceph is currently configured to allow clients that reconnect to monitors using
 202 an insecure process to reclaim their previous ``global_id``. Such reclaiming is
 203 allowed because, by default, ``auth_allow_insecure_global_id_reclaim`` is set
 204 to ``true``. It might be necessary to leave this setting enabled while existing
 205 Ceph clients are upgraded to newer versions of Ceph that correctly and securely
 206 reclaim their ``global_id``.
 207
 208 If the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM`` health check has not also been
 209 raised and if the ``auth_expose_insecure_global_id_reclaim`` setting has not
 210 been disabled (it is enabled by default), then there are currently no clients
 211 connected that need to be upgraded. In that case, it is safe to disable
 212 ``insecure global_id reclaim`` by running the following command:
 213
 214 .. prompt:: bash $
 215
 216    ceph config set mon auth_allow_insecure_global_id_reclaim false
 217
 218 On the other hand, if there are still clients that need to be upgraded, then
 219 this alert can be temporarily silenced by running the following command:
 220
 221 .. prompt:: bash $
 222
 223    ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w   # 1 week
 224
 225 Although we do NOT recommend doing so, you can also disable this alert indefinitely
 226 by running the following command:
 227
 228 .. prompt:: bash $
 229
 230    ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
 231
 232
 233 Manager
 234 -------
 235
 236 MGR_DOWN
 237 ________
 238
 239 All manager daemons are currently down. The cluster should normally have at
 240 least one running manager (``ceph-mgr``) daemon. If no manager daemon is
 241 running, the cluster's ability to monitor itself will be compromised, and parts
 242 of the management API will become unavailable (for example, the dashboard will
 243 not work, and most CLI commands that report metrics or runtime state will
 244 block). However, the cluster will still be able to perform all I/O operations
 245 and to recover from failures.
 246
 247 The "down" manager daemon should be restarted as soon as possible to ensure
 248 that the cluster can be monitored (for example, so that the ``ceph -s``
 249 information is up to date, or so that metrics can be scraped by Prometheus).
 250
 251
 252 MGR_MODULE_DEPENDENCY
 253 _____________________
 254
 255 An enabled manager module is failing its dependency check. This health check
 256 typically comes with an explanatory message from the module about the problem.
 257
 258 For example, a module might report that a required package is not installed: in
 259 this case, you should install the required package and restart your manager
 260 daemons.
 261
 262 This health check is applied only to enabled modules. If a module is not
 263 enabled, you can see whether it is reporting dependency issues in the output of
 264 `ceph module ls`.
 265
 266
 267 MGR_MODULE_ERROR
 268 ________________
 269
 270 A manager module has experienced an unexpected error. Typically, this means
 271 that an unhandled exception was raised from the module's `serve` function. The
 272 human-readable description of the error might be obscurely worded if the
 273 exception did not provide a useful description of itself.
 274
 275 This health check might indicate a bug: please open a Ceph bug report if you
 276 think you have encountered a bug.
 277
 278 However, if you believe the error is transient, you may restart your manager
 279 daemon(s) or use ``ceph mgr fail`` on the active daemon in order to force
 280 failover to another daemon.
 281
 282 OSDs
 283 ----
 284
 285 OSD_DOWN
 286 ________
 287
 288 One or more OSDs are marked "down". The ceph-osd daemon might have been
 289 stopped, or peer OSDs might be unable to reach the OSD over the network.
 290 Common causes include a stopped or crashed daemon, a "down" host, or a network
 291 outage.
 292
 293 Verify that the host is healthy, the daemon is started, and the network is
 294 functioning. If the daemon has crashed, the daemon log file
 295 (``/var/log/ceph/ceph-osd.*``) might contain debugging information.
 296
 297 OSD_<crush type>_DOWN
 298 _____________________
 299
 300 (for example, OSD_HOST_DOWN, OSD_ROOT_DOWN)
 301
 302 All of the OSDs within a particular CRUSH subtree are marked "down" (for
 303 example, all OSDs on a host).
 304
 305 OSD_ORPHAN
 306 __________
 307
 308 An OSD is referenced in the CRUSH map hierarchy, but does not exist.
 309
 310 To remove the OSD from the CRUSH map hierarchy, run the following command:
 311
 312 .. prompt:: bash $
 313
 314    ceph osd crush rm osd.<id>
 315
 316 OSD_OUT_OF_ORDER_FULL
 317 _____________________
 318
 319 The utilization thresholds for `nearfull`, `backfillfull`, `full`, and/or
 320 `failsafe_full` are not ascending. In particular, the following pattern is
 321 expected: `nearfull < backfillfull`, `backfillfull < full`, and `full <
 322 failsafe_full`.
 323
 324 To adjust these utilization thresholds, run the following commands:
 325
 326 .. prompt:: bash $
 327
 328    ceph osd set-nearfull-ratio <ratio>
 329    ceph osd set-backfillfull-ratio <ratio>
 330    ceph osd set-full-ratio <ratio>
 331
 332
 333 OSD_FULL
 334 ________
 335
 336 One or more OSDs have exceeded the `full` threshold and are preventing the
 337 cluster from servicing writes.
 338
 339 To check utilization by pool, run the following command:
 340
 341 .. prompt:: bash $
 342
 343    ceph df
 344
 345 To see the currently defined `full` ratio, run the following command:
 346
 347 .. prompt:: bash $
 348
 349    ceph osd dump | grep full_ratio
 350
 351 A short-term workaround to restore write availability is to raise the full
 352 threshold by a small amount. To do so, run the following command:
 353
 354 .. prompt:: bash $
 355
 356    ceph osd set-full-ratio <ratio>
 357
 358 Additional OSDs should be deployed in order to add new storage to the cluster,
 359 or existing data should be deleted in order to free up space in the cluster.
 360
 361 OSD_BACKFILLFULL
 362 ________________
 363
 364 One or more OSDs have exceeded the `backfillfull` threshold or *would* exceed
 365 it if the currently-mapped backfills were to finish, which will prevent data
 366 from rebalancing to this OSD. This alert is an early warning that
 367 rebalancing might be unable to complete and that the cluster is approaching
 368 full.
 369
 370 To check utilization by pool, run the following command:
 371
 372 .. prompt:: bash $
 373
 374    ceph df
 375
 376 OSD_NEARFULL
 377 ____________
 378
 379 One or more OSDs have exceeded the `nearfull` threshold. This alert is an early
 380 warning that the cluster is approaching full.
 381
 382 To check utilization by pool, run the following command:
 383
 384 .. prompt:: bash $
 385
 386    ceph df
 387
 388 OSDMAP_FLAGS
 389 ____________
 390
 391 One or more cluster flags of interest have been set. These flags include:
 392
 393 * *full* - the cluster is flagged as full and cannot serve writes
 394 * *pauserd*, *pausewr* - there are paused reads or writes
 395 * *noup* - OSDs are not allowed to start
 396 * *nodown* - OSD failure reports are being ignored, and that means that the
 397   monitors will not mark OSDs "down"
 398 * *noin* - OSDs that were previously marked ``out`` are not being marked
 399   back ``in`` when they start
 400 * *noout* - "down" OSDs are not automatically being marked ``out`` after the
 401   configured interval
 402 * *nobackfill*, *norecover*, *norebalance* - recovery or data
 403   rebalancing is suspended
 404 * *noscrub*, *nodeep_scrub* - scrubbing is disabled
 405 * *notieragent* - cache-tiering activity is suspended
 406
 407 With the exception of *full*, these flags can be set or cleared by running the
 408 following commands:
 409
 410 .. prompt:: bash $
 411
 412    ceph osd set <flag>
 413    ceph osd unset <flag>
 414
 415 OSD_FLAGS
 416 _________
 417
 418 One or more OSDs or CRUSH {nodes,device classes} have a flag of interest set.
 419 These flags include:
 420
 421 * *noup*: these OSDs are not allowed to start
 422 * *nodown*: failure reports for these OSDs will be ignored
 423 * *noin*: if these OSDs were previously marked ``out`` automatically
 424   after a failure, they will not be marked ``in`` when they start
 425 * *noout*: if these OSDs are "down" they will not automatically be marked
 426   ``out`` after the configured interval
 427
 428 To set and clear these flags in batch, run the following commands:
 429
 430 .. prompt:: bash $
 431
 432    ceph osd set-group <flags> <who>
 433    ceph osd unset-group <flags> <who>
 434
 435 For example:
 436
 437 .. prompt:: bash $
 438
 439    ceph osd set-group noup,noout osd.0 osd.1
 440    ceph osd unset-group noup,noout osd.0 osd.1
 441    ceph osd set-group noup,noout host-foo
 442    ceph osd unset-group noup,noout host-foo
 443    ceph osd set-group noup,noout class-hdd
 444    ceph osd unset-group noup,noout class-hdd
 445
 446 OLD_CRUSH_TUNABLES
 447 __________________
 448
 449 The CRUSH map is using very old settings and should be updated. The oldest set
 450 of tunables that can be used (that is, the oldest client version that can
 451 connect to the cluster) without raising this health check is determined by the
 452 ``mon_crush_min_required_version`` config option.  For more information, see
 453 :ref:`crush-map-tunables`.
 454
 455 OLD_CRUSH_STRAW_CALC_VERSION
 456 ____________________________
 457
 458 The CRUSH map is using an older, non-optimal method of calculating intermediate
 459 weight values for ``straw`` buckets.
 460
 461 The CRUSH map should be updated to use the newer method (that is:
 462 ``straw_calc_version=1``). For more information, see :ref:`crush-map-tunables`.
 463
 464 CACHE_POOL_NO_HIT_SET
 465 _____________________
 466
 467 One or more cache pools are not configured with a *hit set* to track
 468 utilization. This issue prevents the tiering agent from identifying cold
 469 objects that are to be flushed and evicted from the cache.
 470
 471 To configure hit sets on the cache pool, run the following commands:
 472
 473 .. prompt:: bash $
 474
 475    ceph osd pool set <poolname> hit_set_type <type>
 476    ceph osd pool set <poolname> hit_set_period <period-in-seconds>
 477    ceph osd pool set <poolname> hit_set_count <number-of-hitsets>
 478    ceph osd pool set <poolname> hit_set_fpp <target-false-positive-rate>
 479
 480 OSD_NO_SORTBITWISE
 481 __________________
 482
 483 No pre-Luminous v12.y.z OSDs are running, but the ``sortbitwise`` flag has not
 484 been set.
 485
 486 The ``sortbitwise`` flag must be set in order for OSDs running Luminous v12.y.z
 487 or newer to start. To safely set the flag, run the following command:
 488
 489 .. prompt:: bash $
 490
 491    ceph osd set sortbitwise
 492
 493 OSD_FILESTORE
 494 __________________
 495
 496 Warn if OSDs are running Filestore. The Filestore OSD back end has been
 497 deprecated; the BlueStore back end has been the default object store since the
 498 Ceph Luminous release.
 499
 500 The 'mclock_scheduler' is not supported for Filestore OSDs. For this reason,
 501 the default 'osd_op_queue' is set to 'wpq' for Filestore OSDs and is enforced
 502 even if the user attempts to change it.
 503
 504
 505
 506 .. prompt:: bash $
 507
 508    ceph report | jq -c '."osd_metadata" | .[] | select(.osd_objectstore | contains("filestore")) | {id, osd_objectstore}'
 509
 510 **In order to upgrade to Reef or a later release, you must first migrate any
 511 Filestore OSDs to BlueStore.**
 512
 513 If you are upgrading a pre-Reef release to Reef or later, but it is not
 514 feasible to migrate Filestore OSDs to BlueStore immediately, you can
 515 temporarily silence this alert by running the following command:
 516
 517 .. prompt:: bash $
 518
 519    ceph health mute OSD_FILESTORE
 520
 521 Since this migration can take a considerable amount of time to complete, we
 522 recommend that you begin the process well in advance of any update to Reef or
 523 to later releases.
 524
 525 POOL_FULL
 526 _________
 527
 528 One or more pools have reached their quota and are no longer allowing writes.
 529
 530 To see pool quotas and utilization, run the following command:
 531
 532 .. prompt:: bash $
 533
 534    ceph df detail
 535
 536 If you opt to raise the pool quota, run the following commands:
 537
 538 .. prompt:: bash $
 539
 540    ceph osd pool set-quota <poolname> max_objects <num-objects>
 541    ceph osd pool set-quota <poolname> max_bytes <num-bytes>
 542
 543 If not, delete some existing data to reduce utilization.
 544
 545 BLUEFS_SPILLOVER
 546 ________________
 547
 548 One or more OSDs that use the BlueStore back end have been allocated `db`
 549 partitions (that is, storage space for metadata, normally on a faster device),
 550 but because that space has been filled, metadata has "spilled over" onto the
 551 slow device. This is not necessarily an error condition or even unexpected
 552 behavior, but may result in degraded performance. If the administrator had
 553 expected that all metadata would fit on the faster device, this alert indicates
 554 that not enough space was provided.
 555
 556 To disable this alert on all OSDs, run the following command:
 557
 558 .. prompt:: bash $
 559
 560    ceph config set osd bluestore_warn_on_bluefs_spillover false
 561
 562 Alternatively, to disable the alert on a specific OSD, run the following
 563 command:
 564
 565 .. prompt:: bash $
 566
 567    ceph config set osd.123 bluestore_warn_on_bluefs_spillover false
 568
 569 To secure more metadata space, you can destroy and reprovision the OSD in
 570 question. This process involves data migration and recovery.
 571
 572 It might also be possible to expand the LVM logical volume that backs the `db`
 573 storage. If the underlying LV has been expanded, you must stop the OSD daemon
 574 and inform BlueFS of the device-size change by running the following command:
 575
 576 .. prompt:: bash $
 577
 578    ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-$ID
 579
 580 BLUEFS_AVAILABLE_SPACE
 581 ______________________
 582
 583 To see how much space is free for BlueFS, run the following command:
 584
 585 .. prompt:: bash $
 586
 587    ceph daemon osd.123 bluestore bluefs available
 588
 589 This will output up to three values: ``BDEV_DB free``, ``BDEV_SLOW free``, and
 590 ``available_from_bluestore``. ``BDEV_DB`` and ``BDEV_SLOW`` report the amount
 591 of space that has been acquired by BlueFS and is now considered free. The value
 592 ``available_from_bluestore`` indicates the ability of BlueStore to relinquish
 593 more space to BlueFS.  It is normal for this value to differ from the amount of
 594 BlueStore free space, because the BlueFS allocation unit is typically larger
 595 than the BlueStore allocation unit.  This means that only part of the BlueStore
 596 free space will be available for BlueFS.
 597
 598 BLUEFS_LOW_SPACE
 599 _________________
 600
 601 If BlueFS is running low on available free space and there is not much free
 602 space available from BlueStore (in other words, `available_from_bluestore` has
 603 a low value), consider reducing the BlueFS allocation unit size. To simulate
 604 available space when the allocation unit is different, run the following
 605 command:
 606
 607 .. prompt:: bash $
 608
 609    ceph daemon osd.123 bluestore bluefs available <alloc-unit-size>
 610
 611 BLUESTORE_FRAGMENTATION
 612 _______________________
 613
 614 As BlueStore operates, the free space on the underlying storage will become
 615 fragmented.  This is normal and unavoidable, but excessive fragmentation causes
 616 slowdown.  To inspect BlueStore fragmentation, run the following command:
 617
 618 .. prompt:: bash $
 619
 620    ceph daemon osd.123 bluestore allocator score block
 621
 622 The fragmentation score is given in a [0-1] range.
 623 [0.0 .. 0.4] tiny fragmentation
 624 [0.4 .. 0.7] small, acceptable fragmentation
 625 [0.7 .. 0.9] considerable, but safe fragmentation
 626 [0.9 .. 1.0] severe fragmentation, might impact BlueFS's ability to get space from BlueStore
 627
 628 To see a detailed report of free fragments, run the following command:
 629
 630 .. prompt:: bash $
 631
 632    ceph daemon osd.123 bluestore allocator dump block
 633
 634 For OSD processes that are not currently running, fragmentation can be
 635 inspected with `ceph-bluestore-tool`. To see the fragmentation score, run the
 636 following command:
 637
 638 .. prompt:: bash $
 639
 640    ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-score
 641
 642 To dump detailed free chunks, run the following command:
 643
 644 .. prompt:: bash $
 645
 646    ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-dump
 647
 648 BLUESTORE_LEGACY_STATFS
 649 _______________________
 650
 651 One or more OSDs have BlueStore volumes that were created prior to the
 652 Nautilus release. (In Nautilus, BlueStore tracks its internal usage
 653 statistics on a granular, per-pool basis.)
 654
 655 If *all* OSDs
 656 are older than Nautilus, this means that the per-pool metrics are
 657 simply unavailable. But if there is a mixture of pre-Nautilus and
 658 post-Nautilus OSDs, the cluster usage statistics reported by ``ceph
 659 df`` will be inaccurate.
 660
 661 The old OSDs can be updated to use the new usage-tracking scheme by stopping
 662 each OSD, running a repair operation, and then restarting the OSD. For example,
 663 to update ``osd.123``, run the following commands:
 664
 665 .. prompt:: bash $
 666
 667    systemctl stop ceph-osd@123
 668    ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
 669    systemctl start ceph-osd@123
 670
 671 To disable this alert, run the following command:
 672
 673 .. prompt:: bash $
 674
 675    ceph config set global bluestore_warn_on_legacy_statfs false
 676
 677 BLUESTORE_NO_PER_POOL_OMAP
 678 __________________________
 679
 680 One or more OSDs have volumes that were created prior to the Octopus release.
 681 (In Octopus and later releases, BlueStore tracks omap space utilization by
 682 pool.)
 683
 684 If there are any BlueStore OSDs that do not have the new tracking enabled, the
 685 cluster will report an approximate value for per-pool omap usage based on the
 686 most recent deep scrub.
 687
 688 The OSDs can be updated to track by pool by stopping each OSD, running a repair
 689 operation, and then restarting the OSD. For example, to update ``osd.123``, run
 690 the following commands:
 691
 692 .. prompt:: bash $
 693
 694    systemctl stop ceph-osd@123
 695    ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
 696    systemctl start ceph-osd@123
 697
 698 To disable this alert, run the following command:
 699
 700 .. prompt:: bash $
 701
 702    ceph config set global bluestore_warn_on_no_per_pool_omap false
 703
 704 BLUESTORE_NO_PER_PG_OMAP
 705 __________________________
 706
 707 One or more OSDs have volumes that were created prior to Pacific.  (In Pacific
 708 and later releases Bluestore tracks omap space utilitzation by Placement Group
 709 (PG).)
 710
 711 Per-PG omap allows faster PG removal when PGs migrate.
 712
 713 The older OSDs can be updated to track by PG by stopping each OSD, running a
 714 repair operation, and then restarting the OSD. For example, to update
 715 ``osd.123``, run the following commands:
 716
 717 .. prompt:: bash $
 718
 719    systemctl stop ceph-osd@123
 720    ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
 721    systemctl start ceph-osd@123
 722
 723 To disable this alert, run the following command:
 724
 725 .. prompt:: bash $
 726
 727    ceph config set global bluestore_warn_on_no_per_pg_omap false
 728
 729
 730 BLUESTORE_DISK_SIZE_MISMATCH
 731 ____________________________
 732
 733 One or more BlueStore OSDs have an internal inconsistency between the size of
 734 the physical device and the metadata that tracks its size. This inconsistency
 735 can lead to the OSD(s) crashing in the future.
 736
 737 The OSDs that have this inconsistency should be destroyed and reprovisioned. Be
 738 very careful to execute this procedure on only one OSD at a time, so as to
 739 minimize the risk of losing any data. To execute this procedure, where ``$N``
 740 is the OSD that has the inconsistency, run the following commands:
 741
 742 .. prompt:: bash $
 743
 744    ceph osd out osd.$N
 745    while ! ceph osd safe-to-destroy osd.$N ; do sleep 1m ; done
 746    ceph osd destroy osd.$N
 747    ceph-volume lvm zap /path/to/device
 748    ceph-volume lvm create --osd-id $N --data /path/to/device
 749
 750 .. note::
 751
 752    Wait for this recovery procedure to completely on one OSD before running it
 753    on the next.
 754
 755 BLUESTORE_NO_COMPRESSION
 756 ________________________
 757
 758 One or more OSDs is unable to load a BlueStore compression plugin.  This issue
 759 might be caused by a broken installation, in which the ``ceph-osd`` binary does
 760 not match the compression plugins. Or it might be caused by a recent upgrade in
 761 which the ``ceph-osd`` daemon was not restarted.
 762
 763 To resolve this issue, verify that all of the packages on the host that is
 764 running the affected OSD(s) are correctly installed and that the OSD daemon(s)
 765 have been restarted. If the problem persists, check the OSD log for information
 766 about the source of the problem.
 767
 768 BLUESTORE_SPURIOUS_READ_ERRORS
 769 ______________________________
 770
 771 One or more BlueStore OSDs detect spurious read errors on the main device.
 772 BlueStore has recovered from these errors by retrying disk reads.  This alert
 773 might indicate issues with underlying hardware, issues with the I/O subsystem,
 774 or something similar.  In theory, such issues can cause permanent data
 775 corruption.  Some observations on the root cause of spurious read errors can be
 776 found here: https://tracker.ceph.com/issues/22464
 777
 778 This alert does not require an immediate response, but the affected host might
 779 need additional attention: for example, upgrading the host to the latest
 780 OS/kernel versions and implementing hardware-resource-utilization monitoring.
 781
 782 To disable this alert on all OSDs, run the following command:
 783
 784 .. prompt:: bash $
 785
 786    ceph config set osd bluestore_warn_on_spurious_read_errors false
 787
 788 Or, to disable this alert on a specific OSD, run the following command:
 789
 790 .. prompt:: bash $
 791
 792    ceph config set osd.123 bluestore_warn_on_spurious_read_errors false
 793
 794 Device health
 795 -------------
 796
 797 DEVICE_HEALTH
 798 _____________
 799
 800 One or more OSD devices are expected to fail soon, where the warning threshold
 801 is determined by the ``mgr/devicehealth/warn_threshold`` config option.
 802
 803 Because this alert applies only to OSDs that are currently marked ``in``, the
 804 appropriate response to this expected failure is (1) to mark the OSD ``out`` so
 805 that data is migrated off of the OSD, and then (2) to remove the hardware from
 806 the system. Note that this marking ``out`` is normally done automatically if
 807 ``mgr/devicehealth/self_heal`` is enabled (as determined by
 808 ``mgr/devicehealth/mark_out_threshold``).
 809
 810 To check device health, run the following command:
 811
 812 .. prompt:: bash $
 813
 814    ceph device info <device-id>
 815
 816 Device life expectancy is set either by a prediction model that the mgr runs or
 817 by an external tool that is activated by running the following command:
 818
 819 .. prompt:: bash $
 820
 821    ceph device set-life-expectancy <device-id> <from> <to>
 822
 823 You can change the stored life expectancy manually, but such a change usually
 824 doesn't accomplish anything. The reason for this is that whichever tool
 825 originally set the stored life expectancy will probably undo your change by
 826 setting it again, and a change to the stored value does not affect the actual
 827 health of the hardware device.
 828
 829 DEVICE_HEALTH_IN_USE
 830 ____________________
 831
 832 One or more devices (that is, OSDs) are expected to fail soon and have been
 833 marked ``out`` of the cluster (as controlled by
 834 ``mgr/devicehealth/mark_out_threshold``), but they are still participating in
 835 one or more Placement Groups. This might be because the OSD(s) were marked
 836 ``out`` only recently and data is still migrating, or because data cannot be
 837 migrated off of the OSD(s) for some reason (for example, the cluster is nearly
 838 full, or the CRUSH hierarchy is structured so that there isn't another suitable
 839 OSD to migrate the data to).
 840
 841 This message can be silenced by disabling self-heal behavior (that is, setting
 842 ``mgr/devicehealth/self_heal`` to ``false``), by adjusting
 843 ``mgr/devicehealth/mark_out_threshold``, or by addressing whichever condition
 844 is preventing data from being migrated off of the ailing OSD(s).
 845
 846 .. _rados_health_checks_device_health_toomany:
 847
 848 DEVICE_HEALTH_TOOMANY
 849 _____________________
 850
 851 Too many devices (that is, OSDs) are expected to fail soon, and because
 852 ``mgr/devicehealth/self_heal`` behavior is enabled, marking ``out`` all of the
 853 ailing OSDs would exceed the cluster's ``mon_osd_min_in_ratio`` ratio.  This
 854 ratio prevents a cascade of too many OSDs from being automatically marked
 855 ``out``.
 856
 857 You should promptly add new OSDs to the cluster to prevent data loss, or
 858 incrementally replace the failing OSDs.
 859
 860 Alternatively, you can silence this health check by adjusting options including
 861 ``mon_osd_min_in_ratio`` or ``mgr/devicehealth/mark_out_threshold``.  Be
 862 warned, however, that this will increase the likelihood of unrecoverable data
 863 loss.
 864
 865
 866 Data health (pools & placement groups)
 867 --------------------------------------
 868
 869 PG_AVAILABILITY
 870 _______________
 871
 872 Data availability is reduced. In other words, the cluster is unable to service
 873 potential read or write requests for at least some data in the cluster.  More
 874 precisely, one or more Placement Groups (PGs) are in a state that does not
 875 allow I/O requests to be serviced. Any of the following PG states are
 876 problematic if they do not clear quickly: *peering*, *stale*, *incomplete*, and
 877 the lack of *active*.
 878
 879 For detailed information about which PGs are affected, run the following
 880 command:
 881
 882 .. prompt:: bash $
 883
 884    ceph health detail
 885
 886 In most cases, the root cause of this issue is that one or more OSDs are
 887 currently ``down``: see ``OSD_DOWN`` above.
 888
 889 To see the state of a specific problematic PG, run the following command:
 890
 891 .. prompt:: bash $
 892
 893    ceph tell <pgid> query
 894
 895 PG_DEGRADED
 896 ___________
 897
 898 Data redundancy is reduced for some data: in other words, the cluster does not
 899 have the desired number of replicas for all data (in the case of replicated
 900 pools) or erasure code fragments (in the case of erasure-coded pools).  More
 901 precisely, one or more Placement Groups (PGs):
 902
 903 * have the *degraded* or *undersized* flag set, which means that there are not
 904   enough instances of that PG in the cluster; or
 905 * have not had the *clean* state set for a long time.
 906
 907 For detailed information about which PGs are affected, run the following
 908 command:
 909
 910 .. prompt:: bash $
 911
 912    ceph health detail
 913
 914 In most cases, the root cause of this issue is that one or more OSDs are
 915 currently "down": see ``OSD_DOWN`` above.
 916
 917 To see the state of a specific problematic PG, run the following command:
 918
 919 .. prompt:: bash $
 920
 921    ceph tell <pgid> query
 922
 923
 924 PG_RECOVERY_FULL
 925 ________________
 926
 927 Data redundancy might be reduced or even put at risk for some data due to a
 928 lack of free space in the cluster. More precisely, one or more Placement Groups
 929 have the *recovery_toofull* flag set, which means that the cluster is unable to
 930 migrate or recover data because one or more OSDs are above the ``full``
 931 threshold.
 932
 933 For steps to resolve this condition, see *OSD_FULL* above.
 934
 935 PG_BACKFILL_FULL
 936 ________________
 937
 938 Data redundancy might be reduced or even put at risk for some data due to a
 939 lack of free space in the cluster. More precisely, one or more Placement Groups
 940 have the *backfill_toofull* flag set, which means that the cluster is unable to
 941 migrate or recover data because one or more OSDs are above the ``backfillfull``
 942 threshold.
 943
 944 For steps to resolve this condition, see *OSD_BACKFILLFULL* above.
 945
 946 PG_DAMAGED
 947 __________
 948
 949 Data scrubbing has discovered problems with data consistency in the cluster.
 950 More precisely, one or more Placement Groups either (1) have the *inconsistent*
 951 or ``snaptrim_error`` flag set, which indicates that an earlier data scrub
 952 operation found a problem, or (2) have the *repair* flag set, which means that
 953 a repair for such an inconsistency is currently in progress.
 954
 955 For more information, see :doc:`pg-repair`.
 956
 957 OSD_SCRUB_ERRORS
 958 ________________
 959
 960 Recent OSD scrubs have discovered inconsistencies. This alert is generally
 961 paired with *PG_DAMAGED* (see above).
 962
 963 For more information, see :doc:`pg-repair`.
 964
 965 OSD_TOO_MANY_REPAIRS
 966 ____________________
 967
 968 The count of read repairs has exceeded the config value threshold
 969 ``mon_osd_warn_num_repaired`` (default: ``10``).  Because scrub handles errors
 970 only for data at rest, and because any read error that occurs when another
 971 replica is available will be repaired immediately so that the client can get
 972 the object data, there might exist failing disks that are not registering any
 973 scrub errors. This repair count is maintained as a way of identifying any such
 974 failing disks.
 975
 976
 977 LARGE_OMAP_OBJECTS
 978 __________________
 979
 980 One or more pools contain large omap objects, as determined by
 981 ``osd_deep_scrub_large_omap_object_key_threshold`` (threshold for the number of
 982 keys to determine what is considered a large omap object) or
 983 ``osd_deep_scrub_large_omap_object_value_sum_threshold`` (the threshold for the
 984 summed size in bytes of all key values to determine what is considered a large
 985 omap object) or both.  To find more information on object name, key count, and
 986 size in bytes, search the cluster log for 'Large omap object found'. This issue
 987 can be caused by RGW-bucket index objects that do not have automatic resharding
 988 enabled. For more information on resharding, see :ref:`RGW Dynamic Bucket Index
 989 Resharding <rgw_dynamic_bucket_index_resharding>`.
 990
 991 To adjust the thresholds mentioned above, run the following commands:
 992
 993 .. prompt:: bash $
 994
 995    ceph config set osd osd_deep_scrub_large_omap_object_key_threshold <keys>
 996    ceph config set osd osd_deep_scrub_large_omap_object_value_sum_threshold <bytes>
 997
 998 CACHE_POOL_NEAR_FULL
 999 ____________________
1000
1001 A cache-tier pool is nearly full, as determined by the ``target_max_bytes`` and
1002 ``target_max_objects`` properties of the cache pool. Once the pool reaches the
1003 target threshold, write requests to the pool might block while data is flushed
1004 and evicted from the cache. This state normally leads to very high latencies
1005 and poor performance.
1006
1007 To adjust the cache pool's target size, run the following commands:
1008
1009 .. prompt:: bash $
1010
1011    ceph osd pool set <cache-pool-name> target_max_bytes <bytes>
1012    ceph osd pool set <cache-pool-name> target_max_objects <objects>
1013
1014 There might be other reasons that normal cache flush and evict activity are
1015 throttled: for example, reduced availability of the base tier, reduced
1016 performance of the base tier, or overall cluster load.
1017
1018 TOO_FEW_PGS
1019 ___________
1020
1021 The number of Placement Groups (PGs) that are in use in the cluster is below
1022 the configurable threshold of ``mon_pg_warn_min_per_osd`` PGs per OSD. This can
1023 lead to suboptimal distribution and suboptimal balance of data across the OSDs
1024 in the cluster, and a reduction of overall performance.
1025
1026 If data pools have not yet been created, this condition is expected.
1027
1028 To address this issue, you can increase the PG count for existing pools or
1029 create new pools.  For more information, see
1030 :ref:`choosing-number-of-placement-groups`.
1031
1032 POOL_PG_NUM_NOT_POWER_OF_TWO
1033 ____________________________
1034
1035 One or more pools have a ``pg_num`` value that is not a power of two.  Although
1036 this is not strictly incorrect, it does lead to a less balanced distribution of
1037 data because some Placement Groups will have roughly twice as much data as
1038 others have.
1039
1040 This is easily corrected by setting the ``pg_num`` value for the affected
1041 pool(s) to a nearby power of two. To do so, run the following command:
1042
1043 .. prompt:: bash $
1044
1045    ceph osd pool set <pool-name> pg_num <value>
1046
1047 To disable this health check, run the following command:
1048
1049 .. prompt:: bash $
1050
1051    ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
1052
1053 POOL_TOO_FEW_PGS
1054 ________________
1055
1056 One or more pools should probably have more Placement Groups (PGs), given the
1057 amount of data that is currently stored in the pool. This issue can lead to
1058 suboptimal distribution and suboptimal balance of data across the OSDs in the
1059 cluster, and a reduction of overall performance. This alert is raised only if
1060 the ``pg_autoscale_mode`` property on the pool is set to ``warn``.
1061
1062 To disable the alert, entirely disable auto-scaling of PGs for the pool by
1063 running the following command:
1064
1065 .. prompt:: bash $
1066
1067    ceph osd pool set <pool-name> pg_autoscale_mode off
1068
1069 To allow the cluster to automatically adjust the number of PGs for the pool,
1070 run the following command:
1071
1072 .. prompt:: bash $
1073
1074    ceph osd pool set <pool-name> pg_autoscale_mode on
1075
1076 Alternatively, to manually set the number of PGs for the pool to the
1077 recommended amount, run the following command:
1078
1079 .. prompt:: bash $
1080
1081    ceph osd pool set <pool-name> pg_num <new-pg-num>
1082
1083 For more information, see :ref:`choosing-number-of-placement-groups` and
1084 :ref:`pg-autoscaler`.
1085
1086 TOO_MANY_PGS
1087 ____________
1088
1089 The number of Placement Groups (PGs) in use in the cluster is above the
1090 configurable threshold of ``mon_max_pg_per_osd`` PGs per OSD. If this threshold
1091 is exceeded, the cluster will not allow new pools to be created, pool `pg_num`
1092 to be increased, or pool replication to be increased (any of which, if allowed,
1093 would lead to more PGs in the cluster). A large number of PGs can lead to
1094 higher memory utilization for OSD daemons, slower peering after cluster state
1095 changes (for example, OSD restarts, additions, or removals), and higher load on
1096 the Manager and Monitor daemons.
1097
1098 The simplest way to mitigate the problem is to increase the number of OSDs in
1099 the cluster by adding more hardware. Note that, because the OSD count that is
1100 used for the purposes of this health check is the number of ``in`` OSDs,
1101 marking ``out`` OSDs ``in`` (if there are any ``out`` OSDs available) can also
1102 help. To do so, run the following command:
1103
1104 .. prompt:: bash $
1105
1106    ceph osd in <osd id(s)>
1107
1108 For more information, see :ref:`choosing-number-of-placement-groups`.
1109
1110 POOL_TOO_MANY_PGS
1111 _________________
1112
1113 One or more pools should probably have fewer Placement Groups (PGs), given the
1114 amount of data that is currently stored in the pool. This issue can lead to
1115 higher memory utilization for OSD daemons, slower peering after cluster state
1116 changes (for example, OSD restarts, additions, or removals), and higher load on
1117 the Manager and Monitor daemons. This alert is raised only if the
1118 ``pg_autoscale_mode`` property on the pool is set to ``warn``.
1119
1120 To disable the alert, entirely disable auto-scaling of PGs for the pool by
1121 running the following command:
1122
1123 .. prompt:: bash $
1124
1125    ceph osd pool set <pool-name> pg_autoscale_mode off
1126
1127 To allow the cluster to automatically adjust the number of PGs for the pool,
1128 run the following command:
1129
1130 .. prompt:: bash $
1131
1132    ceph osd pool set <pool-name> pg_autoscale_mode on
1133
1134 Alternatively, to manually set the number of PGs for the pool to the
1135 recommended amount, run the following command:
1136
1137 .. prompt:: bash $
1138
1139    ceph osd pool set <pool-name> pg_num <new-pg-num>
1140
1141 For more information, see :ref:`choosing-number-of-placement-groups` and
1142 :ref:`pg-autoscaler`.
1143
1144
1145 POOL_TARGET_SIZE_BYTES_OVERCOMMITTED
1146 ____________________________________
1147
1148 One or more pools have a ``target_size_bytes`` property that is set in order to
1149 estimate the expected size of the pool, but the value(s) of this property are
1150 greater than the total available storage (either by themselves or in
1151 combination with other pools).
1152
1153 This alert is usually an indication that the ``target_size_bytes`` value for
1154 the pool is too large and should be reduced or set to zero. To reduce the
1155 ``target_size_bytes`` value or set it to zero, run the following command:
1156
1157 .. prompt:: bash $
1158
1159    ceph osd pool set <pool-name> target_size_bytes 0
1160
1161 The above command sets the value of ``target_size_bytes`` to zero. To set the
1162 value of ``target_size_bytes`` to a non-zero value, replace the ``0`` with that
1163 non-zero value.
1164
1165 For more information, see :ref:`specifying_pool_target_size`.
1166
1167 POOL_HAS_TARGET_SIZE_BYTES_AND_RATIO
1168 ____________________________________
1169
1170 One or more pools have both ``target_size_bytes`` and ``target_size_ratio`` set
1171 in order to estimate the expected size of the pool.  Only one of these
1172 properties should be non-zero. If both are set to a non-zero value, then
1173 ``target_size_ratio`` takes precedence and ``target_size_bytes`` is ignored.
1174
1175 To reset ``target_size_bytes`` to zero, run the following command:
1176
1177 .. prompt:: bash $
1178
1179    ceph osd pool set <pool-name> target_size_bytes 0
1180
1181 For more information, see :ref:`specifying_pool_target_size`.
1182
1183 TOO_FEW_OSDS
1184 ____________
1185
1186 The number of OSDs in the cluster is below the configurable threshold of
1187 ``osd_pool_default_size``. This means that some or all data may not be able to
1188 satisfy the data protection policy specified in CRUSH rules and pool settings.
1189
1190 SMALLER_PGP_NUM
1191 _______________
1192
1193 One or more pools have a ``pgp_num`` value less than ``pg_num``. This alert is
1194 normally an indication that the Placement Group (PG) count was increased
1195 without any increase in the placement behavior.
1196
1197 This disparity is sometimes brought about deliberately, in order to separate
1198 out the `split` step when the PG count is adjusted from the data migration that
1199 is needed when ``pgp_num`` is changed.
1200
1201 This issue is normally resolved by setting ``pgp_num`` to match ``pg_num``, so
1202 as to trigger the data migration, by running the following command:
1203
1204 .. prompt:: bash $
1205
1206    ceph osd pool set <pool> pgp_num <pg-num-value>
1207
1208 MANY_OBJECTS_PER_PG
1209 ___________________
1210
1211 One or more pools have an average number of objects per Placement Group (PG)
1212 that is significantly higher than the overall cluster average. The specific
1213 threshold is determined by the ``mon_pg_warn_max_object_skew`` configuration
1214 value.
1215
1216 This alert is usually an indication that the pool(s) that contain most of the
1217 data in the cluster have too few PGs, or that other pools that contain less
1218 data have too many PGs. See *TOO_MANY_PGS* above.
1219
1220 To silence the health check, raise the threshold by adjusting the
1221 ``mon_pg_warn_max_object_skew`` config option on the managers.
1222
1223 The health check will be silenced for a specific pool only if
1224 ``pg_autoscale_mode`` is set to ``on``.
1225
1226 POOL_APP_NOT_ENABLED
1227 ____________________
1228
1229 A pool exists but the pool has not been tagged for use by a particular
1230 application.
1231
1232 To resolve this issue, tag the pool for use by an application. For
1233 example, if the pool is used by RBD, run the following command:
1234
1235 .. prompt:: bash $
1236
1237    rbd pool init <poolname>
1238
1239 Alternatively, if the pool is being used by a custom application (here 'foo'),
1240 you can label the pool by running the following low-level command:
1241
1242 .. prompt:: bash $
1243
1244    ceph osd pool application enable foo
1245
1246 For more information, see :ref:`associate-pool-to-application`.
1247
1248 POOL_FULL
1249 _________
1250
1251 One or more pools have reached (or are very close to reaching) their quota. The
1252 threshold to raise this health check is determined by the
1253 ``mon_pool_quota_crit_threshold`` configuration option.
1254
1255 Pool quotas can be adjusted up or down (or removed) by running the following
1256 commands:
1257
1258 .. prompt:: bash $
1259
1260    ceph osd pool set-quota <pool> max_bytes <bytes>
1261    ceph osd pool set-quota <pool> max_objects <objects>
1262
1263 To disable a quota, set the quota value to 0.
1264
1265 POOL_NEAR_FULL
1266 ______________
1267
1268 One or more pools are approaching a configured fullness threshold.
1269
1270 One of the several thresholds that can raise this health check is determined by
1271 the ``mon_pool_quota_warn_threshold`` configuration option.
1272
1273 Pool quotas can be adjusted up or down (or removed) by running the following
1274 commands:
1275
1276 .. prompt:: bash $
1277
1278    ceph osd pool set-quota <pool> max_bytes <bytes>
1279    ceph osd pool set-quota <pool> max_objects <objects>
1280
1281 To disable a quota, set the quota value to 0.
1282
1283 Other thresholds that can raise the two health checks above are
1284 ``mon_osd_nearfull_ratio`` and ``mon_osd_full_ratio``. For details and
1285 resolution, see :ref:`storage-capacity` and :ref:`no-free-drive-space`.
1286
1287 OBJECT_MISPLACED
1288 ________________
1289
1290 One or more objects in the cluster are not stored on the node that CRUSH would
1291 prefer that they be stored on. This alert is an indication that data migration
1292 due to a recent cluster change has not yet completed.
1293
1294 Misplaced data is not a dangerous condition in and of itself; data consistency
1295 is never at risk, and old copies of objects will not be removed until the
1296 desired number of new copies (in the desired locations) has been created.
1297
1298 OBJECT_UNFOUND
1299 ______________
1300
1301 One or more objects in the cluster cannot be found. More precisely, the OSDs
1302 know that a new or updated copy of an object should exist, but no such copy has
1303 been found on OSDs that are currently online.
1304
1305 Read or write requests to unfound objects will block.
1306
1307 Ideally, a "down" OSD that has a more recent copy of the unfound object can be
1308 brought back online. To identify candidate OSDs, check the peering state of the
1309 PG(s) responsible for the unfound object. To see the peering state, run the
1310 following command:
1311
1312 .. prompt:: bash $
1313
1314    ceph tell <pgid> query
1315
1316 On the other hand, if the latest copy of the object is not available, the
1317 cluster can be told to roll back to a previous version of the object. For more
1318 information, see :ref:`failures-osd-unfound`.
1319
1320 SLOW_OPS
1321 ________
1322
1323 One or more OSD requests or monitor requests are taking a long time to process.
1324 This alert might be an indication of extreme load, a slow storage device, or a
1325 software bug.
1326
1327 To query the request queue for the daemon that is causing the slowdown, run the
1328 following command from the daemon's host:
1329
1330 .. prompt:: bash $
1331
1332    ceph daemon osd.<id> ops
1333
1334 To see a summary of the slowest recent requests, run the following command:
1335
1336 .. prompt:: bash $
1337
1338    ceph daemon osd.<id> dump_historic_ops
1339
1340 To see the location of a specific OSD, run the following command:
1341
1342 .. prompt:: bash $
1343
1344    ceph osd find osd.<id>
1345
1346 PG_NOT_SCRUBBED
1347 _______________
1348
1349 One or more Placement Groups (PGs) have not been scrubbed recently. PGs are
1350 normally scrubbed within an interval determined by
1351 :confval:`osd_scrub_max_interval` globally. This interval can be overridden on
1352 per-pool basis by changing the value of the variable
1353 :confval:`scrub_max_interval`. This health check is raised if a certain
1354 percentage (determined by ``mon_warn_pg_not_scrubbed_ratio``) of the interval
1355 has elapsed after the time the scrub was scheduled and no scrub has been
1356 performed.
1357
1358 PGs will be scrubbed only if they are flagged as ``clean`` (which means that
1359 they are to be cleaned, and not that they have been examined and found to be
1360 clean). Misplaced or degraded PGs will not be flagged as ``clean`` (see
1361 *PG_AVAILABILITY* and *PG_DEGRADED* above).
1362
1363 To manually initiate a scrub of a clean PG, run the following command:
1364
1365 .. prompt: bash $
1366
1367    ceph pg scrub <pgid>
1368
1369 PG_NOT_DEEP_SCRUBBED
1370 ____________________
1371
1372 One or more Placement Groups (PGs) have not been deep scrubbed recently. PGs
1373 are normally scrubbed every :confval:`osd_deep_scrub_interval` seconds at most.
1374 This health check is raised if a certain percentage (determined by
1375 ``mon_warn_pg_not_deep_scrubbed_ratio``) of the interval has elapsed after the
1376 time the scrub was scheduled and no scrub has been performed.
1377
1378 PGs will receive a deep scrub only if they are flagged as *clean* (which means
1379 that they are to be cleaned, and not that they have been examined and found to
1380 be clean). Misplaced or degraded PGs might not be flagged as ``clean`` (see
1381 *PG_AVAILABILITY* and *PG_DEGRADED* above).
1382
1383 To manually initiate a deep scrub of a clean PG, run the following command:
1384
1385 .. prompt:: bash $
1386
1387    ceph pg deep-scrub <pgid>
1388
1389
1390 PG_SLOW_SNAP_TRIMMING
1391 _____________________
1392
1393 The snapshot trim queue for one or more PGs has exceeded the configured warning
1394 threshold. This alert indicates either that an extremely large number of
1395 snapshots was recently deleted, or that OSDs are unable to trim snapshots
1396 quickly enough to keep up with the rate of new snapshot deletions.
1397
1398 The warning threshold is determined by the ``mon_osd_snap_trim_queue_warn_on``
1399 option (default: 32768).
1400
1401 This alert might be raised if OSDs are under excessive load and unable to keep
1402 up with their background work, or if the OSDs' internal metadata database is
1403 heavily fragmented and unable to perform. The alert might also indicate some
1404 other performance issue with the OSDs.
1405
1406 The exact size of the snapshot trim queue is reported by the ``snaptrimq_len``
1407 field of ``ceph pg ls -f json-detail``.
1408
1409 Stretch Mode
1410 ------------
1411
1412 INCORRECT_NUM_BUCKETS_STRETCH_MODE
1413 __________________________________
1414
1415 Stretch mode currently only support 2 dividing buckets with OSDs, this warning suggests
1416 that the number of dividing buckets is not equal to 2 after stretch mode is enabled.
1417 You can expect unpredictable failures and MON assertions until the condition is fixed.
1418
1419 We encourage you to fix this by removing additional dividing buckets or bump the
1420 number of dividing buckets to 2.
1421
1422 UNEVEN_WEIGHTS_STRETCH_MODE
1423 ___________________________
1424
1425 The 2 dividing buckets must have equal weights when stretch mode is enabled.
1426 This warning suggests that the 2 dividing buckets have uneven weights after
1427 stretch mode is enabled. This is not immediately fatal, however, you can expect
1428 Ceph to be confused when trying to process transitions between dividing buckets.
1429
1430 We encourage you to fix this by making the weights even on both dividing buckets.
1431 This can be done by making sure the combined weight of the OSDs on each dividing
1432 bucket are the same.
1433
1434 Miscellaneous
1435 -------------
1436
1437 RECENT_CRASH
1438 ____________
1439
1440 One or more Ceph daemons have crashed recently, and the crash(es) have not yet
1441 been acknowledged and archived by the administrator. This alert might indicate
1442 a software bug, a hardware problem (for example, a failing disk), or some other
1443 problem.
1444
1445 To list recent crashes, run the following command:
1446
1447 .. prompt:: bash $
1448
1449    ceph crash ls-new
1450
1451 To examine information about a specific crash, run the following command:
1452
1453 .. prompt:: bash $
1454
1455    ceph crash info <crash-id>
1456
1457 To silence this alert, you can archive the crash (perhaps after the crash
1458 has been examined by an administrator) by running the following command:
1459
1460 .. prompt:: bash $
1461
1462    ceph crash archive <crash-id>
1463
1464 Similarly, to archive all recent crashes, run the following command:
1465
1466 .. prompt:: bash $
1467
1468    ceph crash archive-all
1469
1470 Archived crashes will still be visible by running the command ``ceph crash
1471 ls``, but not by running the command ``ceph crash ls-new``.
1472
1473 The time period that is considered recent is determined by the option
1474 ``mgr/crash/warn_recent_interval`` (default: two weeks).
1475
1476 To entirely disable this alert, run the following command:
1477
1478 .. prompt:: bash $
1479
1480    ceph config set mgr/crash/warn_recent_interval 0
1481
1482 RECENT_MGR_MODULE_CRASH
1483 _______________________
1484
1485 One or more ``ceph-mgr`` modules have crashed recently, and the crash(es) have
1486 not yet been acknowledged and archived by the administrator.  This alert
1487 usually indicates a software bug in one of the software modules that are
1488 running inside the ``ceph-mgr`` daemon. The module that experienced the problem
1489 might be disabled as a result, but other modules are unaffected and continue to
1490 function as expected.
1491
1492 As with the *RECENT_CRASH* health check, a specific crash can be inspected by
1493 running the following command:
1494
1495 .. prompt:: bash $
1496
1497    ceph crash info <crash-id>
1498
1499 To silence this alert, you can archive the crash (perhaps after the crash has
1500 been examined by an administrator) by running the following command:
1501
1502 .. prompt:: bash $
1503
1504    ceph crash archive <crash-id>
1505
1506 Similarly, to archive all recent crashes, run the following command:
1507
1508 .. prompt:: bash $
1509
1510    ceph crash archive-all
1511
1512 Archived crashes will still be visible by running the command ``ceph crash ls``
1513 but not by running the command ``ceph crash ls-new``.
1514
1515 The time period that is considered recent is determined by the option
1516 ``mgr/crash/warn_recent_interval`` (default: two weeks).
1517
1518 To entirely disable this alert, run the following command:
1519
1520 .. prompt:: bash $
1521
1522    ceph config set mgr/crash/warn_recent_interval 0
1523
1524 TELEMETRY_CHANGED
1525 _________________
1526
1527 Telemetry has been enabled, but because the contents of the telemetry report
1528 have changed in the meantime, telemetry reports will not be sent.
1529
1530 Ceph developers occasionally revise the telemetry feature to include new and
1531 useful information, or to remove information found to be useless or sensitive.
1532 If any new information is included in the report, Ceph requires the
1533 administrator to re-enable telemetry. This requirement ensures that the
1534 administrator has an opportunity to (re)review the information that will be
1535 shared.
1536
1537 To review the contents of the telemetry report, run the following command:
1538
1539 .. prompt:: bash $
1540
1541    ceph telemetry show
1542
1543 Note that the telemetry report consists of several channels that may be
1544 independently enabled or disabled. For more information, see :ref:`telemetry`.
1545
1546 To re-enable telemetry (and silence the alert), run the following command:
1547
1548 .. prompt:: bash $
1549
1550    ceph telemetry on
1551
1552 To disable telemetry (and silence the alert), run the following command:
1553
1554 .. prompt:: bash $
1555
1556    ceph telemetry off
1557
1558 AUTH_BAD_CAPS
1559 _____________
1560
1561 One or more auth users have capabilities that cannot be parsed by the monitors.
1562 As a general rule, this alert indicates that there are one or more daemon types
1563 that the user is not authorized to use to perform any action.
1564
1565 This alert is most likely to be raised after an upgrade if (1) the capabilities
1566 were set with an older version of Ceph that did not properly validate the
1567 syntax of those capabilities, or if (2) the syntax of the capabilities has
1568 changed.
1569
1570 To remove the user(s) in question, run the following command:
1571
1572 .. prompt:: bash $
1573
1574    ceph auth rm <entity-name>
1575
1576 (This resolves the health check, but it prevents clients from being able to
1577 authenticate as the removed user.)
1578
1579 Alternatively, to update the capabilities for the user(s), run the following
1580 command:
1581
1582 .. prompt:: bash $
1583
1584    ceph auth <entity-name> <daemon-type> <caps> [<daemon-type> <caps> ...]
1585
1586 For more information about auth capabilities, see :ref:`user-management`.
1587
1588 OSD_NO_DOWN_OUT_INTERVAL
1589 ________________________
1590
1591 The ``mon_osd_down_out_interval`` option is set to zero, which means that the
1592 system does not automatically perform any repair or healing operations when an
1593 OSD fails. Instead, an administrator an external orchestrator must manually
1594 mark "down" OSDs as ``out`` (by running ``ceph osd out <osd-id>``) in order to
1595 trigger recovery.
1596
1597 This option is normally set to five or ten minutes, which should be enough time
1598 for a host to power-cycle or reboot.
1599
1600 To silence this alert, set ``mon_warn_on_osd_down_out_interval_zero`` to
1601 ``false`` by running the following command:
1602
1603 .. prompt:: bash $
1604
1605    ceph config global mon mon_warn_on_osd_down_out_interval_zero false
1606
1607 DASHBOARD_DEBUG
1608 _______________
1609
1610 The Dashboard debug mode is enabled. This means that if there is an error while
1611 processing a REST API request, the HTTP error response will contain a Python
1612 traceback. This mode should be disabled in production environments because such
1613 a traceback might contain and expose sensitive information.
1614
1615 To disable the debug mode, run the following command:
1616
1617 .. prompt:: bash $
1618
1619    ceph dashboard debug disable