]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/health-checks.rst
Add patch for failing prerm scripts
[ceph.git] / ceph / doc / rados / operations / health-checks.rst
CommitLineData
c07f9fc5
FG
1
2=============
3Health checks
4=============
5
6Overview
7========
8
9There is a finite set of possible health messages that a Ceph cluster can
10raise -- these are defined as *health checks* which have unique identifiers.
11
12The identifier is a terse pseudo-human-readable (i.e. like a variable name)
13string. It is intended to enable tools (such as UIs) to make sense of
14health checks, and present them in a way that reflects their meaning.
15
16This page lists the health checks that are raised by the monitor and manager
17daemons. In addition to these, you may also see health checks that originate
11fdf7f2 18from MDS daemons (see :ref:`cephfs-health-messages`), and health checks
c07f9fc5
FG
19that are defined by ceph-mgr python modules.
20
21Definitions
22===========
23
11fdf7f2
TL
24Monitor
25-------
26
27MON_DOWN
28________
29
30One or more monitor daemons is currently down. The cluster requires a
31majority (more than 1/2) of the monitors in order to function. When
32one or more monitors are down, clients may have a harder time forming
33their initial connection to the cluster as they may need to try more
34addresses before they reach an operating monitor.
35
36The down monitor daemon should generally be restarted as soon as
37possible to reduce the risk of a subsequen monitor failure leading to
38a service outage.
39
40MON_CLOCK_SKEW
41______________
42
43The clocks on the hosts running the ceph-mon monitor daemons are not
44sufficiently well synchronized. This health alert is raised if the
45cluster detects a clock skew greater than ``mon_clock_drift_allowed``.
46
47This is best resolved by synchronizing the clocks using a tool like
48``ntpd`` or ``chrony``.
49
50If it is impractical to keep the clocks closely synchronized, the
51``mon_clock_drift_allowed`` threshold can also be increased, but this
52value must stay significantly below the ``mon_lease`` interval in
53order for monitor cluster to function properly.
54
55MON_MSGR2_NOT_ENABLED
56_____________________
57
58The ``ms_bind_msgr2`` option is enabled but one or more monitors is
59not configured to bind to a v2 port in the cluster's monmap. This
60means that features specific to the msgr2 protocol (e.g., encryption)
61are not available on some or all connections.
62
63In most cases this can be corrected by issuing the command::
64
65 ceph mon enable-msgr2
66
67That command will change any monitor configured for the old default
68port 6789 to continue to listen for v1 connections on 6789 and also
69listen for v2 connections on the new default 3300 port.
70
71If a monitor is configured to listen for v1 connections on a non-standard port (not 6789), then the monmap will need to be modified manually.
72
73
74
75Manager
76-------
77
78MGR_MODULE_DEPENDENCY
79_____________________
80
81An enabled manager module is failing its dependency check. This health check
82should come with an explanatory message from the module about the problem.
83
84For example, a module might report that a required package is not installed:
85install the required package and restart your manager daemons.
86
87This health check is only applied to enabled modules. If a module is
88not enabled, you can see whether it is reporting dependency issues in
89the output of `ceph module ls`.
90
91
92MGR_MODULE_ERROR
93________________
94
95A manager module has experienced an unexpected error. Typically,
96this means an unhandled exception was raised from the module's `serve`
97function. The human readable description of the error may be obscurely
98worded if the exception did not provide a useful description of itself.
99
100This health check may indicate a bug: please open a Ceph bug report if you
101think you have encountered a bug.
102
103If you believe the error is transient, you may restart your manager
104daemon(s), or use `ceph mgr fail` on the active daemon to prompt
105a failover to another daemon.
106
c07f9fc5
FG
107
108OSDs
109----
110
111OSD_DOWN
112________
113
114One or more OSDs are marked down. The ceph-osd daemon may have been
115stopped, or peer OSDs may be unable to reach the OSD over the network.
116Common causes include a stopped or crashed daemon, a down host, or a
117network outage.
118
119Verify the host is healthy, the daemon is started, and network is
120functioning. If the daemon has crashed, the daemon log file
121(``/var/log/ceph/ceph-osd.*``) may contain debugging information.
122
123OSD_<crush type>_DOWN
124_____________________
125
126(e.g. OSD_HOST_DOWN, OSD_ROOT_DOWN)
127
128All the OSDs within a particular CRUSH subtree are marked down, for example
129all OSDs on a host.
130
131OSD_ORPHAN
132__________
133
134An OSD is referenced in the CRUSH map hierarchy but does not exist.
135
136The OSD can be removed from the CRUSH hierarchy with::
137
138 ceph osd crush rm osd.<id>
139
140OSD_OUT_OF_ORDER_FULL
141_____________________
142
143The utilization thresholds for `backfillfull`, `nearfull`, `full`,
144and/or `failsafe_full` are not ascending. In particular, we expect
145`backfillfull < nearfull`, `nearfull < full`, and `full <
146failsafe_full`.
147
148The thresholds can be adjusted with::
149
150 ceph osd set-backfillfull-ratio <ratio>
151 ceph osd set-nearfull-ratio <ratio>
152 ceph osd set-full-ratio <ratio>
153
154
155OSD_FULL
156________
157
158One or more OSDs has exceeded the `full` threshold and is preventing
159the cluster from servicing writes.
160
161Utilization by pool can be checked with::
162
163 ceph df
164
165The currently defined `full` ratio can be seen with::
166
167 ceph osd dump | grep full_ratio
168
169A short-term workaround to restore write availability is to raise the full
170threshold by a small amount::
171
172 ceph osd set-full-ratio <ratio>
173
174New storage should be added to the cluster by deploying more OSDs or
175existing data should be deleted in order to free up space.
11fdf7f2 176
c07f9fc5
FG
177OSD_BACKFILLFULL
178________________
179
180One or more OSDs has exceeded the `backfillfull` threshold, which will
181prevent data from being allowed to rebalance to this device. This is
182an early warning that rebalancing may not be able to complete and that
183the cluster is approaching full.
184
185Utilization by pool can be checked with::
186
187 ceph df
188
189OSD_NEARFULL
190____________
191
192One or more OSDs has exceeded the `nearfull` threshold. This is an early
193warning that the cluster is approaching full.
194
195Utilization by pool can be checked with::
196
197 ceph df
198
199OSDMAP_FLAGS
200____________
201
202One or more cluster flags of interest has been set. These flags include:
203
81eedcae 204* *full* - the cluster is flagged as full and cannot serve writes
c07f9fc5
FG
205* *pauserd*, *pausewr* - paused reads or writes
206* *noup* - OSDs are not allowed to start
207* *nodown* - OSD failure reports are being ignored, such that the
208 monitors will not mark OSDs `down`
209* *noin* - OSDs that were previously marked `out` will not be marked
210 back `in` when they start
211* *noout* - down OSDs will not automatically be marked out after the
212 configured interval
213* *nobackfill*, *norecover*, *norebalance* - recovery or data
214 rebalancing is suspended
215* *noscrub*, *nodeep_scrub* - scrubbing is disabled
216* *notieragent* - cache tiering activity is suspended
217
218With the exception of *full*, these flags can be set or cleared with::
219
220 ceph osd set <flag>
221 ceph osd unset <flag>
11fdf7f2 222
c07f9fc5
FG
223OSD_FLAGS
224_________
225
81eedcae
TL
226One or more OSDs or CRUSH {nodes,device classes} has a flag of interest set.
227These flags include:
c07f9fc5 228
81eedcae
TL
229* *noup*: these OSDs are not allowed to start
230* *nodown*: failure reports for these OSDs will be ignored
231* *noin*: if these OSDs were previously marked `out` automatically
232 after a failure, they will not be marked in when they start
233* *noout*: if these OSDs are down they will not automatically be marked
c07f9fc5
FG
234 `out` after the configured interval
235
81eedcae 236These flags can be set and cleared in batch with::
c07f9fc5 237
81eedcae
TL
238 ceph osd set-group <flags> <who>
239 ceph osd unset-group <flags> <who>
c07f9fc5
FG
240
241For example, ::
242
81eedcae
TL
243 ceph osd set-group noup,noout osd.0 osd.1
244 ceph osd unset-group noup,noout osd.0 osd.1
245 ceph osd set-group noup,noout host-foo
246 ceph osd unset-group noup,noout host-foo
247 ceph osd set-group noup,noout class-hdd
248 ceph osd unset-group noup,noout class-hdd
c07f9fc5
FG
249
250OLD_CRUSH_TUNABLES
251__________________
252
253The CRUSH map is using very old settings and should be updated. The
254oldest tunables that can be used (i.e., the oldest client version that
255can connect to the cluster) without triggering this health warning is
256determined by the ``mon_crush_min_required_version`` config option.
11fdf7f2 257See :ref:`crush-map-tunables` for more information.
c07f9fc5
FG
258
259OLD_CRUSH_STRAW_CALC_VERSION
260____________________________
261
262The CRUSH map is using an older, non-optimal method for calculating
263intermediate weight values for ``straw`` buckets.
264
265The CRUSH map should be updated to use the newer method
266(``straw_calc_version=1``). See
11fdf7f2 267:ref:`crush-map-tunables` for more information.
c07f9fc5
FG
268
269CACHE_POOL_NO_HIT_SET
270_____________________
271
272One or more cache pools is not configured with a *hit set* to track
273utilization, which will prevent the tiering agent from identifying
274cold objects to flush and evict from the cache.
275
276Hit sets can be configured on the cache pool with::
277
278 ceph osd pool set <poolname> hit_set_type <type>
279 ceph osd pool set <poolname> hit_set_period <period-in-seconds>
280 ceph osd pool set <poolname> hit_set_count <number-of-hitsets>
11fdf7f2 281 ceph osd pool set <poolname> hit_set_fpp <target-false-positive-rate>
c07f9fc5
FG
282
283OSD_NO_SORTBITWISE
284__________________
285
286No pre-luminous v12.y.z OSDs are running but the ``sortbitwise`` flag has not
287been set.
288
289The ``sortbitwise`` flag must be set before luminous v12.y.z or newer
290OSDs can start. You can safely set the flag with::
291
292 ceph osd set sortbitwise
293
294POOL_FULL
295_________
296
297One or more pools has reached its quota and is no longer allowing writes.
298
299Pool quotas and utilization can be seen with::
300
301 ceph df detail
302
303You can either raise the pool quota with::
304
305 ceph osd pool set-quota <poolname> max_objects <num-objects>
306 ceph osd pool set-quota <poolname> max_bytes <num-bytes>
307
308or delete some existing data to reduce utilization.
309
81eedcae
TL
310BLUEFS_SPILLOVER
311________________
312
313One or more OSDs that use the BlueStore backend have been allocated
314`db` partitions (storage space for metadata, normally on a faster
315device) but that space has filled, such that metadata has "spilled
316over" onto the normal slow device. This isn't necessarily an error
317condition or even unexpected, but if the administrator's expectation
318was that all metadata would fit on the faster device, it indicates
319that not enough space was provided.
320
321This warning can be disabled on all OSDs with::
322
323 ceph config set osd bluestore_warn_on_bluefs_spillover false
324
325Alternatively, it can be disabled on a specific OSD with::
326
327 ceph config set osd.123 bluestore_warn_on_bluefs_spillover false
328
329To provide more metadata space, the OSD in question could be destroyed and
330reprovisioned. This will involve data migration and recovery.
331
332It may also be possible to expand the LVM logical volume backing the
333`db` storage. If the underlying LV has been expanded, the OSD daemon
334needs to be stopped and BlueFS informed of the device size change with::
335
336 ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-$ID
337
eafe8130
TL
338BLUEFS_AVAILABLE_SPACE
339______________________
340
341To check how much space is free for BlueFS do::
342
343 ceph daemon osd.123 bluestore bluefs available
344
345This will output up to 3 values: `BDEV_DB free`, `BDEV_SLOW free` and
346`available_from_bluestore`. `BDEV_DB` and `BDEV_SLOW` report amount of space that
347has been acquired by BlueFS and is considered free. Value `available_from_bluestore`
348denotes ability of BlueStore to relinquish more space to BlueFS.
349It is normal that this value is different from amount of BlueStore free space, as
350BlueFS allocation unit is typically larger than BlueStore allocation unit.
351This means that only part of BlueStore free space will be acceptable for BlueFS.
352
353BLUEFS_LOW_SPACE
354_________________
355
356If BlueFS is running low on available free space and there is little
357`available_from_bluestore` one can consider reducing BlueFS allocation unit size.
358To simulate available space when allocation unit is different do::
359
360 ceph daemon osd.123 bluestore bluefs available <alloc-unit-size>
361
362BLUESTORE_FRAGMENTATION
363_______________________
364
365As BlueStore works free space on underlying storage will get fragmented.
366This is normal and unavoidable but excessive fragmentation will cause slowdown.
367To inspect BlueStore fragmentation one can do::
368
369 ceph daemon osd.123 bluestore allocator score block
370
371Score is given in [0-1] range.
372[0.0 .. 0.4] tiny fragmentation
373[0.4 .. 0.7] small, acceptable fragmentation
374[0.7 .. 0.9] considerable, but safe fragmentation
375[0.9 .. 1.0] severe fragmentation, may impact BlueFS ability to get space from BlueStore
376
377If detailed report of free fragments is required do::
378
379 ceph daemon osd.123 bluestore allocator dump block
380
381In case when handling OSD process that is not running fragmentation can be
382inspected with `ceph-bluestore-tool`.
383Get fragmentation score::
384
385 ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-score
386
387And dump detailed free chunks::
388
389 ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-dump
390
81eedcae
TL
391BLUESTORE_LEGACY_STATFS
392_______________________
393
394In the Nautilus release, BlueStore tracks its internal usage
395statistics on a per-pool granular basis, and one or more OSDs have
396BlueStore volumes that were created prior to Nautilus. If *all* OSDs
397are older than Nautilus, this just means that the per-pool metrics are
398not available. However, if there is a mix of pre-Nautilus and
399post-Nautilus OSDs, the cluster usage statistics reported by ``ceph
400df`` will not be accurate.
401
402The old OSDs can be updated to use the new usage tracking scheme by stopping each OSD, running a repair operation, and the restarting it. For example, if ``osd.123`` needed to be updated,::
403
404 systemctl stop ceph-osd@123
405 ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
406 systemctl start ceph-osd@123
407
408This warning can be disabled with::
409
410 ceph config set global bluestore_warn_on_legacy_statfs false
411
412
413BLUESTORE_DISK_SIZE_MISMATCH
414____________________________
415
416One or more OSDs using BlueStore has an internal inconsistency between the size
417of the physical device and the metadata tracking its size. This can lead to
418the OSD crashing in the future.
419
420The OSDs in question should be destroyed and reprovisioned. Care should be
421taken to do this one OSD at a time, and in a way that doesn't put any data at
422risk. For example, if osd ``$N`` has the error,::
423
424 ceph osd out osd.$N
425 while ! ceph osd safe-to-destroy osd.$N ; do sleep 1m ; done
426 ceph osd destroy osd.$N
427 ceph-volume lvm zap /path/to/device
428 ceph-volume lvm create --osd-id $N --data /path/to/device
429
c07f9fc5 430
11fdf7f2
TL
431Device health
432-------------
433
434DEVICE_HEALTH
435_____________
436
437One or more devices is expected to fail soon, where the warning
438threshold is controlled by the ``mgr/devicehealth/warn_threshold``
439config option.
440
441This warning only applies to OSDs that are currently marked "in", so
442the expected response to this failure is to mark the device "out" so
443that data is migrated off of the device, and then to remove the
444hardware from the system. Note that the marking out is normally done
445automatically if ``mgr/devicehealth/self_heal`` is enabled based on
446the ``mgr/devicehealth/mark_out_threshold``.
447
448Device health can be checked with::
449
450 ceph device info <device-id>
451
452Device life expectancy is set by a prediction model run by
453the mgr or an by external tool via the command::
454
455 ceph device set-life-expectancy <device-id> <from> <to>
456
457You can change the stored life expectancy manually, but that usually
458doesn't accomplish anything as whatever tool originally set it will
459probably set it again, and changing the stored value does not affect
460the actual health of the hardware device.
461
462DEVICE_HEALTH_IN_USE
463____________________
464
465One or more devices is expected to fail soon and has been marked "out"
466of the cluster based on ``mgr/devicehealth/mark_out_threshold``, but it
467is still participating in one more PGs. This may be because it was
468only recently marked "out" and data is still migrating, or because data
469cannot be migrated off for some reason (e.g., the cluster is nearly
470full, or the CRUSH hierarchy is such that there isn't another suitable
471OSD to migrate the data too).
472
473This message can be silenced by disabling the self heal behavior
474(setting ``mgr/devicehealth/self_heal`` to false), by adjusting the
475``mgr/devicehealth/mark_out_threshold``, or by addressing what is
476preventing data from being migrated off of the ailing device.
477
478DEVICE_HEALTH_TOOMANY
479_____________________
480
481Too many devices is expected to fail soon and the
482``mgr/devicehealth/self_heal`` behavior is enabled, such that marking
483out all of the ailing devices would exceed the clusters
484``mon_osd_min_in_ratio`` ratio that prevents too many OSDs from being
485automatically marked "out".
486
487This generally indicates that too many devices in your cluster are
488expected to fail soon and you should take action to add newer
489(healthier) devices before too many devices fail and data is lost.
490
491The health message can also be silenced by adjusting parameters like
492``mon_osd_min_in_ratio`` or ``mgr/devicehealth/mark_out_threshold``,
493but be warned that this will increase the likelihood of unrecoverable
494data loss in the cluster.
495
496
c07f9fc5 497Data health (pools & placement groups)
d2e6a577 498--------------------------------------
c07f9fc5
FG
499
500PG_AVAILABILITY
501_______________
502
503Data availability is reduced, meaning that the cluster is unable to
504service potential read or write requests for some data in the cluster.
505Specifically, one or more PGs is in a state that does not allow IO
506requests to be serviced. Problematic PG states include *peering*,
507*stale*, *incomplete*, and the lack of *active* (if those conditions do not clear
508quickly).
509
510Detailed information about which PGs are affected is available from::
511
512 ceph health detail
513
514In most cases the root cause is that one or more OSDs is currently
11fdf7f2 515down; see the discussion for ``OSD_DOWN`` above.
c07f9fc5
FG
516
517The state of specific problematic PGs can be queried with::
518
519 ceph tell <pgid> query
520
521PG_DEGRADED
522___________
523
524Data redundancy is reduced for some data, meaning the cluster does not
525have the desired number of replicas for all data (for replicated
526pools) or erasure code fragments (for erasure coded pools).
527Specifically, one or more PGs:
528
529* has the *degraded* or *undersized* flag set, meaning there are not
530 enough instances of that placement group in the cluster;
531* has not had the *clean* flag set for some time.
532
533Detailed information about which PGs are affected is available from::
534
535 ceph health detail
536
537In most cases the root cause is that one or more OSDs is currently
538down; see the dicussion for ``OSD_DOWN`` above.
539
540The state of specific problematic PGs can be queried with::
541
542 ceph tell <pgid> query
543
544
eafe8130
TL
545PG_RECOVERY_FULL
546________________
547
548Data redundancy may be reduced or at risk for some data due to a lack
549of free space in the cluster. Specifically, one or more PGs has the
550*recovery_toofull* flag set, meaning that the
551cluster is unable to migrate or recover data because one or more OSDs
552is above the *full* threshold.
553
554See the discussion for *OSD_FULL* above for steps to resolve this condition.
555
556PG_BACKFILL_FULL
c07f9fc5
FG
557________________
558
559Data redundancy may be reduced or at risk for some data due to a lack
560of free space in the cluster. Specifically, one or more PGs has the
eafe8130 561*backfill_toofull* flag set, meaning that the
c07f9fc5
FG
562cluster is unable to migrate or recover data because one or more OSDs
563is above the *backfillfull* threshold.
564
eafe8130 565See the discussion for *OSD_BACKFILLFULL* above for
c07f9fc5
FG
566steps to resolve this condition.
567
568PG_DAMAGED
569__________
570
571Data scrubbing has discovered some problems with data consistency in
572the cluster. Specifically, one or more PGs has the *inconsistent* or
573*snaptrim_error* flag is set, indicating an earlier scrub operation
574found a problem, or that the *repair* flag is set, meaning a repair
575for such an inconsistency is currently in progress.
576
577See :doc:`pg-repair` for more information.
578
579OSD_SCRUB_ERRORS
580________________
581
582Recent OSD scrubs have uncovered inconsistencies. This error is generally
11fdf7f2 583paired with *PG_DAMAGED* (see above).
c07f9fc5
FG
584
585See :doc:`pg-repair` for more information.
586
11fdf7f2
TL
587LARGE_OMAP_OBJECTS
588__________________
589
590One or more pools contain large omap objects as determined by
591``osd_deep_scrub_large_omap_object_key_threshold`` (threshold for number of keys
592to determine a large omap object) or
593``osd_deep_scrub_large_omap_object_value_sum_threshold`` (the threshold for
594summed size (bytes) of all key values to determine a large omap object) or both.
595More information on the object name, key count, and size in bytes can be found
596by searching the cluster log for 'Large omap object found'. Large omap objects
597can be caused by RGW bucket index objects that do not have automatic resharding
598enabled. Please see :ref:`RGW Dynamic Bucket Index Resharding
599<rgw_dynamic_bucket_index_resharding>` for more information on resharding.
600
601The thresholds can be adjusted with::
602
603 ceph config set osd osd_deep_scrub_large_omap_object_key_threshold <keys>
604 ceph config set osd osd_deep_scrub_large_omap_object_value_sum_threshold <bytes>
605
c07f9fc5
FG
606CACHE_POOL_NEAR_FULL
607____________________
608
609A cache tier pool is nearly full. Full in this context is determined
610by the ``target_max_bytes`` and ``target_max_objects`` properties on
611the cache pool. Once the pool reaches the target threshold, write
612requests to the pool may block while data is flushed and evicted
613from the cache, a state that normally leads to very high latencies and
614poor performance.
615
616The cache pool target size can be adjusted with::
617
618 ceph osd pool set <cache-pool-name> target_max_bytes <bytes>
619 ceph osd pool set <cache-pool-name> target_max_objects <objects>
620
621Normal cache flush and evict activity may also be throttled due to reduced
622availability or performance of the base tier, or overall cluster load.
623
624TOO_FEW_PGS
625___________
626
627The number of PGs in use in the cluster is below the configurable
628threshold of ``mon_pg_warn_min_per_osd`` PGs per OSD. This can lead
11fdf7f2
TL
629to suboptimal distribution and balance of data across the OSDs in
630the cluster, and similarly reduce overall performance.
c07f9fc5
FG
631
632This may be an expected condition if data pools have not yet been
633created.
634
11fdf7f2
TL
635The PG count for existing pools can be increased or new pools can be created.
636Please refer to :ref:`choosing-number-of-placement-groups` for more
637information.
638
639POOL_TOO_FEW_PGS
640________________
641
642One or more pools should probably have more PGs, based on the amount
643of data that is currently stored in the pool. This can lead to
644suboptimal distribution and balance of data across the OSDs in the
645cluster, and similarly reduce overall performance. This warning is
646generated if the ``pg_autoscale_mode`` property on the pool is set to
647``warn``.
648
649To disable the warning, you can disable auto-scaling of PGs for the
650pool entirely with::
651
652 ceph osd pool set <pool-name> pg_autoscale_mode off
653
654To allow the cluster to automatically adjust the number of PGs,::
655
656 ceph osd pool set <pool-name> pg_autoscale_mode on
657
658You can also manually set the number of PGs for the pool to the
659recommended amount with::
660
661 ceph osd pool set <pool-name> pg_num <new-pg-num>
662
663Please refer to :ref:`choosing-number-of-placement-groups` and
664:ref:`pg-autoscaler` for more information.
c07f9fc5
FG
665
666TOO_MANY_PGS
667____________
668
669The number of PGs in use in the cluster is above the configurable
3efd9988
FG
670threshold of ``mon_max_pg_per_osd`` PGs per OSD. If this threshold is
671exceed the cluster will not allow new pools to be created, pool `pg_num` to
672be increased, or pool replication to be increased (any of which would lead to
673more PGs in the cluster). A large number of PGs can lead
c07f9fc5
FG
674to higher memory utilization for OSD daemons, slower peering after
675cluster state changes (like OSD restarts, additions, or removals), and
676higher load on the Manager and Monitor daemons.
677
3efd9988
FG
678The simplest way to mitigate the problem is to increase the number of
679OSDs in the cluster by adding more hardware. Note that the OSD count
680used for the purposes of this health check is the number of "in" OSDs,
681so marking "out" OSDs "in" (if there are any) can also help::
c07f9fc5 682
3efd9988 683 ceph osd in <osd id(s)>
c07f9fc5 684
11fdf7f2
TL
685Please refer to :ref:`choosing-number-of-placement-groups` for more
686information.
687
688POOL_TOO_MANY_PGS
689_________________
690
691One or more pools should probably have more PGs, based on the amount
692of data that is currently stored in the pool. This can lead to higher
693memory utilization for OSD daemons, slower peering after cluster state
694changes (like OSD restarts, additions, or removals), and higher load
695on the Manager and Monitor daemons. This warning is generated if the
696``pg_autoscale_mode`` property on the pool is set to ``warn``.
697
698To disable the warning, you can disable auto-scaling of PGs for the
699pool entirely with::
700
701 ceph osd pool set <pool-name> pg_autoscale_mode off
702
703To allow the cluster to automatically adjust the number of PGs,::
704
705 ceph osd pool set <pool-name> pg_autoscale_mode on
706
707You can also manually set the number of PGs for the pool to the
708recommended amount with::
709
710 ceph osd pool set <pool-name> pg_num <new-pg-num>
711
712Please refer to :ref:`choosing-number-of-placement-groups` and
713:ref:`pg-autoscaler` for more information.
714
715POOL_TARGET_SIZE_RATIO_OVERCOMMITTED
716____________________________________
717
718One or more pools have a ``target_size_ratio`` property set to
719estimate the expected size of the pool as a fraction of total storage,
720but the value(s) exceed the total available storage (either by
721themselves or in combination with other pools' actual usage).
722
723This is usually an indication that the ``target_size_ratio`` value for
724the pool is too large and should be reduced or set to zero with::
725
726 ceph osd pool set <pool-name> target_size_ratio 0
727
728For more information, see :ref:`specifying_pool_target_size`.
729
730POOL_TARGET_SIZE_BYTES_OVERCOMMITTED
731____________________________________
732
733One or more pools have a ``target_size_bytes`` property set to
734estimate the expected size of the pool,
735but the value(s) exceed the total available storage (either by
736themselves or in combination with other pools' actual usage).
737
738This is usually an indication that the ``target_size_bytes`` value for
739the pool is too large and should be reduced or set to zero with::
740
741 ceph osd pool set <pool-name> target_size_bytes 0
742
743For more information, see :ref:`specifying_pool_target_size`.
c07f9fc5 744
eafe8130
TL
745TOO_FEW_OSDS
746____________
747
748The number of OSDs in the cluster is below the configurable
749threshold of ``osd_pool_default_size``.
750
c07f9fc5
FG
751SMALLER_PGP_NUM
752_______________
753
754One or more pools has a ``pgp_num`` value less than ``pg_num``. This
755is normally an indication that the PG count was increased without
756also increasing the placement behavior.
757
758This is sometimes done deliberately to separate out the `split` step
759when the PG count is adjusted from the data migration that is needed
760when ``pgp_num`` is changed.
761
762This is normally resolved by setting ``pgp_num`` to match ``pg_num``,
763triggering the data migration, with::
764
765 ceph osd pool set <pool> pgp_num <pg-num-value>
766
c07f9fc5
FG
767MANY_OBJECTS_PER_PG
768___________________
769
770One or more pools has an average number of objects per PG that is
771significantly higher than the overall cluster average. The specific
772threshold is controlled by the ``mon_pg_warn_max_object_skew``
773configuration value.
774
775This is usually an indication that the pool(s) containing most of the
776data in the cluster have too few PGs, and/or that other pools that do
777not contain as much data have too many PGs. See the discussion of
778*TOO_MANY_PGS* above.
779
780The threshold can be raised to silence the health warning by adjusting
781the ``mon_pg_warn_max_object_skew`` config option on the monitors.
782
11fdf7f2 783
c07f9fc5
FG
784POOL_APP_NOT_ENABLED
785____________________
786
787A pool exists that contains one or more objects but has not been
788tagged for use by a particular application.
789
790Resolve this warning by labeling the pool for use by an application. For
791example, if the pool is used by RBD,::
792
793 rbd pool init <poolname>
794
795If the pool is being used by a custom application 'foo', you can also label
796via the low-level command::
797
798 ceph osd pool application enable foo
799
11fdf7f2 800For more information, see :ref:`associate-pool-to-application`.
c07f9fc5
FG
801
802POOL_FULL
803_________
804
805One or more pools has reached (or is very close to reaching) its
806quota. The threshold to trigger this error condition is controlled by
807the ``mon_pool_quota_crit_threshold`` configuration option.
808
809Pool quotas can be adjusted up or down (or removed) with::
810
811 ceph osd pool set-quota <pool> max_bytes <bytes>
812 ceph osd pool set-quota <pool> max_objects <objects>
813
11fdf7f2 814Setting the quota value to 0 will disable the quota.
c07f9fc5
FG
815
816POOL_NEAR_FULL
817______________
818
819One or more pools is approaching is quota. The threshold to trigger
820this warning condition is controlled by the
821``mon_pool_quota_warn_threshold`` configuration option.
822
823Pool quotas can be adjusted up or down (or removed) with::
824
825 ceph osd pool set-quota <pool> max_bytes <bytes>
826 ceph osd pool set-quota <pool> max_objects <objects>
827
828Setting the quota value to 0 will disable the quota.
829
830OBJECT_MISPLACED
831________________
832
833One or more objects in the cluster is not stored on the node the
834cluster would like it to be stored on. This is an indication that
835data migration due to some recent cluster change has not yet completed.
836
837Misplaced data is not a dangerous condition in and of itself; data
838consistency is never at risk, and old copies of objects are never
839removed until the desired number of new copies (in the desired
840locations) are present.
841
842OBJECT_UNFOUND
843______________
844
845One or more objects in the cluster cannot be found. Specifically, the
846OSDs know that a new or updated copy of an object should exist, but a
847copy of that version of the object has not been found on OSDs that are
848currently online.
849
850Read or write requests to unfound objects will block.
851
852Ideally, a down OSD can be brought back online that has the more
853recent copy of the unfound object. Candidate OSDs can be identified from the
854peering state for the PG(s) responsible for the unfound object::
855
856 ceph tell <pgid> query
857
858If the latest copy of the object is not available, the cluster can be
11fdf7f2
TL
859told to roll back to a previous version of the object. See
860:ref:`failures-osd-unfound` for more information.
c07f9fc5 861
11fdf7f2
TL
862SLOW_OPS
863________
c07f9fc5
FG
864
865One or more OSD requests is taking a long time to process. This can
866be an indication of extreme load, a slow storage device, or a software
867bug.
868
869The request queue on the OSD(s) in question can be queried with the
870following command, executed from the OSD host::
871
872 ceph daemon osd.<id> ops
873
874A summary of the slowest recent requests can be seen with::
875
876 ceph daemon osd.<id> dump_historic_ops
877
878The location of an OSD can be found with::
879
880 ceph osd find osd.<id>
881
c07f9fc5
FG
882PG_NOT_SCRUBBED
883_______________
884
885One or more PGs has not been scrubbed recently. PGs are normally
886scrubbed every ``mon_scrub_interval`` seconds, and this warning
11fdf7f2
TL
887triggers when ``mon_warn_pg_not_scrubbed_ratio`` percentage of interval has elapsed
888without a scrub since it was due.
c07f9fc5
FG
889
890PGs will not scrub if they are not flagged as *clean*, which may
891happen if they are misplaced or degraded (see *PG_AVAILABILITY* and
892*PG_DEGRADED* above).
893
894You can manually initiate a scrub of a clean PG with::
895
896 ceph pg scrub <pgid>
897
898PG_NOT_DEEP_SCRUBBED
899____________________
900
901One or more PGs has not been deep scrubbed recently. PGs are normally
a8e16298 902scrubbed every ``osd_deep_scrub_interval`` seconds, and this warning
11fdf7f2
TL
903triggers when ``mon_warn_pg_not_deep_scrubbed_ratio`` percentage of interval has elapsed
904without a scrub since it was due.
c07f9fc5
FG
905
906PGs will not (deep) scrub if they are not flagged as *clean*, which may
907happen if they are misplaced or degraded (see *PG_AVAILABILITY* and
908*PG_DEGRADED* above).
909
910You can manually initiate a scrub of a clean PG with::
911
912 ceph pg deep-scrub <pgid>
eafe8130
TL
913
914
915Miscellaneous
916-------------
917
918RECENT_CRASH
919____________
920
921One or more Ceph daemons has crashed recently, and the crash has not
922yet been archived (acknowledged) by the administrator. This may
923indicate a software bug, a hardware problem (e.g., a failing disk), or
924some other problem.
925
926New crashes can be listed with::
927
928 ceph crash ls-new
929
930Information about a specific crash can be examined with::
931
932 ceph crash info <crash-id>
933
934This warning can be silenced by "archiving" the crash (perhaps after
935being examined by an administrator) so that it does not generate this
936warning::
937
938 ceph crash archive <crash-id>
939
940Similarly, all new crashes can be archived with::
941
942 ceph crash archive-all
943
944Archived crashes will still be visible via ``ceph crash ls`` but not
945``ceph crash ls-new``.
946
947The time period for what "recent" means is controlled by the option
948``mgr/crash/warn_recent_interval`` (default: two weeks).
949
950These warnings can be disabled entirely with::
951
952 ceph config set mgr/crash/warn_recent_interval 0
953
954TELEMETRY_CHANGED
955_________________
956
957Telemetry has been enabled, but the contents of the telemetry report
958have changed since that time, so telemetry reports will not be sent.
959
960The Ceph developers periodically revise the telemetry feature to
961include new and useful information, or to remove information found to
962be useless or sensitive. If any new information is included in the
963report, Ceph will require the administrator to re-enable telemetry to
964ensure they have an opportunity to (re)review what information will be
965shared.
966
967To review the contents of the telemetry report,::
968
969 ceph telemetry show
970
971Note that the telemetry report consists of several optional channels
972that may be independently enabled or disabled. For more information, see
973:ref:`telemetry`.
974
975To re-enable telemetry (and make this warning go away),::
976
977 ceph telemetry on
978
979To disable telemetry (and make this warning go away),::
980
981 ceph telemetry off