means that features specific to the msgr2 protocol (e.g., encryption)
are not available on some or all connections.
-In most cases this can be corrected by issuing the command::
+In most cases this can be corrected by issuing the command:
- ceph mon enable-msgr2
+.. prompt:: bash $
+
+ ceph mon enable-msgr2
That command will change any monitor configured for the old default
port 6789 to continue to listen for v1 connections on 6789 and also
preventing it from pruning the cluster metadata it stores. If the
problem persists, please report a bug.
-The warning threshold may be adjusted with::
+The warning threshold may be adjusted with:
+
+.. prompt:: bash $
- ceph config set global mon_data_size_warn <size>
+ ceph config set global mon_data_size_warn <size>
AUTH_INSECURE_GLOBAL_ID_RECLAIM
_______________________________
allows monitors to detect clients with insecure reclaim early by forcing them to
reconnect right after they first authenticate).
-You can identify which client(s) are using unpatched ceph client code with::
+You can identify which client(s) are using unpatched ceph client code with:
+
+.. prompt:: bash $
- ceph health detail
+ ceph health detail
Clients global_id reclaim rehavior can also seen in the
``global_id_status`` field in the dump of clients connected to an
individual monitor (``reclaim_insecure`` means the client is
-unpatched and is contributing to this health alert)::
+unpatched and is contributing to this health alert):
- ceph tell mon.\* sessions
+.. prompt:: bash $
+
+ ceph tell mon.\* sessions
We strongly recommend that all clients in the system are upgraded to a
newer version of Ceph that correctly reclaims global_id values. Once
all clients have been updated, you can stop allowing insecure reconnections
-with::
+with:
+
+.. prompt:: bash $
- ceph config set mon auth_allow_insecure_global_id_reclaim false
+ ceph config set mon auth_allow_insecure_global_id_reclaim false
If it is impractical to upgrade all clients immediately, you can silence
-this warning temporarily with::
+this warning temporarily with:
- ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w # 1 week
+.. prompt:: bash $
-Although we do NOT recommend doing so, you can also disable this warning indefinitely
-with::
+ ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM 1w # 1 week
- ceph config set mon mon_warn_on_insecure_global_id_reclaim false
+Although we do NOT recommend doing so, you can also disable this warning
+indefinitely with:
+
+.. prompt:: bash $
+
+ ceph config set mon mon_warn_on_insecure_global_id_reclaim false
AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED
_______________________________________
If the ``AUTH_INSECURE_GLOBAL_ID_RECLAIM`` health alert has not also been raised and
the ``auth_expose_insecure_global_id_reclaim`` setting has not been disabled (it is
on by default), then there are currently no clients connected that need to be
-upgraded, and it is safe to disallow insecure global_id reclaim with::
+upgraded, and it is safe to disallow insecure global_id reclaim with:
+
+.. prompt:: bash $
- ceph config set mon auth_allow_insecure_global_id_reclaim false
+ ceph config set mon auth_allow_insecure_global_id_reclaim false
If there are still clients that need to be upgraded, then this alert can be
-silenced temporarily with::
+silenced temporarily with:
- ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week
+.. prompt:: bash $
+
+ ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED 1w # 1 week
Although we do NOT recommend doing so, you can also disable this warning indefinitely
-with::
+with:
+
+.. prompt:: bash $
- ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
+ ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false
Manager
An OSD is referenced in the CRUSH map hierarchy but does not exist.
-The OSD can be removed from the CRUSH hierarchy with::
+The OSD can be removed from the CRUSH hierarchy with:
+
+.. prompt:: bash $
- ceph osd crush rm osd.<id>
+ ceph osd crush rm osd.<id>
OSD_OUT_OF_ORDER_FULL
_____________________
`nearfull < backfillfull`, `backfillfull < full`, and `full <
failsafe_full`.
-The thresholds can be adjusted with::
+The thresholds can be adjusted with:
- ceph osd set-nearfull-ratio <ratio>
- ceph osd set-backfillfull-ratio <ratio>
- ceph osd set-full-ratio <ratio>
+.. prompt:: bash $
+
+ ceph osd set-nearfull-ratio <ratio>
+ ceph osd set-backfillfull-ratio <ratio>
+ ceph osd set-full-ratio <ratio>
OSD_FULL
One or more OSDs has exceeded the `full` threshold and is preventing
the cluster from servicing writes.
-Utilization by pool can be checked with::
+Utilization by pool can be checked with:
+
+.. prompt:: bash $
+
+ ceph df
- ceph df
+The currently defined `full` ratio can be seen with:
-The currently defined `full` ratio can be seen with::
+.. prompt:: bash $
- ceph osd dump | grep full_ratio
+ ceph osd dump | grep full_ratio
A short-term workaround to restore write availability is to raise the full
-threshold by a small amount::
+threshold by a small amount:
- ceph osd set-full-ratio <ratio>
+.. prompt:: bash $
+
+ ceph osd set-full-ratio <ratio>
New storage should be added to the cluster by deploying more OSDs or
existing data should be deleted in order to free up space.
OSD_BACKFILLFULL
________________
-One or more OSDs has exceeded the `backfillfull` threshold, which will
-prevent data from being allowed to rebalance to this device. This is
-an early warning that rebalancing may not be able to complete and that
-the cluster is approaching full.
+One or more OSDs has exceeded the `backfillfull` threshold or *would* exceed
+when the currently mapped backfills finish, which will prevent data from being
+allowed to rebalance to this device. This is an early warning that rebalancing
+may not be able to complete and that the cluster is approaching full.
+
+Utilization by pool can be checked with:
-Utilization by pool can be checked with::
+.. prompt:: bash $
- ceph df
+ ceph df
OSD_NEARFULL
____________
One or more OSDs has exceeded the `nearfull` threshold. This is an early
warning that the cluster is approaching full.
-Utilization by pool can be checked with::
+Utilization by pool can be checked with:
+
+.. prompt:: bash $
- ceph df
+ ceph df
OSDMAP_FLAGS
____________
* *noscrub*, *nodeep_scrub* - scrubbing is disabled
* *notieragent* - cache tiering activity is suspended
-With the exception of *full*, these flags can be set or cleared with::
+With the exception of *full*, these flags can be set or cleared with:
- ceph osd set <flag>
- ceph osd unset <flag>
+.. prompt:: bash $
+
+ ceph osd set <flag>
+ ceph osd unset <flag>
OSD_FLAGS
_________
* *noout*: if these OSDs are down they will not automatically be marked
`out` after the configured interval
-These flags can be set and cleared in batch with::
+These flags can be set and cleared in batch with:
+
+.. prompt:: bash $
+
+ ceph osd set-group <flags> <who>
+ ceph osd unset-group <flags> <who>
- ceph osd set-group <flags> <who>
- ceph osd unset-group <flags> <who>
+For example:
-For example, ::
+.. prompt:: bash $
- ceph osd set-group noup,noout osd.0 osd.1
- ceph osd unset-group noup,noout osd.0 osd.1
- ceph osd set-group noup,noout host-foo
- ceph osd unset-group noup,noout host-foo
- ceph osd set-group noup,noout class-hdd
- ceph osd unset-group noup,noout class-hdd
+ ceph osd set-group noup,noout osd.0 osd.1
+ ceph osd unset-group noup,noout osd.0 osd.1
+ ceph osd set-group noup,noout host-foo
+ ceph osd unset-group noup,noout host-foo
+ ceph osd set-group noup,noout class-hdd
+ ceph osd unset-group noup,noout class-hdd
OLD_CRUSH_TUNABLES
__________________
utilization, which will prevent the tiering agent from identifying
cold objects to flush and evict from the cache.
-Hit sets can be configured on the cache pool with::
+Hit sets can be configured on the cache pool with:
- ceph osd pool set <poolname> hit_set_type <type>
- ceph osd pool set <poolname> hit_set_period <period-in-seconds>
- ceph osd pool set <poolname> hit_set_count <number-of-hitsets>
- ceph osd pool set <poolname> hit_set_fpp <target-false-positive-rate>
+.. prompt:: bash $
+
+ ceph osd pool set <poolname> hit_set_type <type>
+ ceph osd pool set <poolname> hit_set_period <period-in-seconds>
+ ceph osd pool set <poolname> hit_set_count <number-of-hitsets>
+ ceph osd pool set <poolname> hit_set_fpp <target-false-positive-rate>
OSD_NO_SORTBITWISE
__________________
-No pre-luminous v12.y.z OSDs are running but the ``sortbitwise`` flag has not
+No pre-Luminous v12.y.z OSDs are running but the ``sortbitwise`` flag has not
been set.
-The ``sortbitwise`` flag must be set before luminous v12.y.z or newer
-OSDs can start. You can safely set the flag with::
+The ``sortbitwise`` flag must be set before OSDs running Luminous v12.y.z or newer
+can start. You can safely set the flag with:
+
+.. prompt:: bash $
- ceph osd set sortbitwise
+ ceph osd set sortbitwise
OSD_FILESTORE
__________________
-Filestore has been deprecated, considering that Bluestore has been the default
-objectstore for quite some time. Warn if OSDs are running Filestore.
+The Filestore OSD back end has been deprecated; the BlueStore back end has been
+the default objectstore for quite some time. Warn if OSDs are running Filestore.
-The 'mclock_scheduler' is not supported for filestore OSDs. Therefore, the
-default 'osd_op_queue' is set to 'wpq' for filestore OSDs and is enforced
+The 'mclock_scheduler' is not supported for Filestore OSDs. Therefore, the
+default 'osd_op_queue' is set to 'wpq' for Filestore OSDs and is enforced
even if the user attempts to change it.
-Filestore OSDs can be listed with::
+Filestore OSDs can be listed with:
- ceph report | jq -c '."osd_metadata" | .[] | select(.osd_objectstore | contains("filestore")) | {id, osd_objectstore}'
+.. prompt:: bash $
-If it is not feasible to migrate Filestore OSDs to Bluestore immediately, you can silence
-this warning temporarily with::
+ ceph report | jq -c '."osd_metadata" | .[] | select(.osd_objectstore | contains("filestore")) | {id, osd_objectstore}'
- ceph health mute OSD_FILESTORE
+In order to upgrade to Reef or later releases, any Filestore OSDs must first be
+migrated to BlueStore.
+When upgrading a release prior to Reef to Reef or later: if it is not feasible to migrate Filestore OSDs to
+BlueStore immediately, you can silence this warning temporarily with:
+
+.. prompt:: bash $
+
+ ceph health mute OSD_FILESTORE
+
+Since this migration can take considerable time to complete, we recommend that you
+begin the process well in advance of an update to Reef or later releases.
POOL_FULL
_________
One or more pools has reached its quota and is no longer allowing writes.
-Pool quotas and utilization can be seen with::
+Pool quotas and utilization can be seen with:
- ceph df detail
+.. prompt:: bash $
-You can either raise the pool quota with::
+ ceph df detail
- ceph osd pool set-quota <poolname> max_objects <num-objects>
- ceph osd pool set-quota <poolname> max_bytes <num-bytes>
+You can either raise the pool quota with:
+
+.. prompt:: bash $
+
+ ceph osd pool set-quota <poolname> max_objects <num-objects>
+ ceph osd pool set-quota <poolname> max_bytes <num-bytes>
or delete some existing data to reduce utilization.
was that all metadata would fit on the faster device, it indicates
that not enough space was provided.
-This warning can be disabled on all OSDs with::
+This warning can be disabled on all OSDs with:
+
+.. prompt:: bash $
- ceph config set osd bluestore_warn_on_bluefs_spillover false
+ ceph config set osd bluestore_warn_on_bluefs_spillover false
-Alternatively, it can be disabled on a specific OSD with::
+Alternatively, it can be disabled on a specific OSD with:
- ceph config set osd.123 bluestore_warn_on_bluefs_spillover false
+.. prompt:: bash $
+
+ ceph config set osd.123 bluestore_warn_on_bluefs_spillover false
To provide more metadata space, the OSD in question could be destroyed and
reprovisioned. This will involve data migration and recovery.
It may also be possible to expand the LVM logical volume backing the
`db` storage. If the underlying LV has been expanded, the OSD daemon
-needs to be stopped and BlueFS informed of the device size change with::
+needs to be stopped and BlueFS informed of the device size change with:
+
+.. prompt:: bash $
- ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-$ID
+ ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-$ID
BLUEFS_AVAILABLE_SPACE
______________________
-To check how much space is free for BlueFS do::
+To check how much space is free for BlueFS do:
+
+.. prompt:: bash $
- ceph daemon osd.123 bluestore bluefs available
+ ceph daemon osd.123 bluestore bluefs available
This will output up to 3 values: `BDEV_DB free`, `BDEV_SLOW free` and
`available_from_bluestore`. `BDEV_DB` and `BDEV_SLOW` report amount of space that
If BlueFS is running low on available free space and there is little
`available_from_bluestore` one can consider reducing BlueFS allocation unit size.
-To simulate available space when allocation unit is different do::
+To simulate available space when allocation unit is different do:
- ceph daemon osd.123 bluestore bluefs available <alloc-unit-size>
+.. prompt:: bash $
+
+ ceph daemon osd.123 bluestore bluefs available <alloc-unit-size>
BLUESTORE_FRAGMENTATION
_______________________
As BlueStore works free space on underlying storage will get fragmented.
This is normal and unavoidable but excessive fragmentation will cause slowdown.
-To inspect BlueStore fragmentation one can do::
+To inspect BlueStore fragmentation one can do:
+
+.. prompt:: bash $
- ceph daemon osd.123 bluestore allocator score block
+ ceph daemon osd.123 bluestore allocator score block
Score is given in [0-1] range.
[0.0 .. 0.4] tiny fragmentation
[0.7 .. 0.9] considerable, but safe fragmentation
[0.9 .. 1.0] severe fragmentation, may impact BlueFS ability to get space from BlueStore
-If detailed report of free fragments is required do::
+If detailed report of free fragments is required do:
+
+.. prompt:: bash $
- ceph daemon osd.123 bluestore allocator dump block
+ ceph daemon osd.123 bluestore allocator dump block
In case when handling OSD process that is not running fragmentation can be
inspected with `ceph-bluestore-tool`.
-Get fragmentation score::
+Get fragmentation score:
- ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-score
+.. prompt:: bash $
-And dump detailed free chunks::
+ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-score
- ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-dump
+And dump detailed free chunks:
+
+.. prompt:: bash $
+
+ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-123 --allocator block free-dump
BLUESTORE_LEGACY_STATFS
_______________________
post-Nautilus OSDs, the cluster usage statistics reported by ``ceph
df`` will not be accurate.
-The old OSDs can be updated to use the new usage tracking scheme by stopping each OSD, running a repair operation, and the restarting it. For example, if ``osd.123`` needed to be updated,::
+The old OSDs can be updated to use the new usage tracking scheme by stopping each OSD, running a repair operation, and the restarting it. For example, if ``osd.123`` needed to be updated,:
+
+.. prompt:: bash $
- systemctl stop ceph-osd@123
- ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
- systemctl start ceph-osd@123
+ systemctl stop ceph-osd@123
+ ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
+ systemctl start ceph-osd@123
-This warning can be disabled with::
+This warning can be disabled with:
- ceph config set global bluestore_warn_on_legacy_statfs false
+.. prompt:: bash $
+
+ ceph config set global bluestore_warn_on_legacy_statfs false
BLUESTORE_NO_PER_POOL_OMAP
__________________________
The old OSDs can be updated to track by pool by stopping each OSD,
running a repair operation, and the restarting it. For example, if
-``osd.123`` needed to be updated,::
+``osd.123`` needed to be updated,:
+
+.. prompt:: bash $
+
+ systemctl stop ceph-osd@123
+ ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
+ systemctl start ceph-osd@123
- systemctl stop ceph-osd@123
- ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
- systemctl start ceph-osd@123
+This warning can be disabled with:
-This warning can be disabled with::
+.. prompt:: bash $
- ceph config set global bluestore_warn_on_no_per_pool_omap false
+ ceph config set global bluestore_warn_on_no_per_pool_omap false
BLUESTORE_NO_PER_PG_OMAP
__________________________
The older OSDs can be updated to track by PG by stopping each OSD,
running a repair operation, and the restarting it. For example, if
-``osd.123`` needed to be updated,::
+``osd.123`` needed to be updated,:
- systemctl stop ceph-osd@123
- ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
- systemctl start ceph-osd@123
+.. prompt:: bash $
-This warning can be disabled with::
+ systemctl stop ceph-osd@123
+ ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-123
+ systemctl start ceph-osd@123
- ceph config set global bluestore_warn_on_no_per_pg_omap false
+This warning can be disabled with:
+
+.. prompt:: bash $
+
+ ceph config set global bluestore_warn_on_no_per_pg_omap false
BLUESTORE_DISK_SIZE_MISMATCH
The OSDs in question should be destroyed and reprovisioned. Care should be
taken to do this one OSD at a time, and in a way that doesn't put any data at
-risk. For example, if osd ``$N`` has the error,::
+risk. For example, if osd ``$N`` has the error:
+
+.. prompt:: bash $
- ceph osd out osd.$N
- while ! ceph osd safe-to-destroy osd.$N ; do sleep 1m ; done
- ceph osd destroy osd.$N
- ceph-volume lvm zap /path/to/device
- ceph-volume lvm create --osd-id $N --data /path/to/device
+ ceph osd out osd.$N
+ while ! ceph osd safe-to-destroy osd.$N ; do sleep 1m ; done
+ ceph osd destroy osd.$N
+ ceph-volume lvm zap /path/to/device
+ ceph-volume lvm create --osd-id $N --data /path/to/device
BLUESTORE_NO_COMPRESSION
________________________
additional attention, e.g. upgrading to the latest OS/kernel versions and
H/W resource utilization monitoring.
-This warning can be disabled on all OSDs with::
+This warning can be disabled on all OSDs with:
+
+.. prompt:: bash $
+
+ ceph config set osd bluestore_warn_on_spurious_read_errors false
- ceph config set osd bluestore_warn_on_spurious_read_errors false
+Alternatively, it can be disabled on a specific OSD with:
-Alternatively, it can be disabled on a specific OSD with::
+.. prompt:: bash $
- ceph config set osd.123 bluestore_warn_on_spurious_read_errors false
+ ceph config set osd.123 bluestore_warn_on_spurious_read_errors false
Device health
automatically if ``mgr/devicehealth/self_heal`` is enabled based on
the ``mgr/devicehealth/mark_out_threshold``.
-Device health can be checked with::
+Device health can be checked with:
- ceph device info <device-id>
+.. prompt:: bash $
+
+ ceph device info <device-id>
Device life expectancy is set by a prediction model run by
-the mgr or an by external tool via the command::
+the mgr or an by external tool via the command:
+
+.. prompt:: bash $
- ceph device set-life-expectancy <device-id> <from> <to>
+ ceph device set-life-expectancy <device-id> <from> <to>
You can change the stored life expectancy manually, but that usually
doesn't accomplish anything as whatever tool originally set it will
*stale*, *incomplete*, and the lack of *active* (if those conditions do not clear
quickly).
-Detailed information about which PGs are affected is available from::
+Detailed information about which PGs are affected is available from:
+
+.. prompt:: bash $
- ceph health detail
+ ceph health detail
In most cases the root cause is that one or more OSDs is currently
down; see the discussion for ``OSD_DOWN`` above.
-The state of specific problematic PGs can be queried with::
+The state of specific problematic PGs can be queried with:
- ceph tell <pgid> query
+.. prompt:: bash $
+
+ ceph tell <pgid> query
PG_DEGRADED
___________
enough instances of that placement group in the cluster;
* has not had the *clean* flag set for some time.
-Detailed information about which PGs are affected is available from::
+Detailed information about which PGs are affected is available from:
+
+.. prompt:: bash $
- ceph health detail
+ ceph health detail
In most cases the root cause is that one or more OSDs is currently
down; see the discussion for ``OSD_DOWN`` above.
-The state of specific problematic PGs can be queried with::
+The state of specific problematic PGs can be queried with:
+
+.. prompt:: bash $
- ceph tell <pgid> query
+ ceph tell <pgid> query
PG_RECOVERY_FULL
enabled. Please see :ref:`RGW Dynamic Bucket Index Resharding
<rgw_dynamic_bucket_index_resharding>` for more information on resharding.
-The thresholds can be adjusted with::
+The thresholds can be adjusted with:
- ceph config set osd osd_deep_scrub_large_omap_object_key_threshold <keys>
- ceph config set osd osd_deep_scrub_large_omap_object_value_sum_threshold <bytes>
+.. prompt:: bash $
+
+ ceph config set osd osd_deep_scrub_large_omap_object_key_threshold <keys>
+ ceph config set osd osd_deep_scrub_large_omap_object_value_sum_threshold <bytes>
CACHE_POOL_NEAR_FULL
____________________
from the cache, a state that normally leads to very high latencies and
poor performance.
-The cache pool target size can be adjusted with::
+The cache pool target size can be adjusted with:
+
+.. prompt:: bash $
- ceph osd pool set <cache-pool-name> target_max_bytes <bytes>
- ceph osd pool set <cache-pool-name> target_max_objects <objects>
+ ceph osd pool set <cache-pool-name> target_max_bytes <bytes>
+ ceph osd pool set <cache-pool-name> target_max_objects <objects>
Normal cache flush and evict activity may also be throttled due to reduced
availability or performance of the base tier, or overall cluster load.
much data as others.
This is easily corrected by setting the ``pg_num`` value for the
-affected pool(s) to a nearby power of two::
+affected pool(s) to a nearby power of two:
+
+.. prompt:: bash $
- ceph osd pool set <pool-name> pg_num <value>
+ ceph osd pool set <pool-name> pg_num <value>
-This health warning can be disabled with::
+This health warning can be disabled with:
- ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
+.. prompt:: bash $
+
+ ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
POOL_TOO_FEW_PGS
________________
``warn``.
To disable the warning, you can disable auto-scaling of PGs for the
-pool entirely with::
+pool entirely with:
+
+.. prompt:: bash $
+
+ ceph osd pool set <pool-name> pg_autoscale_mode off
- ceph osd pool set <pool-name> pg_autoscale_mode off
+To allow the cluster to automatically adjust the number of PGs,:
-To allow the cluster to automatically adjust the number of PGs,::
+.. prompt:: bash $
- ceph osd pool set <pool-name> pg_autoscale_mode on
+ ceph osd pool set <pool-name> pg_autoscale_mode on
You can also manually set the number of PGs for the pool to the
-recommended amount with::
+recommended amount with:
- ceph osd pool set <pool-name> pg_num <new-pg-num>
+.. prompt:: bash $
+
+ ceph osd pool set <pool-name> pg_num <new-pg-num>
Please refer to :ref:`choosing-number-of-placement-groups` and
:ref:`pg-autoscaler` for more information.
The simplest way to mitigate the problem is to increase the number of
OSDs in the cluster by adding more hardware. Note that the OSD count
used for the purposes of this health check is the number of "in" OSDs,
-so marking "out" OSDs "in" (if there are any) can also help::
+so marking "out" OSDs "in" (if there are any) can also help:
+
+.. prompt:: bash $
- ceph osd in <osd id(s)>
+ ceph osd in <osd id(s)>
Please refer to :ref:`choosing-number-of-placement-groups` for more
information.
``pg_autoscale_mode`` property on the pool is set to ``warn``.
To disable the warning, you can disable auto-scaling of PGs for the
-pool entirely with::
+pool entirely with:
+
+.. prompt:: bash $
- ceph osd pool set <pool-name> pg_autoscale_mode off
+ ceph osd pool set <pool-name> pg_autoscale_mode off
-To allow the cluster to automatically adjust the number of PGs,::
+To allow the cluster to automatically adjust the number of PGs,:
- ceph osd pool set <pool-name> pg_autoscale_mode on
+.. prompt:: bash $
+
+ ceph osd pool set <pool-name> pg_autoscale_mode on
You can also manually set the number of PGs for the pool to the
-recommended amount with::
+recommended amount with:
+
+.. prompt:: bash $
- ceph osd pool set <pool-name> pg_num <new-pg-num>
+ ceph osd pool set <pool-name> pg_num <new-pg-num>
Please refer to :ref:`choosing-number-of-placement-groups` and
:ref:`pg-autoscaler` for more information.
themselves or in combination with other pools' actual usage).
This is usually an indication that the ``target_size_bytes`` value for
-the pool is too large and should be reduced or set to zero with::
+the pool is too large and should be reduced or set to zero with:
+
+.. prompt:: bash $
- ceph osd pool set <pool-name> target_size_bytes 0
+ ceph osd pool set <pool-name> target_size_bytes 0
For more information, see :ref:`specifying_pool_target_size`.
``target_size_ratio`` takes precedence and ``target_size_bytes`` is
ignored.
-To reset ``target_size_bytes`` to zero::
+To reset ``target_size_bytes`` to zero:
- ceph osd pool set <pool-name> target_size_bytes 0
+.. prompt:: bash $
+
+ ceph osd pool set <pool-name> target_size_bytes 0
For more information, see :ref:`specifying_pool_target_size`.
when ``pgp_num`` is changed.
This is normally resolved by setting ``pgp_num`` to match ``pg_num``,
-triggering the data migration, with::
+triggering the data migration, with:
+
+.. prompt:: bash $
- ceph osd pool set <pool> pgp_num <pg-num-value>
+ ceph osd pool set <pool> pgp_num <pg-num-value>
MANY_OBJECTS_PER_PG
___________________
tagged for use by a particular application.
Resolve this warning by labeling the pool for use by an application. For
-example, if the pool is used by RBD,::
+example, if the pool is used by RBD,:
+
+.. prompt:: bash $
- rbd pool init <poolname>
+ rbd pool init <poolname>
If the pool is being used by a custom application 'foo', you can also label
-via the low-level command::
+via the low-level command:
- ceph osd pool application enable foo
+.. prompt:: bash $
+
+ ceph osd pool application enable foo
For more information, see :ref:`associate-pool-to-application`.
quota. The threshold to trigger this error condition is controlled by
the ``mon_pool_quota_crit_threshold`` configuration option.
-Pool quotas can be adjusted up or down (or removed) with::
+Pool quotas can be adjusted up or down (or removed) with:
+
+.. prompt:: bash $
- ceph osd pool set-quota <pool> max_bytes <bytes>
- ceph osd pool set-quota <pool> max_objects <objects>
+ ceph osd pool set-quota <pool> max_bytes <bytes>
+ ceph osd pool set-quota <pool> max_objects <objects>
Setting the quota value to 0 will disable the quota.
One threshold that can trigger this warning condition is the
``mon_pool_quota_warn_threshold`` configuration option.
-Pool quotas can be adjusted up or down (or removed) with::
+Pool quotas can be adjusted up or down (or removed) with:
- ceph osd pool set-quota <pool> max_bytes <bytes>
- ceph osd pool set-quota <pool> max_objects <objects>
+.. prompt:: bash $
+
+ ceph osd pool set-quota <pool> max_bytes <bytes>
+ ceph osd pool set-quota <pool> max_objects <objects>
Setting the quota value to 0 will disable the quota.
Ideally, a down OSD can be brought back online that has the more
recent copy of the unfound object. Candidate OSDs can be identified from the
-peering state for the PG(s) responsible for the unfound object::
+peering state for the PG(s) responsible for the unfound object:
+
+.. prompt:: bash $
- ceph tell <pgid> query
+ ceph tell <pgid> query
If the latest copy of the object is not available, the cluster can be
told to roll back to a previous version of the object. See
bug.
The request queue for the daemon in question can be queried with the
-following command, executed from the daemon's host::
+following command, executed from the daemon's host:
+
+.. prompt:: bash $
- ceph daemon osd.<id> ops
+ ceph daemon osd.<id> ops
-A summary of the slowest recent requests can be seen with::
+A summary of the slowest recent requests can be seen with:
- ceph daemon osd.<id> dump_historic_ops
+.. prompt:: bash $
-The location of an OSD can be found with::
+ ceph daemon osd.<id> dump_historic_ops
- ceph osd find osd.<id>
+The location of an OSD can be found with:
+
+.. prompt:: bash $
+
+ ceph osd find osd.<id>
PG_NOT_SCRUBBED
_______________
happen if they are misplaced or degraded (see *PG_AVAILABILITY* and
*PG_DEGRADED* above).
-You can manually initiate a scrub of a clean PG with::
+You can manually initiate a scrub of a clean PG with:
- ceph pg deep-scrub <pgid>
+.. prompt:: bash $
+
+ ceph pg deep-scrub <pgid>
PG_SLOW_SNAP_TRIMMING
indicate a software bug, a hardware problem (e.g., a failing disk), or
some other problem.
-New crashes can be listed with::
+New crashes can be listed with:
+
+.. prompt:: bash $
+
+ ceph crash ls-new
- ceph crash ls-new
+Information about a specific crash can be examined with:
-Information about a specific crash can be examined with::
+.. prompt:: bash $
- ceph crash info <crash-id>
+ ceph crash info <crash-id>
This warning can be silenced by "archiving" the crash (perhaps after
being examined by an administrator) so that it does not generate this
-warning::
+warning:
- ceph crash archive <crash-id>
+.. prompt:: bash $
-Similarly, all new crashes can be archived with::
+ ceph crash archive <crash-id>
- ceph crash archive-all
+Similarly, all new crashes can be archived with:
+
+.. prompt:: bash $
+
+ ceph crash archive-all
Archived crashes will still be visible via ``ceph crash ls`` but not
``ceph crash ls-new``.
The time period for what "recent" means is controlled by the option
``mgr/crash/warn_recent_interval`` (default: two weeks).
-These warnings can be disabled entirely with::
+These warnings can be disabled entirely with:
+
+.. prompt:: bash $
- ceph config set mgr/crash/warn_recent_interval 0
+ ceph config set mgr/crash/warn_recent_interval 0
RECENT_MGR_MODULE_CRASH
_______________________
problem maybe be disabled as a result, the function of other modules
is normally unaffected.
-As with the *RECENT_CRASH* health alert, the crash can be inspected with::
+As with the *RECENT_CRASH* health alert, the crash can be inspected with:
- ceph crash info <crash-id>
+.. prompt:: bash $
+
+ ceph crash info <crash-id>
This warning can be silenced by "archiving" the crash (perhaps after
being examined by an administrator) so that it does not generate this
-warning::
+warning:
+
+.. prompt:: bash $
+
+ ceph crash archive <crash-id>
- ceph crash archive <crash-id>
+Similarly, all new crashes can be archived with:
-Similarly, all new crashes can be archived with::
+.. prompt:: bash $
- ceph crash archive-all
+ ceph crash archive-all
Archived crashes will still be visible via ``ceph crash ls`` but not
``ceph crash ls-new``.
The time period for what "recent" means is controlled by the option
``mgr/crash/warn_recent_interval`` (default: two weeks).
-These warnings can be disabled entirely with::
+These warnings can be disabled entirely with:
- ceph config set mgr/crash/warn_recent_interval 0
+.. prompt:: bash $
+
+ ceph config set mgr/crash/warn_recent_interval 0
TELEMETRY_CHANGED
_________________
ensure they have an opportunity to (re)review what information will be
shared.
-To review the contents of the telemetry report,::
+To review the contents of the telemetry report:
+
+.. prompt:: bash $
- ceph telemetry show
+ ceph telemetry show
Note that the telemetry report consists of several optional channels
that may be independently enabled or disabled. For more information, see
:ref:`telemetry`.
-To re-enable telemetry (and make this warning go away),::
+To re-enable telemetry (and make this warning go away):
+
+.. prompt:: bash $
- ceph telemetry on
+ ceph telemetry on
-To disable telemetry (and make this warning go away),::
+To disable telemetry (and make this warning go away):
- ceph telemetry off
+.. prompt:: bash $
+
+ ceph telemetry off
AUTH_BAD_CAPS
_____________
properly validate their syntax, or if the syntax of the capabilities
has changed.
-The user in question can be removed with::
+The user in question can be removed with:
+
+.. prompt:: bash $
- ceph auth rm <entity-name>
+ ceph auth rm <entity-name>
(This will resolve the health alert, but obviously clients will not be
able to authenticate as that user.)
-Alternatively, the capabilities for the user can be updated with::
+Alternatively, the capabilities for the user can be updated with:
+
+.. prompt:: bash $
- ceph auth <entity-name> <daemon-type> <caps> [<daemon-type> <caps> ...]
+ ceph auth <entity-name> <daemon-type> <caps> [<daemon-type> <caps> ...]
For more information about auth capabilities, see :ref:`user-management`.
host to power-cycle or reboot.
This warning can silenced by setting the
-``mon_warn_on_osd_down_out_interval_zero`` to false::
+``mon_warn_on_osd_down_out_interval_zero`` to false:
- ceph config global mon mon_warn_on_osd_down_out_interval_zero false
+.. prompt:: bash $
+
+ ceph config global mon mon_warn_on_osd_down_out_interval_zero false
DASHBOARD_DEBUG
_______________
environments because such a traceback might contain and expose sensible
information.
-The debug mode can be disabled with::
+The debug mode can be disabled with:
+
+.. prompt:: bash $
- ceph dashboard debug disable
+ ceph dashboard debug disable