5 When you first deploy a cluster without creating a pool, Ceph uses the default
6 pools for storing data. A pool provides you with:
8 - **Resilience**: You can set how many OSD are allowed to fail without losing data.
9 For replicated pools, it is the desired number of copies/replicas of an object.
10 A typical configuration stores an object and one additional copy
11 (i.e., ``size = 2``), but you can determine the number of copies/replicas.
12 For `erasure coded pools <../erasure-code>`_, it is the number of coding chunks
13 (i.e. ``m=2`` in the **erasure code profile**)
15 - **Placement Groups**: You can set the number of placement groups for the pool.
16 A typical configuration uses approximately 100 placement groups per OSD to
17 provide optimal balancing without using up too many computing resources. When
18 setting up multiple pools, be careful to ensure you set a reasonable number of
19 placement groups for both the pool and the cluster as a whole.
21 - **CRUSH Rules**: When you store data in a pool, placement of the object
22 and its replicas (or chunks for erasure coded pools) in your cluster is governed
23 by CRUSH rules. You can create a custom CRUSH rule for your pool if the default
24 rule is not appropriate for your use case.
26 - **Snapshots**: When you create snapshots with ``ceph osd pool mksnap``,
27 you effectively take a snapshot of a particular pool.
29 To organize data into pools, you can list, create, and remove pools.
30 You can also view the utilization statistics for each pool.
35 To list your cluster's pools, execute::
45 Before creating pools, refer to the `Pool, PG and CRUSH Config Reference`_.
46 Ideally, you should override the default value for the number of placement
47 groups in your Ceph configuration file, as the default is NOT ideal.
48 For details on placement group numbers refer to `setting the number of placement groups`_
50 .. note:: Starting with Luminous, all pools need to be associated to the
51 application using the pool. See `Associate Pool to Application`_ below for
56 osd pool default pg num = 100
57 osd pool default pgp num = 100
59 To create a pool, execute::
61 ceph osd pool create {pool-name} [{pg-num} [{pgp-num}]] [replicated] \
62 [crush-rule-name] [expected-num-objects]
63 ceph osd pool create {pool-name} [{pg-num} [{pgp-num}]] erasure \
64 [erasure-code-profile] [crush-rule-name] [expected_num_objects] [--autoscale-mode=<on,off,warn>]
70 :Description: The name of the pool. It must be unique.
76 :Description: The total number of placement groups for the pool. See `Placement
77 Groups`_ for details on calculating a suitable number. The
78 default value ``8`` is NOT suitable for most systems.
86 :Description: The total number of placement groups for placement purposes. This
87 **should be equal to the total number of placement groups**, except
88 for placement group splitting scenarios.
91 :Required: Yes. Picks up default or Ceph configuration value if not specified.
94 ``{replicated|erasure}``
96 :Description: The pool type which may either be **replicated** to
97 recover from lost OSDs by keeping multiple copies of the
98 objects or **erasure** to get a kind of
99 `generalized RAID5 <../erasure-code>`_ capability.
100 The **replicated** pools require more
101 raw storage but implement all Ceph operations. The
102 **erasure** pools require less raw storage but only
103 implement a subset of the available operations.
109 ``[crush-rule-name]``
111 :Description: The name of a CRUSH rule to use for this pool. The specified
116 :Default: For **replicated** pools it is the rule specified by the ``osd
117 pool default crush rule`` config variable. This rule must exist.
118 For **erasure** pools it is ``erasure-code`` if the ``default``
119 `erasure code profile`_ is used or ``{pool-name}`` otherwise. This
120 rule will be created implicitly if it doesn't exist already.
123 ``[erasure-code-profile=profile]``
125 .. _erasure code profile: ../erasure-code-profile
127 :Description: For **erasure** pools only. Use the `erasure code profile`_. It
128 must be an existing profile as defined by
129 **osd erasure-code-profile set**.
134 ``--autoscale-mode=<on,off,warn>``
136 :Description: Autoscale mode
140 :Default: The default behavior is controlled by the ``osd pool default pg autoscale mode`` option.
142 If you set the autoscale mode to ``on`` or ``warn``, you can let the system autotune or recommend changes to the number of placement groups in your pool based on actual usage. If you leave it off, then you should refer to `Placement Groups`_ for more information.
144 .. _Placement Groups: ../placement-groups
146 ``[expected-num-objects]``
148 :Description: The expected number of objects for this pool. By setting this value (
149 together with a negative **filestore merge threshold**), the PG folder
150 splitting would happen at the pool creation time, to avoid the latency
151 impact to do a runtime folder splitting.
155 :Default: 0, no splitting at the pool creation time.
157 .. _associate-pool-to-application:
159 Associate Pool to Application
160 =============================
162 Pools need to be associated with an application before use. Pools that will be
163 used with CephFS or pools that are automatically created by RGW are
164 automatically associated. Pools that are intended for use with RBD should be
165 initialized using the ``rbd`` tool (see `Block Device Commands`_ for more
168 For other cases, you can manually associate a free-form application name to
171 ceph osd pool application enable {pool-name} {application-name}
173 .. note:: CephFS uses the application name ``cephfs``, RBD uses the
174 application name ``rbd``, and RGW uses the application name ``rgw``.
179 You can set pool quotas for the maximum number of bytes and/or the maximum
180 number of objects per pool. ::
182 ceph osd pool set-quota {pool-name} [max_objects {obj-count}] [max_bytes {bytes}]
186 ceph osd pool set-quota data max_objects 10000
188 To remove a quota, set its value to ``0``.
194 To delete a pool, execute::
196 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
199 To remove a pool the mon_allow_pool_delete flag must be set to true in the Monitor's
200 configuration. Otherwise they will refuse to remove a pool.
202 See `Monitor Configuration`_ for more information.
204 .. _Monitor Configuration: ../../configuration/mon-config-ref
206 If you created your own rules for a pool you created, you should consider
207 removing them when you no longer need your pool::
209 ceph osd pool get {pool-name} crush_rule
211 If the rule was "123", for example, you can check the other pools like so::
213 ceph osd dump | grep "^pool" | grep "crush_rule 123"
215 If no other pools use that custom rule, then it's safe to delete that
216 rule from the cluster.
218 If you created users with permissions strictly for a pool that no longer
219 exists, you should consider deleting those users too::
221 ceph auth ls | grep -C 5 {pool-name}
228 To rename a pool, execute::
230 ceph osd pool rename {current-pool-name} {new-pool-name}
232 If you rename a pool and you have per-pool capabilities for an authenticated
233 user, you must update the user's capabilities (i.e., caps) with the new pool
239 To show a pool's utilization statistics, execute::
243 Additionally, to obtain I/O information for a specific pool or all, execute::
245 ceph osd pool stats [{pool-name}]
248 Make a Snapshot of a Pool
249 =========================
251 To make a snapshot of a pool, execute::
253 ceph osd pool mksnap {pool-name} {snap-name}
255 Remove a Snapshot of a Pool
256 ===========================
258 To remove a snapshot of a pool, execute::
260 ceph osd pool rmsnap {pool-name} {snap-name}
268 To set a value to a pool, execute the following::
270 ceph osd pool set {pool-name} {key} {value}
272 You may set values for the following keys:
274 .. _compression_algorithm:
276 ``compression_algorithm``
278 :Description: Sets inline compression algorithm to use for underlying BlueStore. This setting overrides the `global setting <http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression algorithm``.
281 :Valid Settings: ``lz4``, ``snappy``, ``zlib``, ``zstd``
285 :Description: Sets the policy for the inline compression algorithm for underlying BlueStore. This setting overrides the `global setting <http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression mode``.
288 :Valid Settings: ``none``, ``passive``, ``aggressive``, ``force``
290 ``compression_min_blob_size``
292 :Description: Chunks smaller than this are never compressed. This setting overrides the `global setting <http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression min blob *``.
294 :Type: Unsigned Integer
296 ``compression_max_blob_size``
298 :Description: Chunks larger than this are broken into smaller blobs sizing
299 ``compression_max_blob_size`` before being compressed.
301 :Type: Unsigned Integer
307 :Description: Sets the number of replicas for objects in the pool.
308 See `Set the Number of Object Replicas`_ for further details.
309 Replicated pools only.
317 :Description: Sets the minimum number of replicas required for I/O.
318 See `Set the Number of Object Replicas`_ for further details.
319 In the case of Erasure Coded pools this should be set to a value
320 greater than 'k' since if we allow IO at the value 'k' there is no
321 redundancy and data will be lost in the event of a permanent OSD
322 failure. For more information see `Erasure Code
326 :Version: ``0.54`` and above
332 :Description: The effective number of placement groups to use when calculating
335 :Valid Range: Superior to ``pg_num`` current value.
341 :Description: The effective number of placement groups for placement to use
342 when calculating data placement.
345 :Valid Range: Equal to or less than ``pg_num``.
351 :Description: The rule to use for mapping object placement in the cluster.
354 .. _allow_ec_overwrites:
356 ``allow_ec_overwrites``
358 :Description: Whether writes to an erasure coded pool can update part
359 of an object, so cephfs and rbd can use it. See
360 `Erasure Coding with Overwrites`_ for more details.
362 :Version: ``12.2.0`` and above
368 :Description: Set/Unset HASHPSPOOL flag on a given pool.
370 :Valid Range: 1 sets flag, 0 unsets flag
376 :Description: Set/Unset NODELETE flag on a given pool.
378 :Valid Range: 1 sets flag, 0 unsets flag
379 :Version: Version ``FIXME``
385 :Description: Set/Unset NOPGCHANGE flag on a given pool.
387 :Valid Range: 1 sets flag, 0 unsets flag
388 :Version: Version ``FIXME``
394 :Description: Set/Unset NOSIZECHANGE flag on a given pool.
396 :Valid Range: 1 sets flag, 0 unsets flag
397 :Version: Version ``FIXME``
399 .. _write_fadvise_dontneed:
401 ``write_fadvise_dontneed``
403 :Description: Set/Unset WRITE_FADVISE_DONTNEED flag on a given pool.
405 :Valid Range: 1 sets flag, 0 unsets flag
411 :Description: Set/Unset NOSCRUB flag on a given pool.
413 :Valid Range: 1 sets flag, 0 unsets flag
419 :Description: Set/Unset NODEEP_SCRUB flag on a given pool.
421 :Valid Range: 1 sets flag, 0 unsets flag
427 :Description: Enables hit set tracking for cache pools.
428 See `Bloom Filter`_ for additional information.
431 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
432 :Default: ``bloom``. Other values are for testing.
438 :Description: The number of hit sets to store for cache pools. The higher
439 the number, the more RAM consumed by the ``ceph-osd`` daemon.
442 :Valid Range: ``1``. Agent doesn't handle > 1 yet.
448 :Description: The duration of a hit set period in seconds for cache pools.
449 The higher the number, the more RAM consumed by the
453 :Example: ``3600`` 1hr
459 :Description: The false positive probability for the ``bloom`` hit set type.
460 See `Bloom Filter`_ for additional information.
463 :Valid Range: 0.0 - 1.0
466 .. _cache_target_dirty_ratio:
468 ``cache_target_dirty_ratio``
470 :Description: The percentage of the cache pool containing modified (dirty)
471 objects before the cache tiering agent will flush them to the
472 backing storage pool.
477 .. _cache_target_dirty_high_ratio:
479 ``cache_target_dirty_high_ratio``
481 :Description: The percentage of the cache pool containing modified (dirty)
482 objects before the cache tiering agent will flush them to the
483 backing storage pool with a higher speed.
488 .. _cache_target_full_ratio:
490 ``cache_target_full_ratio``
492 :Description: The percentage of the cache pool containing unmodified (clean)
493 objects before the cache tiering agent will evict them from the
499 .. _target_max_bytes:
503 :Description: Ceph will begin flushing or evicting objects when the
504 ``max_bytes`` threshold is triggered.
507 :Example: ``1000000000000`` #1-TB
509 .. _target_max_objects:
511 ``target_max_objects``
513 :Description: Ceph will begin flushing or evicting objects when the
514 ``max_objects`` threshold is triggered.
517 :Example: ``1000000`` #1M objects
520 ``hit_set_grade_decay_rate``
522 :Description: Temperature decay rate between two successive hit_sets
524 :Valid Range: 0 - 100
528 ``hit_set_search_last_n``
530 :Description: Count at most N appearance in hit_sets for temperature calculation
532 :Valid Range: 0 - hit_set_count
536 .. _cache_min_flush_age:
538 ``cache_min_flush_age``
540 :Description: The time (in seconds) before the cache tiering agent will flush
541 an object from the cache pool to the storage pool.
544 :Example: ``600`` 10min
546 .. _cache_min_evict_age:
548 ``cache_min_evict_age``
550 :Description: The time (in seconds) before the cache tiering agent will evict
551 an object from the cache pool.
554 :Example: ``1800`` 30min
560 :Description: On Erasure Coding pool, if this flag is turned on, the read request
561 would issue sub reads to all shards, and waits until it receives enough
562 shards to decode to serve the client. In the case of jerasure and isa
563 erasure plugins, once the first K replies return, client's request is
564 served immediately using the data decoded from these replies. This
565 helps to tradeoff some resources for better performance. Currently this
566 flag is only supported for Erasure Coding pool.
571 .. _scrub_min_interval:
573 ``scrub_min_interval``
575 :Description: The minimum interval in seconds for pool scrubbing when
576 load is low. If it is 0, the value osd_scrub_min_interval
582 .. _scrub_max_interval:
584 ``scrub_max_interval``
586 :Description: The maximum interval in seconds for pool scrubbing
587 irrespective of cluster load. If it is 0, the value
588 osd_scrub_max_interval from config is used.
593 .. _deep_scrub_interval:
595 ``deep_scrub_interval``
597 :Description: The interval in seconds for pool “deep” scrubbing. If it
598 is 0, the value osd_deep_scrub_interval from config is used.
604 .. _recovery_priority:
606 ``recovery_priority``
608 :Description: When a value is set it will increase or decrease the computed
609 reservation priority. This value must be in the range -10 to
610 10. Use a negative priority for less important pools so they
611 have lower priority than any new pools.
617 .. _recovery_op_priority:
619 ``recovery_op_priority``
621 :Description: Specify the recovery operation priority for this pool instead of ``osd_recovery_op_priority``.
630 To get a value from a pool, execute the following::
632 ceph osd pool get {pool-name} {key}
634 You may get values for the following keys:
638 :Description: see size_
644 :Description: see min_size_
647 :Version: ``0.54`` and above
651 :Description: see pg_num_
658 :Description: see pgp_num_
661 :Valid Range: Equal to or less than ``pg_num``.
666 :Description: see crush_rule_
671 :Description: see hit_set_type_
674 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
678 :Description: see hit_set_count_
685 :Description: see hit_set_period_
692 :Description: see hit_set_fpp_
697 ``cache_target_dirty_ratio``
699 :Description: see cache_target_dirty_ratio_
704 ``cache_target_dirty_high_ratio``
706 :Description: see cache_target_dirty_high_ratio_
711 ``cache_target_full_ratio``
713 :Description: see cache_target_full_ratio_
720 :Description: see target_max_bytes_
725 ``target_max_objects``
727 :Description: see target_max_objects_
732 ``cache_min_flush_age``
734 :Description: see cache_min_flush_age_
739 ``cache_min_evict_age``
741 :Description: see cache_min_evict_age_
748 :Description: see fast_read_
753 ``scrub_min_interval``
755 :Description: see scrub_min_interval_
760 ``scrub_max_interval``
762 :Description: see scrub_max_interval_
767 ``deep_scrub_interval``
769 :Description: see deep_scrub_interval_
774 ``allow_ec_overwrites``
776 :Description: see allow_ec_overwrites_
781 ``recovery_priority``
783 :Description: see recovery_priority_
788 ``recovery_op_priority``
790 :Description: see recovery_op_priority_
795 Set the Number of Object Replicas
796 =================================
798 To set the number of object replicas on a replicated pool, execute the following::
800 ceph osd pool set {poolname} size {num-replicas}
802 .. important:: The ``{num-replicas}`` includes the object itself.
803 If you want the object and two copies of the object for a total of
804 three instances of the object, specify ``3``.
808 ceph osd pool set data size 3
810 You may execute this command for each pool. **Note:** An object might accept
811 I/Os in degraded mode with fewer than ``pool size`` replicas. To set a minimum
812 number of required replicas for I/O, you should use the ``min_size`` setting.
815 ceph osd pool set data min_size 2
817 This ensures that no object in the data pool will receive I/O with fewer than
818 ``min_size`` replicas.
821 Get the Number of Object Replicas
822 =================================
824 To get the number of object replicas, execute the following::
826 ceph osd dump | grep 'replicated size'
828 Ceph will list the pools, with the ``replicated size`` attribute highlighted.
829 By default, ceph creates two replicas of an object (a total of three copies, or
834 .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
835 .. _Bloom Filter: https://en.wikipedia.org/wiki/Bloom_filter
836 .. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups
837 .. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites
838 .. _Block Device Commands: ../../../rbd/rados-rbd-cmds/#create-a-block-device-pool