5 When you first deploy a cluster without creating a pool, Ceph uses the default
6 pools for storing data. A pool provides you with:
8 - **Resilience**: You can set how many OSD are allowed to fail without losing data.
9 For replicated pools, it is the desired number of copies/replicas of an object.
10 A typical configuration stores an object and one additional copy
11 (i.e., ``size = 2``), but you can determine the number of copies/replicas.
12 For `erasure coded pools <../erasure-code>`_, it is the number of coding chunks
13 (i.e. ``m=2`` in the **erasure code profile**)
15 - **Placement Groups**: You can set the number of placement groups for the pool.
16 A typical configuration uses approximately 100 placement groups per OSD to
17 provide optimal balancing without using up too many computing resources. When
18 setting up multiple pools, be careful to ensure you set a reasonable number of
19 placement groups for both the pool and the cluster as a whole.
21 - **CRUSH Rules**: When you store data in a pool, placement of the object
22 and its replicas (or chunks for erasure coded pools) in your cluster is governed
23 by CRUSH rules. You can create a custom CRUSH rule for your pool if the default
24 rule is not appropriate for your use case.
26 - **Snapshots**: When you create snapshots with ``ceph osd pool mksnap``,
27 you effectively take a snapshot of a particular pool.
29 To organize data into pools, you can list, create, and remove pools.
30 You can also view the utilization statistics for each pool.
35 To list your cluster's pools, execute::
39 On a freshly installed cluster, only the ``rbd`` pool exists.
47 Before creating pools, refer to the `Pool, PG and CRUSH Config Reference`_.
48 Ideally, you should override the default value for the number of placement
49 groups in your Ceph configuration file, as the default is NOT ideal.
50 For details on placement group numbers refer to `setting the number of placement groups`_
52 .. note:: Starting with Luminous, all pools need to be associated to the
53 application using the pool. See `Associate Pool to Application`_ below for
58 osd pool default pg num = 100
59 osd pool default pgp num = 100
61 To create a pool, execute::
63 ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \
64 [crush-rule-name] [expected-num-objects]
65 ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \
66 [erasure-code-profile] [crush-rule-name] [expected_num_objects]
72 :Description: The name of the pool. It must be unique.
78 :Description: The total number of placement groups for the pool. See `Placement
79 Groups`_ for details on calculating a suitable number. The
80 default value ``8`` is NOT suitable for most systems.
88 :Description: The total number of placement groups for placement purposes. This
89 **should be equal to the total number of placement groups**, except
90 for placement group splitting scenarios.
93 :Required: Yes. Picks up default or Ceph configuration value if not specified.
96 ``{replicated|erasure}``
98 :Description: The pool type which may either be **replicated** to
99 recover from lost OSDs by keeping multiple copies of the
100 objects or **erasure** to get a kind of
101 `generalized RAID5 <../erasure-code>`_ capability.
102 The **replicated** pools require more
103 raw storage but implement all Ceph operations. The
104 **erasure** pools require less raw storage but only
105 implement a subset of the available operations.
111 ``[crush-rule-name]``
113 :Description: The name of a CRUSH rule to use for this pool. The specified
118 :Default: For **replicated** pools it is the rule specified by the ``osd
119 pool default crush rule`` config variable. This rule must exist.
120 For **erasure** pools it is ``erasure-code`` if the ``default``
121 `erasure code profile`_ is used or ``{pool-name}`` otherwise. This
122 rule will be created implicitly if it doesn't exist already.
125 ``[erasure-code-profile=profile]``
127 .. _erasure code profile: ../erasure-code-profile
129 :Description: For **erasure** pools only. Use the `erasure code profile`_. It
130 must be an existing profile as defined by
131 **osd erasure-code-profile set**.
136 When you create a pool, set the number of placement groups to a reasonable value
137 (e.g., ``100``). Consider the total number of placement groups per OSD too.
138 Placement groups are computationally expensive, so performance will degrade when
139 you have many pools with many placement groups (e.g., 50 pools with 100
140 placement groups each). The point of diminishing returns depends upon the power
143 See `Placement Groups`_ for details on calculating an appropriate number of
144 placement groups for your pool.
146 .. _Placement Groups: ../placement-groups
148 ``[expected-num-objects]``
150 :Description: The expected number of objects for this pool. By setting this value (
151 together with a negative **filestore merge threshold**), the PG folder
152 splitting would happen at the pool creation time, to avoid the latency
153 impact to do a runtime folder splitting.
157 :Default: 0, no splitting at the pool creation time.
159 Associate Pool to Application
160 =============================
162 Pools need to be associated with an application before use. Pools that will be
163 used with CephFS or pools that are automatically created by RGW are
164 automatically associated. Pools that are intended for use with RBD should be
165 initialized using the ``rbd`` tool (see `Block Device Commands`_ for more
168 For other cases, you can manually associate a free-form application name to
171 ceph osd pool application enable {pool-name} {application-name}
173 .. note:: CephFS uses the application name ``cephfs``, RBD uses the
174 application name ``rbd``, and RGW uses the application name ``rgw``.
179 You can set pool quotas for the maximum number of bytes and/or the maximum
180 number of objects per pool. ::
182 ceph osd pool set-quota {pool-name} [max_objects {obj-count}] [max_bytes {bytes}]
186 ceph osd pool set-quota data max_objects 10000
188 To remove a quota, set its value to ``0``.
194 To delete a pool, execute::
196 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
199 To remove a pool the mon_allow_pool_delete flag must be set to true in the Monitor's
200 configuration. Otherwise they will refuse to remove a pool.
202 See `Monitor Configuration`_ for more information.
204 .. _Monitor Configuration: ../../configuration/mon-config-ref
206 If you created your own rules for a pool you created, you should consider
207 removing them when you no longer need your pool::
209 ceph osd pool get {pool-name} crush_rule
211 If the rule was "123", for example, you can check the other pools like so::
213 ceph osd dump | grep "^pool" | grep "crush_rule 123"
215 If no other pools use that custom rule, then it's safe to delete that
216 rule from the cluster.
218 If you created users with permissions strictly for a pool that no longer
219 exists, you should consider deleting those users too::
221 ceph auth ls | grep -C 5 {pool-name}
228 To rename a pool, execute::
230 ceph osd pool rename {current-pool-name} {new-pool-name}
232 If you rename a pool and you have per-pool capabilities for an authenticated
233 user, you must update the user's capabilities (i.e., caps) with the new pool
236 .. note:: Version ``0.48`` Argonaut and above.
241 To show a pool's utilization statistics, execute::
246 Make a Snapshot of a Pool
247 =========================
249 To make a snapshot of a pool, execute::
251 ceph osd pool mksnap {pool-name} {snap-name}
253 .. note:: Version ``0.48`` Argonaut and above.
256 Remove a Snapshot of a Pool
257 ===========================
259 To remove a snapshot of a pool, execute::
261 ceph osd pool rmsnap {pool-name} {snap-name}
263 .. note:: Version ``0.48`` Argonaut and above.
271 To set a value to a pool, execute the following::
273 ceph osd pool set {pool-name} {key} {value}
275 You may set values for the following keys:
277 .. _compression_algorithm:
279 ``compression_algorithm``
280 :Description: Sets inline compression algorithm to use for underlying BlueStore.
281 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression algorithm``.
284 :Valid Settings: ``lz4``, ``snappy``, ``zlib``, ``zstd``
288 :Description: Sets the policy for the inline compression algorithm for underlying BlueStore.
289 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression mode``.
292 :Valid Settings: ``none``, ``passive``, ``aggressive``, ``force``
294 ``compression_min_blob_size``
296 :Description: Chunks smaller than this are never compressed.
297 This setting overrides the `global setting <rados/configuration/bluestore-config-ref/#inline-compression>`_ of ``bluestore compression min blob *``.
299 :Type: Unsigned Integer
301 ``compression_max_blob_size``
303 :Description: Chunks larger than this are broken into smaller blobs sizing
304 ``compression_max_blob_size`` before being compressed.
306 :Type: Unsigned Integer
312 :Description: Sets the number of replicas for objects in the pool.
313 See `Set the Number of Object Replicas`_ for further details.
314 Replicated pools only.
322 :Description: Sets the minimum number of replicas required for I/O.
323 See `Set the Number of Object Replicas`_ for further details.
324 Replicated pools only.
327 :Version: ``0.54`` and above
333 :Description: The effective number of placement groups to use when calculating
336 :Valid Range: Superior to ``pg_num`` current value.
342 :Description: The effective number of placement groups for placement to use
343 when calculating data placement.
346 :Valid Range: Equal to or less than ``pg_num``.
352 :Description: The rule to use for mapping object placement in the cluster.
355 .. _allow_ec_overwrites:
357 ``allow_ec_overwrites``
359 :Description: Whether writes to an erasure coded pool can update part
360 of an object, so cephfs and rbd can use it. See
361 `Erasure Coding with Overwrites`_ for more details.
363 :Version: ``12.2.0`` and above
369 :Description: Set/Unset HASHPSPOOL flag on a given pool.
371 :Valid Range: 1 sets flag, 0 unsets flag
372 :Version: Version ``0.48`` Argonaut and above.
378 :Description: Set/Unset NODELETE flag on a given pool.
380 :Valid Range: 1 sets flag, 0 unsets flag
381 :Version: Version ``FIXME``
387 :Description: Set/Unset NOPGCHANGE flag on a given pool.
389 :Valid Range: 1 sets flag, 0 unsets flag
390 :Version: Version ``FIXME``
396 :Description: Set/Unset NOSIZECHANGE flag on a given pool.
398 :Valid Range: 1 sets flag, 0 unsets flag
399 :Version: Version ``FIXME``
401 .. _write_fadvise_dontneed:
403 ``write_fadvise_dontneed``
405 :Description: Set/Unset WRITE_FADVISE_DONTNEED flag on a given pool.
407 :Valid Range: 1 sets flag, 0 unsets flag
413 :Description: Set/Unset NOSCRUB flag on a given pool.
415 :Valid Range: 1 sets flag, 0 unsets flag
421 :Description: Set/Unset NODEEP_SCRUB flag on a given pool.
423 :Valid Range: 1 sets flag, 0 unsets flag
429 :Description: Enables hit set tracking for cache pools.
430 See `Bloom Filter`_ for additional information.
433 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
434 :Default: ``bloom``. Other values are for testing.
440 :Description: The number of hit sets to store for cache pools. The higher
441 the number, the more RAM consumed by the ``ceph-osd`` daemon.
444 :Valid Range: ``1``. Agent doesn't handle > 1 yet.
450 :Description: The duration of a hit set period in seconds for cache pools.
451 The higher the number, the more RAM consumed by the
455 :Example: ``3600`` 1hr
461 :Description: The false positive probability for the ``bloom`` hit set type.
462 See `Bloom Filter`_ for additional information.
465 :Valid Range: 0.0 - 1.0
468 .. _cache_target_dirty_ratio:
470 ``cache_target_dirty_ratio``
472 :Description: The percentage of the cache pool containing modified (dirty)
473 objects before the cache tiering agent will flush them to the
474 backing storage pool.
479 .. _cache_target_dirty_high_ratio:
481 ``cache_target_dirty_high_ratio``
483 :Description: The percentage of the cache pool containing modified (dirty)
484 objects before the cache tiering agent will flush them to the
485 backing storage pool with a higher speed.
490 .. _cache_target_full_ratio:
492 ``cache_target_full_ratio``
494 :Description: The percentage of the cache pool containing unmodified (clean)
495 objects before the cache tiering agent will evict them from the
501 .. _target_max_bytes:
505 :Description: Ceph will begin flushing or evicting objects when the
506 ``max_bytes`` threshold is triggered.
509 :Example: ``1000000000000`` #1-TB
511 .. _target_max_objects:
513 ``target_max_objects``
515 :Description: Ceph will begin flushing or evicting objects when the
516 ``max_objects`` threshold is triggered.
519 :Example: ``1000000`` #1M objects
522 ``hit_set_grade_decay_rate``
524 :Description: Temperature decay rate between two successive hit_sets
526 :Valid Range: 0 - 100
530 ``hit_set_search_last_n``
532 :Description: Count at most N appearance in hit_sets for temperature calculation
534 :Valid Range: 0 - hit_set_count
538 .. _cache_min_flush_age:
540 ``cache_min_flush_age``
542 :Description: The time (in seconds) before the cache tiering agent will flush
543 an object from the cache pool to the storage pool.
546 :Example: ``600`` 10min
548 .. _cache_min_evict_age:
550 ``cache_min_evict_age``
552 :Description: The time (in seconds) before the cache tiering agent will evict
553 an object from the cache pool.
556 :Example: ``1800`` 30min
562 :Description: On Erasure Coding pool, if this flag is turned on, the read request
563 would issue sub reads to all shards, and waits until it receives enough
564 shards to decode to serve the client. In the case of jerasure and isa
565 erasure plugins, once the first K replies return, client's request is
566 served immediately using the data decoded from these replies. This
567 helps to tradeoff some resources for better performance. Currently this
568 flag is only supported for Erasure Coding pool.
573 .. _scrub_min_interval:
575 ``scrub_min_interval``
577 :Description: The minimum interval in seconds for pool scrubbing when
578 load is low. If it is 0, the value osd_scrub_min_interval
584 .. _scrub_max_interval:
586 ``scrub_max_interval``
588 :Description: The maximum interval in seconds for pool scrubbing
589 irrespective of cluster load. If it is 0, the value
590 osd_scrub_max_interval from config is used.
595 .. _deep_scrub_interval:
597 ``deep_scrub_interval``
599 :Description: The interval in seconds for pool “deep” scrubbing. If it
600 is 0, the value osd_deep_scrub_interval from config is used.
609 To get a value from a pool, execute the following::
611 ceph osd pool get {pool-name} {key}
613 You may get values for the following keys:
617 :Description: see size_
623 :Description: see min_size_
626 :Version: ``0.54`` and above
630 :Description: see pg_num_
637 :Description: see pgp_num_
640 :Valid Range: Equal to or less than ``pg_num``.
645 :Description: see crush_rule_
650 :Description: see hit_set_type_
653 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
657 :Description: see hit_set_count_
664 :Description: see hit_set_period_
671 :Description: see hit_set_fpp_
676 ``cache_target_dirty_ratio``
678 :Description: see cache_target_dirty_ratio_
683 ``cache_target_dirty_high_ratio``
685 :Description: see cache_target_dirty_high_ratio_
690 ``cache_target_full_ratio``
692 :Description: see cache_target_full_ratio_
699 :Description: see target_max_bytes_
704 ``target_max_objects``
706 :Description: see target_max_objects_
711 ``cache_min_flush_age``
713 :Description: see cache_min_flush_age_
718 ``cache_min_evict_age``
720 :Description: see cache_min_evict_age_
727 :Description: see fast_read_
732 ``scrub_min_interval``
734 :Description: see scrub_min_interval_
739 ``scrub_max_interval``
741 :Description: see scrub_max_interval_
746 ``deep_scrub_interval``
748 :Description: see deep_scrub_interval_
753 ``allow_ec_overwrites``
755 :Description: see allow_ec_overwrites_
760 Set the Number of Object Replicas
761 =================================
763 To set the number of object replicas on a replicated pool, execute the following::
765 ceph osd pool set {poolname} size {num-replicas}
767 .. important:: The ``{num-replicas}`` includes the object itself.
768 If you want the object and two copies of the object for a total of
769 three instances of the object, specify ``3``.
773 ceph osd pool set data size 3
775 You may execute this command for each pool. **Note:** An object might accept
776 I/Os in degraded mode with fewer than ``pool size`` replicas. To set a minimum
777 number of required replicas for I/O, you should use the ``min_size`` setting.
780 ceph osd pool set data min_size 2
782 This ensures that no object in the data pool will receive I/O with fewer than
783 ``min_size`` replicas.
786 Get the Number of Object Replicas
787 =================================
789 To get the number of object replicas, execute the following::
791 ceph osd dump | grep 'replicated size'
793 Ceph will list the pools, with the ``replicated size`` attribute highlighted.
794 By default, ceph creates two replicas of an object (a total of three copies, or
799 .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
800 .. _Bloom Filter: http://en.wikipedia.org/wiki/Bloom_filter
801 .. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups
802 .. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites
803 .. _Block Device Commands: ../../../rbd/rados-rbd-cmds/#create-a-block-device-pool