5 When you first deploy a cluster without creating a pool, Ceph uses the default
6 pools for storing data. A pool provides you with:
8 - **Resilience**: You can set how many OSD are allowed to fail without losing data.
9 For replicated pools, it is the desired number of copies/replicas of an object.
10 A typical configuration stores an object and one additional copy
11 (i.e., ``size = 2``), but you can determine the number of copies/replicas.
12 For `erasure coded pools <../erasure-code>`_, it is the number of coding chunks
13 (i.e. ``m=2`` in the **erasure code profile**)
15 - **Placement Groups**: You can set the number of placement groups for the pool.
16 A typical configuration uses approximately 100 placement groups per OSD to
17 provide optimal balancing without using up too many computing resources. When
18 setting up multiple pools, be careful to ensure you set a reasonable number of
19 placement groups for both the pool and the cluster as a whole.
21 - **CRUSH Rules**: When you store data in a pool, a CRUSH ruleset mapped to the
22 pool enables CRUSH to identify a rule for the placement of the object
23 and its replicas (or chunks for erasure coded pools) in your cluster.
24 You can create a custom CRUSH rule for your pool.
26 - **Snapshots**: When you create snapshots with ``ceph osd pool mksnap``,
27 you effectively take a snapshot of a particular pool.
29 To organize data into pools, you can list, create, and remove pools.
30 You can also view the utilization statistics for each pool.
35 To list your cluster's pools, execute::
39 On a freshly installed cluster, only the ``rbd`` pool exists.
47 Before creating pools, refer to the `Pool, PG and CRUSH Config Reference`_.
48 Ideally, you should override the default value for the number of placement
49 groups in your Ceph configuration file, as the default is NOT ideal.
50 For details on placement group numbers refer to `setting the number of placement groups`_
54 osd pool default pg num = 100
55 osd pool default pgp num = 100
57 To create a pool, execute::
59 ceph osd pool create {pool-name} {pg-num} [{pgp-num}] [replicated] \
60 [crush-rule-name] [expected-num-objects]
61 ceph osd pool create {pool-name} {pg-num} {pgp-num} erasure \
62 [erasure-code-profile] [crush-rule-name] [expected_num_objects]
68 :Description: The name of the pool. It must be unique.
74 :Description: The total number of placement groups for the pool. See `Placement
75 Groups`_ for details on calculating a suitable number. The
76 default value ``8`` is NOT suitable for most systems.
84 :Description: The total number of placement groups for placement purposes. This
85 **should be equal to the total number of placement groups**, except
86 for placement group splitting scenarios.
89 :Required: Yes. Picks up default or Ceph configuration value if not specified.
92 ``{replicated|erasure}``
94 :Description: The pool type which may either be **replicated** to
95 recover from lost OSDs by keeping multiple copies of the
96 objects or **erasure** to get a kind of
97 `generalized RAID5 <../erasure-code>`_ capability.
98 The **replicated** pools require more
99 raw storage but implement all Ceph operations. The
100 **erasure** pools require less raw storage but only
101 implement a subset of the available operations.
107 ``[crush-rule-name]``
109 :Description: The name of a CRUSH rule to use for this pool. The specified
114 :Default: For **replicated** pools it is the ruleset specified by the ``osd
115 pool default crush replicated ruleset`` config variable. This
117 For **erasure** pools it is ``erasure-code`` if the ``default``
118 `erasure code profile`_ is used or ``{pool-name}`` otherwise. This
119 ruleset will be created implicitly if it doesn't exist already.
122 ``[erasure-code-profile=profile]``
124 .. _erasure code profile: ../erasure-code-profile
126 :Description: For **erasure** pools only. Use the `erasure code profile`_. It
127 must be an existing profile as defined by
128 **osd erasure-code-profile set**.
133 When you create a pool, set the number of placement groups to a reasonable value
134 (e.g., ``100``). Consider the total number of placement groups per OSD too.
135 Placement groups are computationally expensive, so performance will degrade when
136 you have many pools with many placement groups (e.g., 50 pools with 100
137 placement groups each). The point of diminishing returns depends upon the power
140 See `Placement Groups`_ for details on calculating an appropriate number of
141 placement groups for your pool.
143 .. _Placement Groups: ../placement-groups
145 ``[expected-num-objects]``
147 :Description: The expected number of objects for this pool. By setting this value (
148 together with a negative **filestore merge threshold**), the PG folder
149 splitting would happen at the pool creation time, to avoid the latency
150 impact to do a runtime folder splitting.
154 :Default: 0, no splitting at the pool creation time.
159 You can set pool quotas for the maximum number of bytes and/or the maximum
160 number of objects per pool. ::
162 ceph osd pool set-quota {pool-name} [max_objects {obj-count}] [max_bytes {bytes}]
166 ceph osd pool set-quota data max_objects 10000
168 To remove a quota, set its value to ``0``.
174 To delete a pool, execute::
176 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
179 To remove a pool the mon_allow_pool_delete flag must be set to true in the Monitor's
180 configuration. Otherwise they will refuse to remove a pool.
182 See `Monitor Configuration`_ for more information.
184 .. _Monitor Configuration: ../../configuration/mon-config-ref
186 If you created your own rulesets and rules for a pool you created, you should
187 consider removing them when you no longer need your pool::
189 ceph osd pool get {pool-name} crush_ruleset
191 If the ruleset was "123", for example, you can check the other pools like so::
193 ceph osd dump | grep "^pool" | grep "crush_ruleset 123"
195 If no other pools use that custom ruleset, then it's safe to delete that
196 ruleset from the cluster.
198 If you created users with permissions strictly for a pool that no longer
199 exists, you should consider deleting those users too::
201 ceph auth list | grep -C 5 {pool-name}
208 To rename a pool, execute::
210 ceph osd pool rename {current-pool-name} {new-pool-name}
212 If you rename a pool and you have per-pool capabilities for an authenticated
213 user, you must update the user's capabilities (i.e., caps) with the new pool
216 .. note:: Version ``0.48`` Argonaut and above.
221 To show a pool's utilization statistics, execute::
226 Make a Snapshot of a Pool
227 =========================
229 To make a snapshot of a pool, execute::
231 ceph osd pool mksnap {pool-name} {snap-name}
233 .. note:: Version ``0.48`` Argonaut and above.
236 Remove a Snapshot of a Pool
237 ===========================
239 To remove a snapshot of a pool, execute::
241 ceph osd pool rmsnap {pool-name} {snap-name}
243 .. note:: Version ``0.48`` Argonaut and above.
251 To set a value to a pool, execute the following::
253 ceph osd pool set {pool-name} {key} {value}
255 You may set values for the following keys:
261 :Description: Sets the number of replicas for objects in the pool.
262 See `Set the Number of Object Replicas`_ for further details.
263 Replicated pools only.
271 :Description: Sets the minimum number of replicas required for I/O.
272 See `Set the Number of Object Replicas`_ for further details.
273 Replicated pools only.
276 :Version: ``0.54`` and above
282 :Description: The effective number of placement groups to use when calculating
285 :Valid Range: Superior to ``pg_num`` current value.
291 :Description: The effective number of placement groups for placement to use
292 when calculating data placement.
295 :Valid Range: Equal to or less than ``pg_num``.
301 :Description: The ruleset to use for mapping object placement in the cluster.
304 .. _allow_ec_overwrites:
306 ``allow_ec_overwrites``
308 :Description: Whether writes to an erasure coded pool can update part
309 of an object, so cephfs and rbd can use it. See
310 `Erasure Coding with Overwrites`_ for more details.
312 :Version: ``12.2.0`` and above
318 :Description: Set/Unset HASHPSPOOL flag on a given pool.
320 :Valid Range: 1 sets flag, 0 unsets flag
321 :Version: Version ``0.48`` Argonaut and above.
327 :Description: Set/Unset NODELETE flag on a given pool.
329 :Valid Range: 1 sets flag, 0 unsets flag
330 :Version: Version ``FIXME``
336 :Description: Set/Unset NOPGCHANGE flag on a given pool.
338 :Valid Range: 1 sets flag, 0 unsets flag
339 :Version: Version ``FIXME``
345 :Description: Set/Unset NOSIZECHANGE flag on a given pool.
347 :Valid Range: 1 sets flag, 0 unsets flag
348 :Version: Version ``FIXME``
350 .. _write_fadvise_dontneed:
352 ``write_fadvise_dontneed``
354 :Description: Set/Unset WRITE_FADVISE_DONTNEED flag on a given pool.
356 :Valid Range: 1 sets flag, 0 unsets flag
362 :Description: Set/Unset NOSCRUB flag on a given pool.
364 :Valid Range: 1 sets flag, 0 unsets flag
370 :Description: Set/Unset NODEEP_SCRUB flag on a given pool.
372 :Valid Range: 1 sets flag, 0 unsets flag
378 :Description: Enables hit set tracking for cache pools.
379 See `Bloom Filter`_ for additional information.
382 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
383 :Default: ``bloom``. Other values are for testing.
389 :Description: The number of hit sets to store for cache pools. The higher
390 the number, the more RAM consumed by the ``ceph-osd`` daemon.
393 :Valid Range: ``1``. Agent doesn't handle > 1 yet.
399 :Description: The duration of a hit set period in seconds for cache pools.
400 The higher the number, the more RAM consumed by the
404 :Example: ``3600`` 1hr
410 :Description: The false positive probability for the ``bloom`` hit set type.
411 See `Bloom Filter`_ for additional information.
414 :Valid Range: 0.0 - 1.0
417 .. _cache_target_dirty_ratio:
419 ``cache_target_dirty_ratio``
421 :Description: The percentage of the cache pool containing modified (dirty)
422 objects before the cache tiering agent will flush them to the
423 backing storage pool.
428 .. _cache_target_dirty_high_ratio:
430 ``cache_target_dirty_high_ratio``
432 :Description: The percentage of the cache pool containing modified (dirty)
433 objects before the cache tiering agent will flush them to the
434 backing storage pool with a higher speed.
439 .. _cache_target_full_ratio:
441 ``cache_target_full_ratio``
443 :Description: The percentage of the cache pool containing unmodified (clean)
444 objects before the cache tiering agent will evict them from the
450 .. _target_max_bytes:
454 :Description: Ceph will begin flushing or evicting objects when the
455 ``max_bytes`` threshold is triggered.
458 :Example: ``1000000000000`` #1-TB
460 .. _target_max_objects:
462 ``target_max_objects``
464 :Description: Ceph will begin flushing or evicting objects when the
465 ``max_objects`` threshold is triggered.
468 :Example: ``1000000`` #1M objects
471 ``hit_set_grade_decay_rate``
473 :Description: Temperature decay rate between two successive hit_sets
475 :Valid Range: 0 - 100
479 ``hit_set_search_last_n``
481 :Description: Count at most N appearance in hit_sets for temperature calculation
483 :Valid Range: 0 - hit_set_count
487 .. _cache_min_flush_age:
489 ``cache_min_flush_age``
491 :Description: The time (in seconds) before the cache tiering agent will flush
492 an object from the cache pool to the storage pool.
495 :Example: ``600`` 10min
497 .. _cache_min_evict_age:
499 ``cache_min_evict_age``
501 :Description: The time (in seconds) before the cache tiering agent will evict
502 an object from the cache pool.
505 :Example: ``1800`` 30min
511 :Description: On Erasure Coding pool, if this flag is turned on, the read request
512 would issue sub reads to all shards, and waits until it receives enough
513 shards to decode to serve the client. In the case of jerasure and isa
514 erasure plugins, once the first K replies return, client's request is
515 served immediately using the data decoded from these replies. This
516 helps to tradeoff some resources for better performance. Currently this
517 flag is only supported for Erasure Coding pool.
522 .. _scrub_min_interval:
524 ``scrub_min_interval``
526 :Description: The minimum interval in seconds for pool scrubbing when
527 load is low. If it is 0, the value osd_scrub_min_interval
533 .. _scrub_max_interval:
535 ``scrub_max_interval``
537 :Description: The maximum interval in seconds for pool scrubbing
538 irrespective of cluster load. If it is 0, the value
539 osd_scrub_max_interval from config is used.
544 .. _deep_scrub_interval:
546 ``deep_scrub_interval``
548 :Description: The interval in seconds for pool “deep” scrubbing. If it
549 is 0, the value osd_deep_scrub_interval from config is used.
558 To get a value from a pool, execute the following::
560 ceph osd pool get {pool-name} {key}
562 You may get values for the following keys:
566 :Description: see size_
572 :Description: see min_size_
575 :Version: ``0.54`` and above
579 :Description: see pg_num_
586 :Description: see pgp_num_
589 :Valid Range: Equal to or less than ``pg_num``.
594 :Description: see crush_ruleset_
599 :Description: see hit_set_type_
602 :Valid Settings: ``bloom``, ``explicit_hash``, ``explicit_object``
606 :Description: see hit_set_count_
613 :Description: see hit_set_period_
620 :Description: see hit_set_fpp_
625 ``cache_target_dirty_ratio``
627 :Description: see cache_target_dirty_ratio_
632 ``cache_target_dirty_high_ratio``
634 :Description: see cache_target_dirty_high_ratio_
639 ``cache_target_full_ratio``
641 :Description: see cache_target_full_ratio_
648 :Description: see target_max_bytes_
653 ``target_max_objects``
655 :Description: see target_max_objects_
660 ``cache_min_flush_age``
662 :Description: see cache_min_flush_age_
667 ``cache_min_evict_age``
669 :Description: see cache_min_evict_age_
676 :Description: see fast_read_
681 ``scrub_min_interval``
683 :Description: see scrub_min_interval_
688 ``scrub_max_interval``
690 :Description: see scrub_max_interval_
695 ``deep_scrub_interval``
697 :Description: see deep_scrub_interval_
702 Set the Number of Object Replicas
703 =================================
705 To set the number of object replicas on a replicated pool, execute the following::
707 ceph osd pool set {poolname} size {num-replicas}
709 .. important:: The ``{num-replicas}`` includes the object itself.
710 If you want the object and two copies of the object for a total of
711 three instances of the object, specify ``3``.
715 ceph osd pool set data size 3
717 You may execute this command for each pool. **Note:** An object might accept
718 I/Os in degraded mode with fewer than ``pool size`` replicas. To set a minimum
719 number of required replicas for I/O, you should use the ``min_size`` setting.
722 ceph osd pool set data min_size 2
724 This ensures that no object in the data pool will receive I/O with fewer than
725 ``min_size`` replicas.
728 Get the Number of Object Replicas
729 =================================
731 To get the number of object replicas, execute the following::
733 ceph osd dump | grep 'replicated size'
735 Ceph will list the pools, with the ``replicated size`` attribute highlighted.
736 By default, ceph creates two replicas of an object (a total of three copies, or
741 .. _Pool, PG and CRUSH Config Reference: ../../configuration/pool-pg-config-ref
742 .. _Bloom Filter: http://en.wikipedia.org/wiki/Bloom_filter
743 .. _setting the number of placement groups: ../placement-groups#set-the-number-of-placement-groups
744 .. _Erasure Coding with Overwrites: ../erasure-code#erasure-coding-with-overwrites