You can also configure the default ``pg_autoscale_mode`` that is
applied to any pools that are created in the future with::
- ceph config set global osd_pool_default_autoscale_mode <mode>
+ ceph config set global osd_pool_default_pg_autoscale_mode <mode>
Viewing PG scaling recommendations
----------------------------------
Output will be something like::
- POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO PG_NUM NEW PG_NUM AUTOSCALE
- a 12900M 3.0 82431M 0.4695 8 128 warn
- c 0 3.0 82431M 0.0000 0.2000 1 64 warn
- b 0 953.6M 3.0 82431M 0.0347 8 warn
+ POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO EFFECTIVE RATIO PG_NUM NEW PG_NUM AUTOSCALE
+ a 12900M 3.0 82431M 0.4695 8 128 warn
+ c 0 3.0 82431M 0.0000 0.2000 0.9884 1 64 warn
+ b 0 953.6M 3.0 82431M 0.0347 8 warn
**SIZE** is the amount of data stored in the pool. **TARGET SIZE**, if
present, is the amount of data the administrator has specified that
this pool is consuming (i.e., ratio = size * rate / raw capacity).
**TARGET RATIO**, if present, is the ratio of storage that the
-administrator has specified that they expect this pool to consume.
-The system uses the larger of the actual ratio and the target ratio
-for its calculation. If both target size bytes and ratio are specified, the
+administrator has specified that they expect this pool to consume
+relative to other pools with target ratios set.
+If both target size bytes and ratio are specified, the
ratio takes precedence.
+**EFFECTIVE RATIO** is the target ratio after adjusting in two ways:
+
+1. subtracting any capacity expected to be used by pools with target size set
+2. normalizing the target ratios among pools with target ratio set so
+ they collectively target the rest of the space. For example, 4
+ pools with target_ratio 1.0 would have an effective ratio of 0.25.
+
+The system uses the larger of the actual ratio and the effective ratio
+for its calculation.
+
**PG_NUM** is the current number of PGs for the pool (or the current
number of PGs that the pool is working towards, if a ``pg_num``
change is in progress). **NEW PG_NUM**, if present, is what the
``pg_num`` and the overhead associated with moving data around when
those adjustments are made.
-The *target size** of a pool can be specified in two ways: either in
-terms of the absolute size of the pool (i.e., bytes), or as a ratio of
-the total cluster capacity.
+The *target size* of a pool can be specified in two ways: either in
+terms of the absolute size of the pool (i.e., bytes), or as a weight
+relative to other pools with a ``target_size_ratio`` set.
For example,::
will tell the system that `mypool` is expected to consume 100 TiB of
space. Alternatively,::
- ceph osd pool set mypool target_size_ratio .9
+ ceph osd pool set mypool target_size_ratio 1.0
-will tell the system that `mypool` is expected to consume 90% of the
-total cluster capacity.
+will tell the system that `mypool` is expected to consume 1.0 relative
+to the other pools with ``target_size_ratio`` set. If `mypool` is the
+only pool in the cluster, this means an expected use of 100% of the
+total capacity. If there is a second pool with ``target_size_ratio``
+1.0, both pools would expect to use 50% of the cluster capacity.
You can also set the target size of a pool at creation time with the optional ``--target-size-bytes <bytes>`` or ``--target-size-ratio <ratio>`` arguments to the ``ceph osd pool create`` command.
Note that if impossible target size values are specified (for example,
-a capacity larger than the total cluster, or ratio(s) that sum to more
-than 1.0) then a health warning
-(``POOL_TARET_SIZE_RATIO_OVERCOMMITTED`` or
-``POOL_TARGET_SIZE_BYTES_OVERCOMMITTED``) will be raised.
+a capacity larger than the total cluster) then a health warning
+(``POOL_TARGET_SIZE_BYTES_OVERCOMMITTED``) will be raised.
+
+If both ``target_size_ratio`` and ``target_size_bytes`` are specified
+for a pool, only the ratio will be considered, and a health warning
+(``POOL_HAS_TARGET_SIZE_BYTES_AND_RATIO``) will be issued.
Specifying bounds on a pool's PGs
---------------------------------
When creating a new pool with::
- ceph osd pool create {pool-name} pg_num
-
-it is mandatory to choose the value of ``pg_num`` because it cannot (currently) be
-calculated automatically. Here are a few values commonly used:
-
-- Less than 5 OSDs set ``pg_num`` to 128
+ ceph osd pool create {pool-name} [pg_num]
-- Between 5 and 10 OSDs set ``pg_num`` to 512
+it is optional to choose the value of ``pg_num``. If you do not
+specify ``pg_num``, the cluster can (by default) automatically tune it
+for you based on how much data is stored in the pool (see above, :ref:`pg-autoscaler`).
-- Between 10 and 50 OSDs set ``pg_num`` to 1024
+Alternatively, ``pg_num`` can be explicitly provided. However,
+whether you specify a ``pg_num`` value or not does not affect whether
+the value is automatically tuned by the cluster after the fact. To
+enable or disable auto-tuning,::
-- If you have more than 50 OSDs, you need to understand the tradeoffs
- and how to calculate the ``pg_num`` value by yourself
+ ceph osd pool set {pool-name} pg_autoscale_mode (on|off|warn)
-- For calculating ``pg_num`` value by yourself please take help of `pgcalc`_ tool
+The "rule of thumb" for PGs per OSD has traditionally be 100. With
+the additional of the balancer (which is also enabled by default), a
+value of more like 50 PGs per OSD is probably reasonable. The
+challenge (which the autoscaler normally does for you), is to:
-As the number of OSDs increases, choosing the right value for pg_num
-becomes more important because it has a significant influence on the
-behavior of the cluster as well as the durability of the data when
-something goes wrong (i.e. the probability that a catastrophic event
-leads to data loss).
+- have the PGs per pool proportional to the data in the pool, and
+- end up with 50-100 PGs per OSDs, after the replication or
+ erasuring-coding fan-out of each PG across OSDs is taken into
+ consideration
How are Placement Groups used ?
===============================