[thumbnail="screenshot/gui-ceph-osd-status.png"]
-via GUI or via CLI as follows:
+You can create an OSD either via the {pve} web-interface, or via CLI using
+`pveceph`. For example:
[source,bash]
----
pveceph osd create /dev/sd[X]
----
-TIP: We recommend a Ceph cluster size, starting with 12 OSDs, distributed
-evenly among your, at least three nodes (4 OSDs on each node).
+TIP: We recommend a Ceph cluster with at least three nodes and a at least 12
+OSDs, evenly distributed among the nodes.
-If the disk was used before (eg. ZFS/RAID/OSD), to remove partition table, boot
-sector and any OSD leftover the following command should be sufficient.
+If the disk was in use before (for example, in a ZFS, or as OSD) you need to
+first zap all traces of that usage. To remove the partition table, boot
+sector and any other OSD leftover, you can use the following command:
[source,bash]
----
ceph-volume lvm zap /dev/sd[X] --destroy
----
-WARNING: The above command will destroy data on the disk!
+WARNING: The above command will destroy all data on the disk!
.Ceph Bluestore
**G**roups (`PG`, `pg_num`), a collection of objects.
-Create Pools
-~~~~~~~~~~~~
+Create and Edit Pools
+~~~~~~~~~~~~~~~~~~~~~
+
+You can create pools through command line or on the web-interface on each {pve}
+host under **Ceph -> Pools**.
[thumbnail="screenshot/gui-ceph-pools.png"]
When no options are given, we set a default of **128 PGs**, a **size of 3
-replicas** and a **min_size of 2 replicas** for serving objects in a degraded
-state.
-
-NOTE: The default number of PGs works for 2-5 disks. Ceph throws a
-'HEALTH_WARNING' if you have too few or too many PGs in your cluster.
+replicas** and a **min_size of 2 replicas**, to ensure no data loss occurs if
+any OSD fails.
WARNING: **Do not set a min_size of 1**. A replicated pool with min_size of 1
allows I/O on an object when it has only 1 replica which could lead to data
loss, incomplete PGs or unfound objects.
-It is advised to calculate the PG number depending on your setup, you can find
-the formula and the PG calculator footnote:[PG calculator
-https://ceph.com/pgcalc/] online. From Ceph Nautilus onwards it is possible to
-increase and decrease the number of PGs later on footnote:[Placement Groups
-{cephdocs-url}/rados/operations/placement-groups/].
-
+It is advised that you calculate the PG number based on your setup. You can
+find the formula and the PG calculator footnote:[PG calculator
+https://ceph.com/pgcalc/] online. From Ceph Nautilus onward, you can change the
+number of PGs footnoteref:[placement_groups,Placement Groups
+{cephdocs-url}/rados/operations/placement-groups/] after the setup.
-You can create pools through command line or on the GUI on each PVE host under
-**Ceph -> Pools**.
+In addition to manual adjustment, the PG autoscaler
+footnoteref:[autoscaler,Automated Scaling
+{cephdocs-url}/rados/operations/placement-groups/#automated-scaling] can
+automatically scale the PG count for a pool in the background.
+.Example for creating a pool over the CLI
[source,bash]
----
-pveceph pool create <name>
-----
-
-If you would like to automatically also get a storage definition for your pool,
-mark the checkbox "Add storages" in the GUI or use the command line option
-'--add_storages' at pool creation.
+pveceph pool create <name> --add_storages
+----
+
+TIP: If you would like to automatically also get a storage definition for your
+pool, keep the `Add storages' checkbox ticked in the web-interface, or use the
+command line option '--add_storages' at pool creation.
+
+.Base Options
+Name:: The name of the pool. This must be unique and can't be changed afterwards.
+Size:: The number of replicas per object. Ceph always tries to have this many
+copies of an object. Default: `3`.
+PG Autoscale Mode:: The automatic PG scaling mode footnoteref:[autoscaler] of
+the pool. If set to `warn`, it produces a warning message when a pool
+has a non-optimal PG count. Default: `warn`.
+Add as Storage:: Configure a VM or container storage using the new pool.
+Default: `true` (only visible on creation).
+
+.Advanced Options
+Min. Size:: The minimum number of replicas per object. Ceph will reject I/O on
+the pool if a PG has less than this many replicas. Default: `2`.
+Crush Rule:: The rule to use for mapping object placement in the cluster. These
+rules define how data is placed within the cluster. See
+xref:pve_ceph_device_classes[Ceph CRUSH & device classes] for information on
+device-based rules.
+# of PGs:: The number of placement groups footnoteref:[placement_groups] that
+the pool should have at the beginning. Default: `128`.
+Target Size Ratio:: The ratio of data that is expected in the pool. The PG
+autoscaler uses the ratio relative to other ratio sets. It takes precedence
+over the `target size` if both are set.
+Target Size:: The estimated amount of data expected in the pool. The PG
+autoscaler uses this size to estimate the optimal PG count.
+Min. # of PGs:: The minimum number of placement groups. This setting is used to
+fine-tune the lower bound of the PG count for that pool. The PG autoscaler
+will not merge PGs below this threshold.
Further information on Ceph pool handling can be found in the Ceph pool
operation footnote:[Ceph pool operation
NOTE: Deleting the data of a pool is a background task and can take some time.
You will notice that the data usage in the cluster is decreasing.
+
+PG Autoscaler
+~~~~~~~~~~~~~
+
+The PG autoscaler allows the cluster to consider the amount of (expected) data
+stored in each pool and to choose the appropriate pg_num values automatically.
+
+You may need to activate the PG autoscaler module before adjustments can take
+effect.
+[source,bash]
+----
+ceph mgr module enable pg_autoscaler
+----
+
+The autoscaler is configured on a per pool basis and has the following modes:
+
+[horizontal]
+warn:: A health warning is issued if the suggested `pg_num` value differs too
+much from the current value.
+on:: The `pg_num` is adjusted automatically with no need for any manual
+interaction.
+off:: No automatic `pg_num` adjustments are made, and no warning will be issued
+if the PG count is far from optimal.
+
+The scaling factor can be adjusted to facilitate future data storage, with the
+`target_size`, `target_size_ratio` and the `pg_num_min` options.
+
+WARNING: By default, the autoscaler considers tuning the PG count of a pool if
+it is off by a factor of 3. This will lead to a considerable shift in data
+placement and might introduce a high load on the cluster.
+
+You can find a more in-depth introduction to the PG autoscaler on Ceph's Blog -
+https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/[New in
+Nautilus: PG merging and autotuning].
+
+
[[pve_ceph_device_classes]]
Ceph CRUSH & device classes
---------------------------
`'cephfs_metadata'' with one quarter of the data pools placement groups (`32`).
Check the xref:pve_ceph_pools[{pve} managed Ceph pool chapter] or visit the
Ceph documentation for more information regarding a fitting placement group
-number (`pg_num`) for your setup footnote:[Ceph Placement Groups
-{cephdocs-url}/rados/operations/placement-groups/].
+number (`pg_num`) for your setup footnoteref:[placement_groups].
Additionally, the `'--add-storage'' parameter will add the CephFS to the {pve}
-storage configuration after it was created successfully.
+storage configuration after it has been created successfully.
Destroy CephFS
~~~~~~~~~~~~~~