[ceph.git] / ceph / doc / rados / operations / data-placement.rst

=========================
 Data Placement Overview
=========================

Ceph stores, replicates and rebalances data objects across a RADOS cluster
dynamically.  With many different users storing objects in different pools for
different purposes on countless OSDs, Ceph operations require some data
placement planning.  The main data placement planning concepts in Ceph include:

- **Pools:** Ceph stores data within pools, which are logical groups for storing
  objects. Pools manage the number of placement groups, the number of replicas,
  and the CRUSH rule for the pool. To store data in a pool, you must have
  an authenticated user with permissions for the pool. Ceph can snapshot pools.
  See `Pools`_ for additional details.

- **Placement Groups:** Ceph maps objects to placement groups (PGs).
  Placement groups (PGs) are shards or fragments of a logical object pool
  that place objects as a group into OSDs. Placement groups reduce the amount
  of per-object metadata when Ceph stores the data in OSDs. A larger number of
  placement groups (e.g., 100 per OSD) leads to better balancing. See
  `Placement Groups`_ for additional details.

- **CRUSH Maps:**  CRUSH is a big part of what allows Ceph to scale without
  performance bottlenecks, without limitations to scalability, and without a
  single point of failure. CRUSH maps provide the physical topology of the
  cluster to the CRUSH algorithm to determine where the data for an object
  and its replicas should be stored, and how to do so across failure domains
  for added data safety among other things. See `CRUSH Maps`_ for additional
  details.

- **Balancer:** The balancer is a feature that will automatically optimize the
  distribution of PGs across devices to achieve a balanced data distribution,
  maximizing the amount of data that can be stored in the cluster and evenly
  distributing the workload across OSDs.

When you initially set up a test cluster, you can use the default values. Once
you begin planning for a large Ceph cluster, refer to pools, placement groups
and CRUSH for data placement operations.

.. _Pools: ../pools
.. _Placement Groups: ../placement-groups
.. _CRUSH Maps: ../crush-map
.. _Balancer: ../balancer
Commit	Line	Data
7c673cae FG	1	=========================
	2	Data Placement Overview
	3	=========================
	4
	5	Ceph stores, replicates and rebalances data objects across a RADOS cluster
	6	dynamically. With many different users storing objects in different pools for
	7	different purposes on countless OSDs, Ceph operations require some data
	8	placement planning. The main data placement planning concepts in Ceph include:
	9
	10	- Pools: Ceph stores data within pools, which are logical groups for storing
	11	objects. Pools manage the number of placement groups, the number of replicas,
b32b8144	12	and the CRUSH rule for the pool. To store data in a pool, you must have
7c673cae FG	13	an authenticated user with permissions for the pool. Ceph can snapshot pools.
	14	See `Pools`_ for additional details.
	15
	16	- Placement Groups: Ceph maps objects to placement groups (PGs).
	17	Placement groups (PGs) are shards or fragments of a logical object pool
	18	that place objects as a group into OSDs. Placement groups reduce the amount
	19	of per-object metadata when Ceph stores the data in OSDs. A larger number of
	20	placement groups (e.g., 100 per OSD) leads to better balancing. See
	21	`Placement Groups`_ for additional details.
	22
	23	- CRUSH Maps: CRUSH is a big part of what allows Ceph to scale without
	24	performance bottlenecks, without limitations to scalability, and without a
	25	single point of failure. CRUSH maps provide the physical topology of the
	26	cluster to the CRUSH algorithm to determine where the data for an object
	27	and its replicas should be stored, and how to do so across failure domains
	28	for added data safety among other things. See `CRUSH Maps`_ for additional
	29	details.
	30
11fdf7f2 TL	31	- Balancer: The balancer is a feature that will automatically optimize the
	32	distribution of PGs across devices to achieve a balanced data distribution,
	33	maximizing the amount of data that can be stored in the cluster and evenly
	34	distributing the workload across OSDs.
	35
7c673cae FG	36	When you initially set up a test cluster, you can use the default values. Once
	37	you begin planning for a large Ceph cluster, refer to pools, placement groups
	38	and CRUSH for data placement operations.
	39
	40	.. _Pools: ../pools
	41	.. _Placement Groups: ../placement-groups
	42	.. _CRUSH Maps: ../crush-map
11fdf7f2	43	.. _Balancer: ../balancer