summary |
shortlog |
log |
commit | commitdiff |
tree
raw |
patch |
inline | side by side (from parent 1:
76f6eca)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
http://docs.ceph.com/docs/luminous/start/hardware-recommendations/[Ceph's website].
.CPU
http://docs.ceph.com/docs/luminous/start/hardware-recommendations/[Ceph's website].
.CPU
-As higher the core frequency the better, this will reduce latency. Among other
-things, this benefits the services of Ceph, as they can process data faster.
-To simplify planning, you should assign a CPU core (or thread) to each Ceph
-service to provide enough resources for stable and durable Ceph performance.
+Higher CPU core frequency reduce latency and should be preferred. As a simple
+rule of thumb, you should assign a CPU core (or thread) to each Ceph service to
+provide enough resources for stable and durable Ceph performance.
.Memory
Especially in a hyper-converged setup, the memory consumption needs to be
.Memory
Especially in a hyper-converged setup, the memory consumption needs to be
-carefully monitored. In addition to the intended workload (VM / Container),
-Ceph needs enough memory to provide good and stable performance. As a rule of
-thumb, for roughly 1TiB of data, 1 GiB of memory will be used by an OSD. With
-additionally needed memory for OSD caching.
+carefully monitored. In addition to the intended workload from virtual machines
+and container, Ceph needs enough memory available to provide good and stable
+performance. As a rule of thumb, for roughly 1 TiB of data, 1 GiB of memory
+will be used by an OSD. OSD caching will use additional memory.
.Network
We recommend a network bandwidth of at least 10 GbE or more, which is used
.Network
We recommend a network bandwidth of at least 10 GbE or more, which is used
footnote:[Full Mesh Network for Ceph {webwiki-url}Full_Mesh_Network_for_Ceph_Server]
is also an option if there are no 10 GbE switches available.
footnote:[Full Mesh Network for Ceph {webwiki-url}Full_Mesh_Network_for_Ceph_Server]
is also an option if there are no 10 GbE switches available.
-To be explicit about the network, since Ceph is a distributed network storage,
-its traffic must be put on its own physical network. The volume of traffic
-especially during recovery will interfere with other services on the same
-network.
+The volume of traffic, especially during recovery, will interfere with other
+services on the same network and may even break the {pve} cluster stack.
Further, estimate your bandwidth needs. While one HDD might not saturate a 1 Gb
Further, estimate your bandwidth needs. While one HDD might not saturate a 1 Gb
-link, a SSD or a NVMe SSD certainly can. Modern NVMe SSDs will even saturate 10
-Gb of bandwidth. You also should consider higher bandwidths, as these tend to
-come with lower latency.
+link, multiple HDD OSDs per node can, and modern NVMe SSDs will even saturate
+10 Gbps of bandwidth quickly. Deploying a network capable of even more bandwith
+will ensure that it isn't your bottleneck and won't be anytime soon, 25, 40 or
+even 100 GBps are possible.
.Disks
When planning the size of your Ceph cluster, it is important to take the
.Disks
When planning the size of your Ceph cluster, it is important to take the
setups to reduce recovery time, minimizing the likelihood of a subsequent
failure event during recovery.
setups to reduce recovery time, minimizing the likelihood of a subsequent
failure event during recovery.
-In general SSDs will provide more IOPs then spinning disks. This fact and the
+In general SSDs will provide more IOPs than spinning disks. This fact and the
higher cost may make a xref:pve_ceph_device_classes[class based] separation of
higher cost may make a xref:pve_ceph_device_classes[class based] separation of
-pools appealing. Another possibility to speedup OSDs is to use a faster disk
+pools appealing. Another possibility to speedup OSDs is to use a faster disk
as journal or DB/WAL device, see xref:pve_ceph_osds[creating Ceph OSDs]. If a
faster disk is used for multiple OSDs, a proper balance between OSD and WAL /
DB (or journal) disk must be selected, otherwise the faster disk becomes the
bottleneck for all linked OSDs.
Aside from the disk type, Ceph best performs with an even sized and distributed
as journal or DB/WAL device, see xref:pve_ceph_osds[creating Ceph OSDs]. If a
faster disk is used for multiple OSDs, a proper balance between OSD and WAL /
DB (or journal) disk must be selected, otherwise the faster disk becomes the
bottleneck for all linked OSDs.
Aside from the disk type, Ceph best performs with an even sized and distributed
-amount of disks per node. For example, 4x disks à 500 GB in each node.
+amount of disks per node. For example, 4 x 500 GB disks with in each node is
+better than a mixed setup with a single 1 TB and three 250 GB disk.
+
+One also need to balance OSD count and single OSD capacity. More capacity
+allows to increase storage density, but it also means that a single OSD
+failure forces ceph to recover more data at once.
.Avoid RAID
As Ceph handles data object redundancy and multiple parallel writes to disks
.Avoid RAID
As Ceph handles data object redundancy and multiple parallel writes to disks
WARNING: Avoid RAID controller, use host bus adapter (HBA) instead.
NOTE: Above recommendations should be seen as a rough guidance for choosing
WARNING: Avoid RAID controller, use host bus adapter (HBA) instead.
NOTE: Above recommendations should be seen as a rough guidance for choosing
-hardware. Therefore, it is still essential to test your setup and monitor
-health & performance.
+hardware. Therefore, it is still essential to adapt it to your specific needs,
+test your setup and monitor health and performance continuously.