From 2f19a6b0b6b16c9502b86239b7cfb25f6b25c509 Mon Sep 17 00:00:00 2001 From: Thomas Lamprecht Date: Thu, 4 Apr 2019 14:31:31 +0200 Subject: [PATCH] ceph: follow up rewords & fixes Signed-off-by: Thomas Lamprecht --- pveceph.adoc | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/pveceph.adoc b/pveceph.adoc index bfe6a62..c1c9657 100644 --- a/pveceph.adoc +++ b/pveceph.adoc @@ -79,17 +79,16 @@ Check also the recommendations from http://docs.ceph.com/docs/luminous/start/hardware-recommendations/[Ceph's website]. .CPU -As higher the core frequency the better, this will reduce latency. Among other -things, this benefits the services of Ceph, as they can process data faster. -To simplify planning, you should assign a CPU core (or thread) to each Ceph -service to provide enough resources for stable and durable Ceph performance. +Higher CPU core frequency reduce latency and should be preferred. As a simple +rule of thumb, you should assign a CPU core (or thread) to each Ceph service to +provide enough resources for stable and durable Ceph performance. .Memory Especially in a hyper-converged setup, the memory consumption needs to be -carefully monitored. In addition to the intended workload (VM / Container), -Ceph needs enough memory to provide good and stable performance. As a rule of -thumb, for roughly 1TiB of data, 1 GiB of memory will be used by an OSD. With -additionally needed memory for OSD caching. +carefully monitored. In addition to the intended workload from virtual machines +and container, Ceph needs enough memory available to provide good and stable +performance. As a rule of thumb, for roughly 1 TiB of data, 1 GiB of memory +will be used by an OSD. OSD caching will use additional memory. .Network We recommend a network bandwidth of at least 10 GbE or more, which is used @@ -97,15 +96,14 @@ exclusively for Ceph. A meshed network setup footnote:[Full Mesh Network for Ceph {webwiki-url}Full_Mesh_Network_for_Ceph_Server] is also an option if there are no 10 GbE switches available. -To be explicit about the network, since Ceph is a distributed network storage, -its traffic must be put on its own physical network. The volume of traffic -especially during recovery will interfere with other services on the same -network. +The volume of traffic, especially during recovery, will interfere with other +services on the same network and may even break the {pve} cluster stack. Further, estimate your bandwidth needs. While one HDD might not saturate a 1 Gb -link, a SSD or a NVMe SSD certainly can. Modern NVMe SSDs will even saturate 10 -Gb of bandwidth. You also should consider higher bandwidths, as these tend to -come with lower latency. +link, multiple HDD OSDs per node can, and modern NVMe SSDs will even saturate +10 Gbps of bandwidth quickly. Deploying a network capable of even more bandwith +will ensure that it isn't your bottleneck and won't be anytime soon, 25, 40 or +even 100 GBps are possible. .Disks When planning the size of your Ceph cluster, it is important to take the @@ -114,16 +112,21 @@ might take long. It is recommended that you use SSDs instead of HDDs in small setups to reduce recovery time, minimizing the likelihood of a subsequent failure event during recovery. -In general SSDs will provide more IOPs then spinning disks. This fact and the +In general SSDs will provide more IOPs than spinning disks. This fact and the higher cost may make a xref:pve_ceph_device_classes[class based] separation of -pools appealing. Another possibility to speedup OSDs is to use a faster disk +pools appealing. Another possibility to speedup OSDs is to use a faster disk as journal or DB/WAL device, see xref:pve_ceph_osds[creating Ceph OSDs]. If a faster disk is used for multiple OSDs, a proper balance between OSD and WAL / DB (or journal) disk must be selected, otherwise the faster disk becomes the bottleneck for all linked OSDs. Aside from the disk type, Ceph best performs with an even sized and distributed -amount of disks per node. For example, 4x disks à 500 GB in each node. +amount of disks per node. For example, 4 x 500 GB disks with in each node is +better than a mixed setup with a single 1 TB and three 250 GB disk. + +One also need to balance OSD count and single OSD capacity. More capacity +allows to increase storage density, but it also means that a single OSD +failure forces ceph to recover more data at once. .Avoid RAID As Ceph handles data object redundancy and multiple parallel writes to disks @@ -137,8 +140,8 @@ the ones from Ceph. WARNING: Avoid RAID controller, use host bus adapter (HBA) instead. NOTE: Above recommendations should be seen as a rough guidance for choosing -hardware. Therefore, it is still essential to test your setup and monitor -health & performance. +hardware. Therefore, it is still essential to adapt it to your specific needs, +test your setup and monitor health and performance continuously. [[pve_ceph_install]] -- 2.39.2