ceph: troubleshooting: followup reword and small expansion

[pve-docs.git] / pveceph.adoc
diff --git a/pveceph.adoc b/pveceph.adoc

index c90a92e3c49b3820837e758a40953910c54ccb5e..d330dea552848cee6d835430c93eb464de0fb433 100644 (file)
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -58,13 +58,15 @@ and VMs on the same node is possible.
  To simplify management, we provide 'pveceph' - a tool to install and
  manage {ceph} services on {pve} nodes.
  
-.Ceph consists of a couple of Daemons footnote:[Ceph intro http://docs.ceph.com/docs/master/start/intro/], for use as a RBD storage:
+.Ceph consists of a couple of Daemons footnote:[Ceph intro http://docs.ceph.com/docs/luminous/start/intro/], for use as a RBD storage:
  - Ceph Monitor (ceph-mon)
  - Ceph Manager (ceph-mgr)
  - Ceph OSD (ceph-osd; Object Storage Daemon)
  
-TIP: We recommend to get familiar with the Ceph vocabulary.
-footnote:[Ceph glossary http://docs.ceph.com/docs/luminous/glossary]
+TIP: We highly recommend to get familiar with Ceph's architecture
+footnote:[Ceph architecture http://docs.ceph.com/docs/luminous/architecture/]
+and vocabulary
+footnote:[Ceph glossary http://docs.ceph.com/docs/luminous/glossary].
  
  
  Precondition
@@ -211,7 +213,7 @@ This is the default when creating OSDs in Ceph luminous.
  pveceph createosd /dev/sd[X]
  ----
  
-NOTE: In order to select a disk in the GUI, to be more failsafe, the disk needs
+NOTE: In order to select a disk in the GUI, to be more fail-safe, the disk needs
  to have a GPT footnoteref:[GPT, GPT partition table
  https://en.wikipedia.org/wiki/GUID_Partition_Table] partition table. You can
  create this with `gdisk /dev/sd(x)`. If there is no GPT, you cannot select the
@@ -227,7 +229,7 @@ pveceph createosd /dev/sd[X] -journal_dev /dev/sd[Y]
  ----
  
  NOTE: The DB stores BlueStore’s internal metadata and the WAL is BlueStore’s
-internal journal or write-ahead log. It is recommended to use a fast SSDs or
+internal journal or write-ahead log. It is recommended to use a fast SSD or
  NVRAM for better performance.
  
  
@@ -235,7 +237,7 @@ Ceph Filestore
  ~~~~~~~~~~~~~
  Till Ceph luminous, Filestore was used as storage type for Ceph OSDs. It can
  still be used and might give better performance in small setups, when backed by
-a NVMe SSD or similar.
+an NVMe SSD or similar.
  
  [source,bash]
  ----
@@ -470,7 +472,7 @@ Since Luminous (12.2.x) you can also have multiple active metadata servers
  running, but this is normally only useful for a high count on parallel clients,
  as else the `MDS` seldom is the bottleneck. If you want to set this up please
  refer to the ceph documentation. footnote:[Configuring multiple active MDS
-daemons http://docs.ceph.com/docs/mimic/cephfs/multimds/]
+daemons http://docs.ceph.com/docs/luminous/cephfs/multimds/]
  
  [[pveceph_fs_create]]
  Create a CephFS
@@ -502,7 +504,7 @@ This creates a CephFS named `'cephfs'' using a pool for its data named
  Check the xref:pve_ceph_pools[{pve} managed Ceph pool chapter] or visit the
  Ceph documentation for more information regarding a fitting placement group
  number (`pg_num`) for your setup footnote:[Ceph Placement Groups
-http://docs.ceph.com/docs/mimic/rados/operations/placement-groups/].
+http://docs.ceph.com/docs/luminous/rados/operations/placement-groups/].
  Additionally, the `'--add-storage'' parameter will add the CephFS to the {pve}
  storage configuration after it was created successfully.
  
@@ -534,6 +536,34 @@ with:
  pveceph pool destroy NAME
  ----
  
+
+Ceph monitoring and troubleshooting
+-----------------------------------
+A good start is to continuosly monitor the ceph health from the start of
+initial deployment. Either through the ceph tools itself, but also by accessing
+the status through the {pve} link:api-viewer/index.html[API].
+
+The following ceph commands below can be used to see if the cluster is healthy
+('HEALTH_OK'), if there are warnings ('HEALTH_WARN'), or even errors
+('HEALTH_ERR'). If the cluster is in an unhealthy state the status commands
+below will also give you an overview on the current events and actions take.
+
+----
+# single time output
+pve# ceph -s
+# continuously output status changes (press CTRL+C to stop)
+pve# ceph -w
+----
+
+To get a more detailed view, every ceph service has a log file under
+`/var/log/ceph/` and if there is not enough detail, the log level can be
+adjusted footnote:[Ceph log and debugging http://docs.ceph.com/docs/luminous/rados/troubleshooting/log-and-debug/].
+
+You can find more information about troubleshooting
+footnote:[Ceph troubleshooting http://docs.ceph.com/docs/luminous/rados/troubleshooting/]
+a Ceph cluster on its website.
+
+
  ifdef::manvolnum[]
  include::pve-copyright.adoc[]
  endif::manvolnum[]