Enabling prometheus output
==========================
-The *prometheus* module is enabled with::
+The *prometheus* module is enabled with:
- ceph mgr module enable prometheus
+.. prompt:: bash $
+
+ ceph mgr module enable prometheus
Configuration
-------------
is registered with Prometheus's `registry
<https://github.com/prometheus/prometheus/wiki/Default-port-allocations>`_.
-::
-
- ceph config set mgr mgr/prometheus/server_addr 0.0.0.0
- ceph config set mgr mgr/prometheus/server_port 9283
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/server_addr 0.0.0.
+ ceph config set mgr mgr/prometheus/server_port 9283
.. warning::
might be useful to increase the scrape interval.
To set a different scrape interval in the Prometheus module, set
-``scrape_interval`` to the desired value::
+``scrape_interval`` to the desired value:
- ceph config set mgr mgr/prometheus/scrape_interval 20
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/scrape_interval 20
On large clusters (>1000 OSDs), the time to fetch the metrics may become
significant. Without the cache, the Prometheus manager module could, especially
to unresponsive or crashing Ceph manager instances. Hence, the cache is enabled
by default. This means that there is a possibility that the cache becomes
stale. The cache is considered stale when the time to fetch the metrics from
-Ceph exceeds the configured :confval:``mgr/prometheus/scrape_interval``.
+Ceph exceeds the configured :confval:`mgr/prometheus/scrape_interval`.
If that is the case, **a warning will be logged** and the module will either
code (service unavailable). You can set other options using the ``ceph config
set`` commands.
-To tell the module to respond with possibly stale data, set it to ``return``::
+To tell the module to respond with possibly stale data, set it to ``return``:
+
+.. prompt:: bash $
ceph config set mgr mgr/prometheus/stale_cache_strategy return
-To tell the module to respond with "service unavailable", set it to ``fail``::
+To tell the module to respond with "service unavailable", set it to ``fail``:
- ceph config set mgr mgr/prometheus/stale_cache_strategy fail
+.. prompt:: bash $
-If you are confident that you don't require the cache, you can disable it::
+ ceph config set mgr mgr/prometheus/stale_cache_strategy fail
- ceph config set mgr mgr/prometheus/cache false
+If you are confident that you don't require the cache, you can disable it:
+
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/cache false
If you are using the prometheus module behind some kind of reverse proxy or
loadbalancer, you can simplify discovering the active instance by switching
-to ``error``-mode::
+to ``error``-mode:
+
+.. prompt:: bash $
- ceph config set mgr mgr/prometheus/standby_behaviour error
+ ceph config set mgr mgr/prometheus/standby_behaviour error
-If set, the prometheus module will repond with a HTTP error when requesting ``/``
+If set, the prometheus module will respond with a HTTP error when requesting ``/``
from the standby instance. The default error code is 500, but you can configure
-the HTTP response code with::
+the HTTP response code with:
- ceph config set mgr mgr/prometheus/standby_error_status_code 503
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/standby_error_status_code 503
Valid error codes are between 400-599.
-To switch back to the default behaviour, simply set the config key to ``default``::
+To switch back to the default behaviour, simply set the config key to ``default``:
+
+.. prompt:: bash $
- ceph config set mgr mgr/prometheus/standby_behaviour default
+ ceph config set mgr mgr/prometheus/standby_behaviour default
.. _prometheus-rbd-io-statistics:
of ``pool[/namespace]`` entries. If the namespace is not specified the
statistics are collected for all namespaces in the pool.
-Example to activate the RBD-enabled pools ``pool1``, ``pool2`` and ``poolN``::
+Example to activate the RBD-enabled pools ``pool1``, ``pool2`` and ``poolN``:
- ceph config set mgr mgr/prometheus/rbd_stats_pools "pool1,pool2,poolN"
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/rbd_stats_pools "pool1,pool2,poolN"
+
+The wildcard can be used to indicate all pools or namespaces:
+
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/rbd_stats_pools "*"
The module makes the list of all available images scanning the specified
pools and namespaces and refreshes it periodically. The period is
force refresh earlier if it detects statistics from a previously unknown
RBD image.
-Example to turn up the sync interval to 10 minutes::
+Example to turn up the sync interval to 10 minutes:
+
+.. prompt:: bash $
+
+ ceph config set mgr mgr/prometheus/rbd_stats_pools_refresh_interval 600
+
+Ceph daemon performance counters metrics
+-----------------------------------------
+
+With the introduction of ``ceph-exporter`` daemon, the prometheus module will no longer export Ceph daemon
+perf counters as prometheus metrics by default. However, one may re-enable exporting these metrics by setting
+the module option ``exclude_perf_counters`` to ``false``:
+
+.. prompt:: bash $
- ceph config set mgr mgr/prometheus/rbd_stats_pools_refresh_interval 600
+ ceph config set mgr mgr/prometheus/exclude_perf_counters false
Statistic names and labels
==========================
::
- rate(node_disk_bytes_written[30s]) and
+ rate(node_disk_written_bytes_total[30s]) and
on (device,instance) ceph_disk_occupation_human{ceph_daemon="osd.0"}
Out of the box the above query will not return any metrics since the ``instance`` labels of
::
label_replace(
- rate(node_disk_bytes_written[30s]),
+ rate(node_disk_written_bytes_total[30s]),
"exported_instance",
"$1",
"instance",