]>
Commit | Line | Data |
---|---|---|
3efd9988 | 1 | ================= |
c07f9fc5 FG |
2 | Prometheus plugin |
3 | ================= | |
4 | ||
5 | Provides a Prometheus exporter to pass on Ceph performance counters | |
6 | from the collection point in ceph-mgr. Ceph-mgr receives MMgrReport | |
7 | messages from all MgrClient processes (mons and OSDs, for instance) | |
8 | with performance counter schema data and actual counter data, and keeps | |
9 | a circular buffer of the last N samples. This plugin creates an HTTP | |
10 | endpoint (like all Prometheus exporters) and retrieves the latest sample | |
11 | of every counter when polled (or "scraped" in Prometheus terminology). | |
12 | The HTTP path and query parameters are ignored; all extant counters | |
13 | for all reporting entities are returned in text exposition format. | |
14 | (See the Prometheus `documentation <https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details>`_.) | |
15 | ||
3efd9988 FG |
16 | Enabling prometheus output |
17 | ========================== | |
c07f9fc5 FG |
18 | |
19 | The *prometheus* module is enabled with:: | |
20 | ||
21 | ceph mgr module enable prometheus | |
22 | ||
23 | Configuration | |
24 | ------------- | |
25 | ||
26 | By default the module will accept HTTP requests on port ``9283`` on all | |
27 | IPv4 and IPv6 addresses on the host. The port and listen address are both | |
28 | configurable with ``ceph config-key set``, with keys | |
29 | ``mgr/prometheus/server_addr`` and ``mgr/prometheus/server_port``. | |
30 | This port is registered with Prometheus's `registry <https://github.com/prometheus/prometheus/wiki/Default-port-allocations>`_. | |
31 | ||
3efd9988 FG |
32 | Statistic names and labels |
33 | ========================== | |
34 | ||
35 | The names of the stats are exactly as Ceph names them, with | |
36 | illegal characters ``.``, ``-`` and ``::`` translated to ``_``, | |
37 | and ``ceph_`` prefixed to all names. | |
38 | ||
39 | ||
40 | All *daemon* statistics have a ``ceph_daemon`` label such as "osd.123" | |
41 | that identifies the type and ID of the daemon they come from. Some | |
42 | statistics can come from different types of daemon, so when querying | |
43 | e.g. an OSD's RocksDB stats, you would probably want to filter | |
44 | on ceph_daemon starting with "osd" to avoid mixing in the monitor | |
45 | rocksdb stats. | |
46 | ||
47 | ||
48 | The *cluster* statistics (i.e. those global to the Ceph cluster) | |
49 | have labels appropriate to what they report on. For example, | |
50 | metrics relating to pools have a ``pool_id`` label. | |
51 | ||
52 | Pool and OSD metadata series | |
53 | ---------------------------- | |
54 | ||
55 | Special series are output to enable displaying and querying on | |
56 | certain metadata fields. | |
57 | ||
58 | Pools have a ``ceph_pool_metadata`` field like this: | |
59 | ||
60 | :: | |
61 | ||
62 | ceph_pool_metadata{pool_id="2",name="cephfs_metadata_a"} 0.0 | |
63 | ||
64 | OSDs have a ``ceph_osd_metadata`` field like this: | |
65 | ||
66 | :: | |
67 | ||
68 | ceph_osd_metadata{cluster_addr="172.21.9.34:6802/19096",device_class="ssd",id="0",public_addr="172.21.9.34:6801/19096",weight="1.0"} 0.0 | |
69 | ||
70 | ||
71 | Correlating drive statistics with node_exporter | |
72 | ----------------------------------------------- | |
73 | ||
74 | The prometheus output from Ceph is designed to be used in conjunction | |
75 | with the generic host monitoring from the Prometheus node_exporter. | |
76 | ||
77 | To enable correlation of Ceph OSD statistics with node_exporter's | |
78 | drive statistics, special series are output like this: | |
79 | ||
80 | :: | |
81 | ||
82 | ceph_disk_occupation{ceph_daemon="osd.0",device="sdd",instance="myhost",job="ceph"} | |
83 | ||
84 | To use this to get disk statistics by OSD ID, use the ``and on`` syntax | |
85 | in your prometheus query like this: | |
86 | ||
87 | :: | |
88 | ||
89 | rate(node_disk_bytes_written[30s]) and on (device,instance) ceph_disk_occupation{ceph_daemon="osd.0"} | |
90 | ||
91 | See the prometheus documentation for more information about constructing | |
92 | queries. | |
93 | ||
94 | Note that for this mechanism to work, Ceph and node_exporter must agree | |
95 | about the values of the ``instance`` label. See the following section | |
96 | for guidance about to to set up Prometheus in a way that sets | |
97 | ``instance`` properly. | |
98 | ||
99 | Configuring Prometheus server | |
100 | ============================= | |
101 | ||
102 | See the prometheus documentation for full details of how to add | |
103 | scrape endpoints: the notes | |
104 | in this section are tips on how to configure Prometheus to capture | |
105 | the Ceph statistics in the most usefully-labelled form. | |
106 | ||
107 | This configuration is necessary because Ceph is reporting metrics | |
108 | from many hosts and services via a single endpoint, and some | |
109 | metrics that relate to no physical host (such as pool statistics). | |
110 | ||
111 | honor_labels | |
112 | ------------ | |
113 | ||
114 | To enable Ceph to output properly-labelled data relating to any host, | |
115 | use the ``honor_labels`` setting when adding the ceph-mgr endpoints | |
116 | to your prometheus configuration. | |
117 | ||
118 | Without this setting, any ``instance`` labels that Ceph outputs, such | |
119 | as those in ``ceph_disk_occupation`` series, will be overridden | |
120 | by Prometheus. | |
121 | ||
122 | Ceph instance label | |
123 | ------------------- | |
124 | ||
125 | By default, Prometheus applies an ``instance`` label that includes | |
126 | the hostname and port of the endpoint that the series game from. Because | |
127 | Ceph clusters have multiple manager daemons, this results in an ``instance`` | |
128 | label that changes spuriously when the active manager daemon changes. | |
129 | ||
130 | Set a custom ``instance`` label in your Prometheus target configuration: | |
131 | you might wish to set it to the hostname of your first monitor, or something | |
132 | completely arbitrary like "ceph_cluster". | |
133 | ||
134 | node_exporter instance labels | |
135 | ----------------------------- | |
136 | ||
137 | Set your ``instance`` labels to match what appears in Ceph's OSD metadata | |
138 | in the ``hostname`` field. This is generally the short hostname of the node. | |
139 | ||
140 | This is only necessary if you want to correlate Ceph stats with host stats, | |
141 | but you may find it useful to do it in all cases in case you want to do | |
142 | the correlation in the future. | |
143 | ||
144 | Example configuration | |
145 | --------------------- | |
146 | ||
147 | This example shows a single node configuration running ceph-mgr and | |
148 | node_exporter on a server called ``senta04``. | |
149 | ||
150 | This is just an example: there are other ways to configure prometheus | |
151 | scrape targets and label rewrite rules. | |
152 | ||
153 | prometheus.yml | |
154 | ~~~~~~~~~~~~~~ | |
155 | ||
156 | :: | |
157 | ||
158 | global: | |
159 | scrape_interval: 15s | |
160 | evaluation_interval: 15s | |
161 | ||
162 | scrape_configs: | |
163 | - job_name: 'node' | |
164 | file_sd_configs: | |
165 | - files: | |
166 | - node_targets.yml | |
167 | - job_name: 'ceph' | |
168 | honor_labels: true | |
169 | file_sd_configs: | |
170 | - files: | |
171 | - ceph_targets.yml | |
172 | ||
173 | ||
174 | ceph_targets.yml | |
175 | ~~~~~~~~~~~~~~~~ | |
176 | ||
177 | ||
178 | :: | |
179 | ||
180 | [ | |
181 | { | |
182 | "targets": [ "senta04.mydomain.com:9283" ], | |
183 | "labels": { | |
184 | "instance": "ceph_cluster" | |
185 | } | |
186 | } | |
187 | ] | |
188 | ||
189 | ||
190 | node_targets.yml | |
191 | ~~~~~~~~~~~~~~~~ | |
192 | ||
193 | :: | |
194 | ||
195 | [ | |
196 | { | |
197 | "targets": [ "senta04.mydomain.com:9100" ], | |
198 | "labels": { | |
199 | "instance": "senta04" | |
200 | } | |
201 | } | |
202 | ] | |
203 | ||
204 | ||
c07f9fc5 | 205 | Notes |
3efd9988 | 206 | ===== |
c07f9fc5 FG |
207 | |
208 | Counters and gauges are exported; currently histograms and long-running | |
209 | averages are not. It's possible that Ceph's 2-D histograms could be | |
210 | reduced to two separate 1-D histograms, and that long-running averages | |
211 | could be exported as Prometheus' Summary type. | |
212 | ||
c07f9fc5 FG |
213 | Timestamps, as with many Prometheus exporters, are established by |
214 | the server's scrape time (Prometheus expects that it is polling the | |
215 | actual counter process synchronously). It is possible to supply a | |
216 | timestamp along with the stat report, but the Prometheus team strongly | |
217 | advises against this. This means that timestamps will be delayed by | |
218 | an unpredictable amount; it's not clear if this will be problematic, | |
219 | but it's worth knowing about. |