]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ===================================== |
2 | Configuring Monitor/OSD Interaction | |
3 | ===================================== | |
4 | ||
5 | .. index:: heartbeat | |
6 | ||
7 | After you have completed your initial Ceph configuration, you may deploy and run | |
8 | Ceph. When you execute a command such as ``ceph health`` or ``ceph -s``, the | |
9 | :term:`Ceph Monitor` reports on the current state of the :term:`Ceph Storage | |
10 | Cluster`. The Ceph Monitor knows about the Ceph Storage Cluster by requiring | |
11 | reports from each :term:`Ceph OSD Daemon`, and by receiving reports from Ceph | |
12 | OSD Daemons about the status of their neighboring Ceph OSD Daemons. If the Ceph | |
13 | Monitor doesn't receive reports, or if it receives reports of changes in the | |
14 | Ceph Storage Cluster, the Ceph Monitor updates the status of the :term:`Ceph | |
15 | Cluster Map`. | |
16 | ||
17 | Ceph provides reasonable default settings for Ceph Monitor/Ceph OSD Daemon | |
18 | interaction. However, you may override the defaults. The following sections | |
19 | describe how Ceph Monitors and Ceph OSD Daemons interact for the purposes of | |
20 | monitoring the Ceph Storage Cluster. | |
21 | ||
22 | .. index:: heartbeat interval | |
23 | ||
24 | OSDs Check Heartbeats | |
25 | ===================== | |
26 | ||
27 | Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons every 6 | |
28 | seconds. You can change the heartbeat interval by adding an ``osd heartbeat | |
29 | interval`` setting under the ``[osd]`` section of your Ceph configuration file, | |
30 | or by setting the value at runtime. If a neighboring Ceph OSD Daemon doesn't | |
31 | show a heartbeat within a 20 second grace period, the Ceph OSD Daemon may | |
32 | consider the neighboring Ceph OSD Daemon ``down`` and report it back to a Ceph | |
33 | Monitor, which will update the Ceph Cluster Map. You may change this grace | |
34 | period by adding an ``osd heartbeat grace`` setting under the ``[mon]`` | |
35 | and ``[osd]`` or ``[global]`` section of your Ceph configuration file, | |
36 | or by setting the value at runtime. | |
37 | ||
38 | ||
39 | .. ditaa:: +---------+ +---------+ | |
40 | | OSD 1 | | OSD 2 | | |
41 | +---------+ +---------+ | |
42 | | | | |
43 | |----+ Heartbeat | | |
44 | | | Interval | | |
45 | |<---+ Exceeded | | |
46 | | | | |
47 | | Check | | |
48 | | Heartbeat | | |
49 | |------------------->| | |
50 | | | | |
51 | |<-------------------| | |
52 | | Heart Beating | | |
53 | | | | |
54 | |----+ Heartbeat | | |
55 | | | Interval | | |
56 | |<---+ Exceeded | | |
57 | | | | |
58 | | Check | | |
59 | | Heartbeat | | |
60 | |------------------->| | |
61 | | | | |
62 | |----+ Grace | | |
63 | | | Period | | |
64 | |<---+ Exceeded | | |
65 | | | | |
66 | |----+ Mark | | |
67 | | | OSD 2 | | |
68 | |<---+ Down | | |
31f18b77 | 69 | |
7c673cae FG |
70 | |
71 | .. index:: OSD down report | |
72 | ||
73 | OSDs Report Down OSDs | |
74 | ===================== | |
75 | ||
31f18b77 FG |
76 | By default, two Ceph OSD Daemons from different hosts must report to the Ceph |
77 | Monitors that another Ceph OSD Daemon is ``down`` before the Ceph Monitors | |
78 | acknowledge that the reported Ceph OSD Daemon is ``down``. But there is chance | |
79 | that all the OSDs reporting the failure are hosted in a rack with a bad switch | |
80 | which has trouble connecting to another OSD. To avoid this sort of false alarm, | |
81 | we consider the peers reporting a failure a proxy for a potential "subcluster" | |
82 | over the overall cluster that is similarly laggy. This is clearly not true in | |
83 | all cases, but will sometimes help us localize the grace correction to a subset | |
84 | of the system that is unhappy. ``mon osd reporter subtree level`` is used to | |
85 | group the peers into the "subcluster" by their common ancestor type in CRUSH | |
86 | map. By default, only two reports from different subtree are required to report | |
87 | another Ceph OSD Daemon ``down``. You can change the number of reporters from | |
88 | unique subtrees and the common ancestor type required to report a Ceph OSD | |
89 | Daemon ``down`` to a Ceph Monitor by adding an ``mon osd min down reporters`` | |
90 | and ``mon osd reporter subtree level`` settings under the ``[mon]`` section of | |
91 | your Ceph configuration file, or by setting the value at runtime. | |
92 | ||
93 | ||
94 | .. ditaa:: +---------+ +---------+ +---------+ | |
95 | | OSD 1 | | OSD 2 | | Monitor | | |
96 | +---------+ +---------+ +---------+ | |
97 | | | | | |
98 | | OSD 3 Is Down | | | |
99 | |---------------+--------------->| | |
100 | | | | | |
101 | | | | | |
102 | | | OSD 3 Is Down | | |
103 | | |--------------->| | |
104 | | | | | |
105 | | | | | |
106 | | | |---------+ Mark | |
107 | | | | | OSD 3 | |
108 | | | |<--------+ Down | |
7c673cae FG |
109 | |
110 | ||
111 | .. index:: peering failure | |
112 | ||
113 | OSDs Report Peering Failure | |
114 | =========================== | |
115 | ||
116 | If a Ceph OSD Daemon cannot peer with any of the Ceph OSD Daemons defined in its | |
117 | Ceph configuration file (or the cluster map), it will ping a Ceph Monitor for | |
118 | the most recent copy of the cluster map every 30 seconds. You can change the | |
119 | Ceph Monitor heartbeat interval by adding an ``osd mon heartbeat interval`` | |
120 | setting under the ``[osd]`` section of your Ceph configuration file, or by | |
121 | setting the value at runtime. | |
122 | ||
123 | .. ditaa:: +---------+ +---------+ +-------+ +---------+ | |
124 | | OSD 1 | | OSD 2 | | OSD 3 | | Monitor | | |
125 | +---------+ +---------+ +-------+ +---------+ | |
126 | | | | | | |
127 | | Request To | | | | |
31f18b77 | 128 | | Peer | | | |
7c673cae FG |
129 | |-------------->| | | |
130 | |<--------------| | | | |
131 | | Peering | | | |
132 | | | | | |
133 | | Request To | | | |
31f18b77 | 134 | | Peer | | |
7c673cae FG |
135 | |----------------------------->| | |
136 | | | | |
137 | |----+ OSD Monitor | | |
138 | | | Heartbeat | | |
139 | |<---+ Interval Exceeded | | |
140 | | | | |
141 | | Failed to Peer with OSD 3 | | |
142 | |-------------------------------------------->| | |
143 | |<--------------------------------------------| | |
144 | | Receive New Cluster Map | | |
31f18b77 | 145 | |
7c673cae FG |
146 | |
147 | .. index:: OSD status | |
148 | ||
149 | OSDs Report Their Status | |
150 | ======================== | |
151 | ||
152 | If an Ceph OSD Daemon doesn't report to a Ceph Monitor, the Ceph Monitor will | |
153 | consider the Ceph OSD Daemon ``down`` after the ``mon osd report timeout`` | |
154 | elapses. A Ceph OSD Daemon sends a report to a Ceph Monitor when a reportable | |
155 | event such as a failure, a change in placement group stats, a change in | |
156 | ``up_thru`` or when it boots within 5 seconds. You can change the Ceph OSD | |
157 | Daemon minimum report interval by adding an ``osd mon report interval min`` | |
158 | setting under the ``[osd]`` section of your Ceph configuration file, or by | |
31f18b77 FG |
159 | setting the value at runtime. A Ceph OSD Daemon sends a report to a Ceph |
160 | Monitor every 120 seconds irrespective of whether any notable changes occur. | |
161 | You can change the Ceph Monitor report interval by adding an ``osd mon report | |
162 | interval max`` setting under the ``[osd]`` section of your Ceph configuration | |
7c673cae FG |
163 | file, or by setting the value at runtime. |
164 | ||
165 | ||
166 | .. ditaa:: +---------+ +---------+ | |
167 | | OSD 1 | | Monitor | | |
168 | +---------+ +---------+ | |
169 | | | | |
170 | |----+ Report Min | | |
171 | | | Interval | | |
172 | |<---+ Exceeded | | |
173 | | | | |
174 | |----+ Reportable | | |
175 | | | Event | | |
176 | |<---+ Occurs | | |
177 | | | | |
178 | | Report To | | |
179 | | Monitor | | |
180 | |------------------->| | |
181 | | | | |
182 | |----+ Report Max | | |
183 | | | Interval | | |
184 | |<---+ Exceeded | | |
185 | | | | |
186 | | Report To | | |
187 | | Monitor | | |
188 | |------------------->| | |
189 | | | | |
190 | |----+ Monitor | | |
191 | | | Fails | | |
192 | |<---+ | | |
193 | +----+ Monitor OSD | |
194 | | | Report Timeout | |
195 | |<---+ Exceeded | |
196 | | | |
197 | +----+ Mark | |
198 | | | OSD 1 | |
199 | |<---+ Down | |
200 | ||
201 | ||
202 | ||
203 | ||
204 | Configuration Settings | |
205 | ====================== | |
206 | ||
207 | When modifying heartbeat settings, you should include them in the ``[global]`` | |
208 | section of your configuration file. | |
209 | ||
210 | .. index:: monitor heartbeat | |
211 | ||
212 | Monitor Settings | |
213 | ---------------- | |
214 | ||
215 | ``mon osd min up ratio`` | |
216 | ||
31f18b77 | 217 | :Description: The minimum ratio of ``up`` Ceph OSD Daemons before Ceph will |
7c673cae | 218 | mark Ceph OSD Daemons ``down``. |
31f18b77 | 219 | |
7c673cae FG |
220 | :Type: Double |
221 | :Default: ``.3`` | |
222 | ||
223 | ||
224 | ``mon osd min in ratio`` | |
225 | ||
31f18b77 | 226 | :Description: The minimum ratio of ``in`` Ceph OSD Daemons before Ceph will |
7c673cae | 227 | mark Ceph OSD Daemons ``out``. |
31f18b77 | 228 | |
7c673cae FG |
229 | :Type: Double |
230 | :Default: ``.75`` | |
231 | ||
232 | ||
233 | ``mon osd laggy halflife`` | |
234 | ||
235 | :Description: The number of seconds laggy estimates will decay. | |
236 | :Type: Integer | |
237 | :Default: ``60*60`` | |
238 | ||
239 | ||
240 | ``mon osd laggy weight`` | |
241 | ||
242 | :Description: The weight for new samples in laggy estimation decay. | |
243 | :Type: Double | |
244 | :Default: ``0.3`` | |
245 | ||
246 | ||
31f18b77 FG |
247 | |
248 | ``mon osd laggy max interval`` | |
224ce89b | 249 | |
31f18b77 FG |
250 | :Description: Maximum value of ``laggy_interval`` in laggy estimations (in seconds). |
251 | Monitor uses an adaptive approach to evaluate the ``laggy_interval`` of | |
252 | a certain OSD. This value will be used to calculate the grace time for | |
253 | that OSD. | |
254 | :Type: Integer | |
255 | :Default: 300 | |
256 | ||
7c673cae FG |
257 | ``mon osd adjust heartbeat grace`` |
258 | ||
259 | :Description: If set to ``true``, Ceph will scale based on laggy estimations. | |
260 | :Type: Boolean | |
261 | :Default: ``true`` | |
262 | ||
263 | ||
264 | ``mon osd adjust down out interval`` | |
265 | ||
266 | :Description: If set to ``true``, Ceph will scaled based on laggy estimations. | |
267 | :Type: Boolean | |
268 | :Default: ``true`` | |
269 | ||
270 | ||
31f18b77 | 271 | ``mon osd auto mark in`` |
7c673cae | 272 | |
31f18b77 | 273 | :Description: Ceph will mark any booting Ceph OSD Daemons as ``in`` |
7c673cae FG |
274 | the Ceph Storage Cluster. |
275 | ||
276 | :Type: Boolean | |
277 | :Default: ``false`` | |
278 | ||
279 | ||
31f18b77 | 280 | ``mon osd auto mark auto out in`` |
7c673cae | 281 | |
31f18b77 | 282 | :Description: Ceph will mark booting Ceph OSD Daemons auto marked ``out`` |
7c673cae | 283 | of the Ceph Storage Cluster as ``in`` the cluster. |
31f18b77 | 284 | |
7c673cae | 285 | :Type: Boolean |
31f18b77 | 286 | :Default: ``true`` |
7c673cae FG |
287 | |
288 | ||
31f18b77 | 289 | ``mon osd auto mark new in`` |
7c673cae | 290 | |
31f18b77 | 291 | :Description: Ceph will mark booting new Ceph OSD Daemons as ``in`` the |
7c673cae | 292 | Ceph Storage Cluster. |
31f18b77 | 293 | |
7c673cae | 294 | :Type: Boolean |
31f18b77 | 295 | :Default: ``true`` |
7c673cae FG |
296 | |
297 | ||
31f18b77 | 298 | ``mon osd down out interval`` |
7c673cae FG |
299 | |
300 | :Description: The number of seconds Ceph waits before marking a Ceph OSD Daemon | |
301 | ``down`` and ``out`` if it doesn't respond. | |
31f18b77 | 302 | |
7c673cae FG |
303 | :Type: 32-bit Integer |
304 | :Default: ``600`` | |
305 | ||
306 | ||
307 | ``mon osd down out subtree limit`` | |
308 | ||
309 | :Description: The smallest :term:`CRUSH` unit type that Ceph will **not** | |
310 | automatically mark out. For instance, if set to ``host`` and if | |
311 | all OSDs of a host are down, Ceph will not automatically mark out | |
312 | these OSDs. | |
313 | ||
314 | :Type: String | |
315 | :Default: ``rack`` | |
316 | ||
317 | ||
31f18b77 | 318 | ``mon osd report timeout`` |
7c673cae | 319 | |
31f18b77 | 320 | :Description: The grace period in seconds before declaring |
7c673cae FG |
321 | unresponsive Ceph OSD Daemons ``down``. |
322 | ||
323 | :Type: 32-bit Integer | |
324 | :Default: ``900`` | |
325 | ||
31f18b77 | 326 | ``mon osd min down reporters`` |
7c673cae | 327 | |
31f18b77 | 328 | :Description: The minimum number of Ceph OSD Daemons required to report a |
7c673cae FG |
329 | ``down`` Ceph OSD Daemon. |
330 | ||
331 | :Type: 32-bit Integer | |
31f18b77 FG |
332 | :Default: ``2`` |
333 | ||
334 | ||
335 | ``mon osd reporter subtree level`` | |
336 | ||
337 | :Description: In which level of parent bucket the reporters are counted. The OSDs | |
338 | send failure reports to monitor if they find its peer is not responsive. | |
339 | And monitor mark the reported OSD out and then down after a grace period. | |
340 | :Type: String | |
341 | :Default: ``host`` | |
7c673cae FG |
342 | |
343 | ||
344 | .. index:: OSD hearbeat | |
345 | ||
346 | OSD Settings | |
347 | ------------ | |
348 | ||
349 | ``osd heartbeat address`` | |
350 | ||
31f18b77 | 351 | :Description: An Ceph OSD Daemon's network address for heartbeats. |
7c673cae FG |
352 | :Type: Address |
353 | :Default: The host address. | |
354 | ||
355 | ||
31f18b77 | 356 | ``osd heartbeat interval`` |
7c673cae FG |
357 | |
358 | :Description: How often an Ceph OSD Daemon pings its peers (in seconds). | |
359 | :Type: 32-bit Integer | |
360 | :Default: ``6`` | |
361 | ||
362 | ||
31f18b77 | 363 | ``osd heartbeat grace`` |
7c673cae FG |
364 | |
365 | :Description: The elapsed time when a Ceph OSD Daemon hasn't shown a heartbeat | |
366 | that the Ceph Storage Cluster considers it ``down``. | |
367 | This setting has to be set in both the [mon] and [osd] or [global] | |
368 | section so that it is read by both the MON and OSD daemons. | |
7c673cae FG |
369 | :Type: 32-bit Integer |
370 | :Default: ``20`` | |
371 | ||
372 | ||
31f18b77 | 373 | ``osd mon heartbeat interval`` |
7c673cae | 374 | |
31f18b77 | 375 | :Description: How often the Ceph OSD Daemon pings a Ceph Monitor if it has no |
7c673cae FG |
376 | Ceph OSD Daemon peers. |
377 | ||
378 | :Type: 32-bit Integer | |
31f18b77 | 379 | :Default: ``30`` |
7c673cae FG |
380 | |
381 | ||
31f18b77 | 382 | ``osd mon report interval max`` |
7c673cae FG |
383 | |
384 | :Description: The maximum time in seconds that a Ceph OSD Daemon can wait before | |
385 | it must report to a Ceph Monitor. | |
386 | ||
387 | :Type: 32-bit Integer | |
31f18b77 | 388 | :Default: ``120`` |
7c673cae FG |
389 | |
390 | ||
31f18b77 | 391 | ``osd mon report interval min`` |
7c673cae FG |
392 | |
393 | :Description: The minimum number of seconds a Ceph OSD Daemon may wait | |
31f18b77 | 394 | from startup or another reportable event before reporting |
7c673cae FG |
395 | to a Ceph Monitor. |
396 | ||
397 | :Type: 32-bit Integer | |
398 | :Default: ``5`` | |
31f18b77 | 399 | :Valid Range: Should be less than ``osd mon report interval max`` |
7c673cae FG |
400 | |
401 | ||
31f18b77 | 402 | ``osd mon ack timeout`` |
7c673cae | 403 | |
31f18b77 | 404 | :Description: The number of seconds to wait for a Ceph Monitor to acknowledge a |
7c673cae FG |
405 | request for statistics. |
406 | ||
407 | :Type: 32-bit Integer | |
31f18b77 | 408 | :Default: ``30`` |