]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ===================================== |
2 | Configuring Monitor/OSD Interaction | |
3 | ===================================== | |
4 | ||
5 | .. index:: heartbeat | |
6 | ||
7 | After you have completed your initial Ceph configuration, you may deploy and run | |
8 | Ceph. When you execute a command such as ``ceph health`` or ``ceph -s``, the | |
9 | :term:`Ceph Monitor` reports on the current state of the :term:`Ceph Storage | |
10 | Cluster`. The Ceph Monitor knows about the Ceph Storage Cluster by requiring | |
11 | reports from each :term:`Ceph OSD Daemon`, and by receiving reports from Ceph | |
12 | OSD Daemons about the status of their neighboring Ceph OSD Daemons. If the Ceph | |
13 | Monitor doesn't receive reports, or if it receives reports of changes in the | |
14 | Ceph Storage Cluster, the Ceph Monitor updates the status of the :term:`Ceph | |
15 | Cluster Map`. | |
16 | ||
17 | Ceph provides reasonable default settings for Ceph Monitor/Ceph OSD Daemon | |
18 | interaction. However, you may override the defaults. The following sections | |
19 | describe how Ceph Monitors and Ceph OSD Daemons interact for the purposes of | |
20 | monitoring the Ceph Storage Cluster. | |
21 | ||
22 | .. index:: heartbeat interval | |
23 | ||
24 | OSDs Check Heartbeats | |
25 | ===================== | |
26 | ||
eafe8130 TL |
27 | Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons at random |
28 | intervals less than every 6 seconds. If a neighboring Ceph OSD Daemon doesn't | |
7c673cae FG |
29 | show a heartbeat within a 20 second grace period, the Ceph OSD Daemon may |
30 | consider the neighboring Ceph OSD Daemon ``down`` and report it back to a Ceph | |
31 | Monitor, which will update the Ceph Cluster Map. You may change this grace | |
32 | period by adding an ``osd heartbeat grace`` setting under the ``[mon]`` | |
33 | and ``[osd]`` or ``[global]`` section of your Ceph configuration file, | |
34 | or by setting the value at runtime. | |
35 | ||
36 | ||
f91f0fd5 TL |
37 | .. ditaa:: |
38 | +---------+ +---------+ | |
7c673cae FG |
39 | | OSD 1 | | OSD 2 | |
40 | +---------+ +---------+ | |
41 | | | | |
42 | |----+ Heartbeat | | |
43 | | | Interval | | |
44 | |<---+ Exceeded | | |
45 | | | | |
46 | | Check | | |
47 | | Heartbeat | | |
48 | |------------------->| | |
49 | | | | |
50 | |<-------------------| | |
51 | | Heart Beating | | |
52 | | | | |
53 | |----+ Heartbeat | | |
54 | | | Interval | | |
55 | |<---+ Exceeded | | |
56 | | | | |
57 | | Check | | |
58 | | Heartbeat | | |
59 | |------------------->| | |
60 | | | | |
61 | |----+ Grace | | |
62 | | | Period | | |
63 | |<---+ Exceeded | | |
64 | | | | |
65 | |----+ Mark | | |
66 | | | OSD 2 | | |
67 | |<---+ Down | | |
31f18b77 | 68 | |
7c673cae FG |
69 | |
70 | .. index:: OSD down report | |
71 | ||
72 | OSDs Report Down OSDs | |
73 | ===================== | |
74 | ||
31f18b77 FG |
75 | By default, two Ceph OSD Daemons from different hosts must report to the Ceph |
76 | Monitors that another Ceph OSD Daemon is ``down`` before the Ceph Monitors | |
77 | acknowledge that the reported Ceph OSD Daemon is ``down``. But there is chance | |
78 | that all the OSDs reporting the failure are hosted in a rack with a bad switch | |
79 | which has trouble connecting to another OSD. To avoid this sort of false alarm, | |
80 | we consider the peers reporting a failure a proxy for a potential "subcluster" | |
81 | over the overall cluster that is similarly laggy. This is clearly not true in | |
82 | all cases, but will sometimes help us localize the grace correction to a subset | |
83 | of the system that is unhappy. ``mon osd reporter subtree level`` is used to | |
84 | group the peers into the "subcluster" by their common ancestor type in CRUSH | |
85 | map. By default, only two reports from different subtree are required to report | |
86 | another Ceph OSD Daemon ``down``. You can change the number of reporters from | |
87 | unique subtrees and the common ancestor type required to report a Ceph OSD | |
88 | Daemon ``down`` to a Ceph Monitor by adding an ``mon osd min down reporters`` | |
89 | and ``mon osd reporter subtree level`` settings under the ``[mon]`` section of | |
90 | your Ceph configuration file, or by setting the value at runtime. | |
91 | ||
92 | ||
f91f0fd5 TL |
93 | .. ditaa:: |
94 | ||
95 | +---------+ +---------+ +---------+ | |
31f18b77 FG |
96 | | OSD 1 | | OSD 2 | | Monitor | |
97 | +---------+ +---------+ +---------+ | |
98 | | | | | |
99 | | OSD 3 Is Down | | | |
100 | |---------------+--------------->| | |
101 | | | | | |
102 | | | | | |
103 | | | OSD 3 Is Down | | |
104 | | |--------------->| | |
105 | | | | | |
106 | | | | | |
107 | | | |---------+ Mark | |
108 | | | | | OSD 3 | |
109 | | | |<--------+ Down | |
7c673cae FG |
110 | |
111 | ||
112 | .. index:: peering failure | |
113 | ||
114 | OSDs Report Peering Failure | |
115 | =========================== | |
116 | ||
117 | If a Ceph OSD Daemon cannot peer with any of the Ceph OSD Daemons defined in its | |
118 | Ceph configuration file (or the cluster map), it will ping a Ceph Monitor for | |
119 | the most recent copy of the cluster map every 30 seconds. You can change the | |
120 | Ceph Monitor heartbeat interval by adding an ``osd mon heartbeat interval`` | |
121 | setting under the ``[osd]`` section of your Ceph configuration file, or by | |
122 | setting the value at runtime. | |
123 | ||
f91f0fd5 TL |
124 | .. ditaa:: |
125 | ||
126 | +---------+ +---------+ +-------+ +---------+ | |
7c673cae FG |
127 | | OSD 1 | | OSD 2 | | OSD 3 | | Monitor | |
128 | +---------+ +---------+ +-------+ +---------+ | |
129 | | | | | | |
130 | | Request To | | | | |
31f18b77 | 131 | | Peer | | | |
7c673cae FG |
132 | |-------------->| | | |
133 | |<--------------| | | | |
134 | | Peering | | | |
135 | | | | | |
136 | | Request To | | | |
31f18b77 | 137 | | Peer | | |
7c673cae FG |
138 | |----------------------------->| | |
139 | | | | |
140 | |----+ OSD Monitor | | |
141 | | | Heartbeat | | |
142 | |<---+ Interval Exceeded | | |
143 | | | | |
144 | | Failed to Peer with OSD 3 | | |
145 | |-------------------------------------------->| | |
146 | |<--------------------------------------------| | |
147 | | Receive New Cluster Map | | |
31f18b77 | 148 | |
7c673cae FG |
149 | |
150 | .. index:: OSD status | |
151 | ||
152 | OSDs Report Their Status | |
153 | ======================== | |
154 | ||
155 | If an Ceph OSD Daemon doesn't report to a Ceph Monitor, the Ceph Monitor will | |
156 | consider the Ceph OSD Daemon ``down`` after the ``mon osd report timeout`` | |
157 | elapses. A Ceph OSD Daemon sends a report to a Ceph Monitor when a reportable | |
158 | event such as a failure, a change in placement group stats, a change in | |
159 | ``up_thru`` or when it boots within 5 seconds. You can change the Ceph OSD | |
11fdf7f2 | 160 | Daemon minimum report interval by adding an ``osd mon report interval`` |
7c673cae | 161 | setting under the ``[osd]`` section of your Ceph configuration file, or by |
31f18b77 FG |
162 | setting the value at runtime. A Ceph OSD Daemon sends a report to a Ceph |
163 | Monitor every 120 seconds irrespective of whether any notable changes occur. | |
164 | You can change the Ceph Monitor report interval by adding an ``osd mon report | |
165 | interval max`` setting under the ``[osd]`` section of your Ceph configuration | |
7c673cae FG |
166 | file, or by setting the value at runtime. |
167 | ||
168 | ||
f91f0fd5 TL |
169 | .. ditaa:: |
170 | ||
171 | +---------+ +---------+ | |
7c673cae FG |
172 | | OSD 1 | | Monitor | |
173 | +---------+ +---------+ | |
174 | | | | |
175 | |----+ Report Min | | |
176 | | | Interval | | |
177 | |<---+ Exceeded | | |
178 | | | | |
179 | |----+ Reportable | | |
180 | | | Event | | |
181 | |<---+ Occurs | | |
182 | | | | |
183 | | Report To | | |
184 | | Monitor | | |
185 | |------------------->| | |
186 | | | | |
187 | |----+ Report Max | | |
188 | | | Interval | | |
189 | |<---+ Exceeded | | |
190 | | | | |
191 | | Report To | | |
192 | | Monitor | | |
193 | |------------------->| | |
194 | | | | |
195 | |----+ Monitor | | |
196 | | | Fails | | |
197 | |<---+ | | |
198 | +----+ Monitor OSD | |
199 | | | Report Timeout | |
200 | |<---+ Exceeded | |
201 | | | |
202 | +----+ Mark | |
203 | | | OSD 1 | |
204 | |<---+ Down | |
205 | ||
206 | ||
207 | ||
208 | ||
209 | Configuration Settings | |
210 | ====================== | |
211 | ||
212 | When modifying heartbeat settings, you should include them in the ``[global]`` | |
213 | section of your configuration file. | |
214 | ||
215 | .. index:: monitor heartbeat | |
216 | ||
217 | Monitor Settings | |
218 | ---------------- | |
219 | ||
20effc67 TL |
220 | .. confval:: mon_osd_min_up_ratio |
221 | .. confval:: mon_osd_min_in_ratio | |
222 | .. confval:: mon_osd_laggy_halflife | |
223 | .. confval:: mon_osd_laggy_weight | |
224 | .. confval:: mon_osd_laggy_max_interval | |
225 | .. confval:: mon_osd_adjust_heartbeat_grace | |
226 | .. confval:: mon_osd_adjust_down_out_interval | |
227 | .. confval:: mon_osd_auto_mark_in | |
228 | .. confval:: mon_osd_auto_mark_auto_out_in | |
229 | .. confval:: mon_osd_auto_mark_new_in | |
230 | .. confval:: mon_osd_down_out_interval | |
231 | .. confval:: mon_osd_down_out_subtree_limit | |
232 | .. confval:: mon_osd_report_timeout | |
233 | .. confval:: mon_osd_min_down_reporters | |
234 | .. confval:: mon_osd_reporter_subtree_level | |
7c673cae | 235 | |
1e59de90 | 236 | .. index:: OSD heartbeat |
7c673cae FG |
237 | |
238 | OSD Settings | |
239 | ------------ | |
240 | ||
20effc67 TL |
241 | .. confval:: osd_heartbeat_interval |
242 | .. confval:: osd_heartbeat_grace | |
243 | .. confval:: osd_mon_heartbeat_interval | |
244 | .. confval:: osd_mon_heartbeat_stat_stale | |
245 | .. confval:: osd_mon_report_interval |