]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ========================== |
2 | Monitor Config Reference | |
3 | ========================== | |
4 | ||
5 | Understanding how to configure a :term:`Ceph Monitor` is an important part of | |
6 | building a reliable :term:`Ceph Storage Cluster`. **All Ceph Storage Clusters | |
7 | have at least one monitor**. A monitor configuration usually remains fairly | |
8 | consistent, but you can add, remove or replace a monitor in a cluster. See | |
9 | `Adding/Removing a Monitor`_ and `Add/Remove a Monitor (ceph-deploy)`_ for | |
10 | details. | |
11 | ||
12 | ||
13 | .. index:: Ceph Monitor; Paxos | |
14 | ||
15 | Background | |
16 | ========== | |
17 | ||
18 | Ceph Monitors maintain a "master copy" of the :term:`cluster map`, which means a | |
19 | :term:`Ceph Client` can determine the location of all Ceph Monitors, Ceph OSD | |
20 | Daemons, and Ceph Metadata Servers just by connecting to one Ceph Monitor and | |
21 | retrieving a current cluster map. Before Ceph Clients can read from or write to | |
22 | Ceph OSD Daemons or Ceph Metadata Servers, they must connect to a Ceph Monitor | |
23 | first. With a current copy of the cluster map and the CRUSH algorithm, a Ceph | |
24 | Client can compute the location for any object. The ability to compute object | |
25 | locations allows a Ceph Client to talk directly to Ceph OSD Daemons, which is a | |
26 | very important aspect of Ceph's high scalability and performance. See | |
27 | `Scalability and High Availability`_ for additional details. | |
28 | ||
29 | The primary role of the Ceph Monitor is to maintain a master copy of the cluster | |
30 | map. Ceph Monitors also provide authentication and logging services. Ceph | |
31 | Monitors write all changes in the monitor services to a single Paxos instance, | |
32 | and Paxos writes the changes to a key/value store for strong consistency. Ceph | |
33 | Monitors can query the most recent version of the cluster map during sync | |
34 | operations. Ceph Monitors leverage the key/value store's snapshots and iterators | |
35 | (using leveldb) to perform store-wide synchronization. | |
36 | ||
37 | .. ditaa:: | |
38 | ||
39 | /-------------\ /-------------\ | |
40 | | Monitor | Write Changes | Paxos | | |
41 | | cCCC +-------------->+ cCCC | | |
42 | | | | | | |
43 | +-------------+ \------+------/ | |
44 | | Auth | | | |
45 | +-------------+ | Write Changes | |
46 | | Log | | | |
47 | +-------------+ v | |
48 | | Monitor Map | /------+------\ | |
49 | +-------------+ | Key / Value | | |
50 | | OSD Map | | Store | | |
51 | +-------------+ | cCCC | | |
52 | | PG Map | \------+------/ | |
53 | +-------------+ ^ | |
54 | | MDS Map | | Read Changes | |
55 | +-------------+ | | |
56 | | cCCC |*---------------------+ | |
57 | \-------------/ | |
58 | ||
59 | ||
60 | .. deprecated:: version 0.58 | |
61 | ||
62 | In Ceph versions 0.58 and earlier, Ceph Monitors use a Paxos instance for | |
63 | each service and store the map as a file. | |
64 | ||
65 | .. index:: Ceph Monitor; cluster map | |
66 | ||
67 | Cluster Maps | |
68 | ------------ | |
69 | ||
70 | The cluster map is a composite of maps, including the monitor map, the OSD map, | |
71 | the placement group map and the metadata server map. The cluster map tracks a | |
72 | number of important things: which processes are ``in`` the Ceph Storage Cluster; | |
73 | which processes that are ``in`` the Ceph Storage Cluster are ``up`` and running | |
74 | or ``down``; whether, the placement groups are ``active`` or ``inactive``, and | |
75 | ``clean`` or in some other state; and, other details that reflect the current | |
76 | state of the cluster such as the total amount of storage space, and the amount | |
77 | of storage used. | |
78 | ||
79 | When there is a significant change in the state of the cluster--e.g., a Ceph OSD | |
80 | Daemon goes down, a placement group falls into a degraded state, etc.--the | |
81 | cluster map gets updated to reflect the current state of the cluster. | |
82 | Additionally, the Ceph Monitor also maintains a history of the prior states of | |
83 | the cluster. The monitor map, OSD map, placement group map and metadata server | |
84 | map each maintain a history of their map versions. We call each version an | |
85 | "epoch." | |
86 | ||
87 | When operating your Ceph Storage Cluster, keeping track of these states is an | |
88 | important part of your system administration duties. See `Monitoring a Cluster`_ | |
89 | and `Monitoring OSDs and PGs`_ for additional details. | |
90 | ||
91 | .. index:: high availability; quorum | |
92 | ||
93 | Monitor Quorum | |
94 | -------------- | |
95 | ||
96 | Our Configuring ceph section provides a trivial `Ceph configuration file`_ that | |
97 | provides for one monitor in the test cluster. A cluster will run fine with a | |
98 | single monitor; however, **a single monitor is a single-point-of-failure**. To | |
99 | ensure high availability in a production Ceph Storage Cluster, you should run | |
100 | Ceph with multiple monitors so that the failure of a single monitor **WILL NOT** | |
101 | bring down your entire cluster. | |
102 | ||
103 | When a Ceph Storage Cluster runs multiple Ceph Monitors for high availability, | |
104 | Ceph Monitors use `Paxos`_ to establish consensus about the master cluster map. | |
105 | A consensus requires a majority of monitors running to establish a quorum for | |
106 | consensus about the cluster map (e.g., 1; 2 out of 3; 3 out of 5; 4 out of 6; | |
107 | etc.). | |
108 | ||
31f18b77 FG |
109 | ``mon force quorum join`` |
110 | ||
111 | :Description: Force monitor to join quorum even if it has been previously removed from the map | |
112 | :Type: Boolean | |
113 | :Default: ``False`` | |
7c673cae FG |
114 | |
115 | .. index:: Ceph Monitor; consistency | |
116 | ||
117 | Consistency | |
118 | ----------- | |
119 | ||
120 | When you add monitor settings to your Ceph configuration file, you need to be | |
121 | aware of some of the architectural aspects of Ceph Monitors. **Ceph imposes | |
122 | strict consistency requirements** for a Ceph monitor when discovering another | |
123 | Ceph Monitor within the cluster. Whereas, Ceph Clients and other Ceph daemons | |
124 | use the Ceph configuration file to discover monitors, monitors discover each | |
125 | other using the monitor map (monmap), not the Ceph configuration file. | |
126 | ||
127 | A Ceph Monitor always refers to the local copy of the monmap when discovering | |
128 | other Ceph Monitors in the Ceph Storage Cluster. Using the monmap instead of the | |
129 | Ceph configuration file avoids errors that could break the cluster (e.g., typos | |
130 | in ``ceph.conf`` when specifying a monitor address or port). Since monitors use | |
131 | monmaps for discovery and they share monmaps with clients and other Ceph | |
132 | daemons, **the monmap provides monitors with a strict guarantee that their | |
133 | consensus is valid.** | |
134 | ||
135 | Strict consistency also applies to updates to the monmap. As with any other | |
136 | updates on the Ceph Monitor, changes to the monmap always run through a | |
137 | distributed consensus algorithm called `Paxos`_. The Ceph Monitors must agree on | |
138 | each update to the monmap, such as adding or removing a Ceph Monitor, to ensure | |
139 | that each monitor in the quorum has the same version of the monmap. Updates to | |
140 | the monmap are incremental so that Ceph Monitors have the latest agreed upon | |
141 | version, and a set of previous versions. Maintaining a history enables a Ceph | |
142 | Monitor that has an older version of the monmap to catch up with the current | |
143 | state of the Ceph Storage Cluster. | |
144 | ||
145 | If Ceph Monitors discovered each other through the Ceph configuration file | |
146 | instead of through the monmap, it would introduce additional risks because the | |
c07f9fc5 | 147 | Ceph configuration files are not updated and distributed automatically. Ceph |
7c673cae FG |
148 | Monitors might inadvertently use an older Ceph configuration file, fail to |
149 | recognize a Ceph Monitor, fall out of a quorum, or develop a situation where | |
c07f9fc5 | 150 | `Paxos`_ is not able to determine the current state of the system accurately. |
7c673cae FG |
151 | |
152 | ||
153 | .. index:: Ceph Monitor; bootstrapping monitors | |
154 | ||
155 | Bootstrapping Monitors | |
156 | ---------------------- | |
157 | ||
158 | In most configuration and deployment cases, tools that deploy Ceph may help | |
159 | bootstrap the Ceph Monitors by generating a monitor map for you (e.g., | |
160 | ``ceph-deploy``, etc). A Ceph Monitor requires a few explicit | |
161 | settings: | |
162 | ||
163 | - **Filesystem ID**: The ``fsid`` is the unique identifier for your | |
164 | object store. Since you can run multiple clusters on the same | |
165 | hardware, you must specify the unique ID of the object store when | |
166 | bootstrapping a monitor. Deployment tools usually do this for you | |
167 | (e.g., ``ceph-deploy`` can call a tool like ``uuidgen``), but you | |
168 | may specify the ``fsid`` manually too. | |
169 | ||
170 | - **Monitor ID**: A monitor ID is a unique ID assigned to each monitor within | |
171 | the cluster. It is an alphanumeric value, and by convention the identifier | |
172 | usually follows an alphabetical increment (e.g., ``a``, ``b``, etc.). This | |
173 | can be set in a Ceph configuration file (e.g., ``[mon.a]``, ``[mon.b]``, etc.), | |
174 | by a deployment tool, or using the ``ceph`` commandline. | |
175 | ||
176 | - **Keys**: The monitor must have secret keys. A deployment tool such as | |
177 | ``ceph-deploy`` usually does this for you, but you may | |
178 | perform this step manually too. See `Monitor Keyrings`_ for details. | |
179 | ||
180 | For additional details on bootstrapping, see `Bootstrapping a Monitor`_. | |
181 | ||
182 | .. index:: Ceph Monitor; configuring monitors | |
183 | ||
184 | Configuring Monitors | |
185 | ==================== | |
186 | ||
187 | To apply configuration settings to the entire cluster, enter the configuration | |
188 | settings under ``[global]``. To apply configuration settings to all monitors in | |
189 | your cluster, enter the configuration settings under ``[mon]``. To apply | |
190 | configuration settings to specific monitors, specify the monitor instance | |
191 | (e.g., ``[mon.a]``). By convention, monitor instance names use alpha notation. | |
192 | ||
193 | .. code-block:: ini | |
194 | ||
195 | [global] | |
196 | ||
197 | [mon] | |
198 | ||
199 | [mon.a] | |
200 | ||
201 | [mon.b] | |
202 | ||
203 | [mon.c] | |
204 | ||
205 | ||
206 | Minimum Configuration | |
207 | --------------------- | |
208 | ||
209 | The bare minimum monitor settings for a Ceph monitor via the Ceph configuration | |
210 | file include a hostname and a monitor address for each monitor. You can configure | |
211 | these under ``[mon]`` or under the entry for a specific monitor. | |
212 | ||
213 | .. code-block:: ini | |
214 | ||
11fdf7f2 TL |
215 | [global] |
216 | mon host = 10.0.0.2,10.0.0.3,10.0.0.4 | |
7c673cae FG |
217 | |
218 | .. code-block:: ini | |
219 | ||
220 | [mon.a] | |
221 | host = hostname1 | |
222 | mon addr = 10.0.0.10:6789 | |
223 | ||
224 | See the `Network Configuration Reference`_ for details. | |
225 | ||
226 | .. note:: This minimum configuration for monitors assumes that a deployment | |
227 | tool generates the ``fsid`` and the ``mon.`` key for you. | |
228 | ||
229 | Once you deploy a Ceph cluster, you **SHOULD NOT** change the IP address of | |
230 | the monitors. However, if you decide to change the monitor's IP address, you | |
231 | must follow a specific procedure. See `Changing a Monitor's IP Address`_ for | |
232 | details. | |
233 | ||
234 | Monitors can also be found by clients using DNS SRV records. See `Monitor lookup through DNS`_ for details. | |
235 | ||
236 | Cluster ID | |
237 | ---------- | |
238 | ||
239 | Each Ceph Storage Cluster has a unique identifier (``fsid``). If specified, it | |
240 | usually appears under the ``[global]`` section of the configuration file. | |
241 | Deployment tools usually generate the ``fsid`` and store it in the monitor map, | |
242 | so the value may not appear in a configuration file. The ``fsid`` makes it | |
243 | possible to run daemons for multiple clusters on the same hardware. | |
244 | ||
245 | ``fsid`` | |
246 | ||
247 | :Description: The cluster ID. One per cluster. | |
248 | :Type: UUID | |
249 | :Required: Yes. | |
250 | :Default: N/A. May be generated by a deployment tool if not specified. | |
251 | ||
252 | .. note:: Do not set this value if you use a deployment tool that does | |
253 | it for you. | |
254 | ||
255 | ||
256 | .. index:: Ceph Monitor; initial members | |
257 | ||
258 | Initial Members | |
259 | --------------- | |
260 | ||
261 | We recommend running a production Ceph Storage Cluster with at least three Ceph | |
262 | Monitors to ensure high availability. When you run multiple monitors, you may | |
263 | specify the initial monitors that must be members of the cluster in order to | |
264 | establish a quorum. This may reduce the time it takes for your cluster to come | |
265 | online. | |
266 | ||
267 | .. code-block:: ini | |
268 | ||
269 | [mon] | |
270 | mon initial members = a,b,c | |
271 | ||
272 | ||
273 | ``mon initial members`` | |
274 | ||
275 | :Description: The IDs of initial monitors in a cluster during startup. If | |
276 | specified, Ceph requires an odd number of monitors to form an | |
277 | initial quorum (e.g., 3). | |
278 | ||
279 | :Type: String | |
280 | :Default: None | |
281 | ||
282 | .. note:: A *majority* of monitors in your cluster must be able to reach | |
283 | each other in order to establish a quorum. You can decrease the initial | |
284 | number of monitors to establish a quorum with this setting. | |
285 | ||
286 | .. index:: Ceph Monitor; data path | |
287 | ||
288 | Data | |
289 | ---- | |
290 | ||
291 | Ceph provides a default path where Ceph Monitors store data. For optimal | |
292 | performance in a production Ceph Storage Cluster, we recommend running Ceph | |
293 | Monitors on separate hosts and drives from Ceph OSD Daemons. As leveldb is using | |
294 | ``mmap()`` for writing the data, Ceph Monitors flush their data from memory to disk | |
295 | very often, which can interfere with Ceph OSD Daemon workloads if the data | |
296 | store is co-located with the OSD Daemons. | |
297 | ||
298 | In Ceph versions 0.58 and earlier, Ceph Monitors store their data in files. This | |
299 | approach allows users to inspect monitor data with common tools like ``ls`` | |
300 | and ``cat``. However, it doesn't provide strong consistency. | |
301 | ||
302 | In Ceph versions 0.59 and later, Ceph Monitors store their data as key/value | |
303 | pairs. Ceph Monitors require `ACID`_ transactions. Using a data store prevents | |
304 | recovering Ceph Monitors from running corrupted versions through Paxos, and it | |
305 | enables multiple modification operations in one single atomic batch, among other | |
306 | advantages. | |
307 | ||
308 | Generally, we do not recommend changing the default data location. If you modify | |
309 | the default location, we recommend that you make it uniform across Ceph Monitors | |
310 | by setting it in the ``[mon]`` section of the configuration file. | |
311 | ||
312 | ||
313 | ``mon data`` | |
314 | ||
315 | :Description: The monitor's data location. | |
316 | :Type: String | |
317 | :Default: ``/var/lib/ceph/mon/$cluster-$id`` | |
318 | ||
319 | ||
31f18b77 FG |
320 | ``mon data size warn`` |
321 | ||
322 | :Description: Issue a ``HEALTH_WARN`` in cluster log when the monitor's data | |
323 | store goes over 15GB. | |
9f95a23c | 324 | |
31f18b77 | 325 | :Type: Integer |
9f95a23c | 326 | :Default: ``15*1024*1024*1024`` |
31f18b77 FG |
327 | |
328 | ||
329 | ``mon data avail warn`` | |
330 | ||
331 | :Description: Issue a ``HEALTH_WARN`` in cluster log when the available disk | |
332 | space of monitor's data store is lower or equal to this | |
333 | percentage. | |
9f95a23c | 334 | |
31f18b77 | 335 | :Type: Integer |
9f95a23c | 336 | :Default: ``30`` |
31f18b77 FG |
337 | |
338 | ||
339 | ``mon data avail crit`` | |
340 | ||
341 | :Description: Issue a ``HEALTH_ERR`` in cluster log when the available disk | |
342 | space of monitor's data store is lower or equal to this | |
343 | percentage. | |
9f95a23c | 344 | |
31f18b77 | 345 | :Type: Integer |
9f95a23c | 346 | :Default: ``5`` |
31f18b77 FG |
347 | |
348 | ||
349 | ``mon warn on cache pools without hit sets`` | |
350 | ||
351 | :Description: Issue a ``HEALTH_WARN`` in cluster log if a cache pool does not | |
11fdf7f2 TL |
352 | have the ``hit_set_type`` value configured. |
353 | See :ref:`hit_set_type <hit_set_type>` for more | |
31f18b77 | 354 | details. |
9f95a23c | 355 | |
31f18b77 | 356 | :Type: Boolean |
9f95a23c | 357 | :Default: ``True`` |
31f18b77 FG |
358 | |
359 | ||
360 | ``mon warn on crush straw calc version zero`` | |
361 | ||
362 | :Description: Issue a ``HEALTH_WARN`` in cluster log if the CRUSH's | |
363 | ``straw_calc_version`` is zero. See | |
11fdf7f2 | 364 | :ref:`CRUSH map tunables <crush-map-tunables>` for |
31f18b77 | 365 | details. |
9f95a23c | 366 | |
31f18b77 | 367 | :Type: Boolean |
9f95a23c | 368 | :Default: ``True`` |
31f18b77 FG |
369 | |
370 | ||
371 | ``mon warn on legacy crush tunables`` | |
372 | ||
373 | :Description: Issue a ``HEALTH_WARN`` in cluster log if | |
374 | CRUSH tunables are too old (older than ``mon_min_crush_required_version``) | |
9f95a23c | 375 | |
31f18b77 | 376 | :Type: Boolean |
9f95a23c | 377 | :Default: ``True`` |
31f18b77 FG |
378 | |
379 | ||
380 | ``mon crush min required version`` | |
381 | ||
382 | :Description: The minimum tunable profile version required by the cluster. | |
383 | See | |
11fdf7f2 | 384 | :ref:`CRUSH map tunables <crush-map-tunables>` for |
31f18b77 | 385 | details. |
9f95a23c | 386 | |
31f18b77 | 387 | :Type: String |
9f95a23c | 388 | :Default: ``hammer`` |
31f18b77 FG |
389 | |
390 | ||
391 | ``mon warn on osd down out interval zero`` | |
392 | ||
393 | :Description: Issue a ``HEALTH_WARN`` in cluster log if | |
394 | ``mon osd down out interval`` is zero. Having this option set to | |
395 | zero on the leader acts much like the ``noout`` flag. It's hard | |
11fdf7f2 | 396 | to figure out what's going wrong with clusters without the |
31f18b77 FG |
397 | ``noout`` flag set but acting like that just the same, so we |
398 | report a warning in this case. | |
9f95a23c | 399 | |
31f18b77 | 400 | :Type: Boolean |
9f95a23c | 401 | :Default: ``True`` |
31f18b77 FG |
402 | |
403 | ||
eafe8130 TL |
404 | ``mon warn on slow ping ratio`` |
405 | ||
406 | :Description: Issue a ``HEALTH_WARN`` in cluster log if any heartbeat | |
407 | between OSDs exceeds ``mon warn on slow ping ratio`` | |
408 | of ``osd heartbeat grace``. The default is 5%. | |
409 | :Type: Float | |
410 | :Default: ``0.05`` | |
411 | ||
412 | ||
413 | ``mon warn on slow ping time`` | |
414 | ||
415 | :Description: Override ``mon warn on slow ping ratio`` with a specific value. | |
416 | Issue a ``HEALTH_WARN`` in cluster log if any heartbeat | |
417 | between OSDs exceeds ``mon warn on slow ping time`` | |
418 | milliseconds. The default is 0 (disabled). | |
419 | :Type: Integer | |
420 | :Default: ``0`` | |
421 | ||
422 | ||
9f95a23c TL |
423 | ``mon warn on pool no redundancy`` |
424 | ||
425 | :Description: Issue a ``HEALTH_WARN`` in cluster log if any pool is | |
426 | configured with no replicas. | |
427 | :Type: Boolean | |
428 | :Default: ``True`` | |
429 | ||
430 | ||
31f18b77 FG |
431 | ``mon cache target full warn ratio`` |
432 | ||
433 | :Description: Position between pool's ``cache_target_full`` and | |
434 | ``target_max_object`` where we start warning | |
31f18b77 | 435 | |
31f18b77 | 436 | :Type: Float |
9f95a23c | 437 | :Default: ``0.66`` |
31f18b77 FG |
438 | |
439 | ||
440 | ``mon health to clog`` | |
441 | ||
442 | :Description: Enable sending health summary to cluster log periodically. | |
443 | :Type: Boolean | |
9f95a23c | 444 | :Default: ``True`` |
31f18b77 FG |
445 | |
446 | ||
447 | ``mon health to clog tick interval`` | |
448 | ||
449 | :Description: How often (in seconds) the monitor send health summary to cluster | |
450 | log (a non-positive number disables it). If current health summary | |
451 | is empty or identical to the last time, monitor will not send it | |
452 | to cluster log. | |
9f95a23c | 453 | |
eafe8130 | 454 | :Type: Float |
9f95a23c | 455 | :Default: ``60.0`` |
31f18b77 FG |
456 | |
457 | ||
458 | ``mon health to clog interval`` | |
459 | ||
460 | :Description: How often (in seconds) the monitor send health summary to cluster | |
461 | log (a non-positive number disables it). Monitor will always | |
462 | send the summary to cluster log no matter if the summary changes | |
463 | or not. | |
9f95a23c | 464 | |
31f18b77 | 465 | :Type: Integer |
9f95a23c | 466 | :Default: ``3600`` |
31f18b77 FG |
467 | |
468 | ||
469 | ||
7c673cae FG |
470 | .. index:: Ceph Storage Cluster; capacity planning, Ceph Monitor; capacity planning |
471 | ||
472 | Storage Capacity | |
473 | ---------------- | |
474 | ||
475 | When a Ceph Storage Cluster gets close to its maximum capacity (i.e., ``mon osd | |
476 | full ratio``), Ceph prevents you from writing to or reading from Ceph OSD | |
477 | Daemons as a safety measure to prevent data loss. Therefore, letting a | |
478 | production Ceph Storage Cluster approach its full ratio is not a good practice, | |
479 | because it sacrifices high availability. The default full ratio is ``.95``, or | |
480 | 95% of capacity. This a very aggressive setting for a test cluster with a small | |
481 | number of OSDs. | |
482 | ||
483 | .. tip:: When monitoring your cluster, be alert to warnings related to the | |
484 | ``nearfull`` ratio. This means that a failure of some OSDs could result | |
485 | in a temporary service disruption if one or more OSDs fails. Consider adding | |
486 | more OSDs to increase storage capacity. | |
487 | ||
488 | A common scenario for test clusters involves a system administrator removing a | |
489 | Ceph OSD Daemon from the Ceph Storage Cluster to watch the cluster rebalance; | |
490 | then, removing another Ceph OSD Daemon, and so on until the Ceph Storage Cluster | |
491 | eventually reaches the full ratio and locks up. We recommend a bit of capacity | |
492 | planning even with a test cluster. Planning enables you to gauge how much spare | |
493 | capacity you will need in order to maintain high availability. Ideally, you want | |
494 | to plan for a series of Ceph OSD Daemon failures where the cluster can recover | |
495 | to an ``active + clean`` state without replacing those Ceph OSD Daemons | |
496 | immediately. You can run a cluster in an ``active + degraded`` state, but this | |
497 | is not ideal for normal operating conditions. | |
498 | ||
499 | The following diagram depicts a simplistic Ceph Storage Cluster containing 33 | |
500 | Ceph Nodes with one Ceph OSD Daemon per host, each Ceph OSD Daemon reading from | |
501 | and writing to a 3TB drive. So this exemplary Ceph Storage Cluster has a maximum | |
502 | actual capacity of 99TB. With a ``mon osd full ratio`` of ``0.95``, if the Ceph | |
503 | Storage Cluster falls to 5TB of remaining capacity, the cluster will not allow | |
504 | Ceph Clients to read and write data. So the Ceph Storage Cluster's operating | |
505 | capacity is 95TB, not 99TB. | |
506 | ||
507 | .. ditaa:: | |
508 | ||
509 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
510 | | Rack 1 | | Rack 2 | | Rack 3 | | Rack 4 | | Rack 5 | | Rack 6 | | |
511 | | cCCC | | cF00 | | cCCC | | cCCC | | cCCC | | cCCC | | |
512 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
513 | | OSD 1 | | OSD 7 | | OSD 13 | | OSD 19 | | OSD 25 | | OSD 31 | | |
514 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
515 | | OSD 2 | | OSD 8 | | OSD 14 | | OSD 20 | | OSD 26 | | OSD 32 | | |
516 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
517 | | OSD 3 | | OSD 9 | | OSD 15 | | OSD 21 | | OSD 27 | | OSD 33 | | |
518 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
519 | | OSD 4 | | OSD 10 | | OSD 16 | | OSD 22 | | OSD 28 | | Spare | | |
520 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
521 | | OSD 5 | | OSD 11 | | OSD 17 | | OSD 23 | | OSD 29 | | Spare | | |
522 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
523 | | OSD 6 | | OSD 12 | | OSD 18 | | OSD 24 | | OSD 30 | | Spare | | |
524 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
525 | ||
526 | It is normal in such a cluster for one or two OSDs to fail. A less frequent but | |
527 | reasonable scenario involves a rack's router or power supply failing, which | |
528 | brings down multiple OSDs simultaneously (e.g., OSDs 7-12). In such a scenario, | |
529 | you should still strive for a cluster that can remain operational and achieve an | |
530 | ``active + clean`` state--even if that means adding a few hosts with additional | |
531 | OSDs in short order. If your capacity utilization is too high, you may not lose | |
532 | data, but you could still sacrifice data availability while resolving an outage | |
533 | within a failure domain if capacity utilization of the cluster exceeds the full | |
534 | ratio. For this reason, we recommend at least some rough capacity planning. | |
535 | ||
536 | Identify two numbers for your cluster: | |
537 | ||
538 | #. The number of OSDs. | |
539 | #. The total capacity of the cluster | |
540 | ||
541 | If you divide the total capacity of your cluster by the number of OSDs in your | |
542 | cluster, you will find the mean average capacity of an OSD within your cluster. | |
543 | Consider multiplying that number by the number of OSDs you expect will fail | |
544 | simultaneously during normal operations (a relatively small number). Finally | |
545 | multiply the capacity of the cluster by the full ratio to arrive at a maximum | |
546 | operating capacity; then, subtract the number of amount of data from the OSDs | |
547 | you expect to fail to arrive at a reasonable full ratio. Repeat the foregoing | |
548 | process with a higher number of OSD failures (e.g., a rack of OSDs) to arrive at | |
549 | a reasonable number for a near full ratio. | |
550 | ||
11fdf7f2 TL |
551 | The following settings only apply on cluster creation and are then stored in |
552 | the OSDMap. | |
553 | ||
7c673cae FG |
554 | .. code-block:: ini |
555 | ||
556 | [global] | |
557 | ||
558 | mon osd full ratio = .80 | |
559 | mon osd backfillfull ratio = .75 | |
560 | mon osd nearfull ratio = .70 | |
561 | ||
562 | ||
563 | ``mon osd full ratio`` | |
564 | ||
565 | :Description: The percentage of disk space used before an OSD is | |
566 | considered ``full``. | |
567 | ||
568 | :Type: Float | |
9f95a23c | 569 | :Default: ``0.95`` |
7c673cae FG |
570 | |
571 | ||
572 | ``mon osd backfillfull ratio`` | |
573 | ||
574 | :Description: The percentage of disk space used before an OSD is | |
575 | considered too ``full`` to backfill. | |
576 | ||
577 | :Type: Float | |
9f95a23c | 578 | :Default: ``0.90`` |
7c673cae FG |
579 | |
580 | ||
581 | ``mon osd nearfull ratio`` | |
582 | ||
583 | :Description: The percentage of disk space used before an OSD is | |
584 | considered ``nearfull``. | |
585 | ||
586 | :Type: Float | |
9f95a23c | 587 | :Default: ``0.85`` |
7c673cae FG |
588 | |
589 | ||
590 | .. tip:: If some OSDs are nearfull, but others have plenty of capacity, you | |
591 | may have a problem with the CRUSH weight for the nearfull OSDs. | |
592 | ||
11fdf7f2 TL |
593 | .. tip:: These settings only apply during cluster creation. Afterwards they need |
594 | to be changed in the OSDMap using ``ceph osd set-nearfull-ratio`` and | |
595 | ``ceph osd set-full-ratio`` | |
596 | ||
7c673cae FG |
597 | .. index:: heartbeat |
598 | ||
599 | Heartbeat | |
600 | --------- | |
601 | ||
602 | Ceph monitors know about the cluster by requiring reports from each OSD, and by | |
603 | receiving reports from OSDs about the status of their neighboring OSDs. Ceph | |
604 | provides reasonable default settings for monitor/OSD interaction; however, you | |
605 | may modify them as needed. See `Monitor/OSD Interaction`_ for details. | |
606 | ||
607 | ||
608 | .. index:: Ceph Monitor; leader, Ceph Monitor; provider, Ceph Monitor; requester, Ceph Monitor; synchronization | |
609 | ||
610 | Monitor Store Synchronization | |
611 | ----------------------------- | |
612 | ||
613 | When you run a production cluster with multiple monitors (recommended), each | |
614 | monitor checks to see if a neighboring monitor has a more recent version of the | |
615 | cluster map (e.g., a map in a neighboring monitor with one or more epoch numbers | |
616 | higher than the most current epoch in the map of the instant monitor). | |
617 | Periodically, one monitor in the cluster may fall behind the other monitors to | |
618 | the point where it must leave the quorum, synchronize to retrieve the most | |
619 | current information about the cluster, and then rejoin the quorum. For the | |
620 | purposes of synchronization, monitors may assume one of three roles: | |
621 | ||
622 | #. **Leader**: The `Leader` is the first monitor to achieve the most recent | |
623 | Paxos version of the cluster map. | |
624 | ||
625 | #. **Provider**: The `Provider` is a monitor that has the most recent version | |
626 | of the cluster map, but wasn't the first to achieve the most recent version. | |
627 | ||
628 | #. **Requester:** A `Requester` is a monitor that has fallen behind the leader | |
629 | and must synchronize in order to retrieve the most recent information about | |
630 | the cluster before it can rejoin the quorum. | |
631 | ||
632 | These roles enable a leader to delegate synchronization duties to a provider, | |
633 | which prevents synchronization requests from overloading the leader--improving | |
634 | performance. In the following diagram, the requester has learned that it has | |
635 | fallen behind the other monitors. The requester asks the leader to synchronize, | |
636 | and the leader tells the requester to synchronize with a provider. | |
637 | ||
638 | ||
639 | .. ditaa:: +-----------+ +---------+ +----------+ | |
640 | | Requester | | Leader | | Provider | | |
641 | +-----------+ +---------+ +----------+ | |
642 | | | | | |
643 | | | | | |
644 | | Ask to Synchronize | | | |
645 | |------------------->| | | |
646 | | | | | |
647 | |<-------------------| | | |
648 | | Tell Requester to | | | |
649 | | Sync with Provider | | | |
650 | | | | | |
651 | | Synchronize | | |
652 | |--------------------+-------------------->| | |
653 | | | | | |
654 | |<-------------------+---------------------| | |
655 | | Send Chunk to Requester | | |
656 | | (repeat as necessary) | | |
657 | | Requester Acks Chuck to Provider | | |
658 | |--------------------+-------------------->| | |
659 | | | | |
660 | | Sync Complete | | |
661 | | Notification | | |
662 | |------------------->| | |
663 | | | | |
664 | |<-------------------| | |
665 | | Ack | | |
666 | | | | |
667 | ||
668 | ||
669 | Synchronization always occurs when a new monitor joins the cluster. During | |
670 | runtime operations, monitors may receive updates to the cluster map at different | |
671 | times. This means the leader and provider roles may migrate from one monitor to | |
672 | another. If this happens while synchronizing (e.g., a provider falls behind the | |
673 | leader), the provider can terminate synchronization with a requester. | |
674 | ||
675 | Once synchronization is complete, Ceph requires trimming across the cluster. | |
676 | Trimming requires that the placement groups are ``active + clean``. | |
677 | ||
678 | ||
7c673cae FG |
679 | ``mon sync timeout`` |
680 | ||
31f18b77 FG |
681 | :Description: Number of seconds the monitor will wait for the next update |
682 | message from its sync provider before it gives up and bootstrap | |
683 | again. | |
9f95a23c | 684 | |
7c673cae | 685 | :Type: Double |
11fdf7f2 | 686 | :Default: ``60.0`` |
7c673cae FG |
687 | |
688 | ||
7c673cae FG |
689 | ``mon sync max payload size`` |
690 | ||
31f18b77 | 691 | :Description: The maximum size for a sync payload (in bytes). |
7c673cae | 692 | :Type: 32-bit Integer |
9f95a23c | 693 | :Default: ``1048576`` |
7c673cae FG |
694 | |
695 | ||
31f18b77 | 696 | ``paxos max join drift`` |
7c673cae | 697 | |
31f18b77 FG |
698 | :Description: The maximum Paxos iterations before we must first sync the |
699 | monitor data stores. When a monitor finds that its peer is too | |
700 | far ahead of it, it will first sync with data stores before moving | |
701 | on. | |
9f95a23c | 702 | |
31f18b77 FG |
703 | :Type: Integer |
704 | :Default: ``10`` | |
7c673cae | 705 | |
9f95a23c | 706 | |
31f18b77 | 707 | ``paxos stash full interval`` |
7c673cae | 708 | |
31f18b77 FG |
709 | :Description: How often (in commits) to stash a full copy of the PaxosService state. |
710 | Current this setting only affects ``mds``, ``mon``, ``auth`` and ``mgr`` | |
711 | PaxosServices. | |
9f95a23c | 712 | |
31f18b77 | 713 | :Type: Integer |
9f95a23c TL |
714 | :Default: ``25`` |
715 | ||
7c673cae FG |
716 | |
717 | ``paxos propose interval`` | |
718 | ||
719 | :Description: Gather updates for this time interval before proposing | |
31f18b77 | 720 | a map update. |
9f95a23c | 721 | |
7c673cae FG |
722 | :Type: Double |
723 | :Default: ``1.0`` | |
724 | ||
725 | ||
31f18b77 FG |
726 | ``paxos min`` |
727 | ||
728 | :Description: The minimum number of paxos states to keep around | |
729 | :Type: Integer | |
9f95a23c | 730 | :Default: ``500`` |
31f18b77 FG |
731 | |
732 | ||
7c673cae FG |
733 | ``paxos min wait`` |
734 | ||
735 | :Description: The minimum amount of time to gather updates after a period of | |
736 | inactivity. | |
9f95a23c | 737 | |
7c673cae FG |
738 | :Type: Double |
739 | :Default: ``0.05`` | |
740 | ||
741 | ||
31f18b77 FG |
742 | ``paxos trim min`` |
743 | ||
744 | :Description: Number of extra proposals tolerated before trimming | |
745 | :Type: Integer | |
9f95a23c | 746 | :Default: ``250`` |
31f18b77 FG |
747 | |
748 | ||
749 | ``paxos trim max`` | |
750 | ||
751 | :Description: The maximum number of extra proposals to trim at a time | |
752 | :Type: Integer | |
9f95a23c | 753 | :Default: ``500`` |
31f18b77 FG |
754 | |
755 | ||
756 | ``paxos service trim min`` | |
757 | ||
758 | :Description: The minimum amount of versions to trigger a trim (0 disables it) | |
759 | :Type: Integer | |
9f95a23c | 760 | :Default: ``250`` |
31f18b77 FG |
761 | |
762 | ||
763 | ``paxos service trim max`` | |
764 | ||
765 | :Description: The maximum amount of versions to trim during a single proposal (0 disables it) | |
766 | :Type: Integer | |
9f95a23c | 767 | :Default: ``500`` |
31f18b77 FG |
768 | |
769 | ||
770 | ``mon mds force trim to`` | |
771 | ||
772 | :Description: Force monitor to trim mdsmaps to this point (0 disables it. | |
773 | dangerous, use with care) | |
9f95a23c | 774 | |
31f18b77 | 775 | :Type: Integer |
9f95a23c | 776 | :Default: ``0`` |
31f18b77 FG |
777 | |
778 | ||
779 | ``mon osd force trim to`` | |
780 | ||
781 | :Description: Force monitor to trim osdmaps to this point, even if there is | |
782 | PGs not clean at the specified epoch (0 disables it. dangerous, | |
783 | use with care) | |
9f95a23c | 784 | |
31f18b77 | 785 | :Type: Integer |
9f95a23c TL |
786 | :Default: ``0`` |
787 | ||
31f18b77 FG |
788 | |
789 | ``mon osd cache size`` | |
790 | ||
791 | :Description: The size of osdmaps cache, not to rely on underlying store's cache | |
792 | :Type: Integer | |
9f95a23c | 793 | :Default: ``500`` |
31f18b77 FG |
794 | |
795 | ||
796 | ``mon election timeout`` | |
797 | ||
798 | :Description: On election proposer, maximum waiting time for all ACKs in seconds. | |
799 | :Type: Float | |
9f95a23c | 800 | :Default: ``5.00`` |
31f18b77 FG |
801 | |
802 | ||
7c673cae FG |
803 | ``mon lease`` |
804 | ||
805 | :Description: The length (in seconds) of the lease on the monitor's versions. | |
806 | :Type: Float | |
9f95a23c | 807 | :Default: ``5.00`` |
7c673cae FG |
808 | |
809 | ||
31f18b77 | 810 | ``mon lease renew interval factor`` |
7c673cae | 811 | |
31f18b77 FG |
812 | :Description: ``mon lease`` \* ``mon lease renew interval factor`` will be the |
813 | interval for the Leader to renew the other monitor's leases. The | |
814 | factor should be less than ``1.0``. | |
9f95a23c | 815 | |
7c673cae | 816 | :Type: Float |
9f95a23c | 817 | :Default: ``0.60`` |
7c673cae FG |
818 | |
819 | ||
31f18b77 | 820 | ``mon lease ack timeout factor`` |
7c673cae | 821 | |
31f18b77 FG |
822 | :Description: The Leader will wait ``mon lease`` \* ``mon lease ack timeout factor`` |
823 | for the Providers to acknowledge the lease extension. | |
9f95a23c | 824 | |
7c673cae | 825 | :Type: Float |
9f95a23c | 826 | :Default: ``2.00`` |
31f18b77 FG |
827 | |
828 | ||
829 | ``mon accept timeout factor`` | |
830 | ||
831 | :Description: The Leader will wait ``mon lease`` \* ``mon accept timeout factor`` | |
832 | for the Requester(s) to accept a Paxos update. It is also used | |
833 | during the Paxos recovery phase for similar purposes. | |
9f95a23c | 834 | |
31f18b77 | 835 | :Type: Float |
9f95a23c | 836 | :Default: ``2.00`` |
7c673cae FG |
837 | |
838 | ||
839 | ``mon min osdmap epochs`` | |
840 | ||
841 | :Description: Minimum number of OSD map epochs to keep at all times. | |
842 | :Type: 32-bit Integer | |
843 | :Default: ``500`` | |
844 | ||
845 | ||
7c673cae FG |
846 | ``mon max log epochs`` |
847 | ||
848 | :Description: Maximum number of Log epochs the monitor should keep. | |
849 | :Type: 32-bit Integer | |
850 | :Default: ``500`` | |
851 | ||
852 | ||
853 | ||
7c673cae FG |
854 | .. index:: Ceph Monitor; clock |
855 | ||
856 | Clock | |
857 | ----- | |
858 | ||
859 | Ceph daemons pass critical messages to each other, which must be processed | |
860 | before daemons reach a timeout threshold. If the clocks in Ceph monitors | |
861 | are not synchronized, it can lead to a number of anomalies. For example: | |
862 | ||
863 | - Daemons ignoring received messages (e.g., timestamps outdated) | |
864 | - Timeouts triggered too soon/late when a message wasn't received in time. | |
865 | ||
31f18b77 | 866 | See `Monitor Store Synchronization`_ for details. |
7c673cae FG |
867 | |
868 | ||
869 | .. tip:: You SHOULD install NTP on your Ceph monitor hosts to | |
870 | ensure that the monitor cluster operates with synchronized clocks. | |
871 | ||
c07f9fc5 | 872 | Clock drift may still be noticeable with NTP even though the discrepancy is not |
7c673cae FG |
873 | yet harmful. Ceph's clock drift / clock skew warnings may get triggered even |
874 | though NTP maintains a reasonable level of synchronization. Increasing your | |
875 | clock drift may be tolerable under such circumstances; however, a number of | |
876 | factors such as workload, network latency, configuring overrides to default | |
877 | timeouts and the `Monitor Store Synchronization`_ settings may influence | |
878 | the level of acceptable clock drift without compromising Paxos guarantees. | |
879 | ||
880 | Ceph provides the following tunable options to allow you to find | |
881 | acceptable values. | |
882 | ||
883 | ||
7c673cae FG |
884 | ``mon tick interval`` |
885 | ||
886 | :Description: A monitor's tick interval in seconds. | |
887 | :Type: 32-bit Integer | |
888 | :Default: ``5`` | |
889 | ||
890 | ||
891 | ``mon clock drift allowed`` | |
892 | ||
893 | :Description: The clock drift in seconds allowed between monitors. | |
894 | :Type: Float | |
9f95a23c | 895 | :Default: ``0.05`` |
7c673cae FG |
896 | |
897 | ||
898 | ``mon clock drift warn backoff`` | |
899 | ||
900 | :Description: Exponential backoff for clock drift warnings | |
901 | :Type: Float | |
9f95a23c | 902 | :Default: ``5.00`` |
7c673cae FG |
903 | |
904 | ||
905 | ``mon timecheck interval`` | |
906 | ||
907 | :Description: The time check interval (clock drift check) in seconds | |
31f18b77 | 908 | for the Leader. |
7c673cae FG |
909 | |
910 | :Type: Float | |
9f95a23c | 911 | :Default: ``300.00`` |
7c673cae FG |
912 | |
913 | ||
31f18b77 FG |
914 | ``mon timecheck skew interval`` |
915 | ||
916 | :Description: The time check interval (clock drift check) in seconds when in | |
917 | presence of a skew in seconds for the Leader. | |
9f95a23c | 918 | |
31f18b77 | 919 | :Type: Float |
9f95a23c | 920 | :Default: ``30.00`` |
31f18b77 | 921 | |
7c673cae FG |
922 | |
923 | Client | |
924 | ------ | |
925 | ||
926 | ``mon client hunt interval`` | |
927 | ||
928 | :Description: The client will try a new monitor every ``N`` seconds until it | |
929 | establishes a connection. | |
930 | ||
931 | :Type: Double | |
9f95a23c | 932 | :Default: ``3.00`` |
7c673cae FG |
933 | |
934 | ||
935 | ``mon client ping interval`` | |
936 | ||
937 | :Description: The client will ping the monitor every ``N`` seconds. | |
938 | :Type: Double | |
9f95a23c | 939 | :Default: ``10.00`` |
7c673cae FG |
940 | |
941 | ||
942 | ``mon client max log entries per message`` | |
943 | ||
944 | :Description: The maximum number of log entries a monitor will generate | |
945 | per client message. | |
946 | ||
947 | :Type: Integer | |
948 | :Default: ``1000`` | |
949 | ||
950 | ||
951 | ``mon client bytes`` | |
952 | ||
953 | :Description: The amount of client message data allowed in memory (in bytes). | |
954 | :Type: 64-bit Integer Unsigned | |
955 | :Default: ``100ul << 20`` | |
956 | ||
957 | ||
958 | Pool settings | |
959 | ============= | |
9f95a23c | 960 | |
7c673cae FG |
961 | Since version v0.94 there is support for pool flags which allow or disallow changes to be made to pools. |
962 | ||
963 | Monitors can also disallow removal of pools if configured that way. | |
964 | ||
965 | ``mon allow pool delete`` | |
966 | ||
967 | :Description: If the monitors should allow pools to be removed. Regardless of what the pool flags say. | |
968 | :Type: Boolean | |
969 | :Default: ``false`` | |
970 | ||
9f95a23c | 971 | |
11fdf7f2 TL |
972 | ``osd pool default ec fast read`` |
973 | ||
974 | :Description: Whether to turn on fast read on the pool or not. It will be used as | |
975 | the default setting of newly created erasure coded pools if ``fast_read`` | |
976 | is not specified at create time. | |
9f95a23c | 977 | |
11fdf7f2 TL |
978 | :Type: Boolean |
979 | :Default: ``false`` | |
980 | ||
9f95a23c | 981 | |
7c673cae FG |
982 | ``osd pool default flag hashpspool`` |
983 | ||
984 | :Description: Set the hashpspool flag on new pools | |
985 | :Type: Boolean | |
986 | :Default: ``true`` | |
987 | ||
9f95a23c | 988 | |
7c673cae FG |
989 | ``osd pool default flag nodelete`` |
990 | ||
991 | :Description: Set the nodelete flag on new pools. Prevents allow pool removal with this flag in any way. | |
992 | :Type: Boolean | |
993 | :Default: ``false`` | |
994 | ||
9f95a23c | 995 | |
7c673cae FG |
996 | ``osd pool default flag nopgchange`` |
997 | ||
998 | :Description: Set the nopgchange flag on new pools. Does not allow the number of PGs to be changed for a pool. | |
999 | :Type: Boolean | |
1000 | :Default: ``false`` | |
1001 | ||
9f95a23c | 1002 | |
7c673cae FG |
1003 | ``osd pool default flag nosizechange`` |
1004 | ||
1005 | :Description: Set the nosizechange flag on new pools. Does not allow the size to be changed of pool. | |
1006 | :Type: Boolean | |
1007 | :Default: ``false`` | |
1008 | ||
1009 | For more information about the pool flags see `Pool values`_. | |
1010 | ||
1011 | Miscellaneous | |
1012 | ============= | |
1013 | ||
7c673cae FG |
1014 | ``mon max osd`` |
1015 | ||
1016 | :Description: The maximum number of OSDs allowed in the cluster. | |
1017 | :Type: 32-bit Integer | |
1018 | :Default: ``10000`` | |
1019 | ||
9f95a23c | 1020 | |
7c673cae FG |
1021 | ``mon globalid prealloc`` |
1022 | ||
1023 | :Description: The number of global IDs to pre-allocate for clients and daemons in the cluster. | |
1024 | :Type: 32-bit Integer | |
9f95a23c TL |
1025 | :Default: ``10000`` |
1026 | ||
7c673cae | 1027 | |
7c673cae FG |
1028 | ``mon subscribe interval`` |
1029 | ||
1030 | :Description: The refresh interval (in seconds) for subscriptions. The | |
1031 | subscription mechanism enables obtaining the cluster maps | |
1032 | and log information. | |
1033 | ||
1034 | :Type: Double | |
9f95a23c | 1035 | :Default: ``86400.00`` |
7c673cae FG |
1036 | |
1037 | ||
1038 | ``mon stat smooth intervals`` | |
1039 | ||
1040 | :Description: Ceph will smooth statistics over the last ``N`` PG maps. | |
1041 | :Type: Integer | |
9f95a23c | 1042 | :Default: ``6`` |
7c673cae FG |
1043 | |
1044 | ||
1045 | ``mon probe timeout`` | |
1046 | ||
1047 | :Description: Number of seconds the monitor will wait to find peers before bootstrapping. | |
1048 | :Type: Double | |
9f95a23c | 1049 | :Default: ``2.00`` |
7c673cae FG |
1050 | |
1051 | ||
1052 | ``mon daemon bytes`` | |
1053 | ||
1054 | :Description: The message memory cap for metadata server and OSD messages (in bytes). | |
1055 | :Type: 64-bit Integer Unsigned | |
1056 | :Default: ``400ul << 20`` | |
1057 | ||
1058 | ||
1059 | ``mon max log entries per event`` | |
1060 | ||
1061 | :Description: The maximum number of log entries per event. | |
1062 | :Type: Integer | |
1063 | :Default: ``4096`` | |
1064 | ||
1065 | ||
1066 | ``mon osd prime pg temp`` | |
1067 | ||
1068 | :Description: Enables or disable priming the PGMap with the previous OSDs when an out | |
1069 | OSD comes back into the cluster. With the ``true`` setting the clients | |
1070 | will continue to use the previous OSDs until the newly in OSDs as that | |
1071 | PG peered. | |
9f95a23c | 1072 | |
7c673cae FG |
1073 | :Type: Boolean |
1074 | :Default: ``true`` | |
1075 | ||
1076 | ||
1077 | ``mon osd prime pg temp max time`` | |
1078 | ||
1079 | :Description: How much time in seconds the monitor should spend trying to prime the | |
1080 | PGMap when an out OSD comes back into the cluster. | |
9f95a23c | 1081 | |
7c673cae | 1082 | :Type: Float |
9f95a23c | 1083 | :Default: ``0.50`` |
7c673cae FG |
1084 | |
1085 | ||
31f18b77 FG |
1086 | ``mon osd prime pg temp max time estimate`` |
1087 | ||
1088 | :Description: Maximum estimate of time spent on each PG before we prime all PGs | |
1089 | in parallel. | |
9f95a23c | 1090 | |
31f18b77 FG |
1091 | :Type: Float |
1092 | :Default: ``0.25`` | |
1093 | ||
1094 | ||
31f18b77 FG |
1095 | ``mon mds skip sanity`` |
1096 | ||
1097 | :Description: Skip safety assertions on FSMap (in case of bugs where we want to | |
1098 | continue anyway). Monitor terminates if the FSMap sanity check | |
1099 | fails, but we can disable it by enabling this option. | |
9f95a23c | 1100 | |
31f18b77 | 1101 | :Type: Boolean |
9f95a23c | 1102 | :Default: ``False`` |
31f18b77 FG |
1103 | |
1104 | ||
1105 | ``mon max mdsmap epochs`` | |
1106 | ||
1107 | :Description: The maximum amount of mdsmap epochs to trim during a single proposal. | |
1108 | :Type: Integer | |
9f95a23c | 1109 | :Default: ``500`` |
31f18b77 FG |
1110 | |
1111 | ||
1112 | ``mon config key max entry size`` | |
1113 | ||
1114 | :Description: The maximum size of config-key entry (in bytes) | |
1115 | :Type: Integer | |
9f95a23c | 1116 | :Default: ``65536`` |
31f18b77 FG |
1117 | |
1118 | ||
1119 | ``mon scrub interval`` | |
1120 | ||
1121 | :Description: How often (in seconds) the monitor scrub its store by comparing | |
1122 | the stored checksums with the computed ones of all the stored | |
1123 | keys. | |
9f95a23c | 1124 | |
31f18b77 | 1125 | :Type: Integer |
9f95a23c | 1126 | :Default: ``3600*24`` |
31f18b77 FG |
1127 | |
1128 | ||
1129 | ``mon scrub max keys`` | |
1130 | ||
1131 | :Description: The maximum number of keys to scrub each time. | |
1132 | :Type: Integer | |
9f95a23c | 1133 | :Default: ``100`` |
31f18b77 FG |
1134 | |
1135 | ||
1136 | ``mon compact on start`` | |
1137 | ||
1138 | :Description: Compact the database used as Ceph Monitor store on | |
1139 | ``ceph-mon`` start. A manual compaction helps to shrink the | |
1140 | monitor database and improve the performance of it if the regular | |
1141 | compaction fails to work. | |
9f95a23c | 1142 | |
31f18b77 | 1143 | :Type: Boolean |
9f95a23c | 1144 | :Default: ``False`` |
31f18b77 FG |
1145 | |
1146 | ||
1147 | ``mon compact on bootstrap`` | |
1148 | ||
1149 | :Description: Compact the database used as Ceph Monitor store on | |
1150 | on bootstrap. Monitor starts probing each other for creating | |
1151 | a quorum after bootstrap. If it times out before joining the | |
1152 | quorum, it will start over and bootstrap itself again. | |
9f95a23c | 1153 | |
31f18b77 | 1154 | :Type: Boolean |
9f95a23c | 1155 | :Default: ``False`` |
31f18b77 FG |
1156 | |
1157 | ||
1158 | ``mon compact on trim`` | |
1159 | ||
1160 | :Description: Compact a certain prefix (including paxos) when we trim its old states. | |
1161 | :Type: Boolean | |
9f95a23c | 1162 | :Default: ``True`` |
31f18b77 FG |
1163 | |
1164 | ||
1165 | ``mon cpu threads`` | |
1166 | ||
1167 | :Description: Number of threads for performing CPU intensive work on monitor. | |
9f95a23c TL |
1168 | :Type: Integer |
1169 | :Default: ``4`` | |
31f18b77 FG |
1170 | |
1171 | ||
1172 | ``mon osd mapping pgs per chunk`` | |
1173 | ||
1174 | :Description: We calculate the mapping from placement group to OSDs in chunks. | |
1175 | This option specifies the number of placement groups per chunk. | |
9f95a23c | 1176 | |
31f18b77 | 1177 | :Type: Integer |
9f95a23c TL |
1178 | :Default: ``4096`` |
1179 | ||
31f18b77 | 1180 | |
31f18b77 FG |
1181 | ``mon session timeout`` |
1182 | ||
1183 | :Description: Monitor will terminate inactive sessions stay idle over this | |
1184 | time limit. | |
9f95a23c | 1185 | |
31f18b77 | 1186 | :Type: Integer |
9f95a23c TL |
1187 | :Default: ``300`` |
1188 | ||
31f18b77 | 1189 | |
eafe8130 TL |
1190 | ``mon osd cache size min`` |
1191 | ||
1192 | :Description: The minimum amount of bytes to be kept mapped in memory for osd | |
1193 | monitor caches. | |
9f95a23c | 1194 | |
eafe8130 | 1195 | :Type: 64-bit Integer |
9f95a23c TL |
1196 | :Default: ``134217728`` |
1197 | ||
eafe8130 TL |
1198 | |
1199 | ``mon memory target`` | |
1200 | ||
1201 | :Description: The amount of bytes pertaining to osd monitor caches and kv cache | |
1202 | to be kept mapped in memory with cache auto-tuning enabled. | |
9f95a23c | 1203 | |
eafe8130 | 1204 | :Type: 64-bit Integer |
9f95a23c TL |
1205 | :Default: ``2147483648`` |
1206 | ||
eafe8130 TL |
1207 | |
1208 | ``mon memory autotune`` | |
1209 | ||
1210 | :Description: Autotune the cache memory being used for osd monitors and kv | |
1211 | database. | |
9f95a23c | 1212 | |
eafe8130 | 1213 | :Type: Boolean |
9f95a23c | 1214 | :Default: ``True`` |
31f18b77 | 1215 | |
7c673cae | 1216 | |
11fdf7f2 | 1217 | .. _Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science) |
7c673cae FG |
1218 | .. _Monitor Keyrings: ../../../dev/mon-bootstrap#secret-keys |
1219 | .. _Ceph configuration file: ../ceph-conf/#monitors | |
1220 | .. _Network Configuration Reference: ../network-config-ref | |
1221 | .. _Monitor lookup through DNS: ../mon-lookup-dns | |
11fdf7f2 | 1222 | .. _ACID: https://en.wikipedia.org/wiki/ACID |
7c673cae FG |
1223 | .. _Adding/Removing a Monitor: ../../operations/add-or-rm-mons |
1224 | .. _Add/Remove a Monitor (ceph-deploy): ../../deployment/ceph-deploy-mon | |
1225 | .. _Monitoring a Cluster: ../../operations/monitoring | |
1226 | .. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg | |
1227 | .. _Bootstrapping a Monitor: ../../../dev/mon-bootstrap | |
1228 | .. _Changing a Monitor's IP Address: ../../operations/add-or-rm-mons#changing-a-monitor-s-ip-address | |
1229 | .. _Monitor/OSD Interaction: ../mon-osd-interaction | |
1230 | .. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability | |
1231 | .. _Pool values: ../../operations/pools/#set-pool-values |