]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ========================== |
2 | Monitor Config Reference | |
3 | ========================== | |
4 | ||
5 | Understanding how to configure a :term:`Ceph Monitor` is an important part of | |
6 | building a reliable :term:`Ceph Storage Cluster`. **All Ceph Storage Clusters | |
7 | have at least one monitor**. A monitor configuration usually remains fairly | |
8 | consistent, but you can add, remove or replace a monitor in a cluster. See | |
9 | `Adding/Removing a Monitor`_ and `Add/Remove a Monitor (ceph-deploy)`_ for | |
10 | details. | |
11 | ||
12 | ||
13 | .. index:: Ceph Monitor; Paxos | |
14 | ||
15 | Background | |
16 | ========== | |
17 | ||
18 | Ceph Monitors maintain a "master copy" of the :term:`cluster map`, which means a | |
19 | :term:`Ceph Client` can determine the location of all Ceph Monitors, Ceph OSD | |
20 | Daemons, and Ceph Metadata Servers just by connecting to one Ceph Monitor and | |
21 | retrieving a current cluster map. Before Ceph Clients can read from or write to | |
22 | Ceph OSD Daemons or Ceph Metadata Servers, they must connect to a Ceph Monitor | |
23 | first. With a current copy of the cluster map and the CRUSH algorithm, a Ceph | |
24 | Client can compute the location for any object. The ability to compute object | |
25 | locations allows a Ceph Client to talk directly to Ceph OSD Daemons, which is a | |
26 | very important aspect of Ceph's high scalability and performance. See | |
27 | `Scalability and High Availability`_ for additional details. | |
28 | ||
29 | The primary role of the Ceph Monitor is to maintain a master copy of the cluster | |
30 | map. Ceph Monitors also provide authentication and logging services. Ceph | |
31 | Monitors write all changes in the monitor services to a single Paxos instance, | |
32 | and Paxos writes the changes to a key/value store for strong consistency. Ceph | |
33 | Monitors can query the most recent version of the cluster map during sync | |
34 | operations. Ceph Monitors leverage the key/value store's snapshots and iterators | |
35 | (using leveldb) to perform store-wide synchronization. | |
36 | ||
37 | .. ditaa:: | |
38 | ||
39 | /-------------\ /-------------\ | |
40 | | Monitor | Write Changes | Paxos | | |
41 | | cCCC +-------------->+ cCCC | | |
42 | | | | | | |
43 | +-------------+ \------+------/ | |
44 | | Auth | | | |
45 | +-------------+ | Write Changes | |
46 | | Log | | | |
47 | +-------------+ v | |
48 | | Monitor Map | /------+------\ | |
49 | +-------------+ | Key / Value | | |
50 | | OSD Map | | Store | | |
51 | +-------------+ | cCCC | | |
52 | | PG Map | \------+------/ | |
53 | +-------------+ ^ | |
54 | | MDS Map | | Read Changes | |
55 | +-------------+ | | |
56 | | cCCC |*---------------------+ | |
57 | \-------------/ | |
58 | ||
59 | ||
60 | .. deprecated:: version 0.58 | |
61 | ||
62 | In Ceph versions 0.58 and earlier, Ceph Monitors use a Paxos instance for | |
63 | each service and store the map as a file. | |
64 | ||
65 | .. index:: Ceph Monitor; cluster map | |
66 | ||
67 | Cluster Maps | |
68 | ------------ | |
69 | ||
70 | The cluster map is a composite of maps, including the monitor map, the OSD map, | |
71 | the placement group map and the metadata server map. The cluster map tracks a | |
72 | number of important things: which processes are ``in`` the Ceph Storage Cluster; | |
73 | which processes that are ``in`` the Ceph Storage Cluster are ``up`` and running | |
74 | or ``down``; whether, the placement groups are ``active`` or ``inactive``, and | |
75 | ``clean`` or in some other state; and, other details that reflect the current | |
76 | state of the cluster such as the total amount of storage space, and the amount | |
77 | of storage used. | |
78 | ||
79 | When there is a significant change in the state of the cluster--e.g., a Ceph OSD | |
80 | Daemon goes down, a placement group falls into a degraded state, etc.--the | |
81 | cluster map gets updated to reflect the current state of the cluster. | |
82 | Additionally, the Ceph Monitor also maintains a history of the prior states of | |
83 | the cluster. The monitor map, OSD map, placement group map and metadata server | |
84 | map each maintain a history of their map versions. We call each version an | |
85 | "epoch." | |
86 | ||
87 | When operating your Ceph Storage Cluster, keeping track of these states is an | |
88 | important part of your system administration duties. See `Monitoring a Cluster`_ | |
89 | and `Monitoring OSDs and PGs`_ for additional details. | |
90 | ||
91 | .. index:: high availability; quorum | |
92 | ||
93 | Monitor Quorum | |
94 | -------------- | |
95 | ||
96 | Our Configuring ceph section provides a trivial `Ceph configuration file`_ that | |
97 | provides for one monitor in the test cluster. A cluster will run fine with a | |
98 | single monitor; however, **a single monitor is a single-point-of-failure**. To | |
99 | ensure high availability in a production Ceph Storage Cluster, you should run | |
100 | Ceph with multiple monitors so that the failure of a single monitor **WILL NOT** | |
101 | bring down your entire cluster. | |
102 | ||
103 | When a Ceph Storage Cluster runs multiple Ceph Monitors for high availability, | |
104 | Ceph Monitors use `Paxos`_ to establish consensus about the master cluster map. | |
105 | A consensus requires a majority of monitors running to establish a quorum for | |
106 | consensus about the cluster map (e.g., 1; 2 out of 3; 3 out of 5; 4 out of 6; | |
107 | etc.). | |
108 | ||
31f18b77 FG |
109 | ``mon force quorum join`` |
110 | ||
111 | :Description: Force monitor to join quorum even if it has been previously removed from the map | |
112 | :Type: Boolean | |
113 | :Default: ``False`` | |
7c673cae FG |
114 | |
115 | .. index:: Ceph Monitor; consistency | |
116 | ||
117 | Consistency | |
118 | ----------- | |
119 | ||
120 | When you add monitor settings to your Ceph configuration file, you need to be | |
121 | aware of some of the architectural aspects of Ceph Monitors. **Ceph imposes | |
122 | strict consistency requirements** for a Ceph monitor when discovering another | |
123 | Ceph Monitor within the cluster. Whereas, Ceph Clients and other Ceph daemons | |
124 | use the Ceph configuration file to discover monitors, monitors discover each | |
125 | other using the monitor map (monmap), not the Ceph configuration file. | |
126 | ||
127 | A Ceph Monitor always refers to the local copy of the monmap when discovering | |
128 | other Ceph Monitors in the Ceph Storage Cluster. Using the monmap instead of the | |
129 | Ceph configuration file avoids errors that could break the cluster (e.g., typos | |
130 | in ``ceph.conf`` when specifying a monitor address or port). Since monitors use | |
131 | monmaps for discovery and they share monmaps with clients and other Ceph | |
132 | daemons, **the monmap provides monitors with a strict guarantee that their | |
133 | consensus is valid.** | |
134 | ||
135 | Strict consistency also applies to updates to the monmap. As with any other | |
136 | updates on the Ceph Monitor, changes to the monmap always run through a | |
137 | distributed consensus algorithm called `Paxos`_. The Ceph Monitors must agree on | |
138 | each update to the monmap, such as adding or removing a Ceph Monitor, to ensure | |
139 | that each monitor in the quorum has the same version of the monmap. Updates to | |
140 | the monmap are incremental so that Ceph Monitors have the latest agreed upon | |
141 | version, and a set of previous versions. Maintaining a history enables a Ceph | |
142 | Monitor that has an older version of the monmap to catch up with the current | |
143 | state of the Ceph Storage Cluster. | |
144 | ||
145 | If Ceph Monitors discovered each other through the Ceph configuration file | |
146 | instead of through the monmap, it would introduce additional risks because the | |
c07f9fc5 | 147 | Ceph configuration files are not updated and distributed automatically. Ceph |
7c673cae FG |
148 | Monitors might inadvertently use an older Ceph configuration file, fail to |
149 | recognize a Ceph Monitor, fall out of a quorum, or develop a situation where | |
c07f9fc5 | 150 | `Paxos`_ is not able to determine the current state of the system accurately. |
7c673cae FG |
151 | |
152 | ||
153 | .. index:: Ceph Monitor; bootstrapping monitors | |
154 | ||
155 | Bootstrapping Monitors | |
156 | ---------------------- | |
157 | ||
158 | In most configuration and deployment cases, tools that deploy Ceph may help | |
159 | bootstrap the Ceph Monitors by generating a monitor map for you (e.g., | |
160 | ``ceph-deploy``, etc). A Ceph Monitor requires a few explicit | |
161 | settings: | |
162 | ||
163 | - **Filesystem ID**: The ``fsid`` is the unique identifier for your | |
164 | object store. Since you can run multiple clusters on the same | |
165 | hardware, you must specify the unique ID of the object store when | |
166 | bootstrapping a monitor. Deployment tools usually do this for you | |
167 | (e.g., ``ceph-deploy`` can call a tool like ``uuidgen``), but you | |
168 | may specify the ``fsid`` manually too. | |
169 | ||
170 | - **Monitor ID**: A monitor ID is a unique ID assigned to each monitor within | |
171 | the cluster. It is an alphanumeric value, and by convention the identifier | |
172 | usually follows an alphabetical increment (e.g., ``a``, ``b``, etc.). This | |
173 | can be set in a Ceph configuration file (e.g., ``[mon.a]``, ``[mon.b]``, etc.), | |
174 | by a deployment tool, or using the ``ceph`` commandline. | |
175 | ||
176 | - **Keys**: The monitor must have secret keys. A deployment tool such as | |
177 | ``ceph-deploy`` usually does this for you, but you may | |
178 | perform this step manually too. See `Monitor Keyrings`_ for details. | |
179 | ||
180 | For additional details on bootstrapping, see `Bootstrapping a Monitor`_. | |
181 | ||
182 | .. index:: Ceph Monitor; configuring monitors | |
183 | ||
184 | Configuring Monitors | |
185 | ==================== | |
186 | ||
187 | To apply configuration settings to the entire cluster, enter the configuration | |
188 | settings under ``[global]``. To apply configuration settings to all monitors in | |
189 | your cluster, enter the configuration settings under ``[mon]``. To apply | |
190 | configuration settings to specific monitors, specify the monitor instance | |
191 | (e.g., ``[mon.a]``). By convention, monitor instance names use alpha notation. | |
192 | ||
193 | .. code-block:: ini | |
194 | ||
195 | [global] | |
196 | ||
197 | [mon] | |
198 | ||
199 | [mon.a] | |
200 | ||
201 | [mon.b] | |
202 | ||
203 | [mon.c] | |
204 | ||
205 | ||
206 | Minimum Configuration | |
207 | --------------------- | |
208 | ||
209 | The bare minimum monitor settings for a Ceph monitor via the Ceph configuration | |
210 | file include a hostname and a monitor address for each monitor. You can configure | |
211 | these under ``[mon]`` or under the entry for a specific monitor. | |
212 | ||
213 | .. code-block:: ini | |
214 | ||
11fdf7f2 TL |
215 | [global] |
216 | mon host = 10.0.0.2,10.0.0.3,10.0.0.4 | |
7c673cae FG |
217 | |
218 | .. code-block:: ini | |
219 | ||
220 | [mon.a] | |
221 | host = hostname1 | |
222 | mon addr = 10.0.0.10:6789 | |
223 | ||
224 | See the `Network Configuration Reference`_ for details. | |
225 | ||
226 | .. note:: This minimum configuration for monitors assumes that a deployment | |
227 | tool generates the ``fsid`` and the ``mon.`` key for you. | |
228 | ||
229 | Once you deploy a Ceph cluster, you **SHOULD NOT** change the IP address of | |
230 | the monitors. However, if you decide to change the monitor's IP address, you | |
231 | must follow a specific procedure. See `Changing a Monitor's IP Address`_ for | |
232 | details. | |
233 | ||
234 | Monitors can also be found by clients using DNS SRV records. See `Monitor lookup through DNS`_ for details. | |
235 | ||
236 | Cluster ID | |
237 | ---------- | |
238 | ||
239 | Each Ceph Storage Cluster has a unique identifier (``fsid``). If specified, it | |
240 | usually appears under the ``[global]`` section of the configuration file. | |
241 | Deployment tools usually generate the ``fsid`` and store it in the monitor map, | |
242 | so the value may not appear in a configuration file. The ``fsid`` makes it | |
243 | possible to run daemons for multiple clusters on the same hardware. | |
244 | ||
245 | ``fsid`` | |
246 | ||
247 | :Description: The cluster ID. One per cluster. | |
248 | :Type: UUID | |
249 | :Required: Yes. | |
250 | :Default: N/A. May be generated by a deployment tool if not specified. | |
251 | ||
252 | .. note:: Do not set this value if you use a deployment tool that does | |
253 | it for you. | |
254 | ||
255 | ||
256 | .. index:: Ceph Monitor; initial members | |
257 | ||
258 | Initial Members | |
259 | --------------- | |
260 | ||
261 | We recommend running a production Ceph Storage Cluster with at least three Ceph | |
262 | Monitors to ensure high availability. When you run multiple monitors, you may | |
263 | specify the initial monitors that must be members of the cluster in order to | |
264 | establish a quorum. This may reduce the time it takes for your cluster to come | |
265 | online. | |
266 | ||
267 | .. code-block:: ini | |
268 | ||
269 | [mon] | |
270 | mon initial members = a,b,c | |
271 | ||
272 | ||
273 | ``mon initial members`` | |
274 | ||
275 | :Description: The IDs of initial monitors in a cluster during startup. If | |
276 | specified, Ceph requires an odd number of monitors to form an | |
277 | initial quorum (e.g., 3). | |
278 | ||
279 | :Type: String | |
280 | :Default: None | |
281 | ||
282 | .. note:: A *majority* of monitors in your cluster must be able to reach | |
283 | each other in order to establish a quorum. You can decrease the initial | |
284 | number of monitors to establish a quorum with this setting. | |
285 | ||
286 | .. index:: Ceph Monitor; data path | |
287 | ||
288 | Data | |
289 | ---- | |
290 | ||
291 | Ceph provides a default path where Ceph Monitors store data. For optimal | |
292 | performance in a production Ceph Storage Cluster, we recommend running Ceph | |
293 | Monitors on separate hosts and drives from Ceph OSD Daemons. As leveldb is using | |
294 | ``mmap()`` for writing the data, Ceph Monitors flush their data from memory to disk | |
295 | very often, which can interfere with Ceph OSD Daemon workloads if the data | |
296 | store is co-located with the OSD Daemons. | |
297 | ||
298 | In Ceph versions 0.58 and earlier, Ceph Monitors store their data in files. This | |
299 | approach allows users to inspect monitor data with common tools like ``ls`` | |
300 | and ``cat``. However, it doesn't provide strong consistency. | |
301 | ||
302 | In Ceph versions 0.59 and later, Ceph Monitors store their data as key/value | |
303 | pairs. Ceph Monitors require `ACID`_ transactions. Using a data store prevents | |
304 | recovering Ceph Monitors from running corrupted versions through Paxos, and it | |
305 | enables multiple modification operations in one single atomic batch, among other | |
306 | advantages. | |
307 | ||
308 | Generally, we do not recommend changing the default data location. If you modify | |
309 | the default location, we recommend that you make it uniform across Ceph Monitors | |
310 | by setting it in the ``[mon]`` section of the configuration file. | |
311 | ||
312 | ||
313 | ``mon data`` | |
314 | ||
315 | :Description: The monitor's data location. | |
316 | :Type: String | |
317 | :Default: ``/var/lib/ceph/mon/$cluster-$id`` | |
318 | ||
319 | ||
31f18b77 FG |
320 | ``mon data size warn`` |
321 | ||
322 | :Description: Issue a ``HEALTH_WARN`` in cluster log when the monitor's data | |
323 | store goes over 15GB. | |
324 | :Type: Integer | |
325 | :Default: 15*1024*1024*1024* | |
326 | ||
327 | ||
328 | ``mon data avail warn`` | |
329 | ||
330 | :Description: Issue a ``HEALTH_WARN`` in cluster log when the available disk | |
331 | space of monitor's data store is lower or equal to this | |
332 | percentage. | |
333 | :Type: Integer | |
334 | :Default: 30 | |
335 | ||
336 | ||
337 | ``mon data avail crit`` | |
338 | ||
339 | :Description: Issue a ``HEALTH_ERR`` in cluster log when the available disk | |
340 | space of monitor's data store is lower or equal to this | |
341 | percentage. | |
342 | :Type: Integer | |
343 | :Default: 5 | |
344 | ||
345 | ||
346 | ``mon warn on cache pools without hit sets`` | |
347 | ||
348 | :Description: Issue a ``HEALTH_WARN`` in cluster log if a cache pool does not | |
11fdf7f2 TL |
349 | have the ``hit_set_type`` value configured. |
350 | See :ref:`hit_set_type <hit_set_type>` for more | |
31f18b77 FG |
351 | details. |
352 | :Type: Boolean | |
353 | :Default: True | |
354 | ||
355 | ||
356 | ``mon warn on crush straw calc version zero`` | |
357 | ||
358 | :Description: Issue a ``HEALTH_WARN`` in cluster log if the CRUSH's | |
359 | ``straw_calc_version`` is zero. See | |
11fdf7f2 | 360 | :ref:`CRUSH map tunables <crush-map-tunables>` for |
31f18b77 FG |
361 | details. |
362 | :Type: Boolean | |
363 | :Default: True | |
364 | ||
365 | ||
366 | ``mon warn on legacy crush tunables`` | |
367 | ||
368 | :Description: Issue a ``HEALTH_WARN`` in cluster log if | |
369 | CRUSH tunables are too old (older than ``mon_min_crush_required_version``) | |
370 | :Type: Boolean | |
371 | :Default: True | |
372 | ||
373 | ||
374 | ``mon crush min required version`` | |
375 | ||
376 | :Description: The minimum tunable profile version required by the cluster. | |
377 | See | |
11fdf7f2 | 378 | :ref:`CRUSH map tunables <crush-map-tunables>` for |
31f18b77 FG |
379 | details. |
380 | :Type: String | |
381 | :Default: ``firefly`` | |
382 | ||
383 | ||
384 | ``mon warn on osd down out interval zero`` | |
385 | ||
386 | :Description: Issue a ``HEALTH_WARN`` in cluster log if | |
387 | ``mon osd down out interval`` is zero. Having this option set to | |
388 | zero on the leader acts much like the ``noout`` flag. It's hard | |
11fdf7f2 | 389 | to figure out what's going wrong with clusters without the |
31f18b77 FG |
390 | ``noout`` flag set but acting like that just the same, so we |
391 | report a warning in this case. | |
392 | :Type: Boolean | |
393 | :Default: True | |
394 | ||
395 | ||
eafe8130 TL |
396 | ``mon warn on slow ping ratio`` |
397 | ||
398 | :Description: Issue a ``HEALTH_WARN`` in cluster log if any heartbeat | |
399 | between OSDs exceeds ``mon warn on slow ping ratio`` | |
400 | of ``osd heartbeat grace``. The default is 5%. | |
401 | :Type: Float | |
402 | :Default: ``0.05`` | |
403 | ||
404 | ||
405 | ``mon warn on slow ping time`` | |
406 | ||
407 | :Description: Override ``mon warn on slow ping ratio`` with a specific value. | |
408 | Issue a ``HEALTH_WARN`` in cluster log if any heartbeat | |
409 | between OSDs exceeds ``mon warn on slow ping time`` | |
410 | milliseconds. The default is 0 (disabled). | |
411 | :Type: Integer | |
412 | :Default: ``0`` | |
413 | ||
414 | ||
31f18b77 FG |
415 | ``mon cache target full warn ratio`` |
416 | ||
417 | :Description: Position between pool's ``cache_target_full`` and | |
418 | ``target_max_object`` where we start warning | |
419 | :Type: Float | |
420 | :Default: ``0.66`` | |
421 | ||
422 | ||
423 | ``mon health data update interval`` | |
424 | ||
425 | :Description: How often (in seconds) the monitor in quorum shares its health | |
426 | status with its peers. (negative number disables it) | |
427 | :Type: Float | |
428 | :Default: ``60`` | |
429 | ||
430 | ||
431 | ``mon health to clog`` | |
432 | ||
433 | :Description: Enable sending health summary to cluster log periodically. | |
434 | :Type: Boolean | |
435 | :Default: True | |
436 | ||
437 | ||
438 | ``mon health to clog tick interval`` | |
439 | ||
440 | :Description: How often (in seconds) the monitor send health summary to cluster | |
441 | log (a non-positive number disables it). If current health summary | |
442 | is empty or identical to the last time, monitor will not send it | |
443 | to cluster log. | |
eafe8130 TL |
444 | :Type: Float |
445 | :Default: 60.000000 | |
31f18b77 FG |
446 | |
447 | ||
448 | ``mon health to clog interval`` | |
449 | ||
450 | :Description: How often (in seconds) the monitor send health summary to cluster | |
451 | log (a non-positive number disables it). Monitor will always | |
452 | send the summary to cluster log no matter if the summary changes | |
453 | or not. | |
454 | :Type: Integer | |
eafe8130 | 455 | :Default: 3600 |
31f18b77 FG |
456 | |
457 | ||
458 | ||
7c673cae FG |
459 | .. index:: Ceph Storage Cluster; capacity planning, Ceph Monitor; capacity planning |
460 | ||
461 | Storage Capacity | |
462 | ---------------- | |
463 | ||
464 | When a Ceph Storage Cluster gets close to its maximum capacity (i.e., ``mon osd | |
465 | full ratio``), Ceph prevents you from writing to or reading from Ceph OSD | |
466 | Daemons as a safety measure to prevent data loss. Therefore, letting a | |
467 | production Ceph Storage Cluster approach its full ratio is not a good practice, | |
468 | because it sacrifices high availability. The default full ratio is ``.95``, or | |
469 | 95% of capacity. This a very aggressive setting for a test cluster with a small | |
470 | number of OSDs. | |
471 | ||
472 | .. tip:: When monitoring your cluster, be alert to warnings related to the | |
473 | ``nearfull`` ratio. This means that a failure of some OSDs could result | |
474 | in a temporary service disruption if one or more OSDs fails. Consider adding | |
475 | more OSDs to increase storage capacity. | |
476 | ||
477 | A common scenario for test clusters involves a system administrator removing a | |
478 | Ceph OSD Daemon from the Ceph Storage Cluster to watch the cluster rebalance; | |
479 | then, removing another Ceph OSD Daemon, and so on until the Ceph Storage Cluster | |
480 | eventually reaches the full ratio and locks up. We recommend a bit of capacity | |
481 | planning even with a test cluster. Planning enables you to gauge how much spare | |
482 | capacity you will need in order to maintain high availability. Ideally, you want | |
483 | to plan for a series of Ceph OSD Daemon failures where the cluster can recover | |
484 | to an ``active + clean`` state without replacing those Ceph OSD Daemons | |
485 | immediately. You can run a cluster in an ``active + degraded`` state, but this | |
486 | is not ideal for normal operating conditions. | |
487 | ||
488 | The following diagram depicts a simplistic Ceph Storage Cluster containing 33 | |
489 | Ceph Nodes with one Ceph OSD Daemon per host, each Ceph OSD Daemon reading from | |
490 | and writing to a 3TB drive. So this exemplary Ceph Storage Cluster has a maximum | |
491 | actual capacity of 99TB. With a ``mon osd full ratio`` of ``0.95``, if the Ceph | |
492 | Storage Cluster falls to 5TB of remaining capacity, the cluster will not allow | |
493 | Ceph Clients to read and write data. So the Ceph Storage Cluster's operating | |
494 | capacity is 95TB, not 99TB. | |
495 | ||
496 | .. ditaa:: | |
497 | ||
498 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
499 | | Rack 1 | | Rack 2 | | Rack 3 | | Rack 4 | | Rack 5 | | Rack 6 | | |
500 | | cCCC | | cF00 | | cCCC | | cCCC | | cCCC | | cCCC | | |
501 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
502 | | OSD 1 | | OSD 7 | | OSD 13 | | OSD 19 | | OSD 25 | | OSD 31 | | |
503 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
504 | | OSD 2 | | OSD 8 | | OSD 14 | | OSD 20 | | OSD 26 | | OSD 32 | | |
505 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
506 | | OSD 3 | | OSD 9 | | OSD 15 | | OSD 21 | | OSD 27 | | OSD 33 | | |
507 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
508 | | OSD 4 | | OSD 10 | | OSD 16 | | OSD 22 | | OSD 28 | | Spare | | |
509 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
510 | | OSD 5 | | OSD 11 | | OSD 17 | | OSD 23 | | OSD 29 | | Spare | | |
511 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
512 | | OSD 6 | | OSD 12 | | OSD 18 | | OSD 24 | | OSD 30 | | Spare | | |
513 | +--------+ +--------+ +--------+ +--------+ +--------+ +--------+ | |
514 | ||
515 | It is normal in such a cluster for one or two OSDs to fail. A less frequent but | |
516 | reasonable scenario involves a rack's router or power supply failing, which | |
517 | brings down multiple OSDs simultaneously (e.g., OSDs 7-12). In such a scenario, | |
518 | you should still strive for a cluster that can remain operational and achieve an | |
519 | ``active + clean`` state--even if that means adding a few hosts with additional | |
520 | OSDs in short order. If your capacity utilization is too high, you may not lose | |
521 | data, but you could still sacrifice data availability while resolving an outage | |
522 | within a failure domain if capacity utilization of the cluster exceeds the full | |
523 | ratio. For this reason, we recommend at least some rough capacity planning. | |
524 | ||
525 | Identify two numbers for your cluster: | |
526 | ||
527 | #. The number of OSDs. | |
528 | #. The total capacity of the cluster | |
529 | ||
530 | If you divide the total capacity of your cluster by the number of OSDs in your | |
531 | cluster, you will find the mean average capacity of an OSD within your cluster. | |
532 | Consider multiplying that number by the number of OSDs you expect will fail | |
533 | simultaneously during normal operations (a relatively small number). Finally | |
534 | multiply the capacity of the cluster by the full ratio to arrive at a maximum | |
535 | operating capacity; then, subtract the number of amount of data from the OSDs | |
536 | you expect to fail to arrive at a reasonable full ratio. Repeat the foregoing | |
537 | process with a higher number of OSD failures (e.g., a rack of OSDs) to arrive at | |
538 | a reasonable number for a near full ratio. | |
539 | ||
11fdf7f2 TL |
540 | The following settings only apply on cluster creation and are then stored in |
541 | the OSDMap. | |
542 | ||
7c673cae FG |
543 | .. code-block:: ini |
544 | ||
545 | [global] | |
546 | ||
547 | mon osd full ratio = .80 | |
548 | mon osd backfillfull ratio = .75 | |
549 | mon osd nearfull ratio = .70 | |
550 | ||
551 | ||
552 | ``mon osd full ratio`` | |
553 | ||
554 | :Description: The percentage of disk space used before an OSD is | |
555 | considered ``full``. | |
556 | ||
557 | :Type: Float | |
558 | :Default: ``.95`` | |
559 | ||
560 | ||
561 | ``mon osd backfillfull ratio`` | |
562 | ||
563 | :Description: The percentage of disk space used before an OSD is | |
564 | considered too ``full`` to backfill. | |
565 | ||
566 | :Type: Float | |
567 | :Default: ``.90`` | |
568 | ||
569 | ||
570 | ``mon osd nearfull ratio`` | |
571 | ||
572 | :Description: The percentage of disk space used before an OSD is | |
573 | considered ``nearfull``. | |
574 | ||
575 | :Type: Float | |
576 | :Default: ``.85`` | |
577 | ||
578 | ||
579 | .. tip:: If some OSDs are nearfull, but others have plenty of capacity, you | |
580 | may have a problem with the CRUSH weight for the nearfull OSDs. | |
581 | ||
11fdf7f2 TL |
582 | .. tip:: These settings only apply during cluster creation. Afterwards they need |
583 | to be changed in the OSDMap using ``ceph osd set-nearfull-ratio`` and | |
584 | ``ceph osd set-full-ratio`` | |
585 | ||
7c673cae FG |
586 | .. index:: heartbeat |
587 | ||
588 | Heartbeat | |
589 | --------- | |
590 | ||
591 | Ceph monitors know about the cluster by requiring reports from each OSD, and by | |
592 | receiving reports from OSDs about the status of their neighboring OSDs. Ceph | |
593 | provides reasonable default settings for monitor/OSD interaction; however, you | |
594 | may modify them as needed. See `Monitor/OSD Interaction`_ for details. | |
595 | ||
596 | ||
597 | .. index:: Ceph Monitor; leader, Ceph Monitor; provider, Ceph Monitor; requester, Ceph Monitor; synchronization | |
598 | ||
599 | Monitor Store Synchronization | |
600 | ----------------------------- | |
601 | ||
602 | When you run a production cluster with multiple monitors (recommended), each | |
603 | monitor checks to see if a neighboring monitor has a more recent version of the | |
604 | cluster map (e.g., a map in a neighboring monitor with one or more epoch numbers | |
605 | higher than the most current epoch in the map of the instant monitor). | |
606 | Periodically, one monitor in the cluster may fall behind the other monitors to | |
607 | the point where it must leave the quorum, synchronize to retrieve the most | |
608 | current information about the cluster, and then rejoin the quorum. For the | |
609 | purposes of synchronization, monitors may assume one of three roles: | |
610 | ||
611 | #. **Leader**: The `Leader` is the first monitor to achieve the most recent | |
612 | Paxos version of the cluster map. | |
613 | ||
614 | #. **Provider**: The `Provider` is a monitor that has the most recent version | |
615 | of the cluster map, but wasn't the first to achieve the most recent version. | |
616 | ||
617 | #. **Requester:** A `Requester` is a monitor that has fallen behind the leader | |
618 | and must synchronize in order to retrieve the most recent information about | |
619 | the cluster before it can rejoin the quorum. | |
620 | ||
621 | These roles enable a leader to delegate synchronization duties to a provider, | |
622 | which prevents synchronization requests from overloading the leader--improving | |
623 | performance. In the following diagram, the requester has learned that it has | |
624 | fallen behind the other monitors. The requester asks the leader to synchronize, | |
625 | and the leader tells the requester to synchronize with a provider. | |
626 | ||
627 | ||
628 | .. ditaa:: +-----------+ +---------+ +----------+ | |
629 | | Requester | | Leader | | Provider | | |
630 | +-----------+ +---------+ +----------+ | |
631 | | | | | |
632 | | | | | |
633 | | Ask to Synchronize | | | |
634 | |------------------->| | | |
635 | | | | | |
636 | |<-------------------| | | |
637 | | Tell Requester to | | | |
638 | | Sync with Provider | | | |
639 | | | | | |
640 | | Synchronize | | |
641 | |--------------------+-------------------->| | |
642 | | | | | |
643 | |<-------------------+---------------------| | |
644 | | Send Chunk to Requester | | |
645 | | (repeat as necessary) | | |
646 | | Requester Acks Chuck to Provider | | |
647 | |--------------------+-------------------->| | |
648 | | | | |
649 | | Sync Complete | | |
650 | | Notification | | |
651 | |------------------->| | |
652 | | | | |
653 | |<-------------------| | |
654 | | Ack | | |
655 | | | | |
656 | ||
657 | ||
658 | Synchronization always occurs when a new monitor joins the cluster. During | |
659 | runtime operations, monitors may receive updates to the cluster map at different | |
660 | times. This means the leader and provider roles may migrate from one monitor to | |
661 | another. If this happens while synchronizing (e.g., a provider falls behind the | |
662 | leader), the provider can terminate synchronization with a requester. | |
663 | ||
664 | Once synchronization is complete, Ceph requires trimming across the cluster. | |
665 | Trimming requires that the placement groups are ``active + clean``. | |
666 | ||
667 | ||
668 | ``mon sync trim timeout`` | |
669 | ||
670 | :Description: | |
671 | :Type: Double | |
672 | :Default: ``30.0`` | |
673 | ||
674 | ||
675 | ``mon sync heartbeat timeout`` | |
676 | ||
677 | :Description: | |
678 | :Type: Double | |
679 | :Default: ``30.0`` | |
680 | ||
681 | ||
682 | ``mon sync heartbeat interval`` | |
683 | ||
684 | :Description: | |
685 | :Type: Double | |
686 | :Default: ``5.0`` | |
687 | ||
688 | ||
689 | ``mon sync backoff timeout`` | |
690 | ||
691 | :Description: | |
692 | :Type: Double | |
693 | :Default: ``30.0`` | |
694 | ||
695 | ||
696 | ``mon sync timeout`` | |
697 | ||
31f18b77 FG |
698 | :Description: Number of seconds the monitor will wait for the next update |
699 | message from its sync provider before it gives up and bootstrap | |
700 | again. | |
7c673cae | 701 | :Type: Double |
11fdf7f2 | 702 | :Default: ``60.0`` |
7c673cae FG |
703 | |
704 | ||
705 | ``mon sync max retries`` | |
706 | ||
707 | :Description: | |
708 | :Type: Integer | |
709 | :Default: ``5`` | |
710 | ||
711 | ||
712 | ``mon sync max payload size`` | |
713 | ||
31f18b77 | 714 | :Description: The maximum size for a sync payload (in bytes). |
7c673cae FG |
715 | :Type: 32-bit Integer |
716 | :Default: ``1045676`` | |
717 | ||
718 | ||
31f18b77 | 719 | ``paxos max join drift`` |
7c673cae | 720 | |
31f18b77 FG |
721 | :Description: The maximum Paxos iterations before we must first sync the |
722 | monitor data stores. When a monitor finds that its peer is too | |
723 | far ahead of it, it will first sync with data stores before moving | |
724 | on. | |
725 | :Type: Integer | |
726 | :Default: ``10`` | |
7c673cae | 727 | |
31f18b77 | 728 | ``paxos stash full interval`` |
7c673cae | 729 | |
31f18b77 FG |
730 | :Description: How often (in commits) to stash a full copy of the PaxosService state. |
731 | Current this setting only affects ``mds``, ``mon``, ``auth`` and ``mgr`` | |
732 | PaxosServices. | |
733 | :Type: Integer | |
734 | :Default: 25 | |
7c673cae FG |
735 | |
736 | ``paxos propose interval`` | |
737 | ||
738 | :Description: Gather updates for this time interval before proposing | |
31f18b77 | 739 | a map update. |
7c673cae FG |
740 | :Type: Double |
741 | :Default: ``1.0`` | |
742 | ||
743 | ||
31f18b77 FG |
744 | ``paxos min`` |
745 | ||
746 | :Description: The minimum number of paxos states to keep around | |
747 | :Type: Integer | |
748 | :Default: 500 | |
749 | ||
750 | ||
7c673cae FG |
751 | ``paxos min wait`` |
752 | ||
753 | :Description: The minimum amount of time to gather updates after a period of | |
754 | inactivity. | |
7c673cae FG |
755 | :Type: Double |
756 | :Default: ``0.05`` | |
757 | ||
758 | ||
31f18b77 FG |
759 | ``paxos trim min`` |
760 | ||
761 | :Description: Number of extra proposals tolerated before trimming | |
762 | :Type: Integer | |
763 | :Default: 250 | |
764 | ||
765 | ||
766 | ``paxos trim max`` | |
767 | ||
768 | :Description: The maximum number of extra proposals to trim at a time | |
769 | :Type: Integer | |
770 | :Default: 500 | |
771 | ||
772 | ||
773 | ``paxos service trim min`` | |
774 | ||
775 | :Description: The minimum amount of versions to trigger a trim (0 disables it) | |
776 | :Type: Integer | |
777 | :Default: 250 | |
778 | ||
779 | ||
780 | ``paxos service trim max`` | |
781 | ||
782 | :Description: The maximum amount of versions to trim during a single proposal (0 disables it) | |
783 | :Type: Integer | |
784 | :Default: 500 | |
785 | ||
786 | ||
787 | ``mon max log epochs`` | |
788 | ||
789 | :Description: The maximum amount of log epochs to trim during a single proposal | |
790 | :Type: Integer | |
791 | :Default: 500 | |
792 | ||
793 | ||
794 | ``mon max pgmap epochs`` | |
795 | ||
796 | :Description: The maximum amount of pgmap epochs to trim during a single proposal | |
797 | :Type: Integer | |
798 | :Default: 500 | |
799 | ||
800 | ||
801 | ``mon mds force trim to`` | |
802 | ||
803 | :Description: Force monitor to trim mdsmaps to this point (0 disables it. | |
804 | dangerous, use with care) | |
805 | :Type: Integer | |
806 | :Default: 0 | |
807 | ||
808 | ||
809 | ``mon osd force trim to`` | |
810 | ||
811 | :Description: Force monitor to trim osdmaps to this point, even if there is | |
812 | PGs not clean at the specified epoch (0 disables it. dangerous, | |
813 | use with care) | |
814 | :Type: Integer | |
815 | :Default: 0 | |
816 | ||
817 | ``mon osd cache size`` | |
818 | ||
819 | :Description: The size of osdmaps cache, not to rely on underlying store's cache | |
820 | :Type: Integer | |
821 | :Default: 10 | |
822 | ||
823 | ||
824 | ``mon election timeout`` | |
825 | ||
826 | :Description: On election proposer, maximum waiting time for all ACKs in seconds. | |
827 | :Type: Float | |
828 | :Default: ``5`` | |
829 | ||
830 | ||
7c673cae FG |
831 | ``mon lease`` |
832 | ||
833 | :Description: The length (in seconds) of the lease on the monitor's versions. | |
834 | :Type: Float | |
835 | :Default: ``5`` | |
836 | ||
837 | ||
31f18b77 | 838 | ``mon lease renew interval factor`` |
7c673cae | 839 | |
31f18b77 FG |
840 | :Description: ``mon lease`` \* ``mon lease renew interval factor`` will be the |
841 | interval for the Leader to renew the other monitor's leases. The | |
842 | factor should be less than ``1.0``. | |
7c673cae | 843 | :Type: Float |
31f18b77 | 844 | :Default: ``0.6`` |
7c673cae FG |
845 | |
846 | ||
31f18b77 | 847 | ``mon lease ack timeout factor`` |
7c673cae | 848 | |
31f18b77 FG |
849 | :Description: The Leader will wait ``mon lease`` \* ``mon lease ack timeout factor`` |
850 | for the Providers to acknowledge the lease extension. | |
7c673cae | 851 | :Type: Float |
31f18b77 FG |
852 | :Default: ``2.0`` |
853 | ||
854 | ||
855 | ``mon accept timeout factor`` | |
856 | ||
857 | :Description: The Leader will wait ``mon lease`` \* ``mon accept timeout factor`` | |
858 | for the Requester(s) to accept a Paxos update. It is also used | |
859 | during the Paxos recovery phase for similar purposes. | |
860 | :Type: Float | |
861 | :Default: ``2.0`` | |
7c673cae FG |
862 | |
863 | ||
864 | ``mon min osdmap epochs`` | |
865 | ||
866 | :Description: Minimum number of OSD map epochs to keep at all times. | |
867 | :Type: 32-bit Integer | |
868 | :Default: ``500`` | |
869 | ||
870 | ||
871 | ``mon max pgmap epochs`` | |
872 | ||
873 | :Description: Maximum number of PG map epochs the monitor should keep. | |
874 | :Type: 32-bit Integer | |
875 | :Default: ``500`` | |
876 | ||
877 | ||
878 | ``mon max log epochs`` | |
879 | ||
880 | :Description: Maximum number of Log epochs the monitor should keep. | |
881 | :Type: 32-bit Integer | |
882 | :Default: ``500`` | |
883 | ||
884 | ||
885 | ||
7c673cae FG |
886 | .. index:: Ceph Monitor; clock |
887 | ||
888 | Clock | |
889 | ----- | |
890 | ||
891 | Ceph daemons pass critical messages to each other, which must be processed | |
892 | before daemons reach a timeout threshold. If the clocks in Ceph monitors | |
893 | are not synchronized, it can lead to a number of anomalies. For example: | |
894 | ||
895 | - Daemons ignoring received messages (e.g., timestamps outdated) | |
896 | - Timeouts triggered too soon/late when a message wasn't received in time. | |
897 | ||
31f18b77 | 898 | See `Monitor Store Synchronization`_ for details. |
7c673cae FG |
899 | |
900 | ||
901 | .. tip:: You SHOULD install NTP on your Ceph monitor hosts to | |
902 | ensure that the monitor cluster operates with synchronized clocks. | |
903 | ||
c07f9fc5 | 904 | Clock drift may still be noticeable with NTP even though the discrepancy is not |
7c673cae FG |
905 | yet harmful. Ceph's clock drift / clock skew warnings may get triggered even |
906 | though NTP maintains a reasonable level of synchronization. Increasing your | |
907 | clock drift may be tolerable under such circumstances; however, a number of | |
908 | factors such as workload, network latency, configuring overrides to default | |
909 | timeouts and the `Monitor Store Synchronization`_ settings may influence | |
910 | the level of acceptable clock drift without compromising Paxos guarantees. | |
911 | ||
912 | Ceph provides the following tunable options to allow you to find | |
913 | acceptable values. | |
914 | ||
915 | ||
916 | ``clock offset`` | |
917 | ||
918 | :Description: How much to offset the system clock. See ``Clock.cc`` for details. | |
919 | :Type: Double | |
920 | :Default: ``0`` | |
921 | ||
922 | ||
923 | .. deprecated:: 0.58 | |
924 | ||
925 | ``mon tick interval`` | |
926 | ||
927 | :Description: A monitor's tick interval in seconds. | |
928 | :Type: 32-bit Integer | |
929 | :Default: ``5`` | |
930 | ||
931 | ||
932 | ``mon clock drift allowed`` | |
933 | ||
934 | :Description: The clock drift in seconds allowed between monitors. | |
935 | :Type: Float | |
936 | :Default: ``.050`` | |
937 | ||
938 | ||
939 | ``mon clock drift warn backoff`` | |
940 | ||
941 | :Description: Exponential backoff for clock drift warnings | |
942 | :Type: Float | |
943 | :Default: ``5`` | |
944 | ||
945 | ||
946 | ``mon timecheck interval`` | |
947 | ||
948 | :Description: The time check interval (clock drift check) in seconds | |
31f18b77 | 949 | for the Leader. |
7c673cae FG |
950 | |
951 | :Type: Float | |
952 | :Default: ``300.0`` | |
953 | ||
954 | ||
31f18b77 FG |
955 | ``mon timecheck skew interval`` |
956 | ||
957 | :Description: The time check interval (clock drift check) in seconds when in | |
958 | presence of a skew in seconds for the Leader. | |
959 | :Type: Float | |
960 | :Default: ``30.0`` | |
961 | ||
7c673cae FG |
962 | |
963 | Client | |
964 | ------ | |
965 | ||
966 | ``mon client hunt interval`` | |
967 | ||
968 | :Description: The client will try a new monitor every ``N`` seconds until it | |
969 | establishes a connection. | |
970 | ||
971 | :Type: Double | |
972 | :Default: ``3.0`` | |
973 | ||
974 | ||
975 | ``mon client ping interval`` | |
976 | ||
977 | :Description: The client will ping the monitor every ``N`` seconds. | |
978 | :Type: Double | |
979 | :Default: ``10.0`` | |
980 | ||
981 | ||
982 | ``mon client max log entries per message`` | |
983 | ||
984 | :Description: The maximum number of log entries a monitor will generate | |
985 | per client message. | |
986 | ||
987 | :Type: Integer | |
988 | :Default: ``1000`` | |
989 | ||
990 | ||
991 | ``mon client bytes`` | |
992 | ||
993 | :Description: The amount of client message data allowed in memory (in bytes). | |
994 | :Type: 64-bit Integer Unsigned | |
995 | :Default: ``100ul << 20`` | |
996 | ||
997 | ||
998 | Pool settings | |
999 | ============= | |
1000 | Since version v0.94 there is support for pool flags which allow or disallow changes to be made to pools. | |
1001 | ||
1002 | Monitors can also disallow removal of pools if configured that way. | |
1003 | ||
1004 | ``mon allow pool delete`` | |
1005 | ||
1006 | :Description: If the monitors should allow pools to be removed. Regardless of what the pool flags say. | |
1007 | :Type: Boolean | |
1008 | :Default: ``false`` | |
1009 | ||
11fdf7f2 TL |
1010 | ``osd pool default ec fast read`` |
1011 | ||
1012 | :Description: Whether to turn on fast read on the pool or not. It will be used as | |
1013 | the default setting of newly created erasure coded pools if ``fast_read`` | |
1014 | is not specified at create time. | |
1015 | :Type: Boolean | |
1016 | :Default: ``false`` | |
1017 | ||
7c673cae FG |
1018 | ``osd pool default flag hashpspool`` |
1019 | ||
1020 | :Description: Set the hashpspool flag on new pools | |
1021 | :Type: Boolean | |
1022 | :Default: ``true`` | |
1023 | ||
1024 | ``osd pool default flag nodelete`` | |
1025 | ||
1026 | :Description: Set the nodelete flag on new pools. Prevents allow pool removal with this flag in any way. | |
1027 | :Type: Boolean | |
1028 | :Default: ``false`` | |
1029 | ||
1030 | ``osd pool default flag nopgchange`` | |
1031 | ||
1032 | :Description: Set the nopgchange flag on new pools. Does not allow the number of PGs to be changed for a pool. | |
1033 | :Type: Boolean | |
1034 | :Default: ``false`` | |
1035 | ||
1036 | ``osd pool default flag nosizechange`` | |
1037 | ||
1038 | :Description: Set the nosizechange flag on new pools. Does not allow the size to be changed of pool. | |
1039 | :Type: Boolean | |
1040 | :Default: ``false`` | |
1041 | ||
1042 | For more information about the pool flags see `Pool values`_. | |
1043 | ||
1044 | Miscellaneous | |
1045 | ============= | |
1046 | ||
1047 | ||
1048 | ``mon max osd`` | |
1049 | ||
1050 | :Description: The maximum number of OSDs allowed in the cluster. | |
1051 | :Type: 32-bit Integer | |
1052 | :Default: ``10000`` | |
1053 | ||
1054 | ``mon globalid prealloc`` | |
1055 | ||
1056 | :Description: The number of global IDs to pre-allocate for clients and daemons in the cluster. | |
1057 | :Type: 32-bit Integer | |
1058 | :Default: ``100`` | |
1059 | ||
7c673cae FG |
1060 | ``mon subscribe interval`` |
1061 | ||
1062 | :Description: The refresh interval (in seconds) for subscriptions. The | |
1063 | subscription mechanism enables obtaining the cluster maps | |
1064 | and log information. | |
1065 | ||
1066 | :Type: Double | |
11fdf7f2 | 1067 | :Default: ``86400`` |
7c673cae FG |
1068 | |
1069 | ||
1070 | ``mon stat smooth intervals`` | |
1071 | ||
1072 | :Description: Ceph will smooth statistics over the last ``N`` PG maps. | |
1073 | :Type: Integer | |
1074 | :Default: ``2`` | |
1075 | ||
1076 | ||
1077 | ``mon probe timeout`` | |
1078 | ||
1079 | :Description: Number of seconds the monitor will wait to find peers before bootstrapping. | |
1080 | :Type: Double | |
1081 | :Default: ``2.0`` | |
1082 | ||
1083 | ||
1084 | ``mon daemon bytes`` | |
1085 | ||
1086 | :Description: The message memory cap for metadata server and OSD messages (in bytes). | |
1087 | :Type: 64-bit Integer Unsigned | |
1088 | :Default: ``400ul << 20`` | |
1089 | ||
1090 | ||
1091 | ``mon max log entries per event`` | |
1092 | ||
1093 | :Description: The maximum number of log entries per event. | |
1094 | :Type: Integer | |
1095 | :Default: ``4096`` | |
1096 | ||
1097 | ||
1098 | ``mon osd prime pg temp`` | |
1099 | ||
1100 | :Description: Enables or disable priming the PGMap with the previous OSDs when an out | |
1101 | OSD comes back into the cluster. With the ``true`` setting the clients | |
1102 | will continue to use the previous OSDs until the newly in OSDs as that | |
1103 | PG peered. | |
1104 | :Type: Boolean | |
1105 | :Default: ``true`` | |
1106 | ||
1107 | ||
1108 | ``mon osd prime pg temp max time`` | |
1109 | ||
1110 | :Description: How much time in seconds the monitor should spend trying to prime the | |
1111 | PGMap when an out OSD comes back into the cluster. | |
1112 | :Type: Float | |
1113 | :Default: ``0.5`` | |
1114 | ||
1115 | ||
31f18b77 FG |
1116 | ``mon osd prime pg temp max time estimate`` |
1117 | ||
1118 | :Description: Maximum estimate of time spent on each PG before we prime all PGs | |
1119 | in parallel. | |
1120 | :Type: Float | |
1121 | :Default: ``0.25`` | |
1122 | ||
1123 | ||
1124 | ``mon osd allow primary affinity`` | |
1125 | ||
1126 | :Description: allow ``primary_affinity`` to be set in the osdmap. | |
1127 | :Type: Boolean | |
1128 | :Default: False | |
1129 | ||
1130 | ||
31f18b77 FG |
1131 | ``mon mds skip sanity`` |
1132 | ||
1133 | :Description: Skip safety assertions on FSMap (in case of bugs where we want to | |
1134 | continue anyway). Monitor terminates if the FSMap sanity check | |
1135 | fails, but we can disable it by enabling this option. | |
1136 | :Type: Boolean | |
1137 | :Default: False | |
1138 | ||
1139 | ||
1140 | ``mon max mdsmap epochs`` | |
1141 | ||
1142 | :Description: The maximum amount of mdsmap epochs to trim during a single proposal. | |
1143 | :Type: Integer | |
1144 | :Default: 500 | |
1145 | ||
1146 | ||
1147 | ``mon config key max entry size`` | |
1148 | ||
1149 | :Description: The maximum size of config-key entry (in bytes) | |
1150 | :Type: Integer | |
1151 | :Default: 4096 | |
1152 | ||
1153 | ||
1154 | ``mon scrub interval`` | |
1155 | ||
1156 | :Description: How often (in seconds) the monitor scrub its store by comparing | |
1157 | the stored checksums with the computed ones of all the stored | |
1158 | keys. | |
1159 | :Type: Integer | |
1160 | :Default: 3600*24 | |
1161 | ||
1162 | ||
1163 | ``mon scrub max keys`` | |
1164 | ||
1165 | :Description: The maximum number of keys to scrub each time. | |
1166 | :Type: Integer | |
1167 | :Default: 100 | |
1168 | ||
1169 | ||
1170 | ``mon compact on start`` | |
1171 | ||
1172 | :Description: Compact the database used as Ceph Monitor store on | |
1173 | ``ceph-mon`` start. A manual compaction helps to shrink the | |
1174 | monitor database and improve the performance of it if the regular | |
1175 | compaction fails to work. | |
1176 | :Type: Boolean | |
1177 | :Default: False | |
1178 | ||
1179 | ||
1180 | ``mon compact on bootstrap`` | |
1181 | ||
1182 | :Description: Compact the database used as Ceph Monitor store on | |
1183 | on bootstrap. Monitor starts probing each other for creating | |
1184 | a quorum after bootstrap. If it times out before joining the | |
1185 | quorum, it will start over and bootstrap itself again. | |
1186 | :Type: Boolean | |
1187 | :Default: False | |
1188 | ||
1189 | ||
1190 | ``mon compact on trim`` | |
1191 | ||
1192 | :Description: Compact a certain prefix (including paxos) when we trim its old states. | |
1193 | :Type: Boolean | |
1194 | :Default: True | |
1195 | ||
1196 | ||
1197 | ``mon cpu threads`` | |
1198 | ||
1199 | :Description: Number of threads for performing CPU intensive work on monitor. | |
1200 | :Type: Boolean | |
1201 | :Default: True | |
1202 | ||
1203 | ||
1204 | ``mon osd mapping pgs per chunk`` | |
1205 | ||
1206 | :Description: We calculate the mapping from placement group to OSDs in chunks. | |
1207 | This option specifies the number of placement groups per chunk. | |
1208 | :Type: Integer | |
1209 | :Default: 4096 | |
1210 | ||
31f18b77 FG |
1211 | ``mon session timeout`` |
1212 | ||
1213 | :Description: Monitor will terminate inactive sessions stay idle over this | |
1214 | time limit. | |
1215 | :Type: Integer | |
1216 | :Default: 300 | |
1217 | ||
eafe8130 TL |
1218 | ``mon osd cache size min`` |
1219 | ||
1220 | :Description: The minimum amount of bytes to be kept mapped in memory for osd | |
1221 | monitor caches. | |
1222 | :Type: 64-bit Integer | |
1223 | :Default: 134217728 | |
1224 | ||
1225 | ``mon memory target`` | |
1226 | ||
1227 | :Description: The amount of bytes pertaining to osd monitor caches and kv cache | |
1228 | to be kept mapped in memory with cache auto-tuning enabled. | |
1229 | :Type: 64-bit Integer | |
1230 | :Default: 2147483648 | |
1231 | ||
1232 | ``mon memory autotune`` | |
1233 | ||
1234 | :Description: Autotune the cache memory being used for osd monitors and kv | |
1235 | database. | |
1236 | :Type: Boolean | |
1237 | :Default: True | |
31f18b77 | 1238 | |
7c673cae | 1239 | |
11fdf7f2 | 1240 | .. _Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science) |
7c673cae FG |
1241 | .. _Monitor Keyrings: ../../../dev/mon-bootstrap#secret-keys |
1242 | .. _Ceph configuration file: ../ceph-conf/#monitors | |
1243 | .. _Network Configuration Reference: ../network-config-ref | |
1244 | .. _Monitor lookup through DNS: ../mon-lookup-dns | |
11fdf7f2 | 1245 | .. _ACID: https://en.wikipedia.org/wiki/ACID |
7c673cae FG |
1246 | .. _Adding/Removing a Monitor: ../../operations/add-or-rm-mons |
1247 | .. _Add/Remove a Monitor (ceph-deploy): ../../deployment/ceph-deploy-mon | |
1248 | .. _Monitoring a Cluster: ../../operations/monitoring | |
1249 | .. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg | |
1250 | .. _Bootstrapping a Monitor: ../../../dev/mon-bootstrap | |
1251 | .. _Changing a Monitor's IP Address: ../../operations/add-or-rm-mons#changing-a-monitor-s-ip-address | |
1252 | .. _Monitor/OSD Interaction: ../mon-osd-interaction | |
1253 | .. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability | |
1254 | .. _Pool values: ../../operations/pools/#set-pool-values |