]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ====================== |
2 | OSD Config Reference | |
3 | ====================== | |
4 | ||
5 | .. index:: OSD; configuration | |
6 | ||
7 | You can configure Ceph OSD Daemons in the Ceph configuration file, but Ceph OSD | |
8 | Daemons can use the default values and a very minimal configuration. A minimal | |
9 | Ceph OSD Daemon configuration sets ``osd journal size`` and ``host``, and | |
10 | uses default values for nearly everything else. | |
11 | ||
12 | Ceph OSD Daemons are numerically identified in incremental fashion, beginning | |
13 | with ``0`` using the following convention. :: | |
14 | ||
15 | osd.0 | |
16 | osd.1 | |
17 | osd.2 | |
18 | ||
19 | In a configuration file, you may specify settings for all Ceph OSD Daemons in | |
20 | the cluster by adding configuration settings to the ``[osd]`` section of your | |
21 | configuration file. To add settings directly to a specific Ceph OSD Daemon | |
22 | (e.g., ``host``), enter it in an OSD-specific section of your configuration | |
23 | file. For example: | |
24 | ||
25 | .. code-block:: ini | |
26 | ||
27 | [osd] | |
28 | osd journal size = 1024 | |
29 | ||
30 | [osd.0] | |
31 | host = osd-host-a | |
32 | ||
33 | [osd.1] | |
34 | host = osd-host-b | |
35 | ||
36 | ||
37 | .. index:: OSD; config settings | |
38 | ||
39 | General Settings | |
40 | ================ | |
41 | ||
42 | The following settings provide an Ceph OSD Daemon's ID, and determine paths to | |
43 | data and journals. Ceph deployment scripts typically generate the UUID | |
44 | automatically. We **DO NOT** recommend changing the default paths for data or | |
45 | journals, as it makes it more problematic to troubleshoot Ceph later. | |
46 | ||
47 | The journal size should be at least twice the product of the expected drive | |
48 | speed multiplied by ``filestore max sync interval``. However, the most common | |
49 | practice is to partition the journal drive (often an SSD), and mount it such | |
50 | that Ceph uses the entire partition for the journal. | |
51 | ||
52 | ||
53 | ``osd uuid`` | |
54 | ||
55 | :Description: The universally unique identifier (UUID) for the Ceph OSD Daemon. | |
56 | :Type: UUID | |
57 | :Default: The UUID. | |
58 | :Note: The ``osd uuid`` applies to a single Ceph OSD Daemon. The ``fsid`` | |
59 | applies to the entire cluster. | |
60 | ||
61 | ||
62 | ``osd data`` | |
63 | ||
64 | :Description: The path to the OSDs data. You must create the directory when | |
65 | deploying Ceph. You should mount a drive for OSD data at this | |
66 | mount point. We do not recommend changing the default. | |
67 | ||
68 | :Type: String | |
69 | :Default: ``/var/lib/ceph/osd/$cluster-$id`` | |
70 | ||
71 | ||
72 | ``osd max write size`` | |
73 | ||
74 | :Description: The maximum size of a write in megabytes. | |
75 | :Type: 32-bit Integer | |
76 | :Default: ``90`` | |
77 | ||
78 | ||
79 | ``osd client message size cap`` | |
80 | ||
81 | :Description: The largest client data message allowed in memory. | |
82 | :Type: 64-bit Integer Unsigned | |
83 | :Default: 500MB default. ``500*1024L*1024L`` | |
84 | ||
85 | ||
86 | ``osd class dir`` | |
87 | ||
88 | :Description: The class path for RADOS class plug-ins. | |
89 | :Type: String | |
90 | :Default: ``$libdir/rados-classes`` | |
91 | ||
92 | ||
93 | .. index:: OSD; file system | |
94 | ||
95 | File System Settings | |
96 | ==================== | |
97 | Ceph builds and mounts file systems which are used for Ceph OSDs. | |
98 | ||
99 | ``osd mkfs options {fs-type}`` | |
100 | ||
101 | :Description: Options used when creating a new Ceph OSD of type {fs-type}. | |
102 | ||
103 | :Type: String | |
104 | :Default for xfs: ``-f -i 2048`` | |
105 | :Default for other file systems: {empty string} | |
106 | ||
107 | For example:: | |
108 | ``osd mkfs options xfs = -f -d agcount=24`` | |
109 | ||
110 | ``osd mount options {fs-type}`` | |
111 | ||
112 | :Description: Options used when mounting a Ceph OSD of type {fs-type}. | |
113 | ||
114 | :Type: String | |
115 | :Default for xfs: ``rw,noatime,inode64`` | |
116 | :Default for other file systems: ``rw, noatime`` | |
117 | ||
118 | For example:: | |
119 | ``osd mount options xfs = rw, noatime, inode64, logbufs=8`` | |
120 | ||
121 | ||
122 | .. index:: OSD; journal settings | |
123 | ||
124 | Journal Settings | |
125 | ================ | |
126 | ||
127 | By default, Ceph expects that you will store an Ceph OSD Daemons journal with | |
128 | the following path:: | |
129 | ||
130 | /var/lib/ceph/osd/$cluster-$id/journal | |
131 | ||
132 | Without performance optimization, Ceph stores the journal on the same disk as | |
133 | the Ceph OSD Daemons data. An Ceph OSD Daemon optimized for performance may use | |
134 | a separate disk to store journal data (e.g., a solid state drive delivers high | |
135 | performance journaling). | |
136 | ||
137 | Ceph's default ``osd journal size`` is 0, so you will need to set this in your | |
138 | ``ceph.conf`` file. A journal size should find the product of the ``filestore | |
139 | max sync interval`` and the expected throughput, and multiply the product by | |
140 | two (2):: | |
141 | ||
142 | osd journal size = {2 * (expected throughput * filestore max sync interval)} | |
143 | ||
144 | The expected throughput number should include the expected disk throughput | |
145 | (i.e., sustained data transfer rate), and network throughput. For example, | |
146 | a 7200 RPM disk will likely have approximately 100 MB/s. Taking the ``min()`` | |
147 | of the disk and network throughput should provide a reasonable expected | |
148 | throughput. Some users just start off with a 10GB journal size. For | |
149 | example:: | |
150 | ||
151 | osd journal size = 10000 | |
152 | ||
153 | ||
154 | ``osd journal`` | |
155 | ||
156 | :Description: The path to the OSD's journal. This may be a path to a file or a | |
157 | block device (such as a partition of an SSD). If it is a file, | |
158 | you must create the directory to contain it. We recommend using a | |
159 | drive separate from the ``osd data`` drive. | |
160 | ||
161 | :Type: String | |
162 | :Default: ``/var/lib/ceph/osd/$cluster-$id/journal`` | |
163 | ||
164 | ||
165 | ``osd journal size`` | |
166 | ||
167 | :Description: The size of the journal in megabytes. If this is 0, and the | |
168 | journal is a block device, the entire block device is used. | |
169 | Since v0.54, this is ignored if the journal is a block device, | |
170 | and the entire block device is used. | |
171 | ||
172 | :Type: 32-bit Integer | |
173 | :Default: ``5120`` | |
174 | :Recommended: Begin with 1GB. Should be at least twice the product of the | |
175 | expected speed multiplied by ``filestore max sync interval``. | |
176 | ||
177 | ||
178 | See `Journal Config Reference`_ for additional details. | |
179 | ||
180 | ||
181 | Monitor OSD Interaction | |
182 | ======================= | |
183 | ||
184 | Ceph OSD Daemons check each other's heartbeats and report to monitors | |
185 | periodically. Ceph can use default values in many cases. However, if your | |
186 | network has latency issues, you may need to adopt longer intervals. See | |
187 | `Configuring Monitor/OSD Interaction`_ for a detailed discussion of heartbeats. | |
188 | ||
189 | ||
190 | Data Placement | |
191 | ============== | |
192 | ||
193 | See `Pool & PG Config Reference`_ for details. | |
194 | ||
195 | ||
196 | .. index:: OSD; scrubbing | |
197 | ||
198 | Scrubbing | |
199 | ========= | |
200 | ||
201 | In addition to making multiple copies of objects, Ceph insures data integrity by | |
202 | scrubbing placement groups. Ceph scrubbing is analogous to ``fsck`` on the | |
203 | object storage layer. For each placement group, Ceph generates a catalog of all | |
204 | objects and compares each primary object and its replicas to ensure that no | |
205 | objects are missing or mismatched. Light scrubbing (daily) checks the object | |
206 | size and attributes. Deep scrubbing (weekly) reads the data and uses checksums | |
207 | to ensure data integrity. | |
208 | ||
209 | Scrubbing is important for maintaining data integrity, but it can reduce | |
210 | performance. You can adjust the following settings to increase or decrease | |
211 | scrubbing operations. | |
212 | ||
213 | ||
214 | ``osd max scrubs`` | |
215 | ||
216 | :Description: The maximum number of simultaneous scrub operations for | |
217 | a Ceph OSD Daemon. | |
218 | ||
219 | :Type: 32-bit Int | |
220 | :Default: ``1`` | |
221 | ||
222 | ``osd scrub begin hour`` | |
223 | ||
224 | :Description: The time of day for the lower bound when a scheduled scrub can be | |
225 | performed. | |
226 | :Type: Integer in the range of 0 to 24 | |
227 | :Default: ``0`` | |
228 | ||
229 | ||
230 | ``osd scrub end hour`` | |
231 | ||
232 | :Description: The time of day for the upper bound when a scheduled scrub can be | |
233 | performed. Along with ``osd scrub begin hour``, they define a time | |
234 | window, in which the scrubs can happen. But a scrub will be performed | |
235 | no matter the time window allows or not, as long as the placement | |
236 | group's scrub interval exceeds ``osd scrub max interval``. | |
237 | :Type: Integer in the range of 0 to 24 | |
238 | :Default: ``24`` | |
239 | ||
240 | ||
241 | ``osd scrub during recovery`` | |
242 | ||
243 | :Description: Allow scrub during recovery. Setting this to ``false`` will disable | |
244 | scheduling new scrub (and deep--scrub) while there is active recovery. | |
245 | Already running scrubs will be continued. This might be useful to reduce | |
246 | load on busy clusters. | |
247 | :Type: Boolean | |
248 | :Default: ``true`` | |
249 | ||
250 | ||
251 | ``osd scrub thread timeout`` | |
252 | ||
253 | :Description: The maximum time in seconds before timing out a scrub thread. | |
254 | :Type: 32-bit Integer | |
255 | :Default: ``60`` | |
256 | ||
257 | ||
258 | ``osd scrub finalize thread timeout`` | |
259 | ||
260 | :Description: The maximum time in seconds before timing out a scrub finalize | |
261 | thread. | |
262 | ||
263 | :Type: 32-bit Integer | |
264 | :Default: ``60*10`` | |
265 | ||
266 | ||
267 | ``osd scrub load threshold`` | |
268 | ||
269 | :Description: The maximum load. Ceph will not scrub when the system load | |
270 | (as defined by ``getloadavg()``) is higher than this number. | |
271 | Default is ``0.5``. | |
272 | ||
273 | :Type: Float | |
274 | :Default: ``0.5`` | |
275 | ||
276 | ||
277 | ``osd scrub min interval`` | |
278 | ||
279 | :Description: The minimal interval in seconds for scrubbing the Ceph OSD Daemon | |
280 | when the Ceph Storage Cluster load is low. | |
281 | ||
282 | :Type: Float | |
283 | :Default: Once per day. ``60*60*24`` | |
284 | ||
285 | ||
286 | ``osd scrub max interval`` | |
287 | ||
288 | :Description: The maximum interval in seconds for scrubbing the Ceph OSD Daemon | |
289 | irrespective of cluster load. | |
290 | ||
291 | :Type: Float | |
292 | :Default: Once per week. ``7*60*60*24`` | |
293 | ||
294 | ||
295 | ``osd scrub chunk min`` | |
296 | ||
297 | :Description: The minimal number of object store chunks to scrub during single operation. | |
298 | Ceph blocks writes to single chunk during scrub. | |
299 | ||
300 | :Type: 32-bit Integer | |
301 | :Default: 5 | |
302 | ||
303 | ||
304 | ``osd scrub chunk max`` | |
305 | ||
306 | :Description: The maximum number of object store chunks to scrub during single operation. | |
307 | ||
308 | :Type: 32-bit Integer | |
309 | :Default: 25 | |
310 | ||
311 | ||
312 | ``osd scrub sleep`` | |
313 | ||
314 | :Description: Time to sleep before scrubbing next group of chunks. Increasing this value will slow | |
315 | down whole scrub operation while client operations will be less impacted. | |
316 | ||
317 | :Type: Float | |
318 | :Default: 0 | |
319 | ||
320 | ||
321 | ``osd deep scrub interval`` | |
322 | ||
323 | :Description: The interval for "deep" scrubbing (fully reading all data). The | |
324 | ``osd scrub load threshold`` does not affect this setting. | |
325 | ||
326 | :Type: Float | |
327 | :Default: Once per week. ``60*60*24*7`` | |
328 | ||
329 | ||
330 | ``osd scrub interval randomize ratio`` | |
331 | ||
332 | :Description: Add a random delay to ``osd scrub min interval`` when scheduling | |
333 | the next scrub job for a placement group. The delay is a random | |
334 | value less than ``osd scrub min interval`` \* | |
335 | ``osd scrub interval randomized ratio``. So the default setting | |
336 | practically randomly spreads the scrubs out in the allowed time | |
337 | window of ``[1, 1.5]`` \* ``osd scrub min interval``. | |
338 | :Type: Float | |
339 | :Default: ``0.5`` | |
340 | ||
341 | ``osd deep scrub stride`` | |
342 | ||
343 | :Description: Read size when doing a deep scrub. | |
344 | :Type: 32-bit Integer | |
345 | :Default: 512 KB. ``524288`` | |
346 | ||
347 | ||
348 | .. index:: OSD; operations settings | |
349 | ||
350 | Operations | |
351 | ========== | |
352 | ||
353 | Operations settings allow you to configure the number of threads for servicing | |
354 | requests. If you set ``osd op threads`` to ``0``, it disables multi-threading. | |
355 | By default, Ceph uses two threads with a 30 second timeout and a 30 second | |
356 | complaint time if an operation doesn't complete within those time parameters. | |
357 | You can set operations priority weights between client operations and | |
358 | recovery operations to ensure optimal performance during recovery. | |
359 | ||
360 | ||
361 | ``osd op threads`` | |
362 | ||
363 | :Description: The number of threads to service Ceph OSD Daemon operations. | |
364 | Set to ``0`` to disable it. Increasing the number may increase | |
365 | the request processing rate. | |
366 | ||
367 | :Type: 32-bit Integer | |
368 | :Default: ``2`` | |
369 | ||
370 | ||
371 | ``osd op queue`` | |
372 | ||
373 | :Description: This sets the type of queue to be used for prioritizing ops | |
374 | in the OSDs. Both queues feature a strict sub-queue which is | |
375 | dequeued before the normal queue. The normal queue is different | |
376 | between implementations. The original PrioritizedQueue (``prio``) uses a | |
377 | token bucket system which when there are sufficient tokens will | |
378 | dequeue high priority queues first. If there are not enough | |
379 | tokens available, queues are dequeued low priority to high priority. | |
380 | The new WeightedPriorityQueue (``wpq``) dequeues all priorities in | |
381 | relation to their priorities to prevent starvation of any queue. | |
382 | WPQ should help in cases where a few OSDs are more overloaded | |
383 | than others. Requires a restart. | |
384 | ||
385 | :Type: String | |
386 | :Valid Choices: prio, wpq | |
387 | :Default: ``prio`` | |
388 | ||
389 | ||
390 | ``osd op queue cut off`` | |
391 | ||
392 | :Description: This selects which priority ops will be sent to the strict | |
393 | queue verses the normal queue. The ``low`` setting sends all | |
394 | replication ops and higher to the strict queue, while the ``high`` | |
395 | option sends only replication acknowledgement ops and higher to | |
396 | the strict queue. Setting this to ``high`` should help when a few | |
397 | OSDs in the cluster are very busy especially when combined with | |
398 | ``wpq`` in the ``osd op queue`` setting. OSDs that are very busy | |
399 | handling replication traffic could starve primary client traffic | |
400 | on these OSDs without these settings. Requires a restart. | |
401 | ||
402 | :Type: String | |
403 | :Valid Choices: low, high | |
404 | :Default: ``low`` | |
405 | ||
406 | ||
407 | ``osd client op priority`` | |
408 | ||
409 | :Description: The priority set for client operations. It is relative to | |
410 | ``osd recovery op priority``. | |
411 | ||
412 | :Type: 32-bit Integer | |
413 | :Default: ``63`` | |
414 | :Valid Range: 1-63 | |
415 | ||
416 | ||
417 | ``osd recovery op priority`` | |
418 | ||
419 | :Description: The priority set for recovery operations. It is relative to | |
420 | ``osd client op priority``. | |
421 | ||
422 | :Type: 32-bit Integer | |
31f18b77 | 423 | :Default: ``3`` |
7c673cae FG |
424 | :Valid Range: 1-63 |
425 | ||
426 | ||
427 | ``osd scrub priority`` | |
428 | ||
429 | :Description: The priority set for scrub operations. It is relative to | |
430 | ``osd client op priority``. | |
431 | ||
432 | :Type: 32-bit Integer | |
433 | :Default: ``5`` | |
434 | :Valid Range: 1-63 | |
435 | ||
436 | ||
437 | ``osd snap trim priority`` | |
438 | ||
439 | :Description: The priority set for snap trim operations. It is relative to | |
440 | ``osd client op priority``. | |
441 | ||
442 | :Type: 32-bit Integer | |
443 | :Default: ``5`` | |
444 | :Valid Range: 1-63 | |
445 | ||
446 | ||
447 | ``osd op thread timeout`` | |
448 | ||
449 | :Description: The Ceph OSD Daemon operation thread timeout in seconds. | |
450 | :Type: 32-bit Integer | |
451 | :Default: ``15`` | |
452 | ||
453 | ||
454 | ``osd op complaint time`` | |
455 | ||
456 | :Description: An operation becomes complaint worthy after the specified number | |
457 | of seconds have elapsed. | |
458 | ||
459 | :Type: Float | |
460 | :Default: ``30`` | |
461 | ||
462 | ||
463 | ``osd disk threads`` | |
464 | ||
465 | :Description: The number of disk threads, which are used to perform background | |
466 | disk intensive OSD operations such as scrubbing and snap | |
467 | trimming. | |
468 | ||
469 | :Type: 32-bit Integer | |
470 | :Default: ``1`` | |
471 | ||
472 | ``osd disk thread ioprio class`` | |
473 | ||
474 | :Description: Warning: it will only be used if both ``osd disk thread | |
475 | ioprio class`` and ``osd disk thread ioprio priority`` are | |
476 | set to a non default value. Sets the ioprio_set(2) I/O | |
477 | scheduling ``class`` for the disk thread. Acceptable | |
478 | values are ``idle``, ``be`` or ``rt``. The ``idle`` | |
479 | class means the disk thread will have lower priority | |
480 | than any other thread in the OSD. This is useful to slow | |
481 | down scrubbing on an OSD that is busy handling client | |
482 | operations. ``be`` is the default and is the same | |
483 | priority as all other threads in the OSD. ``rt`` means | |
484 | the disk thread will have precendence over all other | |
485 | threads in the OSD. Note: Only works with the Linux Kernel | |
486 | CFQ scheduler. Since Jewel scrubbing is no longer carried | |
487 | out by the disk iothread, see osd priority options instead. | |
488 | :Type: String | |
489 | :Default: the empty string | |
490 | ||
491 | ``osd disk thread ioprio priority`` | |
492 | ||
493 | :Description: Warning: it will only be used if both ``osd disk thread | |
494 | ioprio class`` and ``osd disk thread ioprio priority`` are | |
495 | set to a non default value. It sets the ioprio_set(2) | |
496 | I/O scheduling ``priority`` of the disk thread ranging | |
497 | from 0 (highest) to 7 (lowest). If all OSDs on a given | |
498 | host were in class ``idle`` and compete for I/O | |
499 | (i.e. due to controller congestion), it can be used to | |
500 | lower the disk thread priority of one OSD to 7 so that | |
501 | another OSD with priority 0 can have priority. | |
502 | Note: Only works with the Linux Kernel CFQ scheduler. | |
503 | :Type: Integer in the range of 0 to 7 or -1 if not to be used. | |
504 | :Default: ``-1`` | |
505 | ||
506 | ``osd op history size`` | |
507 | ||
508 | :Description: The maximum number of completed operations to track. | |
509 | :Type: 32-bit Unsigned Integer | |
510 | :Default: ``20`` | |
511 | ||
512 | ||
513 | ``osd op history duration`` | |
514 | ||
515 | :Description: The oldest completed operation to track. | |
516 | :Type: 32-bit Unsigned Integer | |
517 | :Default: ``600`` | |
518 | ||
519 | ||
520 | ``osd op log threshold`` | |
521 | ||
522 | :Description: How many operations logs to display at once. | |
523 | :Type: 32-bit Integer | |
524 | :Default: ``5`` | |
525 | ||
526 | .. index:: OSD; backfilling | |
527 | ||
528 | Backfilling | |
529 | =========== | |
530 | ||
531 | When you add or remove Ceph OSD Daemons to a cluster, the CRUSH algorithm will | |
532 | want to rebalance the cluster by moving placement groups to or from Ceph OSD | |
533 | Daemons to restore the balance. The process of migrating placement groups and | |
534 | the objects they contain can reduce the cluster's operational performance | |
535 | considerably. To maintain operational performance, Ceph performs this migration | |
536 | with 'backfilling', which allows Ceph to set backfill operations to a lower | |
537 | priority than requests to read or write data. | |
538 | ||
539 | ||
540 | ``osd max backfills`` | |
541 | ||
542 | :Description: The maximum number of backfills allowed to or from a single OSD. | |
543 | :Type: 64-bit Unsigned Integer | |
544 | :Default: ``1`` | |
545 | ||
546 | ||
547 | ``osd backfill scan min`` | |
548 | ||
549 | :Description: The minimum number of objects per backfill scan. | |
550 | ||
551 | :Type: 32-bit Integer | |
552 | :Default: ``64`` | |
553 | ||
554 | ||
555 | ``osd backfill scan max`` | |
556 | ||
557 | :Description: The maximum number of objects per backfill scan. | |
558 | ||
559 | :Type: 32-bit Integer | |
560 | :Default: ``512`` | |
561 | ||
562 | ||
563 | ``osd backfill retry interval`` | |
564 | ||
565 | :Description: The number of seconds to wait before retrying backfill requests. | |
566 | :Type: Double | |
567 | :Default: ``10.0`` | |
568 | ||
569 | .. index:: OSD; osdmap | |
570 | ||
571 | OSD Map | |
572 | ======= | |
573 | ||
574 | OSD maps reflect the OSD daemons operating in the cluster. Over time, the | |
575 | number of map epochs increases. Ceph provides some settings to ensure that | |
576 | Ceph performs well as the OSD map grows larger. | |
577 | ||
578 | ||
579 | ``osd map dedup`` | |
580 | ||
581 | :Description: Enable removing duplicates in the OSD map. | |
582 | :Type: Boolean | |
583 | :Default: ``true`` | |
584 | ||
585 | ||
586 | ``osd map cache size`` | |
587 | ||
588 | :Description: The number of OSD maps to keep cached. | |
589 | :Type: 32-bit Integer | |
590 | :Default: ``500`` | |
591 | ||
592 | ||
593 | ``osd map cache bl size`` | |
594 | ||
595 | :Description: The size of the in-memory OSD map cache in OSD daemons. | |
596 | :Type: 32-bit Integer | |
597 | :Default: ``50`` | |
598 | ||
599 | ||
600 | ``osd map cache bl inc size`` | |
601 | ||
602 | :Description: The size of the in-memory OSD map cache incrementals in | |
603 | OSD daemons. | |
604 | ||
605 | :Type: 32-bit Integer | |
606 | :Default: ``100`` | |
607 | ||
608 | ||
609 | ``osd map message max`` | |
610 | ||
611 | :Description: The maximum map entries allowed per MOSDMap message. | |
612 | :Type: 32-bit Integer | |
613 | :Default: ``100`` | |
614 | ||
615 | ||
616 | ||
617 | .. index:: OSD; recovery | |
618 | ||
619 | Recovery | |
620 | ======== | |
621 | ||
622 | When the cluster starts or when a Ceph OSD Daemon crashes and restarts, the OSD | |
623 | begins peering with other Ceph OSD Daemons before writes can occur. See | |
624 | `Monitoring OSDs and PGs`_ for details. | |
625 | ||
626 | If a Ceph OSD Daemon crashes and comes back online, usually it will be out of | |
627 | sync with other Ceph OSD Daemons containing more recent versions of objects in | |
628 | the placement groups. When this happens, the Ceph OSD Daemon goes into recovery | |
629 | mode and seeks to get the latest copy of the data and bring its map back up to | |
630 | date. Depending upon how long the Ceph OSD Daemon was down, the OSD's objects | |
631 | and placement groups may be significantly out of date. Also, if a failure domain | |
632 | went down (e.g., a rack), more than one Ceph OSD Daemon may come back online at | |
633 | the same time. This can make the recovery process time consuming and resource | |
634 | intensive. | |
635 | ||
636 | To maintain operational performance, Ceph performs recovery with limitations on | |
637 | the number recovery requests, threads and object chunk sizes which allows Ceph | |
638 | perform well in a degraded state. | |
639 | ||
640 | ||
641 | ``osd recovery delay start`` | |
642 | ||
643 | :Description: After peering completes, Ceph will delay for the specified number | |
644 | of seconds before starting to recover objects. | |
645 | ||
646 | :Type: Float | |
647 | :Default: ``0`` | |
648 | ||
649 | ||
650 | ``osd recovery max active`` | |
651 | ||
652 | :Description: The number of active recovery requests per OSD at one time. More | |
653 | requests will accelerate recovery, but the requests places an | |
654 | increased load on the cluster. | |
655 | ||
656 | :Type: 32-bit Integer | |
31f18b77 | 657 | :Default: ``3`` |
7c673cae FG |
658 | |
659 | ||
660 | ``osd recovery max chunk`` | |
661 | ||
662 | :Description: The maximum size of a recovered chunk of data to push. | |
663 | :Type: 64-bit Integer Unsigned | |
664 | :Default: ``8 << 20`` | |
665 | ||
666 | ||
31f18b77 FG |
667 | ``osd recovery max single start`` |
668 | ||
669 | :Description: The maximum number of recovery operations per OSD that will be | |
670 | newly started when an OSD is recovering. | |
671 | :Type: 64-bit Integer Unsigned | |
672 | :Default: ``1`` | |
673 | ||
674 | ||
7c673cae FG |
675 | ``osd recovery thread timeout`` |
676 | ||
677 | :Description: The maximum time in seconds before timing out a recovery thread. | |
678 | :Type: 32-bit Integer | |
679 | :Default: ``30`` | |
680 | ||
681 | ||
682 | ``osd recover clone overlap`` | |
683 | ||
684 | :Description: Preserves clone overlap during recovery. Should always be set | |
685 | to ``true``. | |
686 | ||
687 | :Type: Boolean | |
688 | :Default: ``true`` | |
689 | ||
31f18b77 FG |
690 | |
691 | ``osd recovery sleep`` | |
692 | ||
693 | :Description: Time to sleep before next recovery. Increasing this value will | |
694 | slow down recovery operation while client operations will be | |
695 | less impacted. | |
696 | ||
697 | :Type: Float | |
698 | :Default: ``0.01`` | |
699 | ||
7c673cae FG |
700 | Tiering |
701 | ======= | |
702 | ||
703 | ``osd agent max ops`` | |
704 | ||
705 | :Description: The maximum number of simultaneous flushing ops per tiering agent | |
706 | in the high speed mode. | |
707 | :Type: 32-bit Integer | |
708 | :Default: ``4`` | |
709 | ||
710 | ||
711 | ``osd agent max low ops`` | |
712 | ||
713 | :Description: The maximum number of simultaneous flushing ops per tiering agent | |
714 | in the low speed mode. | |
715 | :Type: 32-bit Integer | |
716 | :Default: ``2`` | |
717 | ||
718 | See `cache target dirty high ratio`_ for when the tiering agent flushes dirty | |
719 | objects within the high speed mode. | |
720 | ||
721 | Miscellaneous | |
722 | ============= | |
723 | ||
724 | ||
725 | ``osd snap trim thread timeout`` | |
726 | ||
727 | :Description: The maximum time in seconds before timing out a snap trim thread. | |
728 | :Type: 32-bit Integer | |
729 | :Default: ``60*60*1`` | |
730 | ||
731 | ||
732 | ``osd backlog thread timeout`` | |
733 | ||
734 | :Description: The maximum time in seconds before timing out a backlog thread. | |
735 | :Type: 32-bit Integer | |
736 | :Default: ``60*60*1`` | |
737 | ||
738 | ||
739 | ``osd default notify timeout`` | |
740 | ||
741 | :Description: The OSD default notification timeout (in seconds). | |
742 | :Type: 32-bit Integer Unsigned | |
743 | :Default: ``30`` | |
744 | ||
745 | ||
746 | ``osd check for log corruption`` | |
747 | ||
748 | :Description: Check log files for corruption. Can be computationally expensive. | |
749 | :Type: Boolean | |
750 | :Default: ``false`` | |
751 | ||
752 | ||
753 | ``osd remove thread timeout`` | |
754 | ||
755 | :Description: The maximum time in seconds before timing out a remove OSD thread. | |
756 | :Type: 32-bit Integer | |
757 | :Default: ``60*60`` | |
758 | ||
759 | ||
760 | ``osd command thread timeout`` | |
761 | ||
762 | :Description: The maximum time in seconds before timing out a command thread. | |
763 | :Type: 32-bit Integer | |
764 | :Default: ``10*60`` | |
765 | ||
766 | ||
767 | ``osd command max records`` | |
768 | ||
769 | :Description: Limits the number of lost objects to return. | |
770 | :Type: 32-bit Integer | |
771 | :Default: ``256`` | |
772 | ||
773 | ||
774 | ``osd auto upgrade tmap`` | |
775 | ||
776 | :Description: Uses ``tmap`` for ``omap`` on old objects. | |
777 | :Type: Boolean | |
778 | :Default: ``true`` | |
779 | ||
780 | ||
781 | ``osd tmapput sets users tmap`` | |
782 | ||
783 | :Description: Uses ``tmap`` for debugging only. | |
784 | :Type: Boolean | |
785 | :Default: ``false`` | |
786 | ||
787 | ||
788 | ``osd preserve trimmed log`` | |
789 | ||
790 | :Description: Preserves trimmed log files, but uses more disk space. | |
791 | :Type: Boolean | |
792 | :Default: ``false`` | |
793 | ||
794 | ||
795 | ``osd fast fail on connection refused`` | |
796 | ||
797 | :Description: If this option is enabled, crashed OSDs are marked down | |
798 | immediately by connected peers and MONs (assuming that the | |
799 | crashed OSD host survives). Disable it to restore old | |
800 | behavior, at the expense of possible long I/O stalls when | |
801 | OSDs crash in the middle of I/O operations. | |
802 | :Type: Boolean | |
803 | :Default: ``true`` | |
804 | ||
805 | ||
806 | ||
807 | .. _pool: ../../operations/pools | |
808 | .. _Configuring Monitor/OSD Interaction: ../mon-osd-interaction | |
809 | .. _Monitoring OSDs and PGs: ../../operations/monitoring-osd-pg#peering | |
810 | .. _Pool & PG Config Reference: ../pool-pg-config-ref | |
811 | .. _Journal Config Reference: ../journal-ref | |
812 | .. _cache target dirty high ratio: ../../operations/pools#cache-target-dirty-high-ratio |