]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/control.rst
update ceph source to reef 18.1.2
[ceph.git] / ceph / doc / rados / operations / control.rst
CommitLineData
7c673cae
FG
1.. index:: control, commands
2
3==================
4 Control Commands
5==================
6
7
8Monitor Commands
9================
10
1e59de90 11To issue monitor commands, use the ``ceph`` utility:
7c673cae 12
39ae355f 13.. prompt:: bash $
7c673cae 14
39ae355f 15 ceph [-m monhost] {command}
7c673cae 16
1e59de90 17In most cases, monitor commands have the following form:
39ae355f
TL
18
19.. prompt:: bash $
20
21 ceph {subsystem} {command}
7c673cae
FG
22
23
24System Commands
25===============
26
1e59de90 27To display the current cluster status, run the following commands:
7c673cae 28
39ae355f
TL
29.. prompt:: bash $
30
31 ceph -s
32 ceph status
7c673cae 33
1e59de90
TL
34To display a running summary of cluster status and major events, run the
35following command:
39ae355f
TL
36
37.. prompt:: bash $
7c673cae 38
39ae355f 39 ceph -w
7c673cae 40
1e59de90
TL
41To display the monitor quorum, including which monitors are participating and
42which one is the leader, run the following commands:
7c673cae 43
39ae355f
TL
44.. prompt:: bash $
45
46 ceph mon stat
47 ceph quorum_status
7c673cae 48
1e59de90
TL
49To query the status of a single monitor, including whether it is in the quorum,
50run the following command:
39ae355f
TL
51
52.. prompt:: bash $
7c673cae 53
39ae355f 54 ceph tell mon.[id] mon_status
9f95a23c 55
1e59de90
TL
56Here the value of ``[id]`` can be found by consulting the output of ``ceph
57-s``.
7c673cae
FG
58
59
60Authentication Subsystem
61========================
62
1e59de90 63To add an OSD keyring for a specific OSD, run the following command:
39ae355f
TL
64
65.. prompt:: bash $
7c673cae 66
39ae355f 67 ceph auth add {osd} {--in-file|-i} {path-to-osd-keyring}
7c673cae 68
1e59de90 69To list the cluster's keys and their capabilities, run the following command:
7c673cae 70
39ae355f
TL
71.. prompt:: bash $
72
73 ceph auth ls
7c673cae
FG
74
75
76Placement Group Subsystem
77=========================
78
1e59de90
TL
79To display the statistics for all placement groups (PGs), run the following
80command:
39ae355f
TL
81
82.. prompt:: bash $
7c673cae 83
39ae355f 84 ceph pg dump [--format {format}]
7c673cae 85
1e59de90
TL
86Here the valid formats are ``plain`` (default), ``json`` ``json-pretty``,
87``xml``, and ``xml-pretty``. When implementing monitoring tools and other
88tools, it is best to use the ``json`` format. JSON parsing is more
89deterministic than the ``plain`` format (which is more human readable), and the
90layout is much more consistent from release to release. The ``jq`` utility is
91very useful for extracting data from JSON output.
7c673cae 92
1e59de90
TL
93To display the statistics for all PGs stuck in a specified state, run the
94following command:
7c673cae 95
39ae355f
TL
96.. prompt:: bash $
97
98 ceph pg dump_stuck inactive|unclean|stale|undersized|degraded [--format {format}] [-t|--threshold {seconds}]
7c673cae 99
1e59de90
TL
100Here ``--format`` may be ``plain`` (default), ``json``, ``json-pretty``,
101``xml``, or ``xml-pretty``.
7c673cae 102
1e59de90
TL
103The ``--threshold`` argument determines the time interval (in seconds) for a PG
104to be considered ``stuck`` (default: 300).
7c673cae 105
1e59de90 106PGs might be stuck in any of the following states:
7c673cae 107
1e59de90
TL
108**Inactive**
109 PGs are unable to process reads or writes because they are waiting for an
110 OSD that has the most up-to-date data to return to an ``up`` state.
7c673cae 111
1e59de90
TL
112**Unclean**
113 PGs contain objects that have not been replicated the desired number of
114 times. These PGs have not yet completed the process of recovering.
7c673cae 115
1e59de90
TL
116**Stale**
117 PGs are in an unknown state, because the OSDs that host them have not
118 reported to the monitor cluster for a certain period of time (specified by
119 the ``mon_osd_report_timeout`` configuration setting).
7c673cae 120
1e59de90
TL
121
122To delete a ``lost`` RADOS object or revert an object to its prior state
123(either by reverting it to its previous version or by deleting it because it
124was just created and has no previous version), run the following command:
39ae355f
TL
125
126.. prompt:: bash $
7c673cae 127
39ae355f 128 ceph pg {pgid} mark_unfound_lost revert|delete
7c673cae
FG
129
130
a4b75251
TL
131.. _osd-subsystem:
132
7c673cae
FG
133OSD Subsystem
134=============
135
1e59de90 136To query OSD subsystem status, run the following command:
7c673cae 137
39ae355f
TL
138.. prompt:: bash $
139
140 ceph osd stat
7c673cae 141
1e59de90
TL
142To write a copy of the most recent OSD map to a file (see :ref:`osdmaptool
143<osdmaptool>`), run the following command:
39ae355f
TL
144
145.. prompt:: bash $
7c673cae 146
39ae355f 147 ceph osd getmap -o file
7c673cae 148
1e59de90
TL
149To write a copy of the CRUSH map from the most recent OSD map to a file, run
150the following command:
39ae355f
TL
151
152.. prompt:: bash $
7c673cae 153
39ae355f 154 ceph osd getcrushmap -o file
7c673cae 155
1e59de90
TL
156Note that this command is functionally equivalent to the following two
157commands:
7c673cae 158
39ae355f
TL
159.. prompt:: bash $
160
161 ceph osd getmap -o /tmp/osdmap
162 osdmaptool /tmp/osdmap --export-crush file
7c673cae 163
1e59de90 164To dump the OSD map, run the following command:
39ae355f
TL
165
166.. prompt:: bash $
7c673cae 167
39ae355f 168 ceph osd dump [--format {format}]
7c673cae 169
1e59de90
TL
170The ``--format`` option accepts the following arguments: ``plain`` (default),
171``json``, ``json-pretty``, ``xml``, and ``xml-pretty``. As noted above, JSON
172format is the recommended format for consumption by tools, scripting, and other
173forms of automation.
174
175
176To dump the OSD map as a tree that lists one OSD per line and displays
177information about the weights and states of the OSDs, run the following
178command:
39ae355f
TL
179
180.. prompt:: bash $
181
182 ceph osd tree [--format {format}]
7c673cae 183
1e59de90
TL
184To find out where a specific RADOS object is stored in the system, run a
185command of the following form:
7c673cae 186
39ae355f 187.. prompt:: bash $
7c673cae 188
39ae355f 189 ceph osd map <pool-name> <object-name>
7c673cae 190
1e59de90
TL
191To add or move a new OSD (specified by its ID, name, or weight) to a specific
192CRUSH location, run the following command:
39ae355f
TL
193
194.. prompt:: bash $
195
196 ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
197
1e59de90 198To remove an existing OSD from the CRUSH map, run the following command:
7c673cae 199
39ae355f 200.. prompt:: bash $
7c673cae 201
39ae355f 202 ceph osd crush remove {name}
7c673cae 203
1e59de90 204To remove an existing bucket from the CRUSH map, run the following command:
7c673cae 205
39ae355f 206.. prompt:: bash $
7c673cae 207
39ae355f 208 ceph osd crush remove {bucket-name}
7c673cae 209
1e59de90
TL
210To move an existing bucket from one position in the CRUSH hierarchy to another,
211run the following command:
7c673cae 212
39ae355f 213.. prompt:: bash $
7c673cae 214
39ae355f 215 ceph osd crush move {id} {loc1} [{loc2} ...]
7c673cae 216
1e59de90
TL
217To set the CRUSH weight of a specific OSD (specified by ``{name}``) to
218``{weight}``, run the following command:
7c673cae 219
39ae355f 220.. prompt:: bash $
7c673cae 221
39ae355f
TL
222 ceph osd crush reweight {name} {weight}
223
1e59de90 224To mark an OSD as ``lost``, run the following command:
39ae355f
TL
225
226.. prompt:: bash $
227
228 ceph osd lost {id} [--yes-i-really-mean-it]
7c673cae 229
1e59de90
TL
230.. warning::
231 This could result in permanent data loss. Use with caution!
232
233To create an OSD in the CRUSH map, run the following command:
39ae355f
TL
234
235.. prompt:: bash $
236
237 ceph osd create [{uuid}]
238
1e59de90
TL
239If no UUID is given as part of this command, the UUID will be set automatically
240when the OSD starts up.
241
242To remove one or more specific OSDs, run the following command:
39ae355f
TL
243
244.. prompt:: bash $
7c673cae 245
39ae355f 246 ceph osd rm [{id}...]
7c673cae 247
1e59de90
TL
248To display the current ``max_osd`` parameter in the OSD map, run the following
249command:
7c673cae 250
39ae355f 251.. prompt:: bash $
7c673cae 252
39ae355f 253 ceph osd getmaxosd
7c673cae 254
1e59de90 255To import a specific CRUSH map, run the following command:
7c673cae 256
39ae355f 257.. prompt:: bash $
7c673cae 258
39ae355f 259 ceph osd setcrushmap -i file
7c673cae 260
1e59de90 261To set the ``max_osd`` parameter in the OSD map, run the following command:
7c673cae 262
39ae355f 263.. prompt:: bash $
7c673cae 264
39ae355f 265 ceph osd setmaxosd
7c673cae 266
1e59de90
TL
267The parameter has a default value of 10000. Most operators will never need to
268adjust it.
269
270To mark a specific OSD ``down``, run the following command:
7c673cae 271
39ae355f 272.. prompt:: bash $
7c673cae 273
39ae355f 274 ceph osd down {osd-num}
7c673cae 275
1e59de90
TL
276To mark a specific OSD ``out`` (so that no data will be allocated to it), run
277the following command:
7c673cae 278
39ae355f
TL
279.. prompt:: bash $
280
281 ceph osd out {osd-num}
282
1e59de90
TL
283To mark a specific OSD ``in`` (so that data will be allocated to it), run the
284following command:
39ae355f
TL
285
286.. prompt:: bash $
287
288 ceph osd in {osd-num}
7c673cae 289
1e59de90
TL
290By using the ``pause`` and ``unpause`` flags in the OSD map, you can pause or
291unpause I/O requests. If the flags are set, then no I/O requests will be sent
292to any OSD. If the flags are cleared, then pending I/O requests will be resent.
293To set or clear these flags, run one of the following commands:
7c673cae 294
39ae355f
TL
295.. prompt:: bash $
296
297 ceph osd pause
298 ceph osd unpause
7c673cae 299
1e59de90
TL
300You can assign an override or ``reweight`` weight value to a specific OSD
301if the normal CRUSH distribution seems to be suboptimal. The weight of an
302OSD helps determine the extent of its I/O requests and data storage: two
303OSDs with the same weight will receive approximately the same number of
304I/O requests and store approximately the same amount of data. The ``ceph
305osd reweight`` command assigns an override weight to an OSD. The weight
306value is in the range 0 to 1, and the command forces CRUSH to relocate a
307certain amount (1 - ``weight``) of the data that would otherwise be on
308this OSD. The command does not change the weights of the buckets above
309the OSD in the CRUSH map. Using the command is merely a corrective
310measure: for example, if one of your OSDs is at 90% and the others are at
31150%, you could reduce the outlier weight to correct this imbalance. To
312assign an override weight to a specific OSD, run the following command:
39ae355f
TL
313
314.. prompt:: bash $
7c673cae 315
39ae355f 316 ceph osd reweight {osd-num} {weight}
7c673cae 317
1e59de90
TL
318A cluster's OSDs can be reweighted in order to maintain balance if some OSDs
319are being disproportionately utilized. Note that override or ``reweight``
320weights have relative values that default to 1.00000. Their values are not
321absolute, and these weights must be distinguished from CRUSH weights (which
322reflect the absolute capacity of a bucket, as measured in TiB). To reweight
323OSDs by utilization, run the following command:
39ae355f
TL
324
325.. prompt:: bash $
9f95a23c 326
39ae355f 327 ceph osd reweight-by-utilization [threshold [max_change [max_osds]]] [--no-increasing]
7c673cae 328
1e59de90
TL
329By default, this command adjusts the override weight of OSDs that have ±20%
330of the average utilization, but you can specify a different percentage in the
331``threshold`` argument.
7c673cae 332
1e59de90
TL
333To limit the increment by which any OSD's reweight is to be changed, use the
334``max_change`` argument (default: 0.05). To limit the number of OSDs that are
335to be adjusted, use the ``max_osds`` argument (default: 4). Increasing these
336variables can accelerate the reweighting process, but perhaps at the cost of
337slower client operations (as a result of the increase in data movement).
338
339You can test the ``osd reweight-by-utilization`` command before running it. To
340find out which and how many PGs and OSDs will be affected by a specific use of
341the ``osd reweight-by-utilization`` command, run the following command:
7c673cae 342
39ae355f
TL
343.. prompt:: bash $
344
345 ceph osd test-reweight-by-utilization [threshold [max_change max_osds]] [--no-increasing]
7c673cae 346
1e59de90
TL
347The ``--no-increasing`` option can be added to the ``reweight-by-utilization``
348and ``test-reweight-by-utilization`` commands in order to prevent any override
349weights that are currently less than 1.00000 from being increased. This option
350can be useful in certain circumstances: for example, when you are hastily
351balancing in order to remedy ``full`` or ``nearfull`` OSDs, or when there are
352OSDs being evacuated or slowly brought into service.
353
354Operators of deployments that utilize Nautilus or newer (or later revisions of
355Luminous and Mimic) and that have no pre-Luminous clients might likely instead
356want to enable the `balancer`` module for ``ceph-mgr``.
357
358.. note:: The ``balancer`` module does the work for you and achieves a more
359 uniform result, shuffling less data along the way. When enabling the
360 ``balancer`` module, you will want to converge any changed override weights
361 back to 1.00000 so that the balancer can do an optimal job. If your cluster
362 is very full, reverting these override weights before enabling the balancer
363 may cause some OSDs to become full. This means that a phased approach may
364 needed.
9f95a23c 365
33c7a0ef
TL
366Add/remove an IP address or CIDR range to/from the blocklist.
367When adding to the blocklist,
f67539c2
TL
368you can specify how long it should be blocklisted in seconds; otherwise,
369it will default to 1 hour. A blocklisted address is prevented from
33c7a0ef
TL
370connecting to any OSD. If you blocklist an IP or range containing an OSD, be aware
371that OSD will also be prevented from performing operations on its peers where it
372acts as a client. (This includes tiering and copy-from functionality.)
373
374If you want to blocklist a range (in CIDR format), you may do so by
375including the ``range`` keyword.
7c673cae
FG
376
377These commands are mostly only useful for failure testing, as
f67539c2 378blocklists are normally maintained automatically and shouldn't need
39ae355f
TL
379manual intervention. :
380
381.. prompt:: bash $
382
383 ceph osd blocklist ["range"] add ADDRESS[:source_port][/netmask_bits] [TIME]
384 ceph osd blocklist ["range"] rm ADDRESS[:source_port][/netmask_bits]
385
386Creates/deletes a snapshot of a pool. :
7c673cae 387
39ae355f 388.. prompt:: bash $
7c673cae 389
39ae355f
TL
390 ceph osd pool mksnap {pool-name} {snap-name}
391 ceph osd pool rmsnap {pool-name} {snap-name}
7c673cae 392
39ae355f 393Creates/deletes/renames a storage pool. :
7c673cae 394
39ae355f 395.. prompt:: bash $
7c673cae 396
39ae355f
TL
397 ceph osd pool create {pool-name} [pg_num [pgp_num]]
398 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
399 ceph osd pool rename {old-name} {new-name}
7c673cae 400
39ae355f 401Changes a pool setting. :
7c673cae 402
39ae355f
TL
403.. prompt:: bash $
404
405 ceph osd pool set {pool-name} {field} {value}
7c673cae
FG
406
407Valid fields are:
408
409 * ``size``: Sets the number of copies of data in the pool.
410 * ``pg_num``: The placement group number.
411 * ``pgp_num``: Effective number when calculating pg placement.
b32b8144 412 * ``crush_rule``: rule number for mapping placement.
7c673cae 413
39ae355f
TL
414Get the value of a pool setting. :
415
416.. prompt:: bash $
7c673cae 417
39ae355f 418 ceph osd pool get {pool-name} {field}
7c673cae
FG
419
420Valid fields are:
421
422 * ``pg_num``: The placement group number.
423 * ``pgp_num``: Effective number of placement groups when calculating placement.
7c673cae
FG
424
425
39ae355f
TL
426Sends a scrub command to OSD ``{osd-num}``. To send the command to all OSDs, use ``*``. :
427
428.. prompt:: bash $
7c673cae 429
39ae355f 430 ceph osd scrub {osd-num}
7c673cae 431
39ae355f 432Sends a repair command to OSD.N. To send the command to all OSDs, use ``*``. :
7c673cae 433
39ae355f
TL
434.. prompt:: bash $
435
436 ceph osd repair N
7c673cae
FG
437
438Runs a simple throughput benchmark against OSD.N, writing ``TOTAL_DATA_BYTES``
439in write requests of ``BYTES_PER_WRITE`` each. By default, the test
440writes 1 GB in total in 4-MB increments.
441The benchmark is non-destructive and will not overwrite existing live
442OSD data, but might temporarily affect the performance of clients
39ae355f
TL
443concurrently accessing the OSD. :
444
445.. prompt:: bash $
446
447 ceph tell osd.N bench [TOTAL_DATA_BYTES] [BYTES_PER_WRITE]
448
449To clear an OSD's caches between benchmark runs, use the 'cache drop' command :
7c673cae 450
39ae355f 451.. prompt:: bash $
7c673cae 452
39ae355f 453 ceph tell osd.N cache drop
11fdf7f2 454
39ae355f 455To get the cache statistics of an OSD, use the 'cache status' command :
11fdf7f2 456
39ae355f 457.. prompt:: bash $
11fdf7f2 458
39ae355f 459 ceph tell osd.N cache status
7c673cae
FG
460
461MDS Subsystem
462=============
463
39ae355f
TL
464Change configuration parameters on a running mds. :
465
466.. prompt:: bash $
467
468 ceph tell mds.{mds-id} config set {setting} {value}
469
470Example:
7c673cae 471
39ae355f 472.. prompt:: bash $
7c673cae 473
39ae355f 474 ceph tell mds.0 config set debug_ms 1
7c673cae 475
39ae355f 476Enables debug messages. :
7c673cae 477
39ae355f 478.. prompt:: bash $
7c673cae 479
39ae355f 480 ceph mds stat
7c673cae 481
39ae355f 482Displays the status of all metadata servers. :
7c673cae 483
39ae355f
TL
484.. prompt:: bash $
485
486 ceph mds fail 0
7c673cae
FG
487
488Marks the active MDS as failed, triggering failover to a standby if present.
489
490.. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap
491
492
493Mon Subsystem
494=============
495
39ae355f
TL
496Show monitor stats:
497
498.. prompt:: bash $
7c673cae 499
39ae355f
TL
500 ceph mon stat
501
502::
7c673cae
FG
503
504 e2: 3 mons at {a=127.0.0.1:40000/0,b=127.0.0.1:40001/0,c=127.0.0.1:40002/0}, election epoch 6, quorum 0,1,2 a,b,c
505
506
507The ``quorum`` list at the end lists monitor nodes that are part of the current quorum.
508
39ae355f
TL
509This is also available more directly:
510
511.. prompt:: bash $
7c673cae 512
39ae355f 513 ceph quorum_status -f json-pretty
7c673cae
FG
514
515.. code-block:: javascript
516
517 {
518 "election_epoch": 6,
519 "quorum": [
520 0,
521 1,
522 2
523 ],
524 "quorum_names": [
525 "a",
526 "b",
527 "c"
528 ],
529 "quorum_leader_name": "a",
530 "monmap": {
531 "epoch": 2,
532 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
533 "modified": "2016-12-26 14:42:09.288066",
534 "created": "2016-12-26 14:42:03.573585",
535 "features": {
536 "persistent": [
537 "kraken"
538 ],
539 "optional": []
540 },
541 "mons": [
542 {
543 "rank": 0,
544 "name": "a",
545 "addr": "127.0.0.1:40000\/0",
546 "public_addr": "127.0.0.1:40000\/0"
547 },
548 {
549 "rank": 1,
550 "name": "b",
551 "addr": "127.0.0.1:40001\/0",
552 "public_addr": "127.0.0.1:40001\/0"
553 },
554 {
555 "rank": 2,
556 "name": "c",
557 "addr": "127.0.0.1:40002\/0",
558 "public_addr": "127.0.0.1:40002\/0"
559 }
560 ]
561 }
562 }
563
564
565The above will block until a quorum is reached.
566
39ae355f 567For a status of just a single monitor:
7c673cae 568
39ae355f
TL
569.. prompt:: bash $
570
571 ceph tell mon.[name] mon_status
7c673cae 572
9f95a23c
TL
573where the value of ``[name]`` can be taken from ``ceph quorum_status``. Sample
574output::
7c673cae
FG
575
576 {
577 "name": "b",
578 "rank": 1,
579 "state": "peon",
580 "election_epoch": 6,
581 "quorum": [
582 0,
583 1,
584 2
585 ],
586 "features": {
587 "required_con": "9025616074522624",
588 "required_mon": [
589 "kraken"
590 ],
591 "quorum_con": "1152921504336314367",
592 "quorum_mon": [
593 "kraken"
594 ]
595 },
596 "outside_quorum": [],
597 "extra_probe_peers": [],
598 "sync_provider": [],
599 "monmap": {
600 "epoch": 2,
601 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
602 "modified": "2016-12-26 14:42:09.288066",
603 "created": "2016-12-26 14:42:03.573585",
604 "features": {
605 "persistent": [
606 "kraken"
607 ],
608 "optional": []
609 },
610 "mons": [
611 {
612 "rank": 0,
613 "name": "a",
614 "addr": "127.0.0.1:40000\/0",
615 "public_addr": "127.0.0.1:40000\/0"
616 },
617 {
618 "rank": 1,
619 "name": "b",
620 "addr": "127.0.0.1:40001\/0",
621 "public_addr": "127.0.0.1:40001\/0"
622 },
623 {
624 "rank": 2,
625 "name": "c",
626 "addr": "127.0.0.1:40002\/0",
627 "public_addr": "127.0.0.1:40002\/0"
628 }
629 ]
630 }
631 }
632
39ae355f
TL
633A dump of the monitor state:
634
635.. prompt:: bash $
636
637 ceph mon dump
7c673cae 638
39ae355f 639::
7c673cae
FG
640
641 dumped monmap epoch 2
642 epoch 2
643 fsid ba807e74-b64f-4b72-b43f-597dfe60ddbc
644 last_changed 2016-12-26 14:42:09.288066
645 created 2016-12-26 14:42:03.573585
646 0: 127.0.0.1:40000/0 mon.a
647 1: 127.0.0.1:40001/0 mon.b
648 2: 127.0.0.1:40002/0 mon.c
649