ceph/doc/rados/operations/control.rst

   1 .. index:: control, commands
   2
   3 ==================
   4  Control Commands
   5 ==================
   6
   7
   8 Monitor Commands
   9 ================
  10
  11 To issue monitor commands, use the ``ceph`` utility:
  12
  13 .. prompt:: bash $
  14
  15    ceph [-m monhost] {command}
  16
  17 In most cases, monitor commands have the following form:
  18
  19 .. prompt:: bash $
  20
  21    ceph {subsystem} {command}
  22
  23
  24 System Commands
  25 ===============
  26
  27 To display the current cluster status, run the following commands:
  28
  29 .. prompt:: bash $
  30
  31    ceph -s
  32    ceph status
  33
  34 To display a running summary of cluster status and major events, run the
  35 following command:
  36
  37 .. prompt:: bash $
  38
  39    ceph -w
  40
  41 To display the monitor quorum, including which monitors are participating and
  42 which one is the leader, run the following commands:
  43
  44 .. prompt:: bash $
  45
  46    ceph mon stat
  47    ceph quorum_status
  48
  49 To query the status of a single monitor, including whether it is in the quorum,
  50 run the following command:
  51
  52 .. prompt:: bash $
  53
  54    ceph tell mon.[id] mon_status
  55
  56 Here the value of ``[id]`` can be found by consulting the output of ``ceph
  57 -s``.
  58
  59
  60 Authentication Subsystem
  61 ========================
  62
  63 To add an OSD keyring for a specific OSD, run the following command:
  64
  65 .. prompt:: bash $
  66
  67    ceph auth add {osd} {--in-file|-i} {path-to-osd-keyring}
  68
  69 To list the cluster's keys and their capabilities, run the following command:
  70
  71 .. prompt:: bash $
  72
  73    ceph auth ls
  74
  75
  76 Placement Group Subsystem
  77 =========================
  78
  79 To display the statistics for all placement groups (PGs), run the following
  80 command:
  81
  82 .. prompt:: bash $
  83
  84    ceph pg dump [--format {format}]
  85
  86 Here the valid formats are ``plain`` (default), ``json`` ``json-pretty``,
  87 ``xml``, and ``xml-pretty``.  When implementing monitoring tools and other
  88 tools, it is best to use the ``json`` format.  JSON parsing is more
  89 deterministic than the ``plain`` format (which is more human readable), and the
  90 layout is much more consistent from release to release. The ``jq`` utility is
  91 very useful for extracting data from JSON output.
  92
  93 To display the statistics for all PGs stuck in a specified state, run the
  94 following command:
  95
  96 .. prompt:: bash $
  97
  98    ceph pg dump_stuck inactive|unclean|stale|undersized|degraded [--format {format}] [-t|--threshold {seconds}]
  99
 100 Here ``--format`` may be ``plain`` (default), ``json``, ``json-pretty``,
 101 ``xml``, or ``xml-pretty``.
 102
 103 The ``--threshold`` argument determines the time interval (in seconds) for a PG
 104 to be considered ``stuck`` (default: 300).
 105
 106 PGs might be stuck in any of the following states:
 107
 108 **Inactive**
 109
 110     PGs are unable to process reads or writes because they are waiting for an
 111     OSD that has the most up-to-date data to return to an ``up`` state.
 112
 113
 114 **Unclean**
 115
 116     PGs contain objects that have not been replicated the desired number of
 117     times. These PGs have not yet completed the process of recovering.
 118
 119
 120 **Stale**
 121
 122     PGs are in an unknown state, because the OSDs that host them have not
 123     reported to the monitor cluster for a certain period of time (specified by
 124     the ``mon_osd_report_timeout`` configuration setting).
 125
 126
 127 To delete a ``lost`` object or revert an object to its prior state, either by
 128 reverting it to its previous version or by deleting it because it was just
 129 created and has no previous version, run the following command:
 130
 131 .. prompt:: bash $
 132
 133    ceph pg {pgid} mark_unfound_lost revert|delete
 134
 135
 136 .. _osd-subsystem:
 137
 138 OSD Subsystem
 139 =============
 140
 141 To query OSD subsystem status, run the following command:
 142
 143 .. prompt:: bash $
 144
 145    ceph osd stat
 146
 147 To write a copy of the most recent OSD map to a file (see :ref:`osdmaptool
 148 <osdmaptool>`), run the following command:
 149
 150 .. prompt:: bash $
 151
 152    ceph osd getmap -o file
 153
 154 To write a copy of the CRUSH map from the most recent OSD map to a file, run
 155 the following command:
 156
 157 .. prompt:: bash $
 158
 159    ceph osd getcrushmap -o file
 160
 161 Note that this command is functionally equivalent to the following two
 162 commands:
 163
 164 .. prompt:: bash $
 165
 166    ceph osd getmap -o /tmp/osdmap
 167    osdmaptool /tmp/osdmap --export-crush file
 168
 169 To dump the OSD map, run the following command:
 170
 171 .. prompt:: bash $
 172
 173    ceph osd dump [--format {format}]
 174
 175 The ``--format`` option accepts the following arguments: ``plain`` (default),
 176 ``json``, ``json-pretty``, ``xml``, and ``xml-pretty``. As noted above, JSON is
 177 the recommended format for tools, scripting, and other forms of automation.
 178
 179 To dump the OSD map as a tree that lists one OSD per line and displays
 180 information about the weights and states of the OSDs, run the following
 181 command:
 182
 183 .. prompt:: bash $
 184
 185    ceph osd tree [--format {format}]
 186
 187 To find out where a specific RADOS object is stored in the system, run a
 188 command of the following form:
 189
 190 .. prompt:: bash $
 191
 192    ceph osd map <pool-name> <object-name>
 193
 194 To add or move a new OSD (specified by its ID, name, or weight) to a specific
 195 CRUSH location, run the following command:
 196
 197 .. prompt:: bash $
 198
 199    ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
 200
 201 To remove an existing OSD from the CRUSH map, run the following command:
 202
 203 .. prompt:: bash $
 204
 205    ceph osd crush remove {name}
 206
 207 To remove an existing bucket from the CRUSH map, run the following command:
 208
 209 .. prompt:: bash $
 210
 211    ceph osd crush remove {bucket-name}
 212
 213 To move an existing bucket from one position in the CRUSH hierarchy to another,
 214 run the following command:
 215
 216 .. prompt:: bash $
 217
 218    ceph osd crush move {id} {loc1} [{loc2} ...]
 219
 220 To set the CRUSH weight of a specific OSD (specified by ``{name}``) to
 221 ``{weight}``, run the following command:
 222
 223 .. prompt:: bash $
 224
 225    ceph osd crush reweight {name} {weight}
 226
 227 To mark an OSD as ``lost``, run the following command:
 228
 229 .. prompt:: bash $
 230
 231    ceph osd lost {id} [--yes-i-really-mean-it]
 232
 233 .. warning::
 234    This could result in permanent data loss. Use with caution!
 235
 236 To create a new OSD, run the following command:
 237
 238 .. prompt:: bash $
 239
 240    ceph osd create [{uuid}]
 241
 242 If no UUID is given as part of this command, the UUID will be set automatically
 243 when the OSD starts up.
 244
 245 To remove one or more specific OSDs, run the following command:
 246
 247 .. prompt:: bash $
 248
 249    ceph osd rm [{id}...]
 250
 251 To display the current ``max_osd`` parameter in the OSD map, run the following
 252 command:
 253
 254 .. prompt:: bash $
 255
 256    ceph osd getmaxosd
 257
 258 To import a specific CRUSH map, run the following command:
 259
 260 .. prompt:: bash $
 261
 262    ceph osd setcrushmap -i file
 263
 264 To set the ``max_osd`` parameter in the OSD map, run the following command:
 265
 266 .. prompt:: bash $
 267
 268    ceph osd setmaxosd
 269
 270 The parameter has a default value of 10000. Most operators will never need to
 271 adjust it.
 272
 273 To mark a specific OSD ``down``, run the following command:
 274
 275 .. prompt:: bash $
 276
 277    ceph osd down {osd-num}
 278
 279 To mark a specific OSD ``out`` (so that no data will be allocated to it), run
 280 the following command:
 281
 282 .. prompt:: bash $
 283
 284    ceph osd out {osd-num}
 285
 286 To mark a specific OSD ``in`` (so that data will be allocated to it), run the
 287 following command:
 288
 289 .. prompt:: bash $
 290
 291    ceph osd in {osd-num}
 292
 293 By using the "pause flags" in the OSD map, you can pause or unpause I/O
 294 requests.  If the flags are set, then no I/O requests will be sent to any OSD.
 295 When the flags are cleared, then pending I/O requests will be resent. To set or
 296 clear pause flags, run one of the following commands:
 297
 298 .. prompt:: bash $
 299
 300    ceph osd pause
 301    ceph osd unpause
 302
 303 You can assign an override or ``reweight`` weight value to a specific OSD if
 304 the normal CRUSH distribution seems to be suboptimal. The weight of an OSD
 305 helps determine the extent of its I/O requests and data storage: two OSDs with
 306 the same weight will receive approximately the same number of I/O requests and
 307 store approximately the same amount of data. The ``ceph osd reweight`` command
 308 assigns an override weight to an OSD. The weight value is in the range 0 to 1,
 309 and the command forces CRUSH to relocate a certain amount (1 - ``weight``) of
 310 the data that would otherwise be on this OSD. The command does not change the
 311 weights of the buckets above the OSD in the CRUSH map. Using the command is
 312 merely a corrective measure: for example, if one of your OSDs is at 90% and the
 313 others are at 50%, you could reduce the outlier weight to correct this
 314 imbalance. To assign an override weight to a specific OSD, run the following
 315 command:
 316
 317 .. prompt:: bash $
 318
 319    ceph osd reweight {osd-num} {weight}
 320
 321 .. note:: Any assigned override reweight value will conflict with the balancer.
 322    This means that if the balancer is in use, all override reweight values
 323    should be ``1.0000`` in order to avoid suboptimal cluster behavior.
 324
 325 A cluster's OSDs can be reweighted in order to maintain balance if some OSDs
 326 are being disproportionately utilized. Note that override or ``reweight``
 327 weights have values relative to one another that default to 1.00000; their
 328 values are not absolute, and these weights must be distinguished from CRUSH
 329 weights (which reflect the absolute capacity of a bucket, as measured in TiB).
 330 To reweight OSDs by utilization, run the following command:
 331
 332 .. prompt:: bash $
 333
 334    ceph osd reweight-by-utilization [threshold [max_change [max_osds]]] [--no-increasing]
 335
 336 By default, this command adjusts the override weight of OSDs that have ±20% of
 337 the average utilization, but you can specify a different percentage in the
 338 ``threshold`` argument.
 339
 340 To limit the increment by which any OSD's reweight is to be changed, use the
 341 ``max_change`` argument (default: 0.05). To limit the number of OSDs that are
 342 to be adjusted, use the ``max_osds`` argument (default: 4). Increasing these
 343 variables can accelerate the reweighting process, but perhaps at the cost of
 344 slower client operations (as a result of the increase in data movement).
 345
 346 You can test the ``osd reweight-by-utilization`` command before running it. To
 347 find out which and how many PGs and OSDs will be affected by a specific use of
 348 the ``osd reweight-by-utilization`` command, run the following command:
 349
 350 .. prompt:: bash $
 351
 352    ceph osd test-reweight-by-utilization [threshold [max_change max_osds]] [--no-increasing]
 353
 354 The ``--no-increasing`` option can be added to the ``reweight-by-utilization``
 355 and ``test-reweight-by-utilization`` commands in order to prevent any override
 356 weights that are currently less than 1.00000 from being increased. This option
 357 can be useful in certain circumstances: for example, when you are hastily
 358 balancing in order to remedy ``full`` or ``nearfull`` OSDs, or when there are
 359 OSDs being evacuated or slowly brought into service.
 360
 361 Operators of deployments that utilize Nautilus or newer (or later revisions of
 362 Luminous and Mimic) and that have no pre-Luminous clients might likely instead
 363 want to enable the `balancer`` module for ``ceph-mgr``.
 364
 365 The blocklist can be modified by adding or removing an IP address or a CIDR
 366 range. If an address is blocklisted, it will be unable to connect to any OSD.
 367 If an OSD is contained within an IP address or CIDR range that has been
 368 blocklisted, the OSD will be unable to perform operations on its peers when it
 369 acts as a client: such blocked operations include tiering and copy-from
 370 functionality. To add or remove an IP address or CIDR range to the blocklist,
 371 run one of the following commands:
 372
 373 .. prompt:: bash $
 374
 375    ceph osd blocklist ["range"] add ADDRESS[:source_port][/netmask_bits] [TIME]
 376    ceph osd blocklist ["range"] rm ADDRESS[:source_port][/netmask_bits]
 377
 378 If you add something to the blocklist with the above ``add`` command, you can
 379 use the ``TIME`` keyword to specify the length of time (in seconds) that it
 380 will remain on the blocklist (default: one hour). To add or remove a CIDR
 381 range, use the ``range`` keyword in the above commands.
 382
 383 Note that these commands are useful primarily in failure testing. Under normal
 384 conditions, blocklists are maintained automatically and do not need any manual
 385 intervention.
 386
 387 To create or delete a snapshot of a specific storage pool, run one of the
 388 following commands:
 389
 390 .. prompt:: bash $
 391
 392    ceph osd pool mksnap {pool-name} {snap-name}
 393    ceph osd pool rmsnap {pool-name} {snap-name}
 394
 395 To create, delete, or rename a specific storage pool, run one of the following
 396 commands:
 397
 398 .. prompt:: bash $
 399
 400    ceph osd pool create {pool-name} [pg_num [pgp_num]]
 401    ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
 402    ceph osd pool rename {old-name} {new-name}
 403
 404 To change a pool setting, run the following command:
 405
 406 .. prompt:: bash $
 407
 408    ceph osd pool set {pool-name} {field} {value}
 409
 410 The following are valid fields:
 411
 412     * ``size``: The number of copies of data in the pool.
 413     * ``pg_num``: The PG number.
 414     * ``pgp_num``: The effective number of PGs when calculating placement.
 415     * ``crush_rule``: The rule number for mapping placement.
 416
 417 To retrieve the value of a pool setting, run the following command:
 418
 419 .. prompt:: bash $
 420
 421    ceph osd pool get {pool-name} {field}
 422
 423 Valid fields are:
 424
 425     * ``pg_num``: The PG number.
 426     * ``pgp_num``: The effective number of PGs when calculating placement.
 427
 428 To send a scrub command to a specific OSD, or to all OSDs (by using ``*``), run
 429 the following command:
 430
 431 .. prompt:: bash $
 432
 433    ceph osd scrub {osd-num}
 434
 435 To send a repair command to a specific OSD, or to all OSDs (by using ``*``),
 436 run the following command:
 437
 438 .. prompt:: bash $
 439
 440    ceph osd repair N
 441
 442 You can run a simple throughput benchmark test against a specific OSD. This
 443 test writes a total size of ``TOTAL_DATA_BYTES`` (default: 1 GB) incrementally,
 444 in multiple write requests that each have a size of ``BYTES_PER_WRITE``
 445 (default: 4 MB). The test is not destructive and it will not overwrite existing
 446 live OSD data, but it might temporarily affect the performance of clients that
 447 are concurrently accessing the OSD. To launch this benchmark test, run the
 448 following command:
 449
 450 .. prompt:: bash $
 451
 452    ceph tell osd.N bench [TOTAL_DATA_BYTES] [BYTES_PER_WRITE]
 453
 454 To clear the caches of a specific OSD during the interval between one benchmark
 455 run and another, run the following command:
 456
 457 .. prompt:: bash $
 458
 459    ceph tell osd.N cache drop
 460
 461 To retrieve the cache statistics of a specific OSD, run the following command:
 462
 463 .. prompt:: bash $
 464
 465    ceph tell osd.N cache status
 466
 467 MDS Subsystem
 468 =============
 469
 470 To change the configuration parameters of a running metadata server, run the
 471 following command:
 472
 473 .. prompt:: bash $
 474
 475    ceph tell mds.{mds-id} config set {setting} {value}
 476
 477 Example:
 478
 479 .. prompt:: bash $
 480
 481    ceph tell mds.0 config set debug_ms 1
 482
 483 To enable debug messages, run the following command:
 484
 485 .. prompt:: bash $
 486
 487    ceph mds stat
 488
 489 To display the status of all metadata servers, run the following command:
 490
 491 .. prompt:: bash $
 492
 493    ceph mds fail 0
 494
 495 To mark the active metadata server as failed (and to trigger failover to a
 496 standby if a standby is present), run the following command:
 497
 498 .. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap
 499
 500
 501 Mon Subsystem
 502 =============
 503
 504 To display monitor statistics, run the following command:
 505
 506 .. prompt:: bash $
 507
 508    ceph mon stat
 509
 510 This command returns output similar to the following:
 511
 512 ::
 513
 514     e2: 3 mons at {a=127.0.0.1:40000/0,b=127.0.0.1:40001/0,c=127.0.0.1:40002/0}, election epoch 6, quorum 0,1,2 a,b,c
 515
 516 There is a ``quorum`` list at the end of the output. It lists those monitor
 517 nodes that are part of the current quorum.
 518
 519 To retrieve this information in a more direct way, run the following command:
 520
 521 .. prompt:: bash $
 522
 523    ceph quorum_status -f json-pretty
 524
 525 This command returns output similar to the following:
 526
 527 .. code-block:: javascript
 528
 529     {
 530         "election_epoch": 6,
 531         "quorum": [
 532         0,
 533         1,
 534         2
 535         ],
 536         "quorum_names": [
 537         "a",
 538         "b",
 539         "c"
 540         ],
 541         "quorum_leader_name": "a",
 542         "monmap": {
 543         "epoch": 2,
 544         "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
 545         "modified": "2016-12-26 14:42:09.288066",
 546         "created": "2016-12-26 14:42:03.573585",
 547         "features": {
 548             "persistent": [
 549             "kraken"
 550             ],
 551             "optional": []
 552         },
 553         "mons": [
 554             {
 555             "rank": 0,
 556             "name": "a",
 557             "addr": "127.0.0.1:40000\/0",
 558             "public_addr": "127.0.0.1:40000\/0"
 559             },
 560             {
 561             "rank": 1,
 562             "name": "b",
 563             "addr": "127.0.0.1:40001\/0",
 564             "public_addr": "127.0.0.1:40001\/0"
 565             },
 566             {
 567             "rank": 2,
 568             "name": "c",
 569             "addr": "127.0.0.1:40002\/0",
 570             "public_addr": "127.0.0.1:40002\/0"
 571             }
 572         ]
 573         }
 574     }
 575
 576
 577 The above will block until a quorum is reached.
 578
 579 To see the status of a specific monitor, run the following command:
 580
 581 .. prompt:: bash $
 582
 583    ceph tell mon.[name] mon_status
 584
 585 Here the value of ``[name]`` can be found by consulting the output of the
 586 ``ceph quorum_status`` command. This command returns output similar to the
 587 following:
 588
 589 ::
 590
 591     {
 592         "name": "b",
 593         "rank": 1,
 594         "state": "peon",
 595         "election_epoch": 6,
 596         "quorum": [
 597         0,
 598         1,
 599         2
 600         ],
 601         "features": {
 602         "required_con": "9025616074522624",
 603         "required_mon": [
 604             "kraken"
 605         ],
 606         "quorum_con": "1152921504336314367",
 607         "quorum_mon": [
 608             "kraken"
 609         ]
 610         },
 611         "outside_quorum": [],
 612         "extra_probe_peers": [],
 613         "sync_provider": [],
 614         "monmap": {
 615         "epoch": 2,
 616         "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
 617         "modified": "2016-12-26 14:42:09.288066",
 618         "created": "2016-12-26 14:42:03.573585",
 619         "features": {
 620             "persistent": [
 621             "kraken"
 622             ],
 623             "optional": []
 624         },
 625         "mons": [
 626             {
 627             "rank": 0,
 628             "name": "a",
 629             "addr": "127.0.0.1:40000\/0",
 630             "public_addr": "127.0.0.1:40000\/0"
 631             },
 632             {
 633             "rank": 1,
 634             "name": "b",
 635             "addr": "127.0.0.1:40001\/0",
 636             "public_addr": "127.0.0.1:40001\/0"
 637             },
 638             {
 639             "rank": 2,
 640             "name": "c",
 641             "addr": "127.0.0.1:40002\/0",
 642             "public_addr": "127.0.0.1:40002\/0"
 643             }
 644         ]
 645         }
 646     }
 647
 648 To see a dump of the monitor state, run the following command:
 649
 650 .. prompt:: bash $
 651
 652    ceph mon dump
 653
 654 This command returns output similar to the following:
 655
 656 ::
 657
 658     dumped monmap epoch 2
 659     epoch 2
 660     fsid ba807e74-b64f-4b72-b43f-597dfe60ddbc
 661     last_changed 2016-12-26 14:42:09.288066
 662     created 2016-12-26 14:42:03.573585
 663     0: 127.0.0.1:40000/0 mon.a
 664     1: 127.0.0.1:40001/0 mon.b
 665     2: 127.0.0.1:40002/0 mon.c