]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/control.rst
import ceph quincy 17.2.1
[ceph.git] / ceph / doc / rados / operations / control.rst
CommitLineData
7c673cae
FG
1.. index:: control, commands
2
3==================
4 Control Commands
5==================
6
7
8Monitor Commands
9================
10
f67539c2 11Monitor commands are issued using the ``ceph`` utility::
7c673cae
FG
12
13 ceph [-m monhost] {command}
14
15The command is usually (though not always) of the form::
16
17 ceph {subsystem} {command}
18
19
20System Commands
21===============
22
f67539c2 23Execute the following to display the current cluster status. ::
7c673cae
FG
24
25 ceph -s
26 ceph status
27
f67539c2 28Execute the following to display a running summary of cluster status
7c673cae
FG
29and major events. ::
30
31 ceph -w
32
33Execute the following to show the monitor quorum, including which monitors are
34participating and which one is the leader. ::
35
9f95a23c 36 ceph mon stat
7c673cae
FG
37 ceph quorum_status
38
39Execute the following to query the status of a single monitor, including whether
40or not it is in the quorum. ::
41
9f95a23c
TL
42 ceph tell mon.[id] mon_status
43
44where the value of ``[id]`` can be determined, e.g., from ``ceph -s``.
7c673cae
FG
45
46
47Authentication Subsystem
48========================
49
50To add a keyring for an OSD, execute the following::
51
52 ceph auth add {osd} {--in-file|-i} {path-to-osd-keyring}
53
54To list the cluster's keys and their capabilities, execute the following::
55
c07f9fc5 56 ceph auth ls
7c673cae
FG
57
58
59Placement Group Subsystem
60=========================
61
f67539c2 62To display the statistics for all placement groups (PGs), execute the following::
7c673cae
FG
63
64 ceph pg dump [--format {format}]
65
9f95a23c 66The valid formats are ``plain`` (default), ``json`` ``json-pretty``, ``xml``, and ``xml-pretty``.
f67539c2
TL
67When implementing monitoring and other tools, it is best to use ``json`` format.
68JSON parsing is more deterministic than the human-oriented ``plain``, and the layout is much
69less variable from release to release. The ``jq`` utility can be invaluable when extracting
70data from JSON output.
7c673cae
FG
71
72To display the statistics for all placement groups stuck in a specified state,
73execute the following::
74
75 ceph pg dump_stuck inactive|unclean|stale|undersized|degraded [--format {format}] [-t|--threshold {seconds}]
76
77
9f95a23c 78``--format`` may be ``plain`` (default), ``json``, ``json-pretty``, ``xml``, or ``xml-pretty``.
7c673cae
FG
79
80``--threshold`` defines how many seconds "stuck" is (default: 300)
81
82**Inactive** Placement groups cannot process reads or writes because they are waiting for an OSD
83with the most up-to-date data to come back.
84
85**Unclean** Placement groups contain objects that are not replicated the desired number
86of times. They should be recovering.
87
88**Stale** Placement groups are in an unknown state - the OSDs that host them have not
89reported to the monitor cluster in a while (configured by
90``mon_osd_report_timeout``).
91
92Delete "lost" objects or revert them to their prior state, either a previous version
93or delete them if they were just created. ::
94
95 ceph pg {pgid} mark_unfound_lost revert|delete
96
97
a4b75251
TL
98.. _osd-subsystem:
99
7c673cae
FG
100OSD Subsystem
101=============
102
103Query OSD subsystem status. ::
104
105 ceph osd stat
106
107Write a copy of the most recent OSD map to a file. See
11fdf7f2 108:ref:`osdmaptool <osdmaptool>`. ::
7c673cae
FG
109
110 ceph osd getmap -o file
111
7c673cae
FG
112Write a copy of the crush map from the most recent OSD map to
113file. ::
114
115 ceph osd getcrushmap -o file
116
9f95a23c 117The foregoing is functionally equivalent to ::
7c673cae
FG
118
119 ceph osd getmap -o /tmp/osdmap
120 osdmaptool /tmp/osdmap --export-crush file
121
9f95a23c
TL
122Dump the OSD map. Valid formats for ``-f`` are ``plain``, ``json``, ``json-pretty``,
123``xml``, and ``xml-pretty``. If no ``--format`` option is given, the OSD map is
f67539c2 124dumped as plain text. As above, JSON format is best for tools, scripting, and other automation. ::
7c673cae
FG
125
126 ceph osd dump [--format {format}]
127
128Dump the OSD map as a tree with one line per OSD containing weight
129and state. ::
130
131 ceph osd tree [--format {format}]
132
133Find out where a specific object is or would be stored in the system::
134
135 ceph osd map <pool-name> <object-name>
136
137Add or move a new item (OSD) with the given id/name/weight at the specified
138location. ::
139
140 ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
141
142Remove an existing item (OSD) from the CRUSH map. ::
143
144 ceph osd crush remove {name}
145
146Remove an existing bucket from the CRUSH map. ::
147
148 ceph osd crush remove {bucket-name}
149
150Move an existing bucket from one position in the hierarchy to another. ::
151
c07f9fc5 152 ceph osd crush move {id} {loc1} [{loc2} ...]
7c673cae
FG
153
154Set the weight of the item given by ``{name}`` to ``{weight}``. ::
155
156 ceph osd crush reweight {name} {weight}
157
f67539c2 158Mark an OSD as ``lost``. This may result in permanent data loss. Use with caution. ::
7c673cae
FG
159
160 ceph osd lost {id} [--yes-i-really-mean-it]
161
162Create a new OSD. If no UUID is given, it will be set automatically when the OSD
163starts up. ::
164
165 ceph osd create [{uuid}]
166
167Remove the given OSD(s). ::
168
169 ceph osd rm [{id}...]
170
f67539c2 171Query the current ``max_osd`` parameter in the OSD map. ::
7c673cae
FG
172
173 ceph osd getmaxosd
174
175Import the given crush map. ::
176
177 ceph osd setcrushmap -i file
178
f67539c2
TL
179Set the ``max_osd`` parameter in the OSD map. This defaults to 10000 now so
180most admins will never need to adjust this. ::
7c673cae
FG
181
182 ceph osd setmaxosd
183
184Mark OSD ``{osd-num}`` down. ::
185
186 ceph osd down {osd-num}
187
188Mark OSD ``{osd-num}`` out of the distribution (i.e. allocated no data). ::
189
190 ceph osd out {osd-num}
191
192Mark ``{osd-num}`` in the distribution (i.e. allocated data). ::
193
194 ceph osd in {osd-num}
195
7c673cae
FG
196Set or clear the pause flags in the OSD map. If set, no IO requests
197will be sent to any OSD. Clearing the flags via unpause results in
198resending pending requests. ::
199
200 ceph osd pause
201 ceph osd unpause
202
9f95a23c 203Set the override weight (reweight) of ``{osd-num}`` to ``{weight}``. Two OSDs with the
7c673cae
FG
204same weight will receive roughly the same number of I/O requests and
205store approximately the same amount of data. ``ceph osd reweight``
206sets an override weight on the OSD. This value is in the range 0 to 1,
207and forces CRUSH to re-place (1-weight) of the data that would
9f95a23c 208otherwise live on this drive. It does not change weights assigned
7c673cae 209to the buckets above the OSD in the crush map, and is a corrective
c07f9fc5 210measure in case the normal CRUSH distribution is not working out quite
7c673cae 211right. For instance, if one of your OSDs is at 90% and the others are
9f95a23c 212at 50%, you could reduce this weight to compensate. ::
7c673cae
FG
213
214 ceph osd reweight {osd-num} {weight}
215
9f95a23c
TL
216Balance OSD fullness by reducing the override weight of OSDs which are
217overly utilized. Note that these override aka ``reweight`` values
218default to 1.00000 and are relative only to each other; they not absolute.
219It is crucial to distinguish them from CRUSH weights, which reflect the
220absolute capacity of a bucket in TiB. By default this command adjusts
221override weight on OSDs which have + or - 20% of the average utilization,
222but if you include a ``threshold`` that percentage will be used instead. ::
223
224 ceph osd reweight-by-utilization [threshold [max_change [max_osds]]] [--no-increasing]
7c673cae 225
9f95a23c
TL
226To limit the step by which any OSD's reweight will be changed, specify
227``max_change`` which defaults to 0.05. To limit the number of OSDs that will
228be adjusted, specify ``max_osds`` as well; the default is 4. Increasing these
229parameters can speed leveling of OSD utilization, at the potential cost of
230greater impact on client operations due to more data moving at once.
7c673cae 231
9f95a23c
TL
232To determine which and how many PGs and OSDs will be affected by a given invocation
233you can test before executing. ::
7c673cae 234
9f95a23c 235 ceph osd test-reweight-by-utilization [threshold [max_change max_osds]] [--no-increasing]
7c673cae 236
9f95a23c
TL
237Adding ``--no-increasing`` to either command prevents increasing any
238override weights that are currently < 1.00000. This can be useful when
239you are balancing in a hurry to remedy ``full`` or ``nearful`` OSDs or
240when some OSDs are being evacuated or slowly brought into service.
241
242Deployments utilizing Nautilus (or later revisions of Luminous and Mimic)
243that have no pre-Luminous cients may instead wish to instead enable the
244`balancer`` module for ``ceph-mgr``.
245
33c7a0ef
TL
246Add/remove an IP address or CIDR range to/from the blocklist.
247When adding to the blocklist,
f67539c2
TL
248you can specify how long it should be blocklisted in seconds; otherwise,
249it will default to 1 hour. A blocklisted address is prevented from
33c7a0ef
TL
250connecting to any OSD. If you blocklist an IP or range containing an OSD, be aware
251that OSD will also be prevented from performing operations on its peers where it
252acts as a client. (This includes tiering and copy-from functionality.)
253
254If you want to blocklist a range (in CIDR format), you may do so by
255including the ``range`` keyword.
7c673cae
FG
256
257These commands are mostly only useful for failure testing, as
f67539c2 258blocklists are normally maintained automatically and shouldn't need
7c673cae
FG
259manual intervention. ::
260
33c7a0ef
TL
261 ceph osd blocklist ["range"] add ADDRESS[:source_port][/netmask_bits] [TIME]
262 ceph osd blocklist ["range"] rm ADDRESS[:source_port][/netmask_bits]
7c673cae
FG
263
264Creates/deletes a snapshot of a pool. ::
265
266 ceph osd pool mksnap {pool-name} {snap-name}
267 ceph osd pool rmsnap {pool-name} {snap-name}
268
269Creates/deletes/renames a storage pool. ::
270
9f95a23c 271 ceph osd pool create {pool-name} [pg_num [pgp_num]]
7c673cae
FG
272 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
273 ceph osd pool rename {old-name} {new-name}
274
275Changes a pool setting. ::
276
277 ceph osd pool set {pool-name} {field} {value}
278
279Valid fields are:
280
281 * ``size``: Sets the number of copies of data in the pool.
282 * ``pg_num``: The placement group number.
283 * ``pgp_num``: Effective number when calculating pg placement.
b32b8144 284 * ``crush_rule``: rule number for mapping placement.
7c673cae
FG
285
286Get the value of a pool setting. ::
287
288 ceph osd pool get {pool-name} {field}
289
290Valid fields are:
291
292 * ``pg_num``: The placement group number.
293 * ``pgp_num``: Effective number of placement groups when calculating placement.
7c673cae
FG
294
295
296Sends a scrub command to OSD ``{osd-num}``. To send the command to all OSDs, use ``*``. ::
297
298 ceph osd scrub {osd-num}
299
300Sends a repair command to OSD.N. To send the command to all OSDs, use ``*``. ::
301
302 ceph osd repair N
303
304Runs a simple throughput benchmark against OSD.N, writing ``TOTAL_DATA_BYTES``
305in write requests of ``BYTES_PER_WRITE`` each. By default, the test
306writes 1 GB in total in 4-MB increments.
307The benchmark is non-destructive and will not overwrite existing live
308OSD data, but might temporarily affect the performance of clients
309concurrently accessing the OSD. ::
310
311 ceph tell osd.N bench [TOTAL_DATA_BYTES] [BYTES_PER_WRITE]
312
11fdf7f2
TL
313To clear an OSD's caches between benchmark runs, use the 'cache drop' command ::
314
315 ceph tell osd.N cache drop
316
317To get the cache statistics of an OSD, use the 'cache status' command ::
318
319 ceph tell osd.N cache status
7c673cae
FG
320
321MDS Subsystem
322=============
323
324Change configuration parameters on a running mds. ::
325
11fdf7f2 326 ceph tell mds.{mds-id} config set {setting} {value}
7c673cae
FG
327
328Example::
329
11fdf7f2 330 ceph tell mds.0 config set debug_ms 1
7c673cae
FG
331
332Enables debug messages. ::
333
334 ceph mds stat
335
336Displays the status of all metadata servers. ::
337
338 ceph mds fail 0
339
340Marks the active MDS as failed, triggering failover to a standby if present.
341
342.. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap
343
344
345Mon Subsystem
346=============
347
348Show monitor stats::
349
350 ceph mon stat
351
352 e2: 3 mons at {a=127.0.0.1:40000/0,b=127.0.0.1:40001/0,c=127.0.0.1:40002/0}, election epoch 6, quorum 0,1,2 a,b,c
353
354
355The ``quorum`` list at the end lists monitor nodes that are part of the current quorum.
356
357This is also available more directly::
358
359 ceph quorum_status -f json-pretty
360
361.. code-block:: javascript
362
363 {
364 "election_epoch": 6,
365 "quorum": [
366 0,
367 1,
368 2
369 ],
370 "quorum_names": [
371 "a",
372 "b",
373 "c"
374 ],
375 "quorum_leader_name": "a",
376 "monmap": {
377 "epoch": 2,
378 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
379 "modified": "2016-12-26 14:42:09.288066",
380 "created": "2016-12-26 14:42:03.573585",
381 "features": {
382 "persistent": [
383 "kraken"
384 ],
385 "optional": []
386 },
387 "mons": [
388 {
389 "rank": 0,
390 "name": "a",
391 "addr": "127.0.0.1:40000\/0",
392 "public_addr": "127.0.0.1:40000\/0"
393 },
394 {
395 "rank": 1,
396 "name": "b",
397 "addr": "127.0.0.1:40001\/0",
398 "public_addr": "127.0.0.1:40001\/0"
399 },
400 {
401 "rank": 2,
402 "name": "c",
403 "addr": "127.0.0.1:40002\/0",
404 "public_addr": "127.0.0.1:40002\/0"
405 }
406 ]
407 }
408 }
409
410
411The above will block until a quorum is reached.
412
9f95a23c 413For a status of just a single monitor::
7c673cae 414
9f95a23c 415 ceph tell mon.[name] mon_status
7c673cae 416
9f95a23c
TL
417where the value of ``[name]`` can be taken from ``ceph quorum_status``. Sample
418output::
7c673cae
FG
419
420 {
421 "name": "b",
422 "rank": 1,
423 "state": "peon",
424 "election_epoch": 6,
425 "quorum": [
426 0,
427 1,
428 2
429 ],
430 "features": {
431 "required_con": "9025616074522624",
432 "required_mon": [
433 "kraken"
434 ],
435 "quorum_con": "1152921504336314367",
436 "quorum_mon": [
437 "kraken"
438 ]
439 },
440 "outside_quorum": [],
441 "extra_probe_peers": [],
442 "sync_provider": [],
443 "monmap": {
444 "epoch": 2,
445 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
446 "modified": "2016-12-26 14:42:09.288066",
447 "created": "2016-12-26 14:42:03.573585",
448 "features": {
449 "persistent": [
450 "kraken"
451 ],
452 "optional": []
453 },
454 "mons": [
455 {
456 "rank": 0,
457 "name": "a",
458 "addr": "127.0.0.1:40000\/0",
459 "public_addr": "127.0.0.1:40000\/0"
460 },
461 {
462 "rank": 1,
463 "name": "b",
464 "addr": "127.0.0.1:40001\/0",
465 "public_addr": "127.0.0.1:40001\/0"
466 },
467 {
468 "rank": 2,
469 "name": "c",
470 "addr": "127.0.0.1:40002\/0",
471 "public_addr": "127.0.0.1:40002\/0"
472 }
473 ]
474 }
475 }
476
477A dump of the monitor state::
478
479 ceph mon dump
480
481 dumped monmap epoch 2
482 epoch 2
483 fsid ba807e74-b64f-4b72-b43f-597dfe60ddbc
484 last_changed 2016-12-26 14:42:09.288066
485 created 2016-12-26 14:42:03.573585
486 0: 127.0.0.1:40000/0 mon.a
487 1: 127.0.0.1:40001/0 mon.b
488 2: 127.0.0.1:40002/0 mon.c
489