]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/control.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / rados / operations / control.rst
CommitLineData
7c673cae
FG
1.. index:: control, commands
2
3==================
4 Control Commands
5==================
6
7
8Monitor Commands
9================
10
11Monitor commands are issued using the ceph utility::
12
13 ceph [-m monhost] {command}
14
15The command is usually (though not always) of the form::
16
17 ceph {subsystem} {command}
18
19
20System Commands
21===============
22
23Execute the following to display the current status of the cluster. ::
24
25 ceph -s
26 ceph status
27
28Execute the following to display a running summary of the status of the cluster,
29and major events. ::
30
31 ceph -w
32
33Execute the following to show the monitor quorum, including which monitors are
34participating and which one is the leader. ::
35
9f95a23c 36 ceph mon stat
7c673cae
FG
37 ceph quorum_status
38
39Execute the following to query the status of a single monitor, including whether
40or not it is in the quorum. ::
41
9f95a23c
TL
42 ceph tell mon.[id] mon_status
43
44where the value of ``[id]`` can be determined, e.g., from ``ceph -s``.
7c673cae
FG
45
46
47Authentication Subsystem
48========================
49
50To add a keyring for an OSD, execute the following::
51
52 ceph auth add {osd} {--in-file|-i} {path-to-osd-keyring}
53
54To list the cluster's keys and their capabilities, execute the following::
55
c07f9fc5 56 ceph auth ls
7c673cae
FG
57
58
59Placement Group Subsystem
60=========================
61
62To display the statistics for all placement groups, execute the following::
63
64 ceph pg dump [--format {format}]
65
9f95a23c 66The valid formats are ``plain`` (default), ``json`` ``json-pretty``, ``xml``, and ``xml-pretty``.
7c673cae
FG
67
68To display the statistics for all placement groups stuck in a specified state,
69execute the following::
70
71 ceph pg dump_stuck inactive|unclean|stale|undersized|degraded [--format {format}] [-t|--threshold {seconds}]
72
73
9f95a23c 74``--format`` may be ``plain`` (default), ``json``, ``json-pretty``, ``xml``, or ``xml-pretty``.
7c673cae
FG
75
76``--threshold`` defines how many seconds "stuck" is (default: 300)
77
78**Inactive** Placement groups cannot process reads or writes because they are waiting for an OSD
79with the most up-to-date data to come back.
80
81**Unclean** Placement groups contain objects that are not replicated the desired number
82of times. They should be recovering.
83
84**Stale** Placement groups are in an unknown state - the OSDs that host them have not
85reported to the monitor cluster in a while (configured by
86``mon_osd_report_timeout``).
87
88Delete "lost" objects or revert them to their prior state, either a previous version
89or delete them if they were just created. ::
90
91 ceph pg {pgid} mark_unfound_lost revert|delete
92
93
94OSD Subsystem
95=============
96
97Query OSD subsystem status. ::
98
99 ceph osd stat
100
101Write a copy of the most recent OSD map to a file. See
11fdf7f2 102:ref:`osdmaptool <osdmaptool>`. ::
7c673cae
FG
103
104 ceph osd getmap -o file
105
7c673cae
FG
106Write a copy of the crush map from the most recent OSD map to
107file. ::
108
109 ceph osd getcrushmap -o file
110
9f95a23c 111The foregoing is functionally equivalent to ::
7c673cae
FG
112
113 ceph osd getmap -o /tmp/osdmap
114 osdmaptool /tmp/osdmap --export-crush file
115
9f95a23c
TL
116Dump the OSD map. Valid formats for ``-f`` are ``plain``, ``json``, ``json-pretty``,
117``xml``, and ``xml-pretty``. If no ``--format`` option is given, the OSD map is
118dumped as plain text. ::
7c673cae
FG
119
120 ceph osd dump [--format {format}]
121
122Dump the OSD map as a tree with one line per OSD containing weight
123and state. ::
124
125 ceph osd tree [--format {format}]
126
127Find out where a specific object is or would be stored in the system::
128
129 ceph osd map <pool-name> <object-name>
130
131Add or move a new item (OSD) with the given id/name/weight at the specified
132location. ::
133
134 ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
135
136Remove an existing item (OSD) from the CRUSH map. ::
137
138 ceph osd crush remove {name}
139
140Remove an existing bucket from the CRUSH map. ::
141
142 ceph osd crush remove {bucket-name}
143
144Move an existing bucket from one position in the hierarchy to another. ::
145
c07f9fc5 146 ceph osd crush move {id} {loc1} [{loc2} ...]
7c673cae
FG
147
148Set the weight of the item given by ``{name}`` to ``{weight}``. ::
149
150 ceph osd crush reweight {name} {weight}
151
7c673cae
FG
152Mark an OSD as lost. This may result in permanent data loss. Use with caution. ::
153
154 ceph osd lost {id} [--yes-i-really-mean-it]
155
156Create a new OSD. If no UUID is given, it will be set automatically when the OSD
157starts up. ::
158
159 ceph osd create [{uuid}]
160
161Remove the given OSD(s). ::
162
163 ceph osd rm [{id}...]
164
165Query the current max_osd parameter in the OSD map. ::
166
167 ceph osd getmaxosd
168
169Import the given crush map. ::
170
171 ceph osd setcrushmap -i file
172
173Set the ``max_osd`` parameter in the OSD map. This is necessary when
174expanding the storage cluster. ::
175
176 ceph osd setmaxosd
177
178Mark OSD ``{osd-num}`` down. ::
179
180 ceph osd down {osd-num}
181
182Mark OSD ``{osd-num}`` out of the distribution (i.e. allocated no data). ::
183
184 ceph osd out {osd-num}
185
186Mark ``{osd-num}`` in the distribution (i.e. allocated data). ::
187
188 ceph osd in {osd-num}
189
7c673cae
FG
190Set or clear the pause flags in the OSD map. If set, no IO requests
191will be sent to any OSD. Clearing the flags via unpause results in
192resending pending requests. ::
193
194 ceph osd pause
195 ceph osd unpause
196
9f95a23c 197Set the override weight (reweight) of ``{osd-num}`` to ``{weight}``. Two OSDs with the
7c673cae
FG
198same weight will receive roughly the same number of I/O requests and
199store approximately the same amount of data. ``ceph osd reweight``
200sets an override weight on the OSD. This value is in the range 0 to 1,
201and forces CRUSH to re-place (1-weight) of the data that would
9f95a23c 202otherwise live on this drive. It does not change weights assigned
7c673cae 203to the buckets above the OSD in the crush map, and is a corrective
c07f9fc5 204measure in case the normal CRUSH distribution is not working out quite
7c673cae 205right. For instance, if one of your OSDs is at 90% and the others are
9f95a23c 206at 50%, you could reduce this weight to compensate. ::
7c673cae
FG
207
208 ceph osd reweight {osd-num} {weight}
209
9f95a23c
TL
210Balance OSD fullness by reducing the override weight of OSDs which are
211overly utilized. Note that these override aka ``reweight`` values
212default to 1.00000 and are relative only to each other; they not absolute.
213It is crucial to distinguish them from CRUSH weights, which reflect the
214absolute capacity of a bucket in TiB. By default this command adjusts
215override weight on OSDs which have + or - 20% of the average utilization,
216but if you include a ``threshold`` that percentage will be used instead. ::
217
218 ceph osd reweight-by-utilization [threshold [max_change [max_osds]]] [--no-increasing]
7c673cae 219
9f95a23c
TL
220To limit the step by which any OSD's reweight will be changed, specify
221``max_change`` which defaults to 0.05. To limit the number of OSDs that will
222be adjusted, specify ``max_osds`` as well; the default is 4. Increasing these
223parameters can speed leveling of OSD utilization, at the potential cost of
224greater impact on client operations due to more data moving at once.
7c673cae 225
9f95a23c
TL
226To determine which and how many PGs and OSDs will be affected by a given invocation
227you can test before executing. ::
7c673cae 228
9f95a23c 229 ceph osd test-reweight-by-utilization [threshold [max_change max_osds]] [--no-increasing]
7c673cae 230
9f95a23c
TL
231Adding ``--no-increasing`` to either command prevents increasing any
232override weights that are currently < 1.00000. This can be useful when
233you are balancing in a hurry to remedy ``full`` or ``nearful`` OSDs or
234when some OSDs are being evacuated or slowly brought into service.
235
236Deployments utilizing Nautilus (or later revisions of Luminous and Mimic)
237that have no pre-Luminous cients may instead wish to instead enable the
238`balancer`` module for ``ceph-mgr``.
239
240Add/remove an IP address to/from the blacklist. When adding an address,
7c673cae
FG
241you can specify how long it should be blacklisted in seconds; otherwise,
242it will default to 1 hour. A blacklisted address is prevented from
243connecting to any OSD. Blacklisting is most often used to prevent a
244lagging metadata server from making bad changes to data on the OSDs.
245
246These commands are mostly only useful for failure testing, as
247blacklists are normally maintained automatically and shouldn't need
248manual intervention. ::
249
250 ceph osd blacklist add ADDRESS[:source_port] [TIME]
251 ceph osd blacklist rm ADDRESS[:source_port]
252
253Creates/deletes a snapshot of a pool. ::
254
255 ceph osd pool mksnap {pool-name} {snap-name}
256 ceph osd pool rmsnap {pool-name} {snap-name}
257
258Creates/deletes/renames a storage pool. ::
259
9f95a23c 260 ceph osd pool create {pool-name} [pg_num [pgp_num]]
7c673cae
FG
261 ceph osd pool delete {pool-name} [{pool-name} --yes-i-really-really-mean-it]
262 ceph osd pool rename {old-name} {new-name}
263
264Changes a pool setting. ::
265
266 ceph osd pool set {pool-name} {field} {value}
267
268Valid fields are:
269
270 * ``size``: Sets the number of copies of data in the pool.
271 * ``pg_num``: The placement group number.
272 * ``pgp_num``: Effective number when calculating pg placement.
b32b8144 273 * ``crush_rule``: rule number for mapping placement.
7c673cae
FG
274
275Get the value of a pool setting. ::
276
277 ceph osd pool get {pool-name} {field}
278
279Valid fields are:
280
281 * ``pg_num``: The placement group number.
282 * ``pgp_num``: Effective number of placement groups when calculating placement.
7c673cae
FG
283
284
285Sends a scrub command to OSD ``{osd-num}``. To send the command to all OSDs, use ``*``. ::
286
287 ceph osd scrub {osd-num}
288
289Sends a repair command to OSD.N. To send the command to all OSDs, use ``*``. ::
290
291 ceph osd repair N
292
293Runs a simple throughput benchmark against OSD.N, writing ``TOTAL_DATA_BYTES``
294in write requests of ``BYTES_PER_WRITE`` each. By default, the test
295writes 1 GB in total in 4-MB increments.
296The benchmark is non-destructive and will not overwrite existing live
297OSD data, but might temporarily affect the performance of clients
298concurrently accessing the OSD. ::
299
300 ceph tell osd.N bench [TOTAL_DATA_BYTES] [BYTES_PER_WRITE]
301
11fdf7f2
TL
302To clear an OSD's caches between benchmark runs, use the 'cache drop' command ::
303
304 ceph tell osd.N cache drop
305
306To get the cache statistics of an OSD, use the 'cache status' command ::
307
308 ceph tell osd.N cache status
7c673cae
FG
309
310MDS Subsystem
311=============
312
313Change configuration parameters on a running mds. ::
314
11fdf7f2 315 ceph tell mds.{mds-id} config set {setting} {value}
7c673cae
FG
316
317Example::
318
11fdf7f2 319 ceph tell mds.0 config set debug_ms 1
7c673cae
FG
320
321Enables debug messages. ::
322
323 ceph mds stat
324
325Displays the status of all metadata servers. ::
326
327 ceph mds fail 0
328
329Marks the active MDS as failed, triggering failover to a standby if present.
330
331.. todo:: ``ceph mds`` subcommands missing docs: set, dump, getmap, stop, setmap
332
333
334Mon Subsystem
335=============
336
337Show monitor stats::
338
339 ceph mon stat
340
341 e2: 3 mons at {a=127.0.0.1:40000/0,b=127.0.0.1:40001/0,c=127.0.0.1:40002/0}, election epoch 6, quorum 0,1,2 a,b,c
342
343
344The ``quorum`` list at the end lists monitor nodes that are part of the current quorum.
345
346This is also available more directly::
347
348 ceph quorum_status -f json-pretty
349
350.. code-block:: javascript
351
352 {
353 "election_epoch": 6,
354 "quorum": [
355 0,
356 1,
357 2
358 ],
359 "quorum_names": [
360 "a",
361 "b",
362 "c"
363 ],
364 "quorum_leader_name": "a",
365 "monmap": {
366 "epoch": 2,
367 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
368 "modified": "2016-12-26 14:42:09.288066",
369 "created": "2016-12-26 14:42:03.573585",
370 "features": {
371 "persistent": [
372 "kraken"
373 ],
374 "optional": []
375 },
376 "mons": [
377 {
378 "rank": 0,
379 "name": "a",
380 "addr": "127.0.0.1:40000\/0",
381 "public_addr": "127.0.0.1:40000\/0"
382 },
383 {
384 "rank": 1,
385 "name": "b",
386 "addr": "127.0.0.1:40001\/0",
387 "public_addr": "127.0.0.1:40001\/0"
388 },
389 {
390 "rank": 2,
391 "name": "c",
392 "addr": "127.0.0.1:40002\/0",
393 "public_addr": "127.0.0.1:40002\/0"
394 }
395 ]
396 }
397 }
398
399
400The above will block until a quorum is reached.
401
9f95a23c 402For a status of just a single monitor::
7c673cae 403
9f95a23c 404 ceph tell mon.[name] mon_status
7c673cae 405
9f95a23c
TL
406where the value of ``[name]`` can be taken from ``ceph quorum_status``. Sample
407output::
7c673cae
FG
408
409 {
410 "name": "b",
411 "rank": 1,
412 "state": "peon",
413 "election_epoch": 6,
414 "quorum": [
415 0,
416 1,
417 2
418 ],
419 "features": {
420 "required_con": "9025616074522624",
421 "required_mon": [
422 "kraken"
423 ],
424 "quorum_con": "1152921504336314367",
425 "quorum_mon": [
426 "kraken"
427 ]
428 },
429 "outside_quorum": [],
430 "extra_probe_peers": [],
431 "sync_provider": [],
432 "monmap": {
433 "epoch": 2,
434 "fsid": "ba807e74-b64f-4b72-b43f-597dfe60ddbc",
435 "modified": "2016-12-26 14:42:09.288066",
436 "created": "2016-12-26 14:42:03.573585",
437 "features": {
438 "persistent": [
439 "kraken"
440 ],
441 "optional": []
442 },
443 "mons": [
444 {
445 "rank": 0,
446 "name": "a",
447 "addr": "127.0.0.1:40000\/0",
448 "public_addr": "127.0.0.1:40000\/0"
449 },
450 {
451 "rank": 1,
452 "name": "b",
453 "addr": "127.0.0.1:40001\/0",
454 "public_addr": "127.0.0.1:40001\/0"
455 },
456 {
457 "rank": 2,
458 "name": "c",
459 "addr": "127.0.0.1:40002\/0",
460 "public_addr": "127.0.0.1:40002\/0"
461 }
462 ]
463 }
464 }
465
466A dump of the monitor state::
467
468 ceph mon dump
469
470 dumped monmap epoch 2
471 epoch 2
472 fsid ba807e74-b64f-4b72-b43f-597dfe60ddbc
473 last_changed 2016-12-26 14:42:09.288066
474 created 2016-12-26 14:42:03.573585
475 0: 127.0.0.1:40000/0 mon.a
476 1: 127.0.0.1:40001/0 mon.b
477 2: 127.0.0.1:40002/0 mon.c
478