]> git.proxmox.com Git - ceph.git/blob - ceph/doc/mgr/orchestrator.rst
import 15.2.2 octopus source
[ceph.git] / ceph / doc / mgr / orchestrator.rst
1
2 .. _orchestrator-cli-module:
3
4 ================
5 Orchestrator CLI
6 ================
7
8 This module provides a command line interface (CLI) to orchestrator
9 modules (ceph-mgr modules which interface with external orchestration services).
10
11 As the orchestrator CLI unifies different external orchestrators, a common nomenclature
12 for the orchestrator module is needed.
13
14 +--------------------------------------+---------------------------------------+
15 | *host* | hostname (not DNS name) of the |
16 | | physical host. Not the podname, |
17 | | container name, or hostname inside |
18 | | the container. |
19 +--------------------------------------+---------------------------------------+
20 | *service type* | The type of the service. e.g., nfs, |
21 | | mds, osd, mon, rgw, mgr, iscsi |
22 +--------------------------------------+---------------------------------------+
23 | *service* | A logical service, Typically |
24 | | comprised of multiple service |
25 | | instances on multiple hosts for HA |
26 | | |
27 | | * ``fs_name`` for mds type |
28 | | * ``rgw_zone`` for rgw type |
29 | | * ``ganesha_cluster_id`` for nfs type |
30 +--------------------------------------+---------------------------------------+
31 | *daemon* | A single instance of a service. |
32 | | Usually a daemon, but maybe not |
33 | | (e.g., might be a kernel service |
34 | | like LIO or knfsd or whatever) |
35 | | |
36 | | This identifier should |
37 | | uniquely identify the instance |
38 +--------------------------------------+---------------------------------------+
39
40 The relation between the names is the following:
41
42 * A *service* has a specfic *service type*
43 * A *daemon* is a physical instance of a *service type*
44
45
46 .. note::
47
48 Orchestrator modules may only implement a subset of the commands listed below.
49 Also, the implementation of the commands are orchestrator module dependent and will
50 differ between implementations.
51
52 Status
53 ======
54
55 ::
56
57 ceph orch status
58
59 Show current orchestrator mode and high-level status (whether the module able
60 to talk to it)
61
62 Also show any in-progress actions.
63
64 Host Management
65 ===============
66
67 List hosts associated with the cluster::
68
69 ceph orch host ls
70
71 Add and remove hosts::
72
73 ceph orch host add <host>
74 ceph orch host rm <host>
75
76 OSD Management
77 ==============
78
79 List Devices
80 ------------
81
82 Print a list of discovered devices, grouped by host and optionally
83 filtered to a particular host:
84
85 ::
86
87 ceph orch device ls [--host=...] [--refresh]
88
89 Example::
90
91 # ceph orch device ls
92 Host 192.168.121.206:
93 Device Path Type Size Rotates Available Model
94 /dev/sdb hdd 50.0G True True ATA/QEMU HARDDISK
95 /dev/sda hdd 50.0G True False ATA/QEMU HARDDISK
96
97 Host 192.168.121.181:
98 Device Path Type Size Rotates Available Model
99 /dev/sdb hdd 50.0G True True ATA/QEMU HARDDISK
100 /dev/sda hdd 50.0G True False ATA/QEMU HARDDISK
101
102 .. note::
103 Output form Ansible orchestrator
104
105 Create OSDs
106 -----------
107
108 Create OSDs on a group of devices on a single host::
109
110 ceph orch osd create <host>:<drive>
111 ceph orch osd create -i <path-to-drive-group.json>
112
113
114 The output of ``osd create`` is not specified and may vary between orchestrator backends.
115
116 Where ``drive.group.json`` is a JSON file containing the fields defined in
117 :class:`ceph.deployment_utils.drive_group.DriveGroupSpec`
118
119 Example::
120
121 # ceph orch osd create 192.168.121.206:/dev/sdc
122 {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "7082f3ba-f5b7-4b7c-9477-e74ca918afcb", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.206 : ok=96 changed=3 unreachable=0 failed=0 \r\n", "counter": 932, "pid": 10294, "created": "2019-05-28T22:22:58.527821", "end_line": 1170, "runner_ident": "083cad3c-8197-11e9-b07a-2016b900e38f", "start_line": 1166, "event_data": {"ignored": 0, "skipped": {"192.168.121.206": 186}, "ok": {"192.168.121.206": 96}, "artifact_data": {}, "rescued": 0, "changed": {"192.168.121.206": 3}, "pid": 10294, "dark": {}, "playbook_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89", "playbook": "add-osd.yml", "failures": {}, "processed": {"192.168.121.206": 1}}, "parent_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89"}}
123
124 .. note::
125 Output form Ansible orchestrator
126
127 Decommission an OSD
128 -------------------
129 ::
130
131 ceph orch osd rm <osd-id> [osd-id...]
132
133 Removes one or more OSDs from the cluster and the host, if the OSDs are marked as
134 ``destroyed``.
135
136 Example::
137
138 # ceph orch osd rm 4
139 {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "1a16e631-906d-48e0-9e24-fa7eb593cc0a", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.158 : ok=2 changed=0 unreachable=0 failed=0 \r\n192.168.121.181 : ok=2 changed=0 unreachable=0 failed=0 \r\n192.168.121.206 : ok=2 changed=0 unreachable=0 failed=0 \r\nlocalhost : ok=31 changed=8 unreachable=0 failed=0 \r\n", "counter": 240, "pid": 10948, "created": "2019-05-28T22:26:09.264012", "end_line": 308, "runner_ident": "8c093db0-8197-11e9-b07a-2016b900e38f", "start_line": 301, "event_data": {"ignored": 0, "skipped": {"localhost": 37}, "ok": {"192.168.121.181": 2, "192.168.121.158": 2, "192.168.121.206": 2, "localhost": 31}, "artifact_data": {}, "rescued": 0, "changed": {"localhost": 8}, "pid": 10948, "dark": {}, "playbook_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02", "playbook": "shrink-osd.yml", "failures": {}, "processed": {"192.168.121.181": 1, "192.168.121.158": 1, "192.168.121.206": 1, "localhost": 1}}, "parent_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02"}}
140
141 .. note::
142 Output form Ansible orchestrator
143
144 ..
145 Blink Device Lights
146 ^^^^^^^^^^^^^^^^^^^
147 ::
148
149 ceph orch device ident-on <dev_id>
150 ceph orch device ident-on <dev_name> <host>
151 ceph orch device fault-on <dev_id>
152 ceph orch device fault-on <dev_name> <host>
153
154 ceph orch device ident-off <dev_id> [--force=true]
155 ceph orch device ident-off <dev_id> <host> [--force=true]
156 ceph orch device fault-off <dev_id> [--force=true]
157 ceph orch device fault-off <dev_id> <host> [--force=true]
158
159 where ``dev_id`` is the device id as listed in ``osd metadata``,
160 ``dev_name`` is the name of the device on the system and ``host`` is the host as
161 returned by ``orchestrator host ls``
162
163 ceph orch osd ident-on {primary,journal,db,wal,all} <osd-id>
164 ceph orch osd ident-off {primary,journal,db,wal,all} <osd-id>
165 ceph orch osd fault-on {primary,journal,db,wal,all} <osd-id>
166 ceph orch osd fault-off {primary,journal,db,wal,all} <osd-id>
167
168 Where ``journal`` is the filestore journal, ``wal`` is the write ahead log of
169 bluestore and ``all`` stands for all devices associated with the osd
170
171
172 Monitor and manager management
173 ==============================
174
175 Creates or removes MONs or MGRs from the cluster. Orchestrator may return an
176 error if it doesn't know how to do this transition.
177
178 Update the number of monitor hosts::
179
180 ceph orch apply mon <num> [host, host:network...]
181
182 Each host can optionally specify a network for the monitor to listen on.
183
184 Update the number of manager hosts::
185
186 ceph orch apply mgr <num> [host...]
187
188 ..
189 .. note::
190
191 The host lists are the new full list of mon/mgr hosts
192
193 .. note::
194
195 specifying hosts is optional for some orchestrator modules
196 and mandatory for others (e.g. Ansible).
197
198
199 Service Status
200 ==============
201
202 Print a list of services known to the orchestrator. The list can be limited to
203 services on a particular host with the optional --host parameter and/or
204 services of a particular type via optional --type parameter
205 (mon, osd, mgr, mds, rgw):
206
207 ::
208
209 ceph orch ls [--service_type type] [--service_name name] [--export] [--format f] [--refresh]
210
211 Discover the status of a particular service or daemons::
212
213 ceph orch ls --service_type type --service_name <name> [--refresh]
214
215 Export the service specs known to the orchestrator as yaml in format
216 that is compatible to ``ceph orch apply -i``::
217
218 ceph orch ls --export
219
220
221 Daemon Status
222 =============
223
224 Print a list of all daemons known to the orchestrator::
225
226 ceph orch ps [--hostname host] [--daemon_type type] [--service_name name] [--daemon_id id] [--format f] [--refresh]
227
228 Query the status of a particular service instance (mon, osd, mds, rgw). For OSDs
229 the id is the numeric OSD ID, for MDS services it is the file system name::
230
231 ceph orch ps --daemon_type osd --daemon_id 0
232
233
234 .. _orchestrator-cli-cephfs:
235
236 Depoying CephFS
237 ===============
238
239 In order to set up a :term:`CephFS`, execute::
240
241 ceph fs volume create <fs_name> <placement spec>
242
243 Where ``name`` is the name of the CephFS, ``placement`` is a
244 :ref:`orchestrator-cli-placement-spec`.
245
246 This command will create the required Ceph pools, create the new
247 CephFS, and deploy mds servers.
248
249 Stateless services (MDS/RGW/NFS/rbd-mirror/iSCSI)
250 =================================================
251
252 The orchestrator is not responsible for configuring the services. Please look into the corresponding
253 documentation for details.
254
255 The ``name`` parameter is an identifier of the group of instances:
256
257 * a CephFS file system for a group of MDS daemons,
258 * a zone name for a group of RGWs
259
260 Sizing: the ``size`` parameter gives the number of daemons in the cluster
261 (e.g. the number of MDS daemons for a particular CephFS file system).
262
263 Creating/growing/shrinking/removing services::
264
265 ceph orch {mds,rgw} update <name> <size> [host…]
266 ceph orch {mds,rgw} add <name>
267 ceph orch nfs update <name> <size> [host…]
268 ceph orch nfs add <name> <pool> [--namespace=<namespace>]
269 ceph orch {mds,rgw,nfs} rm <name>
270
271 e.g., ``ceph orch mds update myfs 3 host1 host2 host3``
272
273 Start/stop/reload::
274
275 ceph orch service {stop,start,reload} <type> <name>
276
277 ceph orch daemon {start,stop,reload} <type> <daemon-id>
278
279 .. _orchestrator-cli-service-spec:
280
281 Service Specification
282 =====================
283
284 As *Service Specification* is a data structure often represented as YAML
285 to specify the deployment of services. For example:
286
287 .. code-block:: yaml
288
289 service_type: rgw
290 service_id: realm.zone
291 placement:
292 hosts:
293 - host1
294 - host2
295 - host3
296 spec: ...
297
298 Where the properties of a service specification are the following:
299
300 * ``service_type`` is the type of the service. Needs to be either a Ceph
301 service (``mon``, ``crash``, ``mds``, ``mgr``, ``osd`` or
302 ``rbd-mirror``), a gateway (``nfs`` or ``rgw``), or part of the
303 monitoring stack (``alertmanager``, ``grafana``, ``node-exporter`` or
304 ``prometheus``).
305 * ``service_id`` is the name of the service. Omit the service time
306 * ``placement`` is a :ref:`orchestrator-cli-placement-spec`
307 * ``spec``: additional specifications for a specific service.
308
309 Each service type can have different requirements for the spec.
310
311 Service specifications of type ``mon``, ``mgr``, and the monitoring
312 types do not require a ``service_id``
313
314 A service of type ``nfs`` requires a pool name and contain
315 an optional namespace:
316
317 .. code-block:: yaml
318
319 service_type: nfs
320 service_id: mynfs
321 placement:
322 hosts:
323 - host1
324 - host2
325 spec:
326 pool: mypool
327 namespace: mynamespace
328
329 Where ``pool`` is a RADOS pool where NFS client recovery data is stored
330 and ``namespace`` is a RADOS namespace where NFS client recovery
331 data is stored in the pool.
332
333 A service of type ``osd`` is in detail described in :ref:`drivegroups`
334
335 Many service specifications can then be applied at once using
336 ``ceph orch apply -i`` by submitting a multi-document YAML file::
337
338 cat <<EOF | ceph orch apply -i -
339 service_type: mon
340 placement:
341 host_pattern: "mon*"
342 ---
343 service_type: mgr
344 placement:
345 host_pattern: "mgr*"
346 ---
347 service_type: osd
348 placement:
349 host_pattern: "osd*"
350 data_devices:
351 all: true
352 EOF
353
354 .. _orchestrator-cli-placement-spec:
355
356 Placement Specification
357 =======================
358
359 In order to allow the orchestrator to deploy a *service*, it needs to
360 know how many and where it should deploy *daemons*. The orchestrator
361 defines a placement specification that can either be passed as a command line argument.
362
363 Explicit placements
364 -------------------
365
366 Daemons can be explictly placed on hosts by simply specifying them::
367
368 orch apply prometheus "host1 host2 host3"
369
370 Or in yaml:
371
372 .. code-block:: yaml
373
374 service_type: prometheus
375 placement:
376 hosts:
377 - host1
378 - host2
379 - host3
380
381 MONs and other services may require some enhanced network specifications::
382
383 orch daemon add mon myhost:[v2:1.2.3.4:3000,v1:1.2.3.4:6789]=name
384
385 Where ``[v2:1.2.3.4:3000,v1:1.2.3.4:6789]`` is the network address of the monitor
386 and ``=name`` specifies the name of the new monitor.
387
388 Placement by labels
389 -------------------
390
391 Daemons can be explictly placed on hosts that match a specifc label::
392
393 orch apply prometheus label:mylabel
394
395 Or in yaml:
396
397 .. code-block:: yaml
398
399 service_type: prometheus
400 placement:
401 label: "mylabel"
402
403
404 Placement by pattern matching
405 -----------------------------
406
407 Daemons can be placed on hosts as well::
408
409 orch apply prometheus 'myhost[1-3]'
410
411 Or in yaml:
412
413 .. code-block:: yaml
414
415 service_type: prometheus
416 placement:
417 host_pattern: "myhost[1-3]"
418
419 To place a service on *all* hosts, use ``"*"``::
420
421 orch apply crash '*'
422
423 Or in yaml:
424
425 .. code-block:: yaml
426
427 service_type: node-exporter
428 placement:
429 host_pattern: "*"
430
431
432 Setting a limit
433 ---------------
434
435 By specifying ``count``, only that number of daemons will be created::
436
437 orch apply prometheus 3
438
439 To deploy *daemons* on a subset of hosts, also specify the count::
440
441 orch apply prometheus "2 host1 host2 host3"
442
443 If the count is bigger than the amount of hosts, cephadm still deploys two daemons::
444
445 orch apply prometheus "3 host1 host2"
446
447 Or in yaml:
448
449 .. code-block:: yaml
450
451 service_type: prometheus
452 placement:
453 count: 3
454
455 Or with hosts:
456
457 .. code-block:: yaml
458
459 service_type: prometheus
460 placement:
461 count: 2
462 hosts:
463 - host1
464 - host2
465 - host3
466
467
468 Configuring the Orchestrator CLI
469 ================================
470
471 To enable the orchestrator, select the orchestrator module to use
472 with the ``set backend`` command::
473
474 ceph orch set backend <module>
475
476 For example, to enable the Rook orchestrator module and use it with the CLI::
477
478 ceph mgr module enable rook
479 ceph orch set backend rook
480
481 Check the backend is properly configured::
482
483 ceph orch status
484
485 Disable the Orchestrator
486 ------------------------
487
488 To disable the orchestrator, use the empty string ``""``::
489
490 ceph orch set backend ""
491 ceph mgr module disable rook
492
493 Current Implementation Status
494 =============================
495
496 This is an overview of the current implementation status of the orchestrators.
497
498 =================================== ====== =========
499 Command Rook Cephadm
500 =================================== ====== =========
501 apply iscsi ⚪ ⚪
502 apply mds ✔ ✔
503 apply mgr ⚪ ✔
504 apply mon ✔ ✔
505 apply nfs ✔ ✔
506 apply osd ✔ ✔
507 apply rbd-mirror ✔ ✔
508 apply rgw ⚪ ✔
509 host add ⚪ ✔
510 host ls ✔ ✔
511 host rm ⚪ ✔
512 daemon status ⚪ ✔
513 daemon {stop,start,...} ⚪ ✔
514 device {ident,fault}-(on,off} ⚪ ✔
515 device ls ✔ ✔
516 iscsi add ⚪ ⚪
517 mds add ✔ ✔
518 nfs add ✔ ✔
519 ps ⚪ ✔
520 rbd-mirror add ⚪ ✔
521 rgw add ✔ ✔
522 ps ✔ ✔
523 =================================== ====== =========
524
525 where
526
527 * ⚪ = not yet implemented
528 * ❌ = not applicable
529 * ✔ = implemented