]> git.proxmox.com Git - ceph.git/blob - ceph/doc/mgr/orchestrator.rst
import 15.2.1 Octopus source
[ceph.git] / ceph / doc / mgr / orchestrator.rst
1
2 .. _orchestrator-cli-module:
3
4 ================
5 Orchestrator CLI
6 ================
7
8 This module provides a command line interface (CLI) to orchestrator
9 modules (ceph-mgr modules which interface with external orchestration services).
10
11 As the orchestrator CLI unifies different external orchestrators, a common nomenclature
12 for the orchestrator module is needed.
13
14 +--------------------------------------+---------------------------------------+
15 | *host* | hostname (not DNS name) of the |
16 | | physical host. Not the podname, |
17 | | container name, or hostname inside |
18 | | the container. |
19 +--------------------------------------+---------------------------------------+
20 | *service type* | The type of the service. e.g., nfs, |
21 | | mds, osd, mon, rgw, mgr, iscsi |
22 +--------------------------------------+---------------------------------------+
23 | *service* | A logical service, Typically |
24 | | comprised of multiple service |
25 | | instances on multiple hosts for HA |
26 | | |
27 | | * ``fs_name`` for mds type |
28 | | * ``rgw_zone`` for rgw type |
29 | | * ``ganesha_cluster_id`` for nfs type |
30 +--------------------------------------+---------------------------------------+
31 | *daemon* | A single instance of a service. |
32 | | Usually a daemon, but maybe not |
33 | | (e.g., might be a kernel service |
34 | | like LIO or knfsd or whatever) |
35 | | |
36 | | This identifier should |
37 | | uniquely identify the instance |
38 +--------------------------------------+---------------------------------------+
39
40 The relation between the names is the following:
41
42 * A *service* has a specfic *service type*
43 * A *daemon* is a physical instance of a *service type*
44
45
46 .. note::
47
48 Orchestrator modules may only implement a subset of the commands listed below.
49 Also, the implementation of the commands are orchestrator module dependent and will
50 differ between implementations.
51
52 Status
53 ======
54
55 ::
56
57 ceph orch status
58
59 Show current orchestrator mode and high-level status (whether the module able
60 to talk to it)
61
62 Also show any in-progress actions.
63
64 Host Management
65 ===============
66
67 List hosts associated with the cluster::
68
69 ceph orch host ls
70
71 Add and remove hosts::
72
73 ceph orch host add <host>
74 ceph orch host rm <host>
75
76 OSD Management
77 ==============
78
79 List Devices
80 ------------
81
82 Print a list of discovered devices, grouped by host and optionally
83 filtered to a particular host:
84
85 ::
86
87 ceph orch device ls [--host=...] [--refresh]
88
89 Example::
90
91 # ceph orch device ls
92 Host 192.168.121.206:
93 Device Path Type Size Rotates Available Model
94 /dev/sdb hdd 50.0G True True ATA/QEMU HARDDISK
95 /dev/sda hdd 50.0G True False ATA/QEMU HARDDISK
96
97 Host 192.168.121.181:
98 Device Path Type Size Rotates Available Model
99 /dev/sdb hdd 50.0G True True ATA/QEMU HARDDISK
100 /dev/sda hdd 50.0G True False ATA/QEMU HARDDISK
101
102 .. note::
103 Output form Ansible orchestrator
104
105 Create OSDs
106 -----------
107
108 Create OSDs on a group of devices on a single host::
109
110 ceph orch osd create <host>:<drive>
111 ceph orch osd create -i <path-to-drive-group.json>
112
113
114 The output of ``osd create`` is not specified and may vary between orchestrator backends.
115
116 Where ``drive.group.json`` is a JSON file containing the fields defined in
117 :class:`ceph.deployment_utils.drive_group.DriveGroupSpec`
118
119 Example::
120
121 # ceph orch osd create 192.168.121.206:/dev/sdc
122 {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "7082f3ba-f5b7-4b7c-9477-e74ca918afcb", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.206 : ok=96 changed=3 unreachable=0 failed=0 \r\n", "counter": 932, "pid": 10294, "created": "2019-05-28T22:22:58.527821", "end_line": 1170, "runner_ident": "083cad3c-8197-11e9-b07a-2016b900e38f", "start_line": 1166, "event_data": {"ignored": 0, "skipped": {"192.168.121.206": 186}, "ok": {"192.168.121.206": 96}, "artifact_data": {}, "rescued": 0, "changed": {"192.168.121.206": 3}, "pid": 10294, "dark": {}, "playbook_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89", "playbook": "add-osd.yml", "failures": {}, "processed": {"192.168.121.206": 1}}, "parent_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89"}}
123
124 .. note::
125 Output form Ansible orchestrator
126
127 Decommission an OSD
128 -------------------
129 ::
130
131 ceph orch osd rm <osd-id> [osd-id...]
132
133 Removes one or more OSDs from the cluster and the host, if the OSDs are marked as
134 ``destroyed``.
135
136 Example::
137
138 # ceph orch osd rm 4
139 {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "1a16e631-906d-48e0-9e24-fa7eb593cc0a", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.158 : ok=2 changed=0 unreachable=0 failed=0 \r\n192.168.121.181 : ok=2 changed=0 unreachable=0 failed=0 \r\n192.168.121.206 : ok=2 changed=0 unreachable=0 failed=0 \r\nlocalhost : ok=31 changed=8 unreachable=0 failed=0 \r\n", "counter": 240, "pid": 10948, "created": "2019-05-28T22:26:09.264012", "end_line": 308, "runner_ident": "8c093db0-8197-11e9-b07a-2016b900e38f", "start_line": 301, "event_data": {"ignored": 0, "skipped": {"localhost": 37}, "ok": {"192.168.121.181": 2, "192.168.121.158": 2, "192.168.121.206": 2, "localhost": 31}, "artifact_data": {}, "rescued": 0, "changed": {"localhost": 8}, "pid": 10948, "dark": {}, "playbook_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02", "playbook": "shrink-osd.yml", "failures": {}, "processed": {"192.168.121.181": 1, "192.168.121.158": 1, "192.168.121.206": 1, "localhost": 1}}, "parent_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02"}}
140
141 .. note::
142 Output form Ansible orchestrator
143
144 ..
145 Blink Device Lights
146 ^^^^^^^^^^^^^^^^^^^
147 ::
148
149 ceph orch device ident-on <dev_id>
150 ceph orch device ident-on <dev_name> <host>
151 ceph orch device fault-on <dev_id>
152 ceph orch device fault-on <dev_name> <host>
153
154 ceph orch device ident-off <dev_id> [--force=true]
155 ceph orch device ident-off <dev_id> <host> [--force=true]
156 ceph orch device fault-off <dev_id> [--force=true]
157 ceph orch device fault-off <dev_id> <host> [--force=true]
158
159 where ``dev_id`` is the device id as listed in ``osd metadata``,
160 ``dev_name`` is the name of the device on the system and ``host`` is the host as
161 returned by ``orchestrator host ls``
162
163 ceph orch osd ident-on {primary,journal,db,wal,all} <osd-id>
164 ceph orch osd ident-off {primary,journal,db,wal,all} <osd-id>
165 ceph orch osd fault-on {primary,journal,db,wal,all} <osd-id>
166 ceph orch osd fault-off {primary,journal,db,wal,all} <osd-id>
167
168 Where ``journal`` is the filestore journal, ``wal`` is the write ahead log of
169 bluestore and ``all`` stands for all devices associated with the osd
170
171
172 Monitor and manager management
173 ==============================
174
175 Creates or removes MONs or MGRs from the cluster. Orchestrator may return an
176 error if it doesn't know how to do this transition.
177
178 Update the number of monitor hosts::
179
180 ceph orch apply mon <num> [host, host:network...]
181
182 Each host can optionally specify a network for the monitor to listen on.
183
184 Update the number of manager hosts::
185
186 ceph orch apply mgr <num> [host...]
187
188 ..
189 .. note::
190
191 The host lists are the new full list of mon/mgr hosts
192
193 .. note::
194
195 specifying hosts is optional for some orchestrator modules
196 and mandatory for others (e.g. Ansible).
197
198
199 Service Status
200 ==============
201
202 Print a list of services known to the orchestrator. The list can be limited to
203 services on a particular host with the optional --host parameter and/or
204 services of a particular type via optional --type parameter
205 (mon, osd, mgr, mds, rgw):
206
207 ::
208
209 ceph orch ps
210 ceph orch service ls [--host host] [--svc_type type] [--refresh]
211
212 Discover the status of a particular service or daemons::
213
214 ceph orch service ls --svc_type type --svc_id <name> [--refresh]
215
216
217 Query the status of a particular service instance (mon, osd, mds, rgw). For OSDs
218 the id is the numeric OSD ID, for MDS services it is the file system name::
219
220 ceph orch daemon status <type> <instance-name> [--refresh]
221
222
223 .. _orchestrator-cli-cephfs:
224
225 Depoying CephFS
226 ===============
227
228 In order to set up a :term:`CephFS`, execute::
229
230 ceph fs volume create <fs_name> <placement spec>
231
232 Where ``name`` is the name of the CephFS, ``placement`` is a
233 :ref:`orchestrator-cli-placement-spec`.
234
235 This command will create the required Ceph pools, create the new
236 CephFS, and deploy mds servers.
237
238 Stateless services (MDS/RGW/NFS/rbd-mirror/iSCSI)
239 =================================================
240
241 The orchestrator is not responsible for configuring the services. Please look into the corresponding
242 documentation for details.
243
244 The ``name`` parameter is an identifier of the group of instances:
245
246 * a CephFS file system for a group of MDS daemons,
247 * a zone name for a group of RGWs
248
249 Sizing: the ``size`` parameter gives the number of daemons in the cluster
250 (e.g. the number of MDS daemons for a particular CephFS file system).
251
252 Creating/growing/shrinking/removing services::
253
254 ceph orch {mds,rgw} update <name> <size> [host…]
255 ceph orch {mds,rgw} add <name>
256 ceph orch nfs update <name> <size> [host…]
257 ceph orch nfs add <name> <pool> [--namespace=<namespace>]
258 ceph orch {mds,rgw,nfs} rm <name>
259
260 e.g., ``ceph orch mds update myfs 3 host1 host2 host3``
261
262 Start/stop/reload::
263
264 ceph orch service {stop,start,reload} <type> <name>
265
266 ceph orch daemon {start,stop,reload} <type> <daemon-id>
267
268 .. _orchestrator-cli-service-spec:
269
270 Service Specification
271 =====================
272
273 As *Service Specification* is a data structure often represented as YAML
274 to specify the deployment of services. For example:
275
276 .. code-block:: yaml
277
278 service_type: rgw
279 service_id: realm.zone
280 placement:
281 hosts:
282 - host1
283 - host2
284 - host3
285 spec: ...
286
287 Where the properties of a service specification are the following:
288
289 * ``service_type`` is the type of the service. Needs to be either a Ceph
290 service (``mon``, ``crash``, ``mds``, ``mgr``, ``osd`` or
291 ``rbd-mirror``), a gateway (``nfs`` or ``rgw``), or part of the
292 monitoring stack (``alertmanager``, ``grafana``, ``node-exporter`` or
293 ``prometheus``).
294 * ``service_id`` is the name of the service. Omit the service time
295 * ``placement`` is a :ref:`orchestrator-cli-placement-spec`
296 * ``spec``: additional specifications for a specific service.
297
298 Each service type can have different requirements for the spec.
299
300 Service specifications of type ``mon``, ``mgr``, and the monitoring
301 types do not require a ``service_id``
302
303 A service of type ``nfs`` requires a pool name and contain
304 an optional namespace:
305
306 .. code-block:: yaml
307
308 service_type: nfs
309 service_id: mynfs
310 placement:
311 hosts:
312 - host1
313 - host2
314 spec:
315 pool: mypool
316 namespace: mynamespace
317
318 Where ``pool`` is a RADOS pool where NFS client recovery data is stored
319 and ``namespace`` is a RADOS namespace where NFS client recovery
320 data is stored in the pool.
321
322 A service of type ``osd`` is in detail described in :ref:`drivegroups`
323
324 Many service specifications can then be applied at once using
325 ``ceph orch apply -i`` by submitting a multi-document YAML file::
326
327 cat <<EOF | ceph orch apply -i -
328 service_type: mon
329 placement:
330 host_pattern: "mon*"
331 ---
332 service_type: mgr
333 placement:
334 host_pattern: "mgr*"
335 ---
336 service_type: osd
337 placement:
338 host_pattern: "osd*"
339 data_devices:
340 all: true
341 EOF
342
343 .. _orchestrator-cli-placement-spec:
344
345 Placement Specification
346 =======================
347
348 In order to allow the orchestrator to deploy a *service*, it needs to
349 know how many and where it should deploy *daemons*. The orchestrator
350 defines a placement specification that can either be passed as a command line argument.
351
352 Explicit placements
353 -------------------
354
355 Daemons can be explictly placed on hosts by simply specifying them::
356
357 orch apply prometheus "host1 host2 host3"
358
359 Or in yaml:
360
361 .. code-block:: yaml
362
363 service_type: prometheus
364 placement:
365 hosts:
366 - host1
367 - host2
368 - host3
369
370 MONs and other services may require some enhanced network specifications::
371
372 orch daemon add mon myhost:[v2:1.2.3.4:3000,v1:1.2.3.4:6789]=name
373
374 Where ``[v2:1.2.3.4:3000,v1:1.2.3.4:6789]`` is the network address of the monitor
375 and ``=name`` specifies the name of the new monitor.
376
377 Placement by labels
378 -------------------
379
380 Daemons can be explictly placed on hosts that match a specifc label::
381
382 orch apply prometheus label:mylabel
383
384 Or in yaml:
385
386 .. code-block:: yaml
387
388 service_type: prometheus
389 placement:
390 label: "mylabel"
391
392
393 Placement by pattern matching
394 -----------------------------
395
396 Daemons can be placed on hosts as well::
397
398 orch apply prometheus 'myhost[1-3]'
399
400 Or in yaml:
401
402 .. code-block:: yaml
403
404 service_type: prometheus
405 placement:
406 host_pattern: "myhost[1-3]"
407
408 To place a service on *all* hosts, use ``"*"``::
409
410 orch apply crash '*'
411
412 Or in yaml:
413
414 .. code-block:: yaml
415
416 service_type: node-exporter
417 placement:
418 host_pattern: "*"
419
420
421 Setting a limit
422 ---------------
423
424 By specifying ``count``, only that number of daemons will be created::
425
426 orch apply prometheus 3
427
428 To deploy *daemons* on a subset of hosts, also specify the count::
429
430 orch apply prometheus "2 host1 host2 host3"
431
432 If the count is bigger than the amount of hosts, cephadm still deploys two daemons::
433
434 orch apply prometheus "3 host1 host2"
435
436 Or in yaml:
437
438 .. code-block:: yaml
439
440 service_type: prometheus
441 placement:
442 count: 3
443
444 Or with hosts:
445
446 .. code-block:: yaml
447
448 service_type: prometheus
449 placement:
450 count: 2
451 hosts:
452 - host1
453 - host2
454 - host3
455
456
457 Configuring the Orchestrator CLI
458 ================================
459
460 To enable the orchestrator, select the orchestrator module to use
461 with the ``set backend`` command::
462
463 ceph orch set backend <module>
464
465 For example, to enable the Rook orchestrator module and use it with the CLI::
466
467 ceph mgr module enable rook
468 ceph orch set backend rook
469
470 Check the backend is properly configured::
471
472 ceph orch status
473
474 Disable the Orchestrator
475 ------------------------
476
477 To disable the orchestrator, use the empty string ``""``::
478
479 ceph orch set backend ""
480 ceph mgr module disable rook
481
482 Current Implementation Status
483 =============================
484
485 This is an overview of the current implementation status of the orchestrators.
486
487 =================================== ====== =========
488 Command Rook Cephadm
489 =================================== ====== =========
490 apply iscsi ⚪ ⚪
491 apply mds ✔ ✔
492 apply mgr ⚪ ✔
493 apply mon ✔ ✔
494 apply nfs ✔ ✔
495 apply osd ✔ ✔
496 apply rbd-mirror ✔ ✔
497 apply rgw ⚪ ✔
498 host add ⚪ ✔
499 host ls ✔ ✔
500 host rm ⚪ ✔
501 daemon status ⚪ ✔
502 daemon {stop,start,...} ⚪ ✔
503 device {ident,fault}-(on,off} ⚪ ✔
504 device ls ✔ ✔
505 iscsi add ⚪ ⚪
506 mds add ✔ ✔
507 nfs add ✔ ✔
508 ps ⚪ ✔
509 rbd-mirror add ⚪ ✔
510 rgw add ✔ ✔
511 ps ✔ ✔
512 =================================== ====== =========
513
514 where
515
516 * ⚪ = not yet implemented
517 * ❌ = not applicable
518 * ✔ = implemented