ceph/doc/mgr/orchestrator.rst

   1
   2 .. _orchestrator-cli-module:
   3
   4 ================
   5 Orchestrator CLI
   6 ================
   7
   8 This module provides a command line interface (CLI) to orchestrator
   9 modules (ceph-mgr modules which interface with external orchestration services).
  10
  11 As the orchestrator CLI unifies different external orchestrators, a common nomenclature
  12 for the orchestrator module is needed.
  13
  14 +--------------------------------------+---------------------------------------+
  15 | *host*                               | hostname (not DNS name) of the        |
  16 |                                      | physical host. Not the podname,       |
  17 |                                      | container name, or hostname inside    |
  18 |                                      | the container.                        |
  19 +--------------------------------------+---------------------------------------+
  20 | *service type*                       | The type of the service. e.g., nfs,   |
  21 |                                      | mds, osd, mon, rgw, mgr, iscsi        |
  22 +--------------------------------------+---------------------------------------+
  23 | *service*                            | A logical service, Typically          |
  24 |                                      | comprised of multiple service         |
  25 |                                      | instances on multiple hosts for HA    |
  26 |                                      |                                       |
  27 |                                      | * ``fs_name`` for mds type            |
  28 |                                      | * ``rgw_zone`` for rgw type           |
  29 |                                      | * ``ganesha_cluster_id`` for nfs type |
  30 +--------------------------------------+---------------------------------------+
  31 | *daemon*                             | A single instance of a service.       |
  32 |                                      | Usually a daemon, but maybe not       |
  33 |                                      | (e.g., might be a kernel service      |
  34 |                                      | like LIO or knfsd or whatever)        |
  35 |                                      |                                       |
  36 |                                      | This identifier should                |
  37 |                                      | uniquely identify the instance        |
  38 +--------------------------------------+---------------------------------------+
  39
  40 The relation between the names is the following:
  41
  42 * A *service* has a specfic *service type*
  43 * A *daemon* is a physical instance of a *service type*
  44
  45
  46 .. note::
  47
  48     Orchestrator modules may only implement a subset of the commands listed below.
  49     Also, the implementation of the commands are orchestrator module dependent and will
  50     differ between implementations.
  51
  52 Status
  53 ======
  54
  55 ::
  56
  57     ceph orch status
  58
  59 Show current orchestrator mode and high-level status (whether the module able
  60 to talk to it)
  61
  62 Also show any in-progress actions.
  63
  64 Host Management
  65 ===============
  66
  67 List hosts associated with the cluster::
  68
  69     ceph orch host ls
  70
  71 Add and remove hosts::
  72
  73     ceph orch host add <host>
  74     ceph orch host rm <host>
  75
  76 OSD Management
  77 ==============
  78
  79 List Devices
  80 ------------
  81
  82 Print a list of discovered devices, grouped by host and optionally
  83 filtered to a particular host:
  84
  85 ::
  86
  87     ceph orch device ls [--host=...] [--refresh]
  88
  89 Example::
  90
  91     # ceph orch device ls
  92     Host 192.168.121.206:
  93     Device Path           Type       Size    Rotates  Available Model
  94     /dev/sdb               hdd      50.0G       True       True ATA/QEMU HARDDISK
  95     /dev/sda               hdd      50.0G       True      False ATA/QEMU HARDDISK
  96
  97     Host 192.168.121.181:
  98     Device Path           Type       Size    Rotates  Available Model
  99     /dev/sdb               hdd      50.0G       True       True ATA/QEMU HARDDISK
 100     /dev/sda               hdd      50.0G       True      False ATA/QEMU HARDDISK
 101
 102 .. note::
 103     Output form Ansible orchestrator
 104
 105 Create OSDs
 106 -----------
 107
 108 Create OSDs on a group of devices on a single host::
 109
 110     ceph orch osd create <host>:<drive>
 111     ceph orch osd create -i <path-to-drive-group.json>
 112
 113
 114 The output of ``osd create`` is not specified and may vary between orchestrator backends.
 115
 116 Where ``drive.group.json`` is a JSON file containing the fields defined in
 117 :class:`ceph.deployment_utils.drive_group.DriveGroupSpec`
 118
 119 Example::
 120
 121     # ceph orch osd create 192.168.121.206:/dev/sdc
 122     {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "7082f3ba-f5b7-4b7c-9477-e74ca918afcb", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.206            : ok=96   changed=3    unreachable=0    failed=0   \r\n", "counter": 932, "pid": 10294, "created": "2019-05-28T22:22:58.527821", "end_line": 1170, "runner_ident": "083cad3c-8197-11e9-b07a-2016b900e38f", "start_line": 1166, "event_data": {"ignored": 0, "skipped": {"192.168.121.206": 186}, "ok": {"192.168.121.206": 96}, "artifact_data": {}, "rescued": 0, "changed": {"192.168.121.206": 3}, "pid": 10294, "dark": {}, "playbook_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89", "playbook": "add-osd.yml", "failures": {}, "processed": {"192.168.121.206": 1}}, "parent_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89"}}
 123
 124 .. note::
 125     Output form Ansible orchestrator
 126
 127 Decommission an OSD
 128 -------------------
 129 ::
 130
 131     ceph orch osd rm <osd-id> [osd-id...]
 132
 133 Removes one or more OSDs from the cluster and the host, if the OSDs are marked as
 134 ``destroyed``.
 135
 136 Example::
 137
 138     # ceph orch osd rm 4
 139     {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "1a16e631-906d-48e0-9e24-fa7eb593cc0a", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.158            : ok=2    changed=0    unreachable=0    failed=0   \r\n192.168.121.181            : ok=2    changed=0    unreachable=0    failed=0   \r\n192.168.121.206            : ok=2    changed=0    unreachable=0    failed=0   \r\nlocalhost                  : ok=31   changed=8    unreachable=0    failed=0   \r\n", "counter": 240, "pid": 10948, "created": "2019-05-28T22:26:09.264012", "end_line": 308, "runner_ident": "8c093db0-8197-11e9-b07a-2016b900e38f", "start_line": 301, "event_data": {"ignored": 0, "skipped": {"localhost": 37}, "ok": {"192.168.121.181": 2, "192.168.121.158": 2, "192.168.121.206": 2, "localhost": 31}, "artifact_data": {}, "rescued": 0, "changed": {"localhost": 8}, "pid": 10948, "dark": {}, "playbook_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02", "playbook": "shrink-osd.yml", "failures": {}, "processed": {"192.168.121.181": 1, "192.168.121.158": 1, "192.168.121.206": 1, "localhost": 1}}, "parent_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02"}}
 140
 141 .. note::
 142     Output form Ansible orchestrator
 143
 144 ..
 145     Blink Device Lights
 146     ^^^^^^^^^^^^^^^^^^^
 147     ::
 148
 149         ceph orch device ident-on <dev_id>
 150         ceph orch device ident-on <dev_name> <host>
 151         ceph orch device fault-on <dev_id>
 152         ceph orch device fault-on <dev_name> <host>
 153
 154         ceph orch device ident-off <dev_id> [--force=true]
 155         ceph orch device ident-off <dev_id> <host> [--force=true]
 156         ceph orch device fault-off <dev_id> [--force=true]
 157         ceph orch device fault-off <dev_id> <host> [--force=true]
 158
 159     where ``dev_id`` is the device id as listed in ``osd metadata``,
 160     ``dev_name`` is the name of the device on the system and ``host`` is the host as
 161     returned by ``orchestrator host ls``
 162
 163         ceph orch osd ident-on {primary,journal,db,wal,all} <osd-id>
 164         ceph orch osd ident-off {primary,journal,db,wal,all} <osd-id>
 165         ceph orch osd fault-on {primary,journal,db,wal,all} <osd-id>
 166         ceph orch osd fault-off {primary,journal,db,wal,all} <osd-id>
 167
 168     Where ``journal`` is the filestore journal, ``wal`` is the write ahead log of
 169     bluestore and ``all`` stands for all devices associated with the osd
 170
 171
 172 Monitor and manager management
 173 ==============================
 174
 175 Creates or removes MONs or MGRs from the cluster. Orchestrator may return an
 176 error if it doesn't know how to do this transition.
 177
 178 Update the number of monitor hosts::
 179
 180     ceph orch apply mon <num> [host, host:network...]
 181
 182 Each host can optionally specify a network for the monitor to listen on.
 183
 184 Update the number of manager hosts::
 185
 186     ceph orch apply mgr <num> [host...]
 187
 188 ..
 189     .. note::
 190
 191         The host lists are the new full list of mon/mgr hosts
 192
 193     .. note::
 194
 195         specifying hosts is optional for some orchestrator modules
 196         and mandatory for others (e.g. Ansible).
 197
 198
 199 Service Status
 200 ==============
 201
 202 Print a list of services known to the orchestrator. The list can be limited to
 203 services on a particular host with the optional --host parameter and/or
 204 services of a particular type via optional --type parameter
 205 (mon, osd, mgr, mds, rgw):
 206
 207 ::
 208
 209     ceph orch ps
 210     ceph orch service ls [--host host] [--svc_type type] [--refresh]
 211
 212 Discover the status of a particular service or daemons::
 213
 214     ceph orch service ls --svc_type type --svc_id <name> [--refresh]
 215
 216
 217 Query the status of a particular service instance (mon, osd, mds, rgw).  For OSDs
 218 the id is the numeric OSD ID, for MDS services it is the file system name::
 219
 220     ceph orch daemon status <type> <instance-name> [--refresh]
 221
 222
 223 .. _orchestrator-cli-cephfs:
 224
 225 Depoying CephFS
 226 ===============
 227
 228 In order to set up a :term:`CephFS`, execute::
 229
 230     ceph fs volume create <fs_name> <placement spec>
 231
 232 Where ``name`` is the name of the CephFS, ``placement`` is a
 233 :ref:`orchestrator-cli-placement-spec`.
 234
 235 This command will create the required Ceph pools, create the new
 236 CephFS, and deploy mds servers.
 237
 238 Stateless services (MDS/RGW/NFS/rbd-mirror/iSCSI)
 239 =================================================
 240
 241 The orchestrator is not responsible for configuring the services. Please look into the corresponding
 242 documentation for details.
 243
 244 The ``name`` parameter is an identifier of the group of instances:
 245
 246 * a CephFS file system for a group of MDS daemons,
 247 * a zone name for a group of RGWs
 248
 249 Sizing: the ``size`` parameter gives the number of daemons in the cluster
 250 (e.g. the number of MDS daemons for a particular CephFS file system).
 251
 252 Creating/growing/shrinking/removing services::
 253
 254     ceph orch {mds,rgw} update <name> <size> [host…]
 255     ceph orch {mds,rgw} add <name>
 256     ceph orch nfs update <name> <size> [host…]
 257     ceph orch nfs add <name> <pool> [--namespace=<namespace>]
 258     ceph orch {mds,rgw,nfs} rm <name>
 259
 260 e.g., ``ceph orch mds update myfs 3 host1 host2 host3``
 261
 262 Start/stop/reload::
 263
 264     ceph orch service {stop,start,reload} <type> <name>
 265
 266     ceph orch daemon {start,stop,reload} <type> <daemon-id>
 267
 268
 269 .. _orchestrator-cli-placement-spec:
 270
 271 Placement Specification
 272 =======================
 273
 274 In order to allow the orchestrator to deploy a *service*, it needs to
 275 know how many and where it should deploy *daemons*. The orchestrator
 276 defines a placement specification that can either be passed as a command line argument.
 277
 278 Explicit placements
 279 -------------------
 280
 281 Daemons can be explictly placed on hosts by simply specifying them::
 282
 283     orch apply prometheus "host1 host2 host3"
 284
 285 Or in yaml::
 286
 287     service_type: prometheus
 288     placement:
 289       hosts:
 290         - host1
 291         - host2
 292         - host3
 293
 294 MONs and other services may require some enhanced network specifications::
 295
 296   orch daemon add mon myhost:[v2:1.2.3.4:3000,v1:1.2.3.4:6789]=name
 297
 298 Where ``[v2:1.2.3.4:3000,v1:1.2.3.4:6789]`` is the network address of the monitor
 299 and ``=name`` specifies the name of the new monitor.
 300
 301 Placement by labels
 302 -------------------
 303
 304 Daemons can be explictly placed on hosts that match a specifc label::
 305
 306     orch apply prometheus label:mylabel
 307
 308 Or in yaml::
 309
 310     service_type: prometheus
 311     placement:
 312       label: "mylabel"
 313
 314
 315 Placement by pattern matching
 316 -----------------------------
 317
 318 Daemons can be placed on hosts as well::
 319
 320     orch apply prometheus '*'
 321
 322 Or in yaml::
 323
 324     service_type: prometheus
 325     placement:
 326       all_hosts: true
 327
 328
 329 Setting a limit
 330 ---------------
 331
 332 By specifying ``count``, only that number of daemons will be created::
 333
 334     orch apply prometheus 3
 335
 336 To deploy *daemons* on a subset of hosts, also specify the count::
 337
 338     orch apply prometheus "2 host1 host2 host3"
 339
 340 If the count is bigger than the amount of hosts, cephadm still deploys two daemons::
 341
 342     orch apply prometheus "3 host1 host2"
 343
 344 Or in yaml::
 345
 346     service_type: prometheus
 347     placement:
 348       count: 3
 349
 350 Or with hosts::
 351
 352     service_type: prometheus
 353     placement:
 354       count: 2
 355       hosts:
 356         - host1
 357         - host2
 358         - host3
 359
 360
 361 Configuring the Orchestrator CLI
 362 ================================
 363
 364 To enable the orchestrator, select the orchestrator module to use
 365 with the ``set backend`` command::
 366
 367     ceph orch set backend <module>
 368
 369 For example, to enable the Rook orchestrator module and use it with the CLI::
 370
 371     ceph mgr module enable rook
 372     ceph orch set backend rook
 373
 374 Check the backend is properly configured::
 375
 376     ceph orch status
 377
 378 Disable the Orchestrator
 379 ------------------------
 380
 381 To disable the orchestrator, use the empty string ``""``::
 382
 383     ceph orch set backend ""
 384     ceph mgr module disable rook
 385
 386 Current Implementation Status
 387 =============================
 388
 389 This is an overview of the current implementation status of the orchestrators.
 390
 391 =================================== ====== =========
 392  Command                             Rook   Cephadm
 393 =================================== ====== =========
 394  apply iscsi                         ⚪      ⚪
 395  apply mds                           ✔      ✔
 396  apply mgr                           ⚪      ✔
 397  apply mon                           ✔      ✔
 398  apply nfs                           ✔      ⚪
 399  apply osd                           ✔      ✔
 400  apply rbd-mirror                    ✔      ✔
 401  apply rgw                           ⚪      ✔
 402  host add                            ⚪      ✔
 403  host ls                             ✔      ✔
 404  host rm                             ⚪      ✔
 405  daemon status                       ⚪      ✔
 406  daemon {stop,start,...}             ⚪      ✔
 407  device {ident,fault}-(on,off}       ⚪      ✔
 408  device ls                           ✔      ✔
 409  iscsi add                           ⚪      ⚪
 410  mds add                             ✔      ✔
 411  nfs add                             ✔      ⚪
 412  ps                                  ⚪      ✔
 413  rbd-mirror add                      ⚪      ✔
 414  rgw add                             ✔      ✔
 415  ps                                  ✔      ✔
 416 =================================== ====== =========
 417
 418 where
 419
 420 * ⚪ = not yet implemented
 421 * ❌ = not applicable
 422 * ✔ = implemented