ceph/doc/mgr/orchestrator.rst

   1
   2 .. _orchestrator-cli-module:
   3
   4 ================
   5 Orchestrator CLI
   6 ================
   7
   8 This module provides a command line interface (CLI) to orchestrator
   9 modules (ceph-mgr modules which interface with external orchestration services).
  10
  11 As the orchestrator CLI unifies different external orchestrators, a common nomenclature
  12 for the orchestrator module is needed.
  13
  14 +--------------------------------------+---------------------------------------+
  15 | *host*                               | hostname (not DNS name) of the        |
  16 |                                      | physical host. Not the podname,       |
  17 |                                      | container name, or hostname inside    |
  18 |                                      | the container.                        |
  19 +--------------------------------------+---------------------------------------+
  20 | *service type*                       | The type of the service. e.g., nfs,   |
  21 |                                      | mds, osd, mon, rgw, mgr, iscsi        |
  22 +--------------------------------------+---------------------------------------+
  23 | *service*                            | A logical service, Typically          |
  24 |                                      | comprised of multiple service         |
  25 |                                      | instances on multiple hosts for HA    |
  26 |                                      |                                       |
  27 |                                      | * ``fs_name`` for mds type            |
  28 |                                      | * ``rgw_zone`` for rgw type           |
  29 |                                      | * ``ganesha_cluster_id`` for nfs type |
  30 +--------------------------------------+---------------------------------------+
  31 | *daemon*                             | A single instance of a service.       |
  32 |                                      | Usually a daemon, but maybe not       |
  33 |                                      | (e.g., might be a kernel service      |
  34 |                                      | like LIO or knfsd or whatever)        |
  35 |                                      |                                       |
  36 |                                      | This identifier should                |
  37 |                                      | uniquely identify the instance        |
  38 +--------------------------------------+---------------------------------------+
  39
  40 The relation between the names is the following:
  41
  42 * A *service* has a specfic *service type*
  43 * A *daemon* is a physical instance of a *service type*
  44
  45
  46 .. note::
  47
  48     Orchestrator modules may only implement a subset of the commands listed below.
  49     Also, the implementation of the commands are orchestrator module dependent and will
  50     differ between implementations.
  51
  52 Status
  53 ======
  54
  55 ::
  56
  57     ceph orch status
  58
  59 Show current orchestrator mode and high-level status (whether the module able
  60 to talk to it)
  61
  62 Also show any in-progress actions.
  63
  64 Host Management
  65 ===============
  66
  67 List hosts associated with the cluster::
  68
  69     ceph orch host ls
  70
  71 Add and remove hosts::
  72
  73     ceph orch host add <host>
  74     ceph orch host rm <host>
  75
  76 OSD Management
  77 ==============
  78
  79 List Devices
  80 ------------
  81
  82 Print a list of discovered devices, grouped by host and optionally
  83 filtered to a particular host:
  84
  85 ::
  86
  87     ceph orch device ls [--host=...] [--refresh]
  88
  89 Example::
  90
  91     # ceph orch device ls
  92     Host 192.168.121.206:
  93     Device Path           Type       Size    Rotates  Available Model
  94     /dev/sdb               hdd      50.0G       True       True ATA/QEMU HARDDISK
  95     /dev/sda               hdd      50.0G       True      False ATA/QEMU HARDDISK
  96
  97     Host 192.168.121.181:
  98     Device Path           Type       Size    Rotates  Available Model
  99     /dev/sdb               hdd      50.0G       True       True ATA/QEMU HARDDISK
 100     /dev/sda               hdd      50.0G       True      False ATA/QEMU HARDDISK
 101
 102 .. note::
 103     Output form Ansible orchestrator
 104
 105 Create OSDs
 106 -----------
 107
 108 Create OSDs on a group of devices on a single host::
 109
 110     ceph orch osd create <host>:<drive>
 111     ceph orch osd create -i <path-to-drive-group.json>
 112
 113
 114 The output of ``osd create`` is not specified and may vary between orchestrator backends.
 115
 116 Where ``drive.group.json`` is a JSON file containing the fields defined in
 117 :class:`ceph.deployment_utils.drive_group.DriveGroupSpec`
 118
 119 Example::
 120
 121     # ceph orch osd create 192.168.121.206:/dev/sdc
 122     {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "7082f3ba-f5b7-4b7c-9477-e74ca918afcb", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.206            : ok=96   changed=3    unreachable=0    failed=0   \r\n", "counter": 932, "pid": 10294, "created": "2019-05-28T22:22:58.527821", "end_line": 1170, "runner_ident": "083cad3c-8197-11e9-b07a-2016b900e38f", "start_line": 1166, "event_data": {"ignored": 0, "skipped": {"192.168.121.206": 186}, "ok": {"192.168.121.206": 96}, "artifact_data": {}, "rescued": 0, "changed": {"192.168.121.206": 3}, "pid": 10294, "dark": {}, "playbook_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89", "playbook": "add-osd.yml", "failures": {}, "processed": {"192.168.121.206": 1}}, "parent_uuid": "409364a6-9d49-4e44-8b7b-c28e5b3adf89"}}
 123
 124 .. note::
 125     Output form Ansible orchestrator
 126
 127 Decommission an OSD
 128 -------------------
 129 ::
 130
 131     ceph orch osd rm <osd-id> [osd-id...]
 132
 133 Removes one or more OSDs from the cluster and the host, if the OSDs are marked as
 134 ``destroyed``.
 135
 136 Example::
 137
 138     # ceph orch osd rm 4
 139     {"status": "OK", "msg": "", "data": {"event": "playbook_on_stats", "uuid": "1a16e631-906d-48e0-9e24-fa7eb593cc0a", "stdout": "\r\nPLAY RECAP *********************************************************************\r\n192.168.121.158            : ok=2    changed=0    unreachable=0    failed=0   \r\n192.168.121.181            : ok=2    changed=0    unreachable=0    failed=0   \r\n192.168.121.206            : ok=2    changed=0    unreachable=0    failed=0   \r\nlocalhost                  : ok=31   changed=8    unreachable=0    failed=0   \r\n", "counter": 240, "pid": 10948, "created": "2019-05-28T22:26:09.264012", "end_line": 308, "runner_ident": "8c093db0-8197-11e9-b07a-2016b900e38f", "start_line": 301, "event_data": {"ignored": 0, "skipped": {"localhost": 37}, "ok": {"192.168.121.181": 2, "192.168.121.158": 2, "192.168.121.206": 2, "localhost": 31}, "artifact_data": {}, "rescued": 0, "changed": {"localhost": 8}, "pid": 10948, "dark": {}, "playbook_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02", "playbook": "shrink-osd.yml", "failures": {}, "processed": {"192.168.121.181": 1, "192.168.121.158": 1, "192.168.121.206": 1, "localhost": 1}}, "parent_uuid": "a12ec40e-bce9-4bc9-b09e-2d8f76a5be02"}}
 140
 141 .. note::
 142     Output form Ansible orchestrator
 143
 144 ..
 145     Blink Device Lights
 146     ^^^^^^^^^^^^^^^^^^^
 147     ::
 148
 149         ceph orch device ident-on <dev_id>
 150         ceph orch device ident-on <dev_name> <host>
 151         ceph orch device fault-on <dev_id>
 152         ceph orch device fault-on <dev_name> <host>
 153
 154         ceph orch device ident-off <dev_id> [--force=true]
 155         ceph orch device ident-off <dev_id> <host> [--force=true]
 156         ceph orch device fault-off <dev_id> [--force=true]
 157         ceph orch device fault-off <dev_id> <host> [--force=true]
 158
 159     where ``dev_id`` is the device id as listed in ``osd metadata``,
 160     ``dev_name`` is the name of the device on the system and ``host`` is the host as
 161     returned by ``orchestrator host ls``
 162
 163         ceph orch osd ident-on {primary,journal,db,wal,all} <osd-id>
 164         ceph orch osd ident-off {primary,journal,db,wal,all} <osd-id>
 165         ceph orch osd fault-on {primary,journal,db,wal,all} <osd-id>
 166         ceph orch osd fault-off {primary,journal,db,wal,all} <osd-id>
 167
 168     Where ``journal`` is the filestore journal, ``wal`` is the write ahead log of
 169     bluestore and ``all`` stands for all devices associated with the osd
 170
 171
 172 Monitor and manager management
 173 ==============================
 174
 175 Creates or removes MONs or MGRs from the cluster. Orchestrator may return an
 176 error if it doesn't know how to do this transition.
 177
 178 Update the number of monitor hosts::
 179
 180     ceph orch apply mon <num> [host, host:network...]
 181
 182 Each host can optionally specify a network for the monitor to listen on.
 183
 184 Update the number of manager hosts::
 185
 186     ceph orch apply mgr <num> [host...]
 187
 188 ..
 189     .. note::
 190
 191         The host lists are the new full list of mon/mgr hosts
 192
 193     .. note::
 194
 195         specifying hosts is optional for some orchestrator modules
 196         and mandatory for others (e.g. Ansible).
 197
 198
 199 Service Status
 200 ==============
 201
 202 Print a list of services known to the orchestrator. The list can be limited to
 203 services on a particular host with the optional --host parameter and/or
 204 services of a particular type via optional --type parameter
 205 (mon, osd, mgr, mds, rgw):
 206
 207 ::
 208
 209     ceph orch ls [--service_type type] [--service_name name] [--export] [--format f] [--refresh]
 210
 211 Discover the status of a particular service or daemons::
 212
 213     ceph orch ls --service_type type --service_name <name> [--refresh]
 214
 215 Export the service specs known to the orchestrator as yaml in format
 216 that is compatible to ``ceph orch apply -i``::
 217
 218     ceph orch ls --export
 219
 220
 221 Daemon Status
 222 =============
 223
 224 Print a list of all daemons known to the orchestrator::
 225
 226     ceph orch ps [--hostname host] [--daemon_type type] [--service_name name] [--daemon_id id] [--format f] [--refresh]
 227
 228 Query the status of a particular service instance (mon, osd, mds, rgw).  For OSDs
 229 the id is the numeric OSD ID, for MDS services it is the file system name::
 230
 231     ceph orch ps --daemon_type osd --daemon_id 0
 232
 233
 234 .. _orchestrator-cli-cephfs:
 235
 236 Depoying CephFS
 237 ===============
 238
 239 In order to set up a :term:`CephFS`, execute::
 240
 241     ceph fs volume create <fs_name> <placement spec>
 242
 243 Where ``name`` is the name of the CephFS, ``placement`` is a
 244 :ref:`orchestrator-cli-placement-spec`.
 245
 246 This command will create the required Ceph pools, create the new
 247 CephFS, and deploy mds servers.
 248
 249 Stateless services (MDS/RGW/NFS/rbd-mirror/iSCSI)
 250 =================================================
 251
 252 The orchestrator is not responsible for configuring the services. Please look into the corresponding
 253 documentation for details.
 254
 255 The ``name`` parameter is an identifier of the group of instances:
 256
 257 * a CephFS file system for a group of MDS daemons,
 258 * a zone name for a group of RGWs
 259
 260 Sizing: the ``size`` parameter gives the number of daemons in the cluster
 261 (e.g. the number of MDS daemons for a particular CephFS file system).
 262
 263 Creating/growing/shrinking/removing services::
 264
 265     ceph orch {mds,rgw} update <name> <size> [host…]
 266     ceph orch {mds,rgw} add <name>
 267     ceph orch nfs update <name> <size> [host…]
 268     ceph orch nfs add <name> <pool> [--namespace=<namespace>]
 269     ceph orch {mds,rgw,nfs} rm <name>
 270
 271 e.g., ``ceph orch mds update myfs 3 host1 host2 host3``
 272
 273 Start/stop/reload::
 274
 275     ceph orch service {stop,start,reload} <type> <name>
 276
 277     ceph orch daemon {start,stop,reload} <type> <daemon-id>
 278
 279 .. _orchestrator-cli-service-spec:
 280
 281 Service Specification
 282 =====================
 283
 284 As *Service Specification* is a data structure often represented as YAML
 285 to specify the deployment of services. For example:
 286
 287 .. code-block:: yaml
 288
 289     service_type: rgw
 290     service_id: realm.zone
 291     placement:
 292       hosts:
 293         - host1
 294         - host2
 295         - host3
 296     spec: ...
 297
 298 Where the properties of a service specification are the following:
 299
 300 * ``service_type`` is the type of the service. Needs to be either a Ceph
 301    service (``mon``, ``crash``, ``mds``, ``mgr``, ``osd`` or
 302    ``rbd-mirror``), a gateway (``nfs`` or ``rgw``), or part of the
 303    monitoring stack (``alertmanager``, ``grafana``, ``node-exporter`` or
 304    ``prometheus``).
 305 * ``service_id`` is the name of the service. Omit the service time
 306 * ``placement`` is a :ref:`orchestrator-cli-placement-spec`
 307 * ``spec``: additional specifications for a specific service.
 308
 309 Each service type can have different requirements for the spec.
 310
 311 Service specifications of type ``mon``, ``mgr``, and the monitoring
 312 types do not require a ``service_id``
 313
 314 A service of type ``nfs`` requires a pool name and contain
 315 an optional namespace:
 316
 317 .. code-block:: yaml
 318
 319     service_type: nfs
 320     service_id: mynfs
 321     placement:
 322       hosts:
 323         - host1
 324         - host2
 325     spec:
 326       pool: mypool
 327       namespace: mynamespace
 328
 329 Where ``pool`` is a RADOS pool where NFS client recovery data is stored
 330 and ``namespace`` is a RADOS namespace where NFS client recovery
 331 data is stored in the pool.
 332
 333 A service of type ``osd`` is in detail described in :ref:`drivegroups`
 334
 335 Many service specifications can then be applied at once using
 336 ``ceph orch apply -i`` by submitting a multi-document YAML file::
 337
 338     cat <<EOF | ceph orch apply -i -
 339     service_type: mon
 340     placement:
 341       host_pattern: "mon*"
 342     ---
 343     service_type: mgr
 344     placement:
 345       host_pattern: "mgr*"
 346     ---
 347     service_type: osd
 348     placement:
 349       host_pattern: "osd*"
 350     data_devices:
 351       all: true
 352     EOF
 353
 354 .. _orchestrator-cli-placement-spec:
 355
 356 Placement Specification
 357 =======================
 358
 359 In order to allow the orchestrator to deploy a *service*, it needs to
 360 know how many and where it should deploy *daemons*. The orchestrator
 361 defines a placement specification that can either be passed as a command line argument.
 362
 363 Explicit placements
 364 -------------------
 365
 366 Daemons can be explictly placed on hosts by simply specifying them::
 367
 368     orch apply prometheus "host1 host2 host3"
 369
 370 Or in yaml:
 371
 372 .. code-block:: yaml
 373
 374     service_type: prometheus
 375     placement:
 376       hosts:
 377         - host1
 378         - host2
 379         - host3
 380
 381 MONs and other services may require some enhanced network specifications::
 382
 383   orch daemon add mon myhost:[v2:1.2.3.4:3000,v1:1.2.3.4:6789]=name
 384
 385 Where ``[v2:1.2.3.4:3000,v1:1.2.3.4:6789]`` is the network address of the monitor
 386 and ``=name`` specifies the name of the new monitor.
 387
 388 Placement by labels
 389 -------------------
 390
 391 Daemons can be explictly placed on hosts that match a specifc label::
 392
 393     orch apply prometheus label:mylabel
 394
 395 Or in yaml:
 396
 397 .. code-block:: yaml
 398
 399     service_type: prometheus
 400     placement:
 401       label: "mylabel"
 402
 403
 404 Placement by pattern matching
 405 -----------------------------
 406
 407 Daemons can be placed on hosts as well::
 408
 409     orch apply prometheus 'myhost[1-3]'
 410
 411 Or in yaml:
 412
 413 .. code-block:: yaml
 414
 415     service_type: prometheus
 416     placement:
 417       host_pattern: "myhost[1-3]"
 418
 419 To place a service on *all* hosts, use ``"*"``::
 420
 421     orch apply crash '*'
 422
 423 Or in yaml:
 424
 425 .. code-block:: yaml
 426
 427     service_type: node-exporter
 428     placement:
 429       host_pattern: "*"
 430
 431
 432 Setting a limit
 433 ---------------
 434
 435 By specifying ``count``, only that number of daemons will be created::
 436
 437     orch apply prometheus 3
 438
 439 To deploy *daemons* on a subset of hosts, also specify the count::
 440
 441     orch apply prometheus "2 host1 host2 host3"
 442
 443 If the count is bigger than the amount of hosts, cephadm still deploys two daemons::
 444
 445     orch apply prometheus "3 host1 host2"
 446
 447 Or in yaml:
 448
 449 .. code-block:: yaml
 450
 451     service_type: prometheus
 452     placement:
 453       count: 3
 454
 455 Or with hosts:
 456
 457 .. code-block:: yaml
 458
 459     service_type: prometheus
 460     placement:
 461       count: 2
 462       hosts:
 463         - host1
 464         - host2
 465         - host3
 466
 467
 468 Configuring the Orchestrator CLI
 469 ================================
 470
 471 To enable the orchestrator, select the orchestrator module to use
 472 with the ``set backend`` command::
 473
 474     ceph orch set backend <module>
 475
 476 For example, to enable the Rook orchestrator module and use it with the CLI::
 477
 478     ceph mgr module enable rook
 479     ceph orch set backend rook
 480
 481 Check the backend is properly configured::
 482
 483     ceph orch status
 484
 485 Disable the Orchestrator
 486 ------------------------
 487
 488 To disable the orchestrator, use the empty string ``""``::
 489
 490     ceph orch set backend ""
 491     ceph mgr module disable rook
 492
 493 Current Implementation Status
 494 =============================
 495
 496 This is an overview of the current implementation status of the orchestrators.
 497
 498 =================================== ====== =========
 499  Command                             Rook   Cephadm
 500 =================================== ====== =========
 501  apply iscsi                         ⚪      ⚪
 502  apply mds                           ✔      ✔
 503  apply mgr                           ⚪      ✔
 504  apply mon                           ✔      ✔
 505  apply nfs                           ✔      ✔
 506  apply osd                           ✔      ✔
 507  apply rbd-mirror                    ✔      ✔
 508  apply rgw                           ⚪      ✔
 509  host add                            ⚪      ✔
 510  host ls                             ✔      ✔
 511  host rm                             ⚪      ✔
 512  daemon status                       ⚪      ✔
 513  daemon {stop,start,...}             ⚪      ✔
 514  device {ident,fault}-(on,off}       ⚪      ✔
 515  device ls                           ✔      ✔
 516  iscsi add                           ⚪      ⚪
 517  mds add                             ✔      ✔
 518  nfs add                             ✔      ✔
 519  ps                                  ⚪      ✔
 520  rbd-mirror add                      ⚪      ✔
 521  rgw add                             ✔      ✔
 522  ps                                  ✔      ✔
 523 =================================== ====== =========
 524
 525 where
 526
 527 * ⚪ = not yet implemented
 528 * ❌ = not applicable
 529 * ✔ = implemented