]> git.proxmox.com Git - ceph.git/blame - ceph/doc/cephadm/host-management.rst
bump version to 19.2.0-pve1
[ceph.git] / ceph / doc / cephadm / host-management.rst
CommitLineData
f67539c2
TL
1.. _orchestrator-cli-host-management:
2
3===============
4Host Management
5===============
6
39ae355f
TL
7Listing Hosts
8=============
9
10Run a command of this form to list hosts associated with the cluster:
f67539c2
TL
11
12.. prompt:: bash #
13
1e59de90 14 ceph orch host ls [--format yaml] [--host-pattern <name>] [--label <label>] [--host-status <status>] [--detail]
39ae355f 15
1e59de90 16In commands of this form, the arguments "host-pattern", "label", and
39ae355f 17"host-status" are optional and are used for filtering.
20effc67 18
39ae355f
TL
19- "host-pattern" is a regex that matches against hostnames and returns only
20 matching hosts.
21- "label" returns only hosts with the specified label.
22- "host-status" returns only hosts with the specified status (currently
23 "offline" or "maintenance").
24- Any combination of these filtering flags is valid. It is possible to filter
25 against name, label and status simultaneously, or to filter against any
26 proper subset of name, label and status.
f67539c2 27
1e59de90
TL
28The "detail" parameter provides more host related information for cephadm based
29clusters. For example:
30
31.. prompt:: bash #
32
33 ceph orch host ls --detail
34
35::
36
37 HOSTNAME ADDRESS LABELS STATUS VENDOR/MODEL CPU HDD SSD NIC
38 ceph-master 192.168.122.73 _admin QEMU (Standard PC (Q35 + ICH9, 2009)) 4C/4T 4/1.6TB - 1
39 1 hosts in cluster
40
f67539c2
TL
41.. _cephadm-adding-hosts:
42
43Adding Hosts
44============
45
46Hosts must have these :ref:`cephadm-host-requirements` installed.
47Hosts without all the necessary requirements will fail to be added to the cluster.
48
49To add each new host to the cluster, perform two steps:
50
51#. Install the cluster's public SSH key in the new host's root user's ``authorized_keys`` file:
52
53 .. prompt:: bash #
54
39ae355f 55 ssh-copy-id -f -i /etc/ceph/ceph.pub root@*<new-host>*
f67539c2
TL
56
57 For example:
58
59 .. prompt:: bash #
60
61 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2
62 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host3
63
64#. Tell Ceph that the new node is part of the cluster:
65
66 .. prompt:: bash #
67
39ae355f 68 ceph orch host add *<newhost>* [*<ip>*] [*<label1> ...*]
f67539c2
TL
69
70 For example:
71
72 .. prompt:: bash #
73
1e59de90
TL
74 ceph orch host add host2 10.10.0.102
75 ceph orch host add host3 10.10.0.103
b3b6e05e 76
f38dd50b 77 It is best to explicitly provide the host IP address. If an address is
b3b6e05e 78 not provided, then the host name will be immediately resolved via
f38dd50b 79 DNS and the result will be used.
b3b6e05e
TL
80
81 One or more labels can also be included to immediately label the
82 new host. For example, by default the ``_admin`` label will make
83 cephadm maintain a copy of the ``ceph.conf`` file and a
84 ``client.admin`` keyring file in ``/etc/ceph``:
85
86 .. prompt:: bash #
87
39ae355f 88 ceph orch host add host4 10.10.0.104 --labels _admin
b3b6e05e 89
f67539c2
TL
90.. _cephadm-removing-hosts:
91
92Removing Hosts
93==============
94
39ae355f
TL
95A host can safely be removed from the cluster after all daemons are removed
96from it.
f67539c2 97
39ae355f 98To drain all daemons from a host, run a command of the following form:
f67539c2 99
522d829b
TL
100.. prompt:: bash #
101
39ae355f 102 ceph orch host drain *<host>*
522d829b 103
aee94f69
TL
104The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the
105host. See :ref:`cephadm-special-host-labels`.
106
f38dd50b 107If you want to drain daemons but leave managed `ceph.conf` and keyring
aee94f69
TL
108files on the host, you may pass the ``--keep-conf-keyring`` flag to the
109drain command.
110
111.. prompt:: bash #
112
113 ceph orch host drain *<host>* --keep-conf-keyring
114
115This will apply the ``_no_schedule`` label to the host but not the
116``_no_conf_keyring`` label.
f67539c2 117
f38dd50b
TL
118All OSDs on the host will be scheduled to be removed. You can check
119progress of the OSD removal operation with the following command:
f67539c2 120
522d829b 121.. prompt:: bash #
f67539c2 122
39ae355f 123 ceph orch osd rm status
f67539c2 124
39ae355f 125See :ref:`cephadm-osd-removal` for more details about OSD removal.
f67539c2 126
aee94f69
TL
127The ``orch host drain`` command also supports a ``--zap-osd-devices``
128flag. Setting this flag while draining a host will cause cephadm to zap
129the devices of the OSDs it is removing as part of the drain process
130
131.. prompt:: bash #
132
133 ceph orch host drain *<host>* --zap-osd-devices
134
39ae355f
TL
135Use the following command to determine whether any daemons are still on the
136host:
f67539c2 137
522d829b 138.. prompt:: bash #
f67539c2 139
39ae355f 140 ceph orch ps <host>
f67539c2 141
39ae355f
TL
142After all daemons have been removed from the host, remove the host from the
143cluster by running the following command:
f67539c2
TL
144
145.. prompt:: bash #
146
39ae355f 147 ceph orch host rm <host>
f67539c2 148
522d829b
TL
149Offline host removal
150--------------------
f67539c2 151
f38dd50b 152If a host is offline and can not be recovered, it can be removed from the
39ae355f 153cluster by running a command of the following form:
f67539c2
TL
154
155.. prompt:: bash #
156
39ae355f 157 ceph orch host rm <host> --offline --force
522d829b 158
39ae355f
TL
159.. warning:: This can potentially cause data loss. This command forcefully
160 purges OSDs from the cluster by calling ``osd purge-actual`` for each OSD.
161 Any service specs that still contain this host should be manually updated.
f67539c2
TL
162
163.. _orchestrator-host-labels:
164
165Host labels
166===========
167
168The orchestrator supports assigning labels to hosts. Labels
169are free form and have no particular meaning by itself and each host
170can have multiple labels. They can be used to specify placement
171of daemons. See :ref:`orch-placement-by-labels`
172
39ae355f
TL
173Labels can be added when adding a host with the ``--labels`` flag:
174
175.. prompt:: bash #
f67539c2 176
39ae355f
TL
177 ceph orch host add my_hostname --labels=my_label1
178 ceph orch host add my_hostname --labels=my_label1,my_label2
f67539c2 179
39ae355f 180To add a label a existing host, run:
f67539c2 181
39ae355f 182.. prompt:: bash #
f67539c2 183
39ae355f 184 ceph orch host label add my_hostname my_label
f67539c2 185
39ae355f
TL
186To remove a label, run:
187
188.. prompt:: bash #
189
190 ceph orch host label rm my_hostname my_label
f67539c2 191
b3b6e05e
TL
192
193.. _cephadm-special-host-labels:
194
195Special host labels
196-------------------
197
198The following host labels have a special meaning to cephadm. All start with ``_``.
199
200* ``_no_schedule``: *Do not schedule or deploy daemons on this host*.
201
202 This label prevents cephadm from deploying daemons on this host. If it is added to
203 an existing host that already contains Ceph daemons, it will cause cephadm to move
204 those daemons elsewhere (except OSDs, which are not removed automatically).
205
aee94f69
TL
206* ``_no_conf_keyring``: *Do not deploy config files or keyrings on this host*.
207
208 This label is effectively the same as ``_no_schedule`` but instead of working for
209 daemons it works for client keyrings and ceph conf files that are being managed
210 by cephadm
211
b3b6e05e
TL
212* ``_no_autotune_memory``: *Do not autotune memory on this host*.
213
214 This label will prevent daemon memory from being tuned even when the
215 ``osd_memory_target_autotune`` or similar option is enabled for one or more daemons
216 on that host.
217
218* ``_admin``: *Distribute client.admin and ceph.conf to this host*.
219
220 By default, an ``_admin`` label is applied to the first host in the cluster (where
221 bootstrap was originally run), and the ``client.admin`` key is set to be distributed
222 to that host via the ``ceph orch client-keyring ...`` function. Adding this label
223 to additional hosts will normally cause cephadm to deploy config and keyring files
2a845540
TL
224 in ``/etc/ceph``. Starting from versions 16.2.10 (Pacific) and 17.2.1 (Quincy) in
225 addition to the default location ``/etc/ceph/`` cephadm also stores config and keyring
226 files in the ``/var/lib/ceph/<fsid>/config`` directory.
b3b6e05e 227
f67539c2
TL
228Maintenance Mode
229================
230
39ae355f 231Place a host in and out of maintenance mode (stops all Ceph daemons on host):
f67539c2 232
39ae355f
TL
233.. prompt:: bash #
234
1e59de90 235 ceph orch host maintenance enter <hostname> [--force] [--yes-i-really-mean-it]
39ae355f 236 ceph orch host maintenance exit <hostname>
f67539c2 237
1e59de90
TL
238The ``--force`` flag allows the user to bypass warnings (but not alerts). The ``--yes-i-really-mean-it``
239flag bypasses all safety checks and will attempt to force the host into maintenance mode no
240matter what.
241
242.. warning:: Using the --yes-i-really-mean-it flag to force the host to enter maintenance
243 mode can potentially cause loss of data availability, the mon quorum to break down due
244 to too few running monitors, mgr module commands (such as ``ceph orch . . .`` commands)
245 to be become unresponsive, and a number of other possible issues. Please only use this
246 flag if you're absolutely certain you know what you're doing.
f67539c2
TL
247
248See also :ref:`cephadm-fqdn`
249
2a845540
TL
250Rescanning Host Devices
251=======================
252
253Some servers and external enclosures may not register device removal or insertion with the
f38dd50b
TL
254kernel. In these scenarios, you'll need to perform a device rescan on the appropriate host.
255A rescan is typically non-disruptive, and can be performed with the following CLI command:
39ae355f
TL
256
257.. prompt:: bash #
2a845540 258
39ae355f 259 ceph orch host rescan <hostname> [--with-summary]
2a845540
TL
260
261The ``with-summary`` flag provides a breakdown of the number of HBAs found and scanned, together
39ae355f
TL
262with any that failed:
263
264.. prompt:: bash [ceph:root@rh9-ceph1/]#
265
266 ceph orch host rescan rh9-ceph1 --with-summary
267
268::
2a845540 269
39ae355f 270 Ok. 2 adapters detected: 2 rescanned, 0 skipped, 0 failed (0.32s)
2a845540 271
a4b75251
TL
272Creating many hosts at once
273===========================
f67539c2
TL
274
275Many hosts can be added at once using
a4b75251
TL
276``ceph orch apply -i`` by submitting a multi-document YAML file:
277
278.. code-block:: yaml
f67539c2 279
f67539c2 280 service_type: host
f67539c2 281 hostname: node-00
b3b6e05e 282 addr: 192.168.0.10
f67539c2
TL
283 labels:
284 - example1
285 - example2
286 ---
287 service_type: host
f67539c2 288 hostname: node-01
b3b6e05e 289 addr: 192.168.0.11
f67539c2
TL
290 labels:
291 - grafana
292 ---
293 service_type: host
f67539c2 294 hostname: node-02
b3b6e05e 295 addr: 192.168.0.12
f67539c2 296
39ae355f
TL
297This can be combined with :ref:`service specifications<orchestrator-cli-service-spec>`
298to create a cluster spec file to deploy a whole cluster in one command. see
299``cephadm bootstrap --apply-spec`` also to do this during bootstrap. Cluster
300SSH Keys must be copied to hosts prior to adding them.
f67539c2 301
a4b75251
TL
302Setting the initial CRUSH location of host
303==========================================
304
305Hosts can contain a ``location`` identifier which will instruct cephadm to
f78120f9
TL
306create a new CRUSH host bucket located in the specified hierarchy.
307You can specify more than one element of the tree when doing so (for
308instance if you want to ensure that the rack that a host is being
309added to is also added to the default bucket), for example:
a4b75251
TL
310
311.. code-block:: yaml
312
313 service_type: host
314 hostname: node-00
315 addr: 192.168.0.10
316 location:
f78120f9 317 root: default
a4b75251
TL
318 rack: rack1
319
320.. note::
321
f51cf556
TL
322 The ``location`` attribute will be only affect the initial CRUSH location. Subsequent
323 changes of the ``location`` property will be ignored. Also, removing a host will not remove
324 an associated CRUSH bucket unless the ``--rm-crush-entry`` flag is provided to the ``orch host rm`` command
a4b75251 325
20effc67
TL
326See also :ref:`crush_map_default_types`.
327
f38dd50b
TL
328Removing a host from the CRUSH map
329==================================
330
331The ``ceph orch host rm`` command has support for removing the associated host bucket
332from the CRUSH map. This is done by providing the ``--rm-crush-entry`` flag.
333
334.. prompt:: bash [ceph:root@host1/]#
335
336 ceph orch host rm host1 --rm-crush-entry
337
338When this flag is specified, cephadm will attempt to remove the host bucket
339from the CRUSH map as part of the host removal process. Note that if
340it fails to do so, cephadm will report the failure and the host will remain under
341cephadm control.
342
343.. note::
344
345 Removal from the CRUSH map will fail if there are OSDs deployed on the
346 host. If you would like to remove all the host's OSDs as well, please start
347 by using the ``ceph orch host drain`` command to do so. Once the OSDs
348 have been removed, then you may direct cephadm remove the CRUSH bucket
349 along with the host using the ``--rm-crush-entry`` flag.
350
2a845540
TL
351OS Tuning Profiles
352==================
353
f38dd50b
TL
354Cephadm can be used to manage operating system tuning profiles that apply
355``sysctl`` settings to sets of hosts.
39ae355f 356
f38dd50b 357To do so, create a YAML spec file in the following format:
2a845540
TL
358
359.. code-block:: yaml
360
361 profile_name: 23-mon-host-profile
362 placement:
363 hosts:
364 - mon-host-01
365 - mon-host-02
366 settings:
367 fs.file-max: 1000000
368 vm.swappiness: '13'
369
39ae355f 370Apply the tuning profile with the following command:
2a845540 371
39ae355f 372.. prompt:: bash #
2a845540 373
39ae355f
TL
374 ceph orch tuned-profile apply -i <tuned-profile-file-name>
375
f38dd50b
TL
376This profile is written to a file under ``/etc/sysctl.d/`` on each host
377specified in the ``placement`` block, then ``sysctl --system`` is
39ae355f 378run on the host.
2a845540
TL
379
380.. note::
381
39ae355f
TL
382 The exact filename that the profile is written to within ``/etc/sysctl.d/``
383 is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is
f38dd50b
TL
384 the ``profile_name`` setting that you specify in the YAML spec. We suggest
385 naming these profiles following the usual ``sysctl.d`` `NN-xxxxx` convention. Because
39ae355f 386 sysctl settings are applied in lexicographical order (sorted by the filename
f38dd50b
TL
387 in which the setting is specified), you may want to carefully choose
388 the ``profile_name`` in your spec so that it is applied before or after other
389 conf files. Careful selection ensures that values supplied here override or
390 do not override those in other ``sysctl.d`` files as desired.
2a845540
TL
391
392.. note::
393
394 These settings are applied only at the host level, and are not specific
39ae355f 395 to any particular daemon or container.
2a845540
TL
396
397.. note::
398
f38dd50b 399 Applying tuning profiles is idempotent when the ``--no-overwrite`` option is
39ae355f
TL
400 passed. Moreover, if the ``--no-overwrite`` option is passed, existing
401 profiles with the same name are not overwritten.
2a845540
TL
402
403
404Viewing Profiles
405----------------
406
39ae355f
TL
407Run the following command to view all the profiles that cephadm currently manages:
408
409.. prompt:: bash #
2a845540 410
39ae355f 411 ceph orch tuned-profile ls
2a845540
TL
412
413.. note::
414
39ae355f
TL
415 To make modifications and re-apply a profile, pass ``--format yaml`` to the
416 ``tuned-profile ls`` command. The ``tuned-profile ls --format yaml`` command
417 presents the profiles in a format that is easy to copy and re-apply.
2a845540
TL
418
419
420Removing Profiles
421-----------------
422
39ae355f
TL
423To remove a previously applied profile, run this command:
424
425.. prompt:: bash #
2a845540 426
39ae355f 427 ceph orch tuned-profile rm <profile-name>
2a845540 428
39ae355f 429When a profile is removed, cephadm cleans up the file previously written to ``/etc/sysctl.d``.
2a845540
TL
430
431
432Modifying Profiles
433------------------
434
39ae355f
TL
435Profiles can be modified by re-applying a YAML spec with the same name as the
436profile that you want to modify, but settings within existing profiles can be
437adjusted with the following commands.
2a845540 438
39ae355f 439To add or modify a setting in an existing profile:
2a845540 440
39ae355f
TL
441.. prompt:: bash #
442
443 ceph orch tuned-profile add-setting <profile-name> <setting-name> <value>
444
445To remove a setting from an existing profile:
2a845540 446
39ae355f 447.. prompt:: bash #
2a845540 448
39ae355f 449 ceph orch tuned-profile rm-setting <profile-name> <setting-name>
2a845540
TL
450
451.. note::
452
39ae355f
TL
453 Modifying the placement requires re-applying a profile with the same name.
454 Remember that profiles are tracked by their names, so when a profile with the
455 same name as an existing profile is applied, it overwrites the old profile
456 unless the ``--no-overwrite`` flag is passed.
2a845540 457
f67539c2
TL
458SSH Configuration
459=================
460
461Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
462with those hosts in a secure way.
463
464
465Default behavior
466----------------
467
468Cephadm stores an SSH key in the monitor that is used to
469connect to remote hosts. When the cluster is bootstrapped, this SSH
470key is generated automatically and no additional configuration
471is necessary.
472
39ae355f
TL
473A *new* SSH key can be generated with:
474
475.. prompt:: bash #
476
477 ceph cephadm generate-key
478
479The public portion of the SSH key can be retrieved with:
480
481.. prompt:: bash #
482
483 ceph cephadm get-pub-key
f67539c2 484
39ae355f 485The currently stored SSH key can be deleted with:
f67539c2 486
39ae355f 487.. prompt:: bash #
f67539c2 488
39ae355f 489 ceph cephadm clear-key
f67539c2 490
39ae355f 491You can make use of an existing key by directly importing it with:
f67539c2 492
39ae355f 493.. prompt:: bash #
f67539c2 494
39ae355f
TL
495 ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
496 ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
f67539c2 497
39ae355f 498You will then need to restart the mgr daemon to reload the configuration with:
f67539c2 499
39ae355f 500.. prompt:: bash #
f67539c2 501
39ae355f 502 ceph mgr fail
f67539c2 503
a4b75251
TL
504.. _cephadm-ssh-user:
505
f67539c2
TL
506Configuring a different SSH user
507----------------------------------
508
509Cephadm must be able to log into all the Ceph cluster nodes as an user
510that has enough privileges to download container images, start containers
511and execute commands without prompting for a password. If you do not want
512to use the "root" user (default option in cephadm), you must provide
513cephadm the name of the user that is going to be used to perform all the
39ae355f 514cephadm operations. Use the command:
f67539c2 515
39ae355f
TL
516.. prompt:: bash #
517
518 ceph cephadm set-user <user>
f67539c2 519
39ae355f 520Prior to running this the cluster SSH key needs to be added to this users
f67539c2
TL
521authorized_keys file and non-root users must have passwordless sudo access.
522
523
524Customizing the SSH configuration
525---------------------------------
526
527Cephadm generates an appropriate ``ssh_config`` file that is
528used for connecting to remote hosts. This configuration looks
529something like this::
530
531 Host *
532 User root
533 StrictHostKeyChecking no
534 UserKnownHostsFile /dev/null
535
536There are two ways to customize this configuration for your environment:
537
538#. Import a customized configuration file that will be stored
39ae355f 539 by the monitor with:
f67539c2 540
39ae355f 541 .. prompt:: bash #
f67539c2 542
39ae355f 543 ceph cephadm set-ssh-config -i <ssh_config_file>
f67539c2 544
39ae355f 545 To remove a customized SSH config and revert back to the default behavior:
f67539c2 546
39ae355f
TL
547 .. prompt:: bash #
548
549 ceph cephadm clear-ssh-config
550
551#. You can configure a file location for the SSH configuration file with:
552
553 .. prompt:: bash #
f67539c2 554
39ae355f 555 ceph config set mgr mgr/cephadm/ssh_config_file <path>
f67539c2
TL
556
557 We do *not recommend* this approach. The path name must be
558 visible to *any* mgr daemon, and cephadm runs all daemons as
f38dd50b 559 containers. That means that the file must either be placed
f67539c2
TL
560 inside a customized container image for your deployment, or
561 manually distributed to the mgr data directory
562 (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
563 ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
aee94f69
TL
564
565Setting up CA signed keys for the cluster
566-----------------------------------------
567
568Cephadm also supports using CA signed keys for SSH authentication
569across cluster nodes. In this setup, instead of needing a private
570key and public key, we instead need a private key and certificate
571created by signing that private key with a CA key. For more info
572on setting up nodes for authentication using a CA signed key, see
573:ref:`cephadm-bootstrap-ca-signed-keys`. Once you have your private
574key and signed cert, they can be set up for cephadm to use by running:
575
576.. prompt:: bash #
577
578 ceph config-key set mgr/cephadm/ssh_identity_key -i <private-key-file>
579 ceph config-key set mgr/cephadm/ssh_identity_cert -i <signed-cert-file>
580
f67539c2
TL
581.. _cephadm-fqdn:
582
583Fully qualified domain names vs bare host names
584===============================================
585
f67539c2
TL
586.. note::
587
588 cephadm demands that the name of the host given via ``ceph orch host add``
589 equals the output of ``hostname`` on remote hosts.
590
b3b6e05e 591Otherwise cephadm can't be sure that names returned by
f67539c2
TL
592``ceph * metadata`` match the hosts known to cephadm. This might result
593in a :ref:`cephadm-stray-host` warning.
594
595When configuring new hosts, there are two **valid** ways to set the
596``hostname`` of a host:
597
5981. Using the bare host name. In this case:
599
600- ``hostname`` returns the bare host name.
601- ``hostname -f`` returns the FQDN.
602
6032. Using the fully qualified domain name as the host name. In this case:
604
605- ``hostname`` returns the FQDN
606- ``hostname -s`` return the bare host name
607
608Note that ``man hostname`` recommends ``hostname`` to return the bare
609host name:
610
611 The FQDN (Fully Qualified Domain Name) of the system is the
f38dd50b
TL
612 name that the resolver(3) returns for the host name, for example
613 ``ursula.example.com``. It is usually the short hostname followed by the DNS
f67539c2
TL
614 domain name (the part after the first dot). You can check the FQDN
615 using ``hostname --fqdn`` or the domain name using ``dnsdomainname``.
616
617 .. code-block:: none
618
619 You cannot change the FQDN with hostname or dnsdomainname.
620
621 The recommended method of setting the FQDN is to make the hostname
622 be an alias for the fully qualified name using /etc/hosts, DNS, or
623 NIS. For example, if the hostname was "ursula", one might have
624 a line in /etc/hosts which reads
625
626 127.0.1.1 ursula.example.com ursula
627
628Which means, ``man hostname`` recommends ``hostname`` to return the bare
629host name. This in turn means that Ceph will return the bare host names
630when executing ``ceph * metadata``. This in turn means cephadm also
631requires the bare host name when adding a host to the cluster:
632``ceph orch host add <bare-name>``.
633
634..
635 TODO: This chapter needs to provide way for users to configure
39ae355f 636 Grafana in the dashboard, as this is right now very hard to do.