]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephadm/host-management.rst
4b964c5f455a7f57a8366e2b4f57a62c184db301
[ceph.git] / ceph / doc / cephadm / host-management.rst
1 .. _orchestrator-cli-host-management:
2
3 ===============
4 Host Management
5 ===============
6
7 Listing Hosts
8 =============
9
10 Run a command of this form to list hosts associated with the cluster:
11
12 .. prompt:: bash #
13
14 ceph orch host ls [--format yaml] [--host-pattern <name>] [--label <label>] [--host-status <status>] [--detail]
15
16 In commands of this form, the arguments "host-pattern", "label", and
17 "host-status" are optional and are used for filtering.
18
19 - "host-pattern" is a regex that matches against hostnames and returns only
20 matching hosts.
21 - "label" returns only hosts with the specified label.
22 - "host-status" returns only hosts with the specified status (currently
23 "offline" or "maintenance").
24 - Any combination of these filtering flags is valid. It is possible to filter
25 against name, label and status simultaneously, or to filter against any
26 proper subset of name, label and status.
27
28 The "detail" parameter provides more host related information for cephadm based
29 clusters. For example:
30
31 .. prompt:: bash #
32
33 ceph orch host ls --detail
34
35 ::
36
37 HOSTNAME ADDRESS LABELS STATUS VENDOR/MODEL CPU HDD SSD NIC
38 ceph-master 192.168.122.73 _admin QEMU (Standard PC (Q35 + ICH9, 2009)) 4C/4T 4/1.6TB - 1
39 1 hosts in cluster
40
41 .. _cephadm-adding-hosts:
42
43 Adding Hosts
44 ============
45
46 Hosts must have these :ref:`cephadm-host-requirements` installed.
47 Hosts without all the necessary requirements will fail to be added to the cluster.
48
49 To add each new host to the cluster, perform two steps:
50
51 #. Install the cluster's public SSH key in the new host's root user's ``authorized_keys`` file:
52
53 .. prompt:: bash #
54
55 ssh-copy-id -f -i /etc/ceph/ceph.pub root@*<new-host>*
56
57 For example:
58
59 .. prompt:: bash #
60
61 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2
62 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host3
63
64 #. Tell Ceph that the new node is part of the cluster:
65
66 .. prompt:: bash #
67
68 ceph orch host add *<newhost>* [*<ip>*] [*<label1> ...*]
69
70 For example:
71
72 .. prompt:: bash #
73
74 ceph orch host add host2 10.10.0.102
75 ceph orch host add host3 10.10.0.103
76
77 It is best to explicitly provide the host IP address. If an IP is
78 not provided, then the host name will be immediately resolved via
79 DNS and that IP will be used.
80
81 One or more labels can also be included to immediately label the
82 new host. For example, by default the ``_admin`` label will make
83 cephadm maintain a copy of the ``ceph.conf`` file and a
84 ``client.admin`` keyring file in ``/etc/ceph``:
85
86 .. prompt:: bash #
87
88 ceph orch host add host4 10.10.0.104 --labels _admin
89
90 .. _cephadm-removing-hosts:
91
92 Removing Hosts
93 ==============
94
95 A host can safely be removed from the cluster after all daemons are removed
96 from it.
97
98 To drain all daemons from a host, run a command of the following form:
99
100 .. prompt:: bash #
101
102 ceph orch host drain *<host>*
103
104 The ``_no_schedule`` and ``_no_conf_keyring`` labels will be applied to the
105 host. See :ref:`cephadm-special-host-labels`.
106
107 If you only want to drain daemons but leave managed ceph conf and keyring
108 files on the host, you may pass the ``--keep-conf-keyring`` flag to the
109 drain command.
110
111 .. prompt:: bash #
112
113 ceph orch host drain *<host>* --keep-conf-keyring
114
115 This will apply the ``_no_schedule`` label to the host but not the
116 ``_no_conf_keyring`` label.
117
118 All OSDs on the host will be scheduled to be removed. You can check the progress of the OSD removal operation with the following command:
119
120 .. prompt:: bash #
121
122 ceph orch osd rm status
123
124 See :ref:`cephadm-osd-removal` for more details about OSD removal.
125
126 The ``orch host drain`` command also supports a ``--zap-osd-devices``
127 flag. Setting this flag while draining a host will cause cephadm to zap
128 the devices of the OSDs it is removing as part of the drain process
129
130 .. prompt:: bash #
131
132 ceph orch host drain *<host>* --zap-osd-devices
133
134 Use the following command to determine whether any daemons are still on the
135 host:
136
137 .. prompt:: bash #
138
139 ceph orch ps <host>
140
141 After all daemons have been removed from the host, remove the host from the
142 cluster by running the following command:
143
144 .. prompt:: bash #
145
146 ceph orch host rm <host>
147
148 Offline host removal
149 --------------------
150
151 Even if a host is offline and can not be recovered, it can be removed from the
152 cluster by running a command of the following form:
153
154 .. prompt:: bash #
155
156 ceph orch host rm <host> --offline --force
157
158 .. warning:: This can potentially cause data loss. This command forcefully
159 purges OSDs from the cluster by calling ``osd purge-actual`` for each OSD.
160 Any service specs that still contain this host should be manually updated.
161
162 .. _orchestrator-host-labels:
163
164 Host labels
165 ===========
166
167 The orchestrator supports assigning labels to hosts. Labels
168 are free form and have no particular meaning by itself and each host
169 can have multiple labels. They can be used to specify placement
170 of daemons. See :ref:`orch-placement-by-labels`
171
172 Labels can be added when adding a host with the ``--labels`` flag:
173
174 .. prompt:: bash #
175
176 ceph orch host add my_hostname --labels=my_label1
177 ceph orch host add my_hostname --labels=my_label1,my_label2
178
179 To add a label a existing host, run:
180
181 .. prompt:: bash #
182
183 ceph orch host label add my_hostname my_label
184
185 To remove a label, run:
186
187 .. prompt:: bash #
188
189 ceph orch host label rm my_hostname my_label
190
191
192 .. _cephadm-special-host-labels:
193
194 Special host labels
195 -------------------
196
197 The following host labels have a special meaning to cephadm. All start with ``_``.
198
199 * ``_no_schedule``: *Do not schedule or deploy daemons on this host*.
200
201 This label prevents cephadm from deploying daemons on this host. If it is added to
202 an existing host that already contains Ceph daemons, it will cause cephadm to move
203 those daemons elsewhere (except OSDs, which are not removed automatically).
204
205 * ``_no_conf_keyring``: *Do not deploy config files or keyrings on this host*.
206
207 This label is effectively the same as ``_no_schedule`` but instead of working for
208 daemons it works for client keyrings and ceph conf files that are being managed
209 by cephadm
210
211 * ``_no_autotune_memory``: *Do not autotune memory on this host*.
212
213 This label will prevent daemon memory from being tuned even when the
214 ``osd_memory_target_autotune`` or similar option is enabled for one or more daemons
215 on that host.
216
217 * ``_admin``: *Distribute client.admin and ceph.conf to this host*.
218
219 By default, an ``_admin`` label is applied to the first host in the cluster (where
220 bootstrap was originally run), and the ``client.admin`` key is set to be distributed
221 to that host via the ``ceph orch client-keyring ...`` function. Adding this label
222 to additional hosts will normally cause cephadm to deploy config and keyring files
223 in ``/etc/ceph``. Starting from versions 16.2.10 (Pacific) and 17.2.1 (Quincy) in
224 addition to the default location ``/etc/ceph/`` cephadm also stores config and keyring
225 files in the ``/var/lib/ceph/<fsid>/config`` directory.
226
227 Maintenance Mode
228 ================
229
230 Place a host in and out of maintenance mode (stops all Ceph daemons on host):
231
232 .. prompt:: bash #
233
234 ceph orch host maintenance enter <hostname> [--force] [--yes-i-really-mean-it]
235 ceph orch host maintenance exit <hostname>
236
237 The ``--force`` flag allows the user to bypass warnings (but not alerts). The ``--yes-i-really-mean-it``
238 flag bypasses all safety checks and will attempt to force the host into maintenance mode no
239 matter what.
240
241 .. warning:: Using the --yes-i-really-mean-it flag to force the host to enter maintenance
242 mode can potentially cause loss of data availability, the mon quorum to break down due
243 to too few running monitors, mgr module commands (such as ``ceph orch . . .`` commands)
244 to be become unresponsive, and a number of other possible issues. Please only use this
245 flag if you're absolutely certain you know what you're doing.
246
247 See also :ref:`cephadm-fqdn`
248
249 Rescanning Host Devices
250 =======================
251
252 Some servers and external enclosures may not register device removal or insertion with the
253 kernel. In these scenarios, you'll need to perform a host rescan. A rescan is typically
254 non-disruptive, and can be performed with the following CLI command:
255
256 .. prompt:: bash #
257
258 ceph orch host rescan <hostname> [--with-summary]
259
260 The ``with-summary`` flag provides a breakdown of the number of HBAs found and scanned, together
261 with any that failed:
262
263 .. prompt:: bash [ceph:root@rh9-ceph1/]#
264
265 ceph orch host rescan rh9-ceph1 --with-summary
266
267 ::
268
269 Ok. 2 adapters detected: 2 rescanned, 0 skipped, 0 failed (0.32s)
270
271 Creating many hosts at once
272 ===========================
273
274 Many hosts can be added at once using
275 ``ceph orch apply -i`` by submitting a multi-document YAML file:
276
277 .. code-block:: yaml
278
279 service_type: host
280 hostname: node-00
281 addr: 192.168.0.10
282 labels:
283 - example1
284 - example2
285 ---
286 service_type: host
287 hostname: node-01
288 addr: 192.168.0.11
289 labels:
290 - grafana
291 ---
292 service_type: host
293 hostname: node-02
294 addr: 192.168.0.12
295
296 This can be combined with :ref:`service specifications<orchestrator-cli-service-spec>`
297 to create a cluster spec file to deploy a whole cluster in one command. see
298 ``cephadm bootstrap --apply-spec`` also to do this during bootstrap. Cluster
299 SSH Keys must be copied to hosts prior to adding them.
300
301 Setting the initial CRUSH location of host
302 ==========================================
303
304 Hosts can contain a ``location`` identifier which will instruct cephadm to
305 create a new CRUSH host located in the specified hierarchy.
306
307 .. code-block:: yaml
308
309 service_type: host
310 hostname: node-00
311 addr: 192.168.0.10
312 location:
313 rack: rack1
314
315 .. note::
316
317 The ``location`` attribute will be only affect the initial CRUSH location. Subsequent
318 changes of the ``location`` property will be ignored. Also, removing a host will not remove
319 any CRUSH buckets.
320
321 See also :ref:`crush_map_default_types`.
322
323 OS Tuning Profiles
324 ==================
325
326 Cephadm can be used to manage operating-system-tuning profiles that apply sets
327 of sysctl settings to sets of hosts.
328
329 Create a YAML spec file in the following format:
330
331 .. code-block:: yaml
332
333 profile_name: 23-mon-host-profile
334 placement:
335 hosts:
336 - mon-host-01
337 - mon-host-02
338 settings:
339 fs.file-max: 1000000
340 vm.swappiness: '13'
341
342 Apply the tuning profile with the following command:
343
344 .. prompt:: bash #
345
346 ceph orch tuned-profile apply -i <tuned-profile-file-name>
347
348 This profile is written to ``/etc/sysctl.d/`` on each host that matches the
349 hosts specified in the placement block of the yaml, and ``sysctl --system`` is
350 run on the host.
351
352 .. note::
353
354 The exact filename that the profile is written to within ``/etc/sysctl.d/``
355 is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is
356 the ``profile_name`` setting that you specify in the YAML spec. Because
357 sysctl settings are applied in lexicographical order (sorted by the filename
358 in which the setting is specified), you may want to set the ``profile_name``
359 in your spec so that it is applied before or after other conf files.
360
361 .. note::
362
363 These settings are applied only at the host level, and are not specific
364 to any particular daemon or container.
365
366 .. note::
367
368 Applying tuned profiles is idempotent when the ``--no-overwrite`` option is
369 passed. Moreover, if the ``--no-overwrite`` option is passed, existing
370 profiles with the same name are not overwritten.
371
372
373 Viewing Profiles
374 ----------------
375
376 Run the following command to view all the profiles that cephadm currently manages:
377
378 .. prompt:: bash #
379
380 ceph orch tuned-profile ls
381
382 .. note::
383
384 To make modifications and re-apply a profile, pass ``--format yaml`` to the
385 ``tuned-profile ls`` command. The ``tuned-profile ls --format yaml`` command
386 presents the profiles in a format that is easy to copy and re-apply.
387
388
389 Removing Profiles
390 -----------------
391
392 To remove a previously applied profile, run this command:
393
394 .. prompt:: bash #
395
396 ceph orch tuned-profile rm <profile-name>
397
398 When a profile is removed, cephadm cleans up the file previously written to ``/etc/sysctl.d``.
399
400
401 Modifying Profiles
402 ------------------
403
404 Profiles can be modified by re-applying a YAML spec with the same name as the
405 profile that you want to modify, but settings within existing profiles can be
406 adjusted with the following commands.
407
408 To add or modify a setting in an existing profile:
409
410 .. prompt:: bash #
411
412 ceph orch tuned-profile add-setting <profile-name> <setting-name> <value>
413
414 To remove a setting from an existing profile:
415
416 .. prompt:: bash #
417
418 ceph orch tuned-profile rm-setting <profile-name> <setting-name>
419
420 .. note::
421
422 Modifying the placement requires re-applying a profile with the same name.
423 Remember that profiles are tracked by their names, so when a profile with the
424 same name as an existing profile is applied, it overwrites the old profile
425 unless the ``--no-overwrite`` flag is passed.
426
427 SSH Configuration
428 =================
429
430 Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
431 with those hosts in a secure way.
432
433
434 Default behavior
435 ----------------
436
437 Cephadm stores an SSH key in the monitor that is used to
438 connect to remote hosts. When the cluster is bootstrapped, this SSH
439 key is generated automatically and no additional configuration
440 is necessary.
441
442 A *new* SSH key can be generated with:
443
444 .. prompt:: bash #
445
446 ceph cephadm generate-key
447
448 The public portion of the SSH key can be retrieved with:
449
450 .. prompt:: bash #
451
452 ceph cephadm get-pub-key
453
454 The currently stored SSH key can be deleted with:
455
456 .. prompt:: bash #
457
458 ceph cephadm clear-key
459
460 You can make use of an existing key by directly importing it with:
461
462 .. prompt:: bash #
463
464 ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
465 ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
466
467 You will then need to restart the mgr daemon to reload the configuration with:
468
469 .. prompt:: bash #
470
471 ceph mgr fail
472
473 .. _cephadm-ssh-user:
474
475 Configuring a different SSH user
476 ----------------------------------
477
478 Cephadm must be able to log into all the Ceph cluster nodes as an user
479 that has enough privileges to download container images, start containers
480 and execute commands without prompting for a password. If you do not want
481 to use the "root" user (default option in cephadm), you must provide
482 cephadm the name of the user that is going to be used to perform all the
483 cephadm operations. Use the command:
484
485 .. prompt:: bash #
486
487 ceph cephadm set-user <user>
488
489 Prior to running this the cluster SSH key needs to be added to this users
490 authorized_keys file and non-root users must have passwordless sudo access.
491
492
493 Customizing the SSH configuration
494 ---------------------------------
495
496 Cephadm generates an appropriate ``ssh_config`` file that is
497 used for connecting to remote hosts. This configuration looks
498 something like this::
499
500 Host *
501 User root
502 StrictHostKeyChecking no
503 UserKnownHostsFile /dev/null
504
505 There are two ways to customize this configuration for your environment:
506
507 #. Import a customized configuration file that will be stored
508 by the monitor with:
509
510 .. prompt:: bash #
511
512 ceph cephadm set-ssh-config -i <ssh_config_file>
513
514 To remove a customized SSH config and revert back to the default behavior:
515
516 .. prompt:: bash #
517
518 ceph cephadm clear-ssh-config
519
520 #. You can configure a file location for the SSH configuration file with:
521
522 .. prompt:: bash #
523
524 ceph config set mgr mgr/cephadm/ssh_config_file <path>
525
526 We do *not recommend* this approach. The path name must be
527 visible to *any* mgr daemon, and cephadm runs all daemons as
528 containers. That means that the file either need to be placed
529 inside a customized container image for your deployment, or
530 manually distributed to the mgr data directory
531 (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
532 ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
533
534 Setting up CA signed keys for the cluster
535 -----------------------------------------
536
537 Cephadm also supports using CA signed keys for SSH authentication
538 across cluster nodes. In this setup, instead of needing a private
539 key and public key, we instead need a private key and certificate
540 created by signing that private key with a CA key. For more info
541 on setting up nodes for authentication using a CA signed key, see
542 :ref:`cephadm-bootstrap-ca-signed-keys`. Once you have your private
543 key and signed cert, they can be set up for cephadm to use by running:
544
545 .. prompt:: bash #
546
547 ceph config-key set mgr/cephadm/ssh_identity_key -i <private-key-file>
548 ceph config-key set mgr/cephadm/ssh_identity_cert -i <signed-cert-file>
549
550 .. _cephadm-fqdn:
551
552 Fully qualified domain names vs bare host names
553 ===============================================
554
555 .. note::
556
557 cephadm demands that the name of the host given via ``ceph orch host add``
558 equals the output of ``hostname`` on remote hosts.
559
560 Otherwise cephadm can't be sure that names returned by
561 ``ceph * metadata`` match the hosts known to cephadm. This might result
562 in a :ref:`cephadm-stray-host` warning.
563
564 When configuring new hosts, there are two **valid** ways to set the
565 ``hostname`` of a host:
566
567 1. Using the bare host name. In this case:
568
569 - ``hostname`` returns the bare host name.
570 - ``hostname -f`` returns the FQDN.
571
572 2. Using the fully qualified domain name as the host name. In this case:
573
574 - ``hostname`` returns the FQDN
575 - ``hostname -s`` return the bare host name
576
577 Note that ``man hostname`` recommends ``hostname`` to return the bare
578 host name:
579
580 The FQDN (Fully Qualified Domain Name) of the system is the
581 name that the resolver(3) returns for the host name, such as,
582 ursula.example.com. It is usually the hostname followed by the DNS
583 domain name (the part after the first dot). You can check the FQDN
584 using ``hostname --fqdn`` or the domain name using ``dnsdomainname``.
585
586 .. code-block:: none
587
588 You cannot change the FQDN with hostname or dnsdomainname.
589
590 The recommended method of setting the FQDN is to make the hostname
591 be an alias for the fully qualified name using /etc/hosts, DNS, or
592 NIS. For example, if the hostname was "ursula", one might have
593 a line in /etc/hosts which reads
594
595 127.0.1.1 ursula.example.com ursula
596
597 Which means, ``man hostname`` recommends ``hostname`` to return the bare
598 host name. This in turn means that Ceph will return the bare host names
599 when executing ``ceph * metadata``. This in turn means cephadm also
600 requires the bare host name when adding a host to the cluster:
601 ``ceph orch host add <bare-name>``.
602
603 ..
604 TODO: This chapter needs to provide way for users to configure
605 Grafana in the dashboard, as this is right now very hard to do.