]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephadm/host-management.rst
add stop-gap to fix compat with CPUs not supporting SSE 4.1
[ceph.git] / ceph / doc / cephadm / host-management.rst
1 .. _orchestrator-cli-host-management:
2
3 ===============
4 Host Management
5 ===============
6
7 Listing Hosts
8 =============
9
10 Run a command of this form to list hosts associated with the cluster:
11
12 .. prompt:: bash #
13
14 ceph orch host ls [--format yaml] [--host-pattern <name>] [--label <label>] [--host-status <status>] [--detail]
15
16 In commands of this form, the arguments "host-pattern", "label", and
17 "host-status" are optional and are used for filtering.
18
19 - "host-pattern" is a regex that matches against hostnames and returns only
20 matching hosts.
21 - "label" returns only hosts with the specified label.
22 - "host-status" returns only hosts with the specified status (currently
23 "offline" or "maintenance").
24 - Any combination of these filtering flags is valid. It is possible to filter
25 against name, label and status simultaneously, or to filter against any
26 proper subset of name, label and status.
27
28 The "detail" parameter provides more host related information for cephadm based
29 clusters. For example:
30
31 .. prompt:: bash #
32
33 ceph orch host ls --detail
34
35 ::
36
37 HOSTNAME ADDRESS LABELS STATUS VENDOR/MODEL CPU HDD SSD NIC
38 ceph-master 192.168.122.73 _admin QEMU (Standard PC (Q35 + ICH9, 2009)) 4C/4T 4/1.6TB - 1
39 1 hosts in cluster
40
41 .. _cephadm-adding-hosts:
42
43 Adding Hosts
44 ============
45
46 Hosts must have these :ref:`cephadm-host-requirements` installed.
47 Hosts without all the necessary requirements will fail to be added to the cluster.
48
49 To add each new host to the cluster, perform two steps:
50
51 #. Install the cluster's public SSH key in the new host's root user's ``authorized_keys`` file:
52
53 .. prompt:: bash #
54
55 ssh-copy-id -f -i /etc/ceph/ceph.pub root@*<new-host>*
56
57 For example:
58
59 .. prompt:: bash #
60
61 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2
62 ssh-copy-id -f -i /etc/ceph/ceph.pub root@host3
63
64 #. Tell Ceph that the new node is part of the cluster:
65
66 .. prompt:: bash #
67
68 ceph orch host add *<newhost>* [*<ip>*] [*<label1> ...*]
69
70 For example:
71
72 .. prompt:: bash #
73
74 ceph orch host add host2 10.10.0.102
75 ceph orch host add host3 10.10.0.103
76
77 It is best to explicitly provide the host IP address. If an IP is
78 not provided, then the host name will be immediately resolved via
79 DNS and that IP will be used.
80
81 One or more labels can also be included to immediately label the
82 new host. For example, by default the ``_admin`` label will make
83 cephadm maintain a copy of the ``ceph.conf`` file and a
84 ``client.admin`` keyring file in ``/etc/ceph``:
85
86 .. prompt:: bash #
87
88 ceph orch host add host4 10.10.0.104 --labels _admin
89
90 .. _cephadm-removing-hosts:
91
92 Removing Hosts
93 ==============
94
95 A host can safely be removed from the cluster after all daemons are removed
96 from it.
97
98 To drain all daemons from a host, run a command of the following form:
99
100 .. prompt:: bash #
101
102 ceph orch host drain *<host>*
103
104 The ``_no_schedule`` label will be applied to the host. See
105 :ref:`cephadm-special-host-labels`.
106
107 All OSDs on the host will be scheduled to be removed. You can check the progress of the OSD removal operation with the following command:
108
109 .. prompt:: bash #
110
111 ceph orch osd rm status
112
113 See :ref:`cephadm-osd-removal` for more details about OSD removal.
114
115 Use the following command to determine whether any daemons are still on the
116 host:
117
118 .. prompt:: bash #
119
120 ceph orch ps <host>
121
122 After all daemons have been removed from the host, remove the host from the
123 cluster by running the following command:
124
125 .. prompt:: bash #
126
127 ceph orch host rm <host>
128
129 Offline host removal
130 --------------------
131
132 Even if a host is offline and can not be recovered, it can be removed from the
133 cluster by running a command of the following form:
134
135 .. prompt:: bash #
136
137 ceph orch host rm <host> --offline --force
138
139 .. warning:: This can potentially cause data loss. This command forcefully
140 purges OSDs from the cluster by calling ``osd purge-actual`` for each OSD.
141 Any service specs that still contain this host should be manually updated.
142
143 .. _orchestrator-host-labels:
144
145 Host labels
146 ===========
147
148 The orchestrator supports assigning labels to hosts. Labels
149 are free form and have no particular meaning by itself and each host
150 can have multiple labels. They can be used to specify placement
151 of daemons. See :ref:`orch-placement-by-labels`
152
153 Labels can be added when adding a host with the ``--labels`` flag:
154
155 .. prompt:: bash #
156
157 ceph orch host add my_hostname --labels=my_label1
158 ceph orch host add my_hostname --labels=my_label1,my_label2
159
160 To add a label a existing host, run:
161
162 .. prompt:: bash #
163
164 ceph orch host label add my_hostname my_label
165
166 To remove a label, run:
167
168 .. prompt:: bash #
169
170 ceph orch host label rm my_hostname my_label
171
172
173 .. _cephadm-special-host-labels:
174
175 Special host labels
176 -------------------
177
178 The following host labels have a special meaning to cephadm. All start with ``_``.
179
180 * ``_no_schedule``: *Do not schedule or deploy daemons on this host*.
181
182 This label prevents cephadm from deploying daemons on this host. If it is added to
183 an existing host that already contains Ceph daemons, it will cause cephadm to move
184 those daemons elsewhere (except OSDs, which are not removed automatically).
185
186 * ``_no_autotune_memory``: *Do not autotune memory on this host*.
187
188 This label will prevent daemon memory from being tuned even when the
189 ``osd_memory_target_autotune`` or similar option is enabled for one or more daemons
190 on that host.
191
192 * ``_admin``: *Distribute client.admin and ceph.conf to this host*.
193
194 By default, an ``_admin`` label is applied to the first host in the cluster (where
195 bootstrap was originally run), and the ``client.admin`` key is set to be distributed
196 to that host via the ``ceph orch client-keyring ...`` function. Adding this label
197 to additional hosts will normally cause cephadm to deploy config and keyring files
198 in ``/etc/ceph``. Starting from versions 16.2.10 (Pacific) and 17.2.1 (Quincy) in
199 addition to the default location ``/etc/ceph/`` cephadm also stores config and keyring
200 files in the ``/var/lib/ceph/<fsid>/config`` directory.
201
202 Maintenance Mode
203 ================
204
205 Place a host in and out of maintenance mode (stops all Ceph daemons on host):
206
207 .. prompt:: bash #
208
209 ceph orch host maintenance enter <hostname> [--force] [--yes-i-really-mean-it]
210 ceph orch host maintenance exit <hostname>
211
212 The ``--force`` flag allows the user to bypass warnings (but not alerts). The ``--yes-i-really-mean-it``
213 flag bypasses all safety checks and will attempt to force the host into maintenance mode no
214 matter what.
215
216 .. warning:: Using the --yes-i-really-mean-it flag to force the host to enter maintenance
217 mode can potentially cause loss of data availability, the mon quorum to break down due
218 to too few running monitors, mgr module commands (such as ``ceph orch . . .`` commands)
219 to be become unresponsive, and a number of other possible issues. Please only use this
220 flag if you're absolutely certain you know what you're doing.
221
222 See also :ref:`cephadm-fqdn`
223
224 Rescanning Host Devices
225 =======================
226
227 Some servers and external enclosures may not register device removal or insertion with the
228 kernel. In these scenarios, you'll need to perform a host rescan. A rescan is typically
229 non-disruptive, and can be performed with the following CLI command:
230
231 .. prompt:: bash #
232
233 ceph orch host rescan <hostname> [--with-summary]
234
235 The ``with-summary`` flag provides a breakdown of the number of HBAs found and scanned, together
236 with any that failed:
237
238 .. prompt:: bash [ceph:root@rh9-ceph1/]#
239
240 ceph orch host rescan rh9-ceph1 --with-summary
241
242 ::
243
244 Ok. 2 adapters detected: 2 rescanned, 0 skipped, 0 failed (0.32s)
245
246 Creating many hosts at once
247 ===========================
248
249 Many hosts can be added at once using
250 ``ceph orch apply -i`` by submitting a multi-document YAML file:
251
252 .. code-block:: yaml
253
254 service_type: host
255 hostname: node-00
256 addr: 192.168.0.10
257 labels:
258 - example1
259 - example2
260 ---
261 service_type: host
262 hostname: node-01
263 addr: 192.168.0.11
264 labels:
265 - grafana
266 ---
267 service_type: host
268 hostname: node-02
269 addr: 192.168.0.12
270
271 This can be combined with :ref:`service specifications<orchestrator-cli-service-spec>`
272 to create a cluster spec file to deploy a whole cluster in one command. see
273 ``cephadm bootstrap --apply-spec`` also to do this during bootstrap. Cluster
274 SSH Keys must be copied to hosts prior to adding them.
275
276 Setting the initial CRUSH location of host
277 ==========================================
278
279 Hosts can contain a ``location`` identifier which will instruct cephadm to
280 create a new CRUSH host located in the specified hierarchy.
281
282 .. code-block:: yaml
283
284 service_type: host
285 hostname: node-00
286 addr: 192.168.0.10
287 location:
288 rack: rack1
289
290 .. note::
291
292 The ``location`` attribute will be only affect the initial CRUSH location. Subsequent
293 changes of the ``location`` property will be ignored. Also, removing a host will no remove
294 any CRUSH buckets.
295
296 See also :ref:`crush_map_default_types`.
297
298 OS Tuning Profiles
299 ==================
300
301 Cephadm can be used to manage operating-system-tuning profiles that apply sets
302 of sysctl settings to sets of hosts.
303
304 Create a YAML spec file in the following format:
305
306 .. code-block:: yaml
307
308 profile_name: 23-mon-host-profile
309 placement:
310 hosts:
311 - mon-host-01
312 - mon-host-02
313 settings:
314 fs.file-max: 1000000
315 vm.swappiness: '13'
316
317 Apply the tuning profile with the following command:
318
319 .. prompt:: bash #
320
321 ceph orch tuned-profile apply -i <tuned-profile-file-name>
322
323 This profile is written to ``/etc/sysctl.d/`` on each host that matches the
324 hosts specified in the placement block of the yaml, and ``sysctl --system`` is
325 run on the host.
326
327 .. note::
328
329 The exact filename that the profile is written to within ``/etc/sysctl.d/``
330 is ``<profile-name>-cephadm-tuned-profile.conf``, where ``<profile-name>`` is
331 the ``profile_name`` setting that you specify in the YAML spec. Because
332 sysctl settings are applied in lexicographical order (sorted by the filename
333 in which the setting is specified), you may want to set the ``profile_name``
334 in your spec so that it is applied before or after other conf files.
335
336 .. note::
337
338 These settings are applied only at the host level, and are not specific
339 to any particular daemon or container.
340
341 .. note::
342
343 Applying tuned profiles is idempotent when the ``--no-overwrite`` option is
344 passed. Moreover, if the ``--no-overwrite`` option is passed, existing
345 profiles with the same name are not overwritten.
346
347
348 Viewing Profiles
349 ----------------
350
351 Run the following command to view all the profiles that cephadm currently manages:
352
353 .. prompt:: bash #
354
355 ceph orch tuned-profile ls
356
357 .. note::
358
359 To make modifications and re-apply a profile, pass ``--format yaml`` to the
360 ``tuned-profile ls`` command. The ``tuned-profile ls --format yaml`` command
361 presents the profiles in a format that is easy to copy and re-apply.
362
363
364 Removing Profiles
365 -----------------
366
367 To remove a previously applied profile, run this command:
368
369 .. prompt:: bash #
370
371 ceph orch tuned-profile rm <profile-name>
372
373 When a profile is removed, cephadm cleans up the file previously written to ``/etc/sysctl.d``.
374
375
376 Modifying Profiles
377 ------------------
378
379 Profiles can be modified by re-applying a YAML spec with the same name as the
380 profile that you want to modify, but settings within existing profiles can be
381 adjusted with the following commands.
382
383 To add or modify a setting in an existing profile:
384
385 .. prompt:: bash #
386
387 ceph orch tuned-profile add-setting <profile-name> <setting-name> <value>
388
389 To remove a setting from an existing profile:
390
391 .. prompt:: bash #
392
393 ceph orch tuned-profile rm-setting <profile-name> <setting-name>
394
395 .. note::
396
397 Modifying the placement requires re-applying a profile with the same name.
398 Remember that profiles are tracked by their names, so when a profile with the
399 same name as an existing profile is applied, it overwrites the old profile
400 unless the ``--no-overwrite`` flag is passed.
401
402 SSH Configuration
403 =================
404
405 Cephadm uses SSH to connect to remote hosts. SSH uses a key to authenticate
406 with those hosts in a secure way.
407
408
409 Default behavior
410 ----------------
411
412 Cephadm stores an SSH key in the monitor that is used to
413 connect to remote hosts. When the cluster is bootstrapped, this SSH
414 key is generated automatically and no additional configuration
415 is necessary.
416
417 A *new* SSH key can be generated with:
418
419 .. prompt:: bash #
420
421 ceph cephadm generate-key
422
423 The public portion of the SSH key can be retrieved with:
424
425 .. prompt:: bash #
426
427 ceph cephadm get-pub-key
428
429 The currently stored SSH key can be deleted with:
430
431 .. prompt:: bash #
432
433 ceph cephadm clear-key
434
435 You can make use of an existing key by directly importing it with:
436
437 .. prompt:: bash #
438
439 ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
440 ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
441
442 You will then need to restart the mgr daemon to reload the configuration with:
443
444 .. prompt:: bash #
445
446 ceph mgr fail
447
448 .. _cephadm-ssh-user:
449
450 Configuring a different SSH user
451 ----------------------------------
452
453 Cephadm must be able to log into all the Ceph cluster nodes as an user
454 that has enough privileges to download container images, start containers
455 and execute commands without prompting for a password. If you do not want
456 to use the "root" user (default option in cephadm), you must provide
457 cephadm the name of the user that is going to be used to perform all the
458 cephadm operations. Use the command:
459
460 .. prompt:: bash #
461
462 ceph cephadm set-user <user>
463
464 Prior to running this the cluster SSH key needs to be added to this users
465 authorized_keys file and non-root users must have passwordless sudo access.
466
467
468 Customizing the SSH configuration
469 ---------------------------------
470
471 Cephadm generates an appropriate ``ssh_config`` file that is
472 used for connecting to remote hosts. This configuration looks
473 something like this::
474
475 Host *
476 User root
477 StrictHostKeyChecking no
478 UserKnownHostsFile /dev/null
479
480 There are two ways to customize this configuration for your environment:
481
482 #. Import a customized configuration file that will be stored
483 by the monitor with:
484
485 .. prompt:: bash #
486
487 ceph cephadm set-ssh-config -i <ssh_config_file>
488
489 To remove a customized SSH config and revert back to the default behavior:
490
491 .. prompt:: bash #
492
493 ceph cephadm clear-ssh-config
494
495 #. You can configure a file location for the SSH configuration file with:
496
497 .. prompt:: bash #
498
499 ceph config set mgr mgr/cephadm/ssh_config_file <path>
500
501 We do *not recommend* this approach. The path name must be
502 visible to *any* mgr daemon, and cephadm runs all daemons as
503 containers. That means that the file either need to be placed
504 inside a customized container image for your deployment, or
505 manually distributed to the mgr data directory
506 (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
507 ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
508
509 .. _cephadm-fqdn:
510
511 Fully qualified domain names vs bare host names
512 ===============================================
513
514 .. note::
515
516 cephadm demands that the name of the host given via ``ceph orch host add``
517 equals the output of ``hostname`` on remote hosts.
518
519 Otherwise cephadm can't be sure that names returned by
520 ``ceph * metadata`` match the hosts known to cephadm. This might result
521 in a :ref:`cephadm-stray-host` warning.
522
523 When configuring new hosts, there are two **valid** ways to set the
524 ``hostname`` of a host:
525
526 1. Using the bare host name. In this case:
527
528 - ``hostname`` returns the bare host name.
529 - ``hostname -f`` returns the FQDN.
530
531 2. Using the fully qualified domain name as the host name. In this case:
532
533 - ``hostname`` returns the FQDN
534 - ``hostname -s`` return the bare host name
535
536 Note that ``man hostname`` recommends ``hostname`` to return the bare
537 host name:
538
539 The FQDN (Fully Qualified Domain Name) of the system is the
540 name that the resolver(3) returns for the host name, such as,
541 ursula.example.com. It is usually the hostname followed by the DNS
542 domain name (the part after the first dot). You can check the FQDN
543 using ``hostname --fqdn`` or the domain name using ``dnsdomainname``.
544
545 .. code-block:: none
546
547 You cannot change the FQDN with hostname or dnsdomainname.
548
549 The recommended method of setting the FQDN is to make the hostname
550 be an alias for the fully qualified name using /etc/hosts, DNS, or
551 NIS. For example, if the hostname was "ursula", one might have
552 a line in /etc/hosts which reads
553
554 127.0.1.1 ursula.example.com ursula
555
556 Which means, ``man hostname`` recommends ``hostname`` to return the bare
557 host name. This in turn means that Ceph will return the bare host names
558 when executing ``ceph * metadata``. This in turn means cephadm also
559 requires the bare host name when adding a host to the cluster:
560 ``ceph orch host add <bare-name>``.
561
562 ..
563 TODO: This chapter needs to provide way for users to configure
564 Grafana in the dashboard, as this is right now very hard to do.