]> git.proxmox.com Git - ceph.git/blame - ceph/doc/install/manual-deployment.rst
bump version to 15.2.11-pve1
[ceph.git] / ceph / doc / install / manual-deployment.rst
CommitLineData
7c673cae
FG
1===================
2 Manual Deployment
3===================
4
5All Ceph clusters require at least one monitor, and at least as many OSDs as
6copies of an object stored on the cluster. Bootstrapping the initial monitor(s)
7is the first step in deploying a Ceph Storage Cluster. Monitor deployment also
8sets important criteria for the entire cluster, such as the number of replicas
9for pools, the number of placement groups per OSD, the heartbeat intervals,
10whether authentication is required, etc. Most of these values are set by
11default, so it's useful to know about them when setting up your cluster for
12production.
13
14Following the same configuration as `Installation (Quick)`_, we will set up a
b5b8bbf5 15cluster with ``node1`` as the monitor node, and ``node2`` and ``node3`` for
7c673cae
FG
16OSD nodes.
17
18
19
b5b8bbf5 20.. ditaa::
f91f0fd5 21
7c673cae
FG
22 /------------------\ /----------------\
23 | Admin Node | | node1 |
24 | +-------->+ |
25 | | | cCCC |
26 \---------+--------/ \----------------/
27 |
28 | /----------------\
29 | | node2 |
30 +----------------->+ |
31 | | cCCC |
32 | \----------------/
33 |
34 | /----------------\
35 | | node3 |
36 +----------------->| |
37 | cCCC |
38 \----------------/
39
40
41Monitor Bootstrapping
42=====================
43
44Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires
45a number of things:
46
b5b8bbf5
FG
47- **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
48 and stands for File System ID from the days when the Ceph Storage Cluster was
9f95a23c 49 principally for the Ceph File System. Ceph now supports native interfaces,
b5b8bbf5 50 block devices, and object storage gateway interfaces too, so ``fsid`` is a
7c673cae
FG
51 bit of a misnomer.
52
53- **Cluster Name:** Ceph clusters have a cluster name, which is a simple string
54 without spaces. The default cluster name is ``ceph``, but you may specify
b5b8bbf5
FG
55 a different cluster name. Overriding the default cluster name is
56 especially useful when you are working with multiple clusters and you need to
57 clearly understand which cluster your are working with.
58
11fdf7f2 59 For example, when you run multiple clusters in a :ref:`multisite configuration <multisite>`,
7c673cae 60 the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for
b5b8bbf5
FG
61 the current CLI session. **Note:** To identify the cluster name on the
62 command line interface, specify the Ceph configuration file with the
7c673cae
FG
63 cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.).
64 Also see CLI usage (``ceph --cluster {cluster-name}``).
b5b8bbf5
FG
65
66- **Monitor Name:** Each monitor instance within a cluster has a unique name.
7c673cae 67 In common practice, the Ceph Monitor name is the host name (we recommend one
b5b8bbf5 68 Ceph Monitor per host, and no commingling of Ceph OSD Daemons with
7c673cae
FG
69 Ceph Monitors). You may retrieve the short hostname with ``hostname -s``.
70
b5b8bbf5
FG
71- **Monitor Map:** Bootstrapping the initial monitor(s) requires you to
72 generate a monitor map. The monitor map requires the ``fsid``, the cluster
7c673cae
FG
73 name (or uses the default), and at least one host name and its IP address.
74
b5b8bbf5
FG
75- **Monitor Keyring**: Monitors communicate with each other via a
76 secret key. You must generate a keyring with a monitor secret and provide
7c673cae 77 it when bootstrapping the initial monitor(s).
b5b8bbf5 78
7c673cae
FG
79- **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have
80 a ``client.admin`` user. So you must generate the admin user and keyring,
81 and you must also add the ``client.admin`` user to the monitor keyring.
82
b5b8bbf5
FG
83The foregoing requirements do not imply the creation of a Ceph Configuration
84file. However, as a best practice, we recommend creating a Ceph configuration
7c673cae
FG
85file and populating it with the ``fsid``, the ``mon initial members`` and the
86``mon host`` settings.
87
88You can get and set all of the monitor settings at runtime as well. However,
b5b8bbf5 89a Ceph Configuration file may contain only those settings that override the
7c673cae 90default values. When you add settings to a Ceph configuration file, these
b5b8bbf5 91settings override the default settings. Maintaining those settings in a
7c673cae
FG
92Ceph configuration file makes it easier to maintain your cluster.
93
94The procedure is as follows:
95
96
97#. Log in to the initial monitor node(s)::
98
99 ssh {hostname}
100
b5b8bbf5 101 For example::
7c673cae
FG
102
103 ssh node1
104
105
b5b8bbf5
FG
106#. Ensure you have a directory for the Ceph configuration file. By default,
107 Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will
7c673cae
FG
108 create the ``/etc/ceph`` directory automatically. ::
109
b5b8bbf5 110 ls /etc/ceph
7c673cae
FG
111
112 **Note:** Deployment tools may remove this directory when purging a
113 cluster (e.g., ``ceph-deploy purgedata {node-name}``, ``ceph-deploy purge
114 {node-name}``).
115
b5b8bbf5 116#. Create a Ceph configuration file. By default, Ceph uses
7c673cae
FG
117 ``ceph.conf``, where ``ceph`` reflects the cluster name. ::
118
119 sudo vim /etc/ceph/ceph.conf
120
121
b5b8bbf5 122#. Generate a unique ID (i.e., ``fsid``) for your cluster. ::
7c673cae
FG
123
124 uuidgen
7c673cae 125
b5b8bbf5
FG
126
127#. Add the unique ID to your Ceph configuration file. ::
7c673cae
FG
128
129 fsid = {UUID}
130
b5b8bbf5 131 For example::
7c673cae
FG
132
133 fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
134
135
b5b8bbf5 136#. Add the initial monitor(s) to your Ceph configuration file. ::
7c673cae
FG
137
138 mon initial members = {hostname}[,{hostname}]
139
b5b8bbf5 140 For example::
7c673cae
FG
141
142 mon initial members = node1
143
144
b5b8bbf5
FG
145#. Add the IP address(es) of the initial monitor(s) to your Ceph configuration
146 file and save the file. ::
7c673cae
FG
147
148 mon host = {ip-address}[,{ip-address}]
149
150 For example::
151
152 mon host = 192.168.0.1
153
154 **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
155 you must set ``ms bind ipv6`` to ``true``. See `Network Configuration
156 Reference`_ for details about network configuration.
157
158#. Create a keyring for your cluster and generate a monitor secret key. ::
159
160 ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
161
162
163#. Generate an administrator keyring, generate a ``client.admin`` user and add
b5b8bbf5 164 the user to the keyring. ::
7c673cae 165
11fdf7f2 166 sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
7c673cae 167
11fdf7f2
TL
168#. Generate a bootstrap-osd keyring, generate a ``client.bootstrap-osd`` user and add
169 the user to the keyring. ::
7c673cae 170
9f95a23c 171 sudo ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
7c673cae 172
11fdf7f2 173#. Add the generated keys to the ``ceph.mon.keyring``. ::
7c673cae 174
11fdf7f2
TL
175 sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
176 sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
7c673cae 177
9f95a23c
TL
178#. Change the owner for ``ceph.mon.keyring``. ::
179
180 sudo chown ceph:ceph /tmp/ceph.mon.keyring
181
b5b8bbf5
FG
182#. Generate a monitor map using the hostname(s), host IP address(es) and the FSID.
183 Save it as ``/tmp/monmap``::
7c673cae
FG
184
185 monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap
186
187 For example::
188
189 monmaptool --create --add node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap
190
191
192#. Create a default data directory (or directories) on the monitor host(s). ::
193
194 sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname}
195
196 For example::
197
11fdf7f2 198 sudo -u ceph mkdir /var/lib/ceph/mon/ceph-node1
7c673cae
FG
199
200 See `Monitor Config Reference - Data`_ for details.
201
202#. Populate the monitor daemon(s) with the monitor map and keyring. ::
203
204 sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
205
206 For example::
207
208 sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
209
210
b5b8bbf5 211#. Consider settings for a Ceph configuration file. Common settings include
7c673cae
FG
212 the following::
213
214 [global]
215 fsid = {cluster-id}
216 mon initial members = {hostname}[, {hostname}]
217 mon host = {ip-address}[, {ip-address}]
218 public network = {network}[, {network}]
219 cluster network = {network}[, {network}]
220 auth cluster required = cephx
221 auth service required = cephx
222 auth client required = cephx
223 osd journal size = {n}
224 osd pool default size = {n} # Write an object n times.
11fdf7f2 225 osd pool default min size = {n} # Allow writing n copies in a degraded state.
7c673cae 226 osd pool default pg num = {n}
b5b8bbf5 227 osd pool default pgp num = {n}
7c673cae
FG
228 osd crush chooseleaf type = {n}
229
230 In the foregoing example, the ``[global]`` section of the configuration might
231 look like this::
232
233 [global]
234 fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
235 mon initial members = node1
236 mon host = 192.168.0.1
237 public network = 192.168.0.0/24
238 auth cluster required = cephx
239 auth service required = cephx
240 auth client required = cephx
241 osd journal size = 1024
94b18763
FG
242 osd pool default size = 3
243 osd pool default min size = 2
7c673cae 244 osd pool default pg num = 333
b5b8bbf5 245 osd pool default pgp num = 333
7c673cae
FG
246 osd crush chooseleaf type = 1
247
7c673cae
FG
248
249#. Start the monitor(s).
250
11fdf7f2 251 For most distributions, services are started via systemd now::
7c673cae 252
11fdf7f2 253 sudo systemctl start ceph-mon@node1
7c673cae 254
11fdf7f2 255 For older Debian/CentOS/RHEL, use sysvinit::
7c673cae
FG
256
257 sudo /etc/init.d/ceph start mon.node1
258
259
b5b8bbf5 260#. Verify that the monitor is running. ::
7c673cae
FG
261
262 ceph -s
263
264 You should see output that the monitor you started is up and running, and
265 you should see a health error indicating that placement groups are stuck
b5b8bbf5 266 inactive. It should look something like this::
7c673cae 267
11fdf7f2
TL
268 cluster:
269 id: a7f64266-0894-4f1e-a635-d0aeaca0e993
270 health: HEALTH_OK
271
272 services:
273 mon: 1 daemons, quorum node1
274 mgr: node1(active)
275 osd: 0 osds: 0 up, 0 in
276
277 data:
278 pools: 0 pools, 0 pgs
279 objects: 0 objects, 0 bytes
280 usage: 0 kB used, 0 kB / 0 kB avail
281 pgs:
282
7c673cae
FG
283
284 **Note:** Once you add OSDs and start them, the placement group health errors
11fdf7f2 285 should disappear. See `Adding OSDs`_ for details.
7c673cae 286
31f18b77
FG
287Manager daemon configuration
288============================
289
290On each node where you run a ceph-mon daemon, you should also set up a ceph-mgr daemon.
291
b5b8bbf5 292See :ref:`mgr-administrator-guide`
7c673cae
FG
293
294Adding OSDs
295===========
296
297Once you have your initial monitor(s) running, you should add OSDs. Your cluster
298cannot reach an ``active + clean`` state until you have enough OSDs to handle the
299number of copies of an object (e.g., ``osd pool default size = 2`` requires at
300least two OSDs). After bootstrapping your monitor, your cluster has a default
b5b8bbf5 301CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to
7c673cae
FG
302a Ceph Node.
303
304
305Short Form
306----------
307
94b18763
FG
308Ceph provides the ``ceph-volume`` utility, which can prepare a logical volume, disk, or partition
309for use with Ceph. The ``ceph-volume`` utility creates the OSD ID by
310incrementing the index. Additionally, ``ceph-volume`` will add the new OSD to the
311CRUSH map under the host for you. Execute ``ceph-volume -h`` for CLI details.
312The ``ceph-volume`` utility automates the steps of the `Long Form`_ below. To
7c673cae
FG
313create the first two OSDs with the short form procedure, execute the following
314on ``node2`` and ``node3``:
315
94b18763
FG
316bluestore
317^^^^^^^^^
318#. Create the OSD. ::
319
320 ssh {node-name}
321 sudo ceph-volume lvm create --data {data-path}
322
323 For example::
324
325 ssh node1
326 sudo ceph-volume lvm create --data /dev/hdd1
327
328Alternatively, the creation process can be split in two phases (prepare, and
329activate):
7c673cae
FG
330
331#. Prepare the OSD. ::
332
333 ssh {node-name}
94b18763 334 sudo ceph-volume lvm prepare --data {data-path} {data-path}
7c673cae
FG
335
336 For example::
337
338 ssh node1
94b18763
FG
339 sudo ceph-volume lvm prepare --data /dev/hdd1
340
341 Once prepared, the ``ID`` and ``FSID`` of the prepared OSD are required for
342 activation. These can be obtained by listing OSDs in the current server::
7c673cae 343
94b18763 344 sudo ceph-volume lvm list
7c673cae
FG
345
346#. Activate the OSD::
347
94b18763
FG
348 sudo ceph-volume lvm activate {ID} {FSID}
349
350 For example::
351
352 sudo ceph-volume lvm activate 0 a7f64266-0894-4f1e-a635-d0aeaca0e993
353
354
355filestore
356^^^^^^^^^
357#. Create the OSD. ::
358
359 ssh {node-name}
360 sudo ceph-volume lvm create --filestore --data {data-path} --journal {journal-path}
361
362 For example::
363
364 ssh node1
365 sudo ceph-volume lvm create --filestore --data /dev/hdd1 --journal /dev/hdd2
366
367Alternatively, the creation process can be split in two phases (prepare, and
368activate):
369
370#. Prepare the OSD. ::
371
372 ssh {node-name}
373 sudo ceph-volume lvm prepare --filestore --data {data-path} --journal {journal-path}
7c673cae 374
b5b8bbf5 375 For example::
7c673cae 376
94b18763
FG
377 ssh node1
378 sudo ceph-volume lvm prepare --filestore --data /dev/hdd1 --journal /dev/hdd2
379
380 Once prepared, the ``ID`` and ``FSID`` of the prepared OSD are required for
381 activation. These can be obtained by listing OSDs in the current server::
382
383 sudo ceph-volume lvm list
384
385#. Activate the OSD::
386
387 sudo ceph-volume lvm activate --filestore {ID} {FSID}
388
389 For example::
7c673cae 390
94b18763 391 sudo ceph-volume lvm activate --filestore 0 a7f64266-0894-4f1e-a635-d0aeaca0e993
7c673cae
FG
392
393
394Long Form
395---------
396
397Without the benefit of any helper utilities, create an OSD and add it to the
398cluster and CRUSH map with the following procedure. To create the first two
c07f9fc5 399OSDs with the long form procedure, execute the following steps for each OSD.
7c673cae 400
c07f9fc5
FG
401.. note:: This procedure does not describe deployment on top of dm-crypt
402 making use of the dm-crypt 'lockbox'.
7c673cae 403
c07f9fc5 404#. Connect to the OSD host and become root. ::
7c673cae 405
c07f9fc5
FG
406 ssh {node-name}
407 sudo bash
7c673cae 408
c07f9fc5 409#. Generate a UUID for the OSD. ::
7c673cae 410
c07f9fc5 411 UUID=$(uuidgen)
7c673cae 412
c07f9fc5 413#. Generate a cephx key for the OSD. ::
7c673cae 414
c07f9fc5 415 OSD_SECRET=$(ceph-authtool --gen-print-key)
7c673cae 416
c07f9fc5
FG
417#. Create the OSD. Note that an OSD ID can be provided as an
418 additional argument to ``ceph osd new`` if you need to reuse a
419 previously-destroyed OSD id. We assume that the
420 ``client.bootstrap-osd`` key is present on the machine. You may
421 alternatively execute this command as ``client.admin`` on a
422 different host where that key is present.::
b5b8bbf5 423
c07f9fc5
FG
424 ID=$(echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | \
425 ceph osd new $UUID -i - \
426 -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring)
7c673cae 427
3a9019d9
FG
428 It is also possible to include a ``crush_device_class`` property in the JSON
429 to set an initial class other than the default (``ssd`` or ``hdd`` based on
430 the auto-detected device type).
431
c07f9fc5 432#. Create the default directory on your new OSD. ::
7c673cae 433
c07f9fc5 434 mkdir /var/lib/ceph/osd/ceph-$ID
7c673cae 435
b5b8bbf5 436#. If the OSD is for a drive other than the OS drive, prepare it
c07f9fc5 437 for use with Ceph, and mount it to the directory you just created. ::
7c673cae 438
c07f9fc5
FG
439 mkfs.xfs /dev/{DEV}
440 mount /dev/{DEV} /var/lib/ceph/osd/ceph-$ID
7c673cae 441
c07f9fc5 442#. Write the secret to the OSD keyring file. ::
7c673cae 443
c07f9fc5
FG
444 ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring \
445 --name osd.$ID --add-key $OSD_SECRET
7c673cae 446
c07f9fc5 447#. Initialize the OSD data directory. ::
7c673cae 448
c07f9fc5 449 ceph-osd -i $ID --mkfs --osd-uuid $UUID
7c673cae 450
c07f9fc5 451#. Fix ownership. ::
7c673cae 452
c07f9fc5 453 chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID
7c673cae 454
b5b8bbf5
FG
455#. After you add an OSD to Ceph, the OSD is in your configuration. However,
456 it is not yet running. You must start
7c673cae
FG
457 your new OSD before it can begin receiving data.
458
c07f9fc5 459 For modern systemd distributions::
7c673cae 460
c07f9fc5
FG
461 systemctl enable ceph-osd@$ID
462 systemctl start ceph-osd@$ID
b5b8bbf5 463
7c673cae
FG
464 For example::
465
c07f9fc5
FG
466 systemctl enable ceph-osd@12
467 systemctl start ceph-osd@12
7c673cae
FG
468
469
470Adding MDS
471==========
472
473In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine.
474
475#. Create the mds data directory.::
476
477 mkdir -p /var/lib/ceph/mds/{cluster-name}-{id}
478
479#. Create a keyring.::
480
481 ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id}
b5b8bbf5 482
7c673cae
FG
483#. Import the keyring and set caps.::
484
485 ceph auth add mds.{id} osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring
b5b8bbf5 486
7c673cae
FG
487#. Add to ceph.conf.::
488
489 [mds.{id}]
490 host = {id}
491
492#. Start the daemon the manual way.::
493
494 ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f]
495
496#. Start the daemon the right way (using ceph.conf entry).::
497
498 service ceph start
499
500#. If starting the daemon fails with this error::
501
502 mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument
503
504 Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
505
9f95a23c 506#. Now you are ready to `create a Ceph file system`_.
7c673cae
FG
507
508
509Summary
510=======
511
512Once you have your monitor and two OSDs up and running, you can watch the
b5b8bbf5 513placement groups peer by executing the following::
7c673cae
FG
514
515 ceph -w
516
b5b8bbf5 517To view the tree, execute the following::
7c673cae
FG
518
519 ceph osd tree
b5b8bbf5
FG
520
521You should see output that looks something like this::
7c673cae
FG
522
523 # id weight type name up/down reweight
524 -1 2 root default
525 -2 2 host node1
526 0 1 osd.0 up 1
527 -3 1 host node2
b5b8bbf5 528 1 1 osd.1 up 1
7c673cae 529
b5b8bbf5 530To add (or remove) additional monitors, see `Add/Remove Monitors`_.
7c673cae
FG
531To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_.
532
533
7c673cae
FG
534.. _Installation (Quick): ../../start
535.. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons
536.. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
537.. _Network Configuration Reference: ../../rados/configuration/network-config-ref
538.. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
9f95a23c 539.. _create a Ceph file system: ../../cephfs/createfs