]>
Commit | Line | Data |
---|---|---|
1 | =================== | |
2 | Manual Deployment | |
3 | =================== | |
4 | ||
5 | All Ceph clusters require at least one monitor, and at least as many OSDs as | |
6 | copies of an object stored on the cluster. Bootstrapping the initial monitor(s) | |
7 | is the first step in deploying a Ceph Storage Cluster. Monitor deployment also | |
8 | sets important criteria for the entire cluster, such as the number of replicas | |
9 | for pools, the number of placement groups per OSD, the heartbeat intervals, | |
10 | whether authentication is required, etc. Most of these values are set by | |
11 | default, so it's useful to know about them when setting up your cluster for | |
12 | production. | |
13 | ||
14 | We will set up a cluster with ``mon-node1`` as the monitor node, and ``osd-node1`` and | |
15 | ``osd-node2`` for OSD nodes. | |
16 | ||
17 | ||
18 | ||
19 | .. ditaa:: | |
20 | ||
21 | /------------------\ /----------------\ | |
22 | | Admin Node | | mon-node1 | | |
23 | | +-------->+ | | |
24 | | | | cCCC | | |
25 | \---------+--------/ \----------------/ | |
26 | | | |
27 | | /----------------\ | |
28 | | | osd-node1 | | |
29 | +----------------->+ | | |
30 | | | cCCC | | |
31 | | \----------------/ | |
32 | | | |
33 | | /----------------\ | |
34 | | | osd-node2 | | |
35 | +----------------->| | | |
36 | | cCCC | | |
37 | \----------------/ | |
38 | ||
39 | ||
40 | Monitor Bootstrapping | |
41 | ===================== | |
42 | ||
43 | Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires | |
44 | a number of things: | |
45 | ||
46 | - **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster, | |
47 | and stands for File System ID from the days when the Ceph Storage Cluster was | |
48 | principally for the Ceph File System. Ceph now supports native interfaces, | |
49 | block devices, and object storage gateway interfaces too, so ``fsid`` is a | |
50 | bit of a misnomer. | |
51 | ||
52 | - **Cluster Name:** Ceph clusters have a cluster name, which is a simple string | |
53 | without spaces. The default cluster name is ``ceph``, but you may specify | |
54 | a different cluster name. Overriding the default cluster name is | |
55 | especially useful when you are working with multiple clusters and you need to | |
56 | clearly understand which cluster your are working with. | |
57 | ||
58 | For example, when you run multiple clusters in a :ref:`multisite configuration <multisite>`, | |
59 | the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for | |
60 | the current CLI session. **Note:** To identify the cluster name on the | |
61 | command line interface, specify the Ceph configuration file with the | |
62 | cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.). | |
63 | Also see CLI usage (``ceph --cluster {cluster-name}``). | |
64 | ||
65 | - **Monitor Name:** Each monitor instance within a cluster has a unique name. | |
66 | In common practice, the Ceph Monitor name is the host name (we recommend one | |
67 | Ceph Monitor per host, and no commingling of Ceph OSD Daemons with | |
68 | Ceph Monitors). You may retrieve the short hostname with ``hostname -s``. | |
69 | ||
70 | - **Monitor Map:** Bootstrapping the initial monitor(s) requires you to | |
71 | generate a monitor map. The monitor map requires the ``fsid``, the cluster | |
72 | name (or uses the default), and at least one host name and its IP address. | |
73 | ||
74 | - **Monitor Keyring**: Monitors communicate with each other via a | |
75 | secret key. You must generate a keyring with a monitor secret and provide | |
76 | it when bootstrapping the initial monitor(s). | |
77 | ||
78 | - **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have | |
79 | a ``client.admin`` user. So you must generate the admin user and keyring, | |
80 | and you must also add the ``client.admin`` user to the monitor keyring. | |
81 | ||
82 | The foregoing requirements do not imply the creation of a Ceph Configuration | |
83 | file. However, as a best practice, we recommend creating a Ceph configuration | |
84 | file and populating it with the ``fsid``, the ``mon initial members`` and the | |
85 | ``mon host`` settings. | |
86 | ||
87 | You can get and set all of the monitor settings at runtime as well. However, | |
88 | a Ceph Configuration file may contain only those settings that override the | |
89 | default values. When you add settings to a Ceph configuration file, these | |
90 | settings override the default settings. Maintaining those settings in a | |
91 | Ceph configuration file makes it easier to maintain your cluster. | |
92 | ||
93 | The procedure is as follows: | |
94 | ||
95 | ||
96 | #. Log in to the initial monitor node(s):: | |
97 | ||
98 | ssh {hostname} | |
99 | ||
100 | For example:: | |
101 | ||
102 | ssh mon-node1 | |
103 | ||
104 | ||
105 | #. Ensure you have a directory for the Ceph configuration file. By default, | |
106 | Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will | |
107 | create the ``/etc/ceph`` directory automatically. :: | |
108 | ||
109 | ls /etc/ceph | |
110 | ||
111 | ||
112 | #. Create a Ceph configuration file. By default, Ceph uses | |
113 | ``ceph.conf``, where ``ceph`` reflects the cluster name. Add a line | |
114 | containing "[global]" to the configuration file. :: | |
115 | ||
116 | sudo vim /etc/ceph/ceph.conf | |
117 | ||
118 | ||
119 | #. Generate a unique ID (i.e., ``fsid``) for your cluster. :: | |
120 | ||
121 | uuidgen | |
122 | ||
123 | ||
124 | #. Add the unique ID to your Ceph configuration file. :: | |
125 | ||
126 | fsid = {UUID} | |
127 | ||
128 | For example:: | |
129 | ||
130 | fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
131 | ||
132 | ||
133 | #. Add the initial monitor(s) to your Ceph configuration file. :: | |
134 | ||
135 | mon initial members = {hostname}[,{hostname}] | |
136 | ||
137 | For example:: | |
138 | ||
139 | mon initial members = mon-node1 | |
140 | ||
141 | ||
142 | #. Add the IP address(es) of the initial monitor(s) to your Ceph configuration | |
143 | file and save the file. :: | |
144 | ||
145 | mon host = {ip-address}[,{ip-address}] | |
146 | ||
147 | For example:: | |
148 | ||
149 | mon host = 192.168.0.1 | |
150 | ||
151 | **Note:** You may use IPv6 addresses instead of IPv4 addresses, but | |
152 | you must set ``ms bind ipv6`` to ``true``. See `Network Configuration | |
153 | Reference`_ for details about network configuration. | |
154 | ||
155 | #. Create a keyring for your cluster and generate a monitor secret key. :: | |
156 | ||
157 | sudo ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *' | |
158 | ||
159 | ||
160 | #. Generate an administrator keyring, generate a ``client.admin`` user and add | |
161 | the user to the keyring. :: | |
162 | ||
163 | sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *' | |
164 | ||
165 | #. Generate a bootstrap-osd keyring, generate a ``client.bootstrap-osd`` user and add | |
166 | the user to the keyring. :: | |
167 | ||
168 | sudo ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r' | |
169 | ||
170 | #. Add the generated keys to the ``ceph.mon.keyring``. :: | |
171 | ||
172 | sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring | |
173 | sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring | |
174 | ||
175 | #. Change the owner for ``ceph.mon.keyring``. :: | |
176 | ||
177 | sudo chown ceph:ceph /tmp/ceph.mon.keyring | |
178 | ||
179 | #. Generate a monitor map using the hostname(s), host IP address(es) and the FSID. | |
180 | Save it as ``/tmp/monmap``:: | |
181 | ||
182 | monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap | |
183 | ||
184 | For example:: | |
185 | ||
186 | monmaptool --create --add mon-node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap | |
187 | ||
188 | ||
189 | #. Create a default data directory (or directories) on the monitor host(s). :: | |
190 | ||
191 | sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname} | |
192 | ||
193 | For example:: | |
194 | ||
195 | sudo -u ceph mkdir /var/lib/ceph/mon/ceph-mon-node1 | |
196 | ||
197 | See `Monitor Config Reference - Data`_ for details. | |
198 | ||
199 | #. Populate the monitor daemon(s) with the monitor map and keyring. :: | |
200 | ||
201 | sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring | |
202 | ||
203 | For example:: | |
204 | ||
205 | sudo -u ceph ceph-mon --mkfs -i mon-node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring | |
206 | ||
207 | ||
208 | #. Consider settings for a Ceph configuration file. Common settings include | |
209 | the following:: | |
210 | ||
211 | [global] | |
212 | fsid = {cluster-id} | |
213 | mon initial members = {hostname}[, {hostname}] | |
214 | mon host = {ip-address}[, {ip-address}] | |
215 | public network = {network}[, {network}] | |
216 | cluster network = {network}[, {network}] | |
217 | auth cluster required = cephx | |
218 | auth service required = cephx | |
219 | auth client required = cephx | |
220 | osd journal size = {n} | |
221 | osd pool default size = {n} # Write an object n times. | |
222 | osd pool default min size = {n} # Allow writing n copies in a degraded state. | |
223 | osd pool default pg num = {n} | |
224 | osd pool default pgp num = {n} | |
225 | osd crush chooseleaf type = {n} | |
226 | ||
227 | In the foregoing example, the ``[global]`` section of the configuration might | |
228 | look like this:: | |
229 | ||
230 | [global] | |
231 | fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
232 | mon initial members = mon-node1 | |
233 | mon host = 192.168.0.1 | |
234 | public network = 192.168.0.0/24 | |
235 | auth cluster required = cephx | |
236 | auth service required = cephx | |
237 | auth client required = cephx | |
238 | osd journal size = 1024 | |
239 | osd pool default size = 3 | |
240 | osd pool default min size = 2 | |
241 | osd pool default pg num = 333 | |
242 | osd pool default pgp num = 333 | |
243 | osd crush chooseleaf type = 1 | |
244 | ||
245 | ||
246 | #. Start the monitor(s). | |
247 | ||
248 | Start the service with systemd:: | |
249 | ||
250 | sudo systemctl start ceph-mon@mon-node1 | |
251 | ||
252 | #. Ensure to open firewall ports for ceph-mon. | |
253 | ||
254 | Open the ports with firewalld:: | |
255 | ||
256 | sudo firewall-cmd --zone=public --add-service=ceph-mon | |
257 | sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent | |
258 | ||
259 | #. Verify that the monitor is running. :: | |
260 | ||
261 | sudo ceph -s | |
262 | ||
263 | You should see output that the monitor you started is up and running, and | |
264 | you should see a health error indicating that placement groups are stuck | |
265 | inactive. It should look something like this:: | |
266 | ||
267 | cluster: | |
268 | id: a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
269 | health: HEALTH_OK | |
270 | ||
271 | services: | |
272 | mon: 1 daemons, quorum mon-node1 | |
273 | mgr: mon-node1(active) | |
274 | osd: 0 osds: 0 up, 0 in | |
275 | ||
276 | data: | |
277 | pools: 0 pools, 0 pgs | |
278 | objects: 0 objects, 0 bytes | |
279 | usage: 0 kB used, 0 kB / 0 kB avail | |
280 | pgs: | |
281 | ||
282 | ||
283 | **Note:** Once you add OSDs and start them, the placement group health errors | |
284 | should disappear. See `Adding OSDs`_ for details. | |
285 | ||
286 | Manager daemon configuration | |
287 | ============================ | |
288 | ||
289 | On each node where you run a ceph-mon daemon, you should also set up a ceph-mgr daemon. | |
290 | ||
291 | See :ref:`mgr-administrator-guide` | |
292 | ||
293 | Adding OSDs | |
294 | =========== | |
295 | ||
296 | Once you have your initial monitor(s) running, you should add OSDs. Your cluster | |
297 | cannot reach an ``active + clean`` state until you have enough OSDs to handle the | |
298 | number of copies of an object (e.g., ``osd pool default size = 2`` requires at | |
299 | least two OSDs). After bootstrapping your monitor, your cluster has a default | |
300 | CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to | |
301 | a Ceph Node. | |
302 | ||
303 | ||
304 | Short Form | |
305 | ---------- | |
306 | ||
307 | Ceph provides the ``ceph-volume`` utility, which can prepare a logical volume, disk, or partition | |
308 | for use with Ceph. The ``ceph-volume`` utility creates the OSD ID by | |
309 | incrementing the index. Additionally, ``ceph-volume`` will add the new OSD to the | |
310 | CRUSH map under the host for you. Execute ``ceph-volume -h`` for CLI details. | |
311 | The ``ceph-volume`` utility automates the steps of the `Long Form`_ below. To | |
312 | create the first two OSDs with the short form procedure, execute the following for each OSD: | |
313 | ||
314 | bluestore | |
315 | ^^^^^^^^^ | |
316 | #. Create the OSD. :: | |
317 | ||
318 | copy /var/lib/ceph/bootstrap-osd/ceph.keyring from monitor node (mon-node1) to /var/lib/ceph/bootstrap-osd/ceph.keyring on osd node (osd-node1) | |
319 | ssh {osd node} | |
320 | sudo ceph-volume lvm create --data {data-path} | |
321 | ||
322 | For example:: | |
323 | ||
324 | scp -3 root@mon-node1:/var/lib/ceph/bootstrap-osd/ceph.keyring root@osd-node1:/var/lib/ceph/bootstrap-osd/ceph.keyring | |
325 | ||
326 | ssh osd-node1 | |
327 | sudo ceph-volume lvm create --data /dev/hdd1 | |
328 | ||
329 | Alternatively, the creation process can be split in two phases (prepare, and | |
330 | activate): | |
331 | ||
332 | #. Prepare the OSD. :: | |
333 | ||
334 | ssh {osd node} | |
335 | sudo ceph-volume lvm prepare --data {data-path} {data-path} | |
336 | ||
337 | For example:: | |
338 | ||
339 | ssh osd-node1 | |
340 | sudo ceph-volume lvm prepare --data /dev/hdd1 | |
341 | ||
342 | Once prepared, the ``ID`` and ``FSID`` of the prepared OSD are required for | |
343 | activation. These can be obtained by listing OSDs in the current server:: | |
344 | ||
345 | sudo ceph-volume lvm list | |
346 | ||
347 | #. Activate the OSD:: | |
348 | ||
349 | sudo ceph-volume lvm activate {ID} {FSID} | |
350 | ||
351 | For example:: | |
352 | ||
353 | sudo ceph-volume lvm activate 0 a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
354 | ||
355 | ||
356 | filestore | |
357 | ^^^^^^^^^ | |
358 | #. Create the OSD. :: | |
359 | ||
360 | ssh {osd node} | |
361 | sudo ceph-volume lvm create --filestore --data {data-path} --journal {journal-path} | |
362 | ||
363 | For example:: | |
364 | ||
365 | ssh osd-node1 | |
366 | sudo ceph-volume lvm create --filestore --data /dev/hdd1 --journal /dev/hdd2 | |
367 | ||
368 | Alternatively, the creation process can be split in two phases (prepare, and | |
369 | activate): | |
370 | ||
371 | #. Prepare the OSD. :: | |
372 | ||
373 | ssh {node-name} | |
374 | sudo ceph-volume lvm prepare --filestore --data {data-path} --journal {journal-path} | |
375 | ||
376 | For example:: | |
377 | ||
378 | ssh osd-node1 | |
379 | sudo ceph-volume lvm prepare --filestore --data /dev/hdd1 --journal /dev/hdd2 | |
380 | ||
381 | Once prepared, the ``ID`` and ``FSID`` of the prepared OSD are required for | |
382 | activation. These can be obtained by listing OSDs in the current server:: | |
383 | ||
384 | sudo ceph-volume lvm list | |
385 | ||
386 | #. Activate the OSD:: | |
387 | ||
388 | sudo ceph-volume lvm activate --filestore {ID} {FSID} | |
389 | ||
390 | For example:: | |
391 | ||
392 | sudo ceph-volume lvm activate --filestore 0 a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
393 | ||
394 | ||
395 | Long Form | |
396 | --------- | |
397 | ||
398 | Without the benefit of any helper utilities, create an OSD and add it to the | |
399 | cluster and CRUSH map with the following procedure. To create the first two | |
400 | OSDs with the long form procedure, execute the following steps for each OSD. | |
401 | ||
402 | .. note:: This procedure does not describe deployment on top of dm-crypt | |
403 | making use of the dm-crypt 'lockbox'. | |
404 | ||
405 | #. Connect to the OSD host and become root. :: | |
406 | ||
407 | ssh {node-name} | |
408 | sudo bash | |
409 | ||
410 | #. Generate a UUID for the OSD. :: | |
411 | ||
412 | UUID=$(uuidgen) | |
413 | ||
414 | #. Generate a cephx key for the OSD. :: | |
415 | ||
416 | OSD_SECRET=$(ceph-authtool --gen-print-key) | |
417 | ||
418 | #. Create the OSD. Note that an OSD ID can be provided as an | |
419 | additional argument to ``ceph osd new`` if you need to reuse a | |
420 | previously-destroyed OSD id. We assume that the | |
421 | ``client.bootstrap-osd`` key is present on the machine. You may | |
422 | alternatively execute this command as ``client.admin`` on a | |
423 | different host where that key is present.:: | |
424 | ||
425 | ID=$(echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | \ | |
426 | ceph osd new $UUID -i - \ | |
427 | -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring) | |
428 | ||
429 | It is also possible to include a ``crush_device_class`` property in the JSON | |
430 | to set an initial class other than the default (``ssd`` or ``hdd`` based on | |
431 | the auto-detected device type). | |
432 | ||
433 | #. Create the default directory on your new OSD. :: | |
434 | ||
435 | mkdir /var/lib/ceph/osd/ceph-$ID | |
436 | ||
437 | #. If the OSD is for a drive other than the OS drive, prepare it | |
438 | for use with Ceph, and mount it to the directory you just created. :: | |
439 | ||
440 | mkfs.xfs /dev/{DEV} | |
441 | mount /dev/{DEV} /var/lib/ceph/osd/ceph-$ID | |
442 | ||
443 | #. Write the secret to the OSD keyring file. :: | |
444 | ||
445 | ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring \ | |
446 | --name osd.$ID --add-key $OSD_SECRET | |
447 | ||
448 | #. Initialize the OSD data directory. :: | |
449 | ||
450 | ceph-osd -i $ID --mkfs --osd-uuid $UUID | |
451 | ||
452 | #. Fix ownership. :: | |
453 | ||
454 | chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID | |
455 | ||
456 | #. After you add an OSD to Ceph, the OSD is in your configuration. However, | |
457 | it is not yet running. You must start | |
458 | your new OSD before it can begin receiving data. | |
459 | ||
460 | For modern systemd distributions:: | |
461 | ||
462 | systemctl enable ceph-osd@$ID | |
463 | systemctl start ceph-osd@$ID | |
464 | ||
465 | For example:: | |
466 | ||
467 | systemctl enable ceph-osd@12 | |
468 | systemctl start ceph-osd@12 | |
469 | ||
470 | ||
471 | Adding MDS | |
472 | ========== | |
473 | ||
474 | In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine. | |
475 | ||
476 | #. Create the mds data directory.:: | |
477 | ||
478 | mkdir -p /var/lib/ceph/mds/{cluster-name}-{id} | |
479 | ||
480 | #. Create a keyring.:: | |
481 | ||
482 | ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id} | |
483 | ||
484 | #. Import the keyring and set caps.:: | |
485 | ||
486 | ceph auth add mds.{id} osd "allow rwx" mds "allow *" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring | |
487 | ||
488 | #. Add to ceph.conf.:: | |
489 | ||
490 | [mds.{id}] | |
491 | host = {id} | |
492 | ||
493 | #. Start the daemon the manual way.:: | |
494 | ||
495 | ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f] | |
496 | ||
497 | #. Start the daemon the right way (using ceph.conf entry).:: | |
498 | ||
499 | service ceph start | |
500 | ||
501 | #. If starting the daemon fails with this error:: | |
502 | ||
503 | mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument | |
504 | ||
505 | Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output. | |
506 | ||
507 | #. Now you are ready to `create a Ceph file system`_. | |
508 | ||
509 | ||
510 | Summary | |
511 | ======= | |
512 | ||
513 | Once you have your monitor and two OSDs up and running, you can watch the | |
514 | placement groups peer by executing the following:: | |
515 | ||
516 | ceph -w | |
517 | ||
518 | To view the tree, execute the following:: | |
519 | ||
520 | ceph osd tree | |
521 | ||
522 | You should see output that looks something like this:: | |
523 | ||
524 | # id weight type name up/down reweight | |
525 | -1 2 root default | |
526 | -2 2 host osd-node1 | |
527 | 0 1 osd.0 up 1 | |
528 | -3 1 host osd-node2 | |
529 | 1 1 osd.1 up 1 | |
530 | ||
531 | To add (or remove) additional monitors, see `Add/Remove Monitors`_. | |
532 | To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_. | |
533 | ||
534 | ||
535 | .. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons | |
536 | .. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds | |
537 | .. _Network Configuration Reference: ../../rados/configuration/network-config-ref | |
538 | .. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data | |
539 | .. _create a Ceph file system: ../../cephfs/createfs |