]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | =================== |
2 | Manual Deployment | |
3 | =================== | |
4 | ||
5 | All Ceph clusters require at least one monitor, and at least as many OSDs as | |
6 | copies of an object stored on the cluster. Bootstrapping the initial monitor(s) | |
7 | is the first step in deploying a Ceph Storage Cluster. Monitor deployment also | |
8 | sets important criteria for the entire cluster, such as the number of replicas | |
9 | for pools, the number of placement groups per OSD, the heartbeat intervals, | |
10 | whether authentication is required, etc. Most of these values are set by | |
11 | default, so it's useful to know about them when setting up your cluster for | |
12 | production. | |
13 | ||
14 | Following the same configuration as `Installation (Quick)`_, we will set up a | |
15 | cluster with ``node1`` as the monitor node, and ``node2`` and ``node3`` for | |
16 | OSD nodes. | |
17 | ||
18 | ||
19 | ||
20 | .. ditaa:: | |
21 | /------------------\ /----------------\ | |
22 | | Admin Node | | node1 | | |
23 | | +-------->+ | | |
24 | | | | cCCC | | |
25 | \---------+--------/ \----------------/ | |
26 | | | |
27 | | /----------------\ | |
28 | | | node2 | | |
29 | +----------------->+ | | |
30 | | | cCCC | | |
31 | | \----------------/ | |
32 | | | |
33 | | /----------------\ | |
34 | | | node3 | | |
35 | +----------------->| | | |
36 | | cCCC | | |
37 | \----------------/ | |
38 | ||
39 | ||
40 | Monitor Bootstrapping | |
41 | ===================== | |
42 | ||
43 | Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires | |
44 | a number of things: | |
45 | ||
46 | - **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster, | |
47 | and stands for File System ID from the days when the Ceph Storage Cluster was | |
48 | principally for the Ceph Filesystem. Ceph now supports native interfaces, | |
49 | block devices, and object storage gateway interfaces too, so ``fsid`` is a | |
50 | bit of a misnomer. | |
51 | ||
52 | - **Cluster Name:** Ceph clusters have a cluster name, which is a simple string | |
53 | without spaces. The default cluster name is ``ceph``, but you may specify | |
54 | a different cluster name. Overriding the default cluster name is | |
55 | especially useful when you are working with multiple clusters and you need to | |
56 | clearly understand which cluster your are working with. | |
57 | ||
58 | For example, when you run multiple clusters in a `federated architecture`_, | |
59 | the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for | |
60 | the current CLI session. **Note:** To identify the cluster name on the | |
31f18b77 | 61 | command line interface, specify the Ceph configuration file with the |
7c673cae FG |
62 | cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.). |
63 | Also see CLI usage (``ceph --cluster {cluster-name}``). | |
64 | ||
65 | - **Monitor Name:** Each monitor instance within a cluster has a unique name. | |
66 | In common practice, the Ceph Monitor name is the host name (we recommend one | |
67 | Ceph Monitor per host, and no commingling of Ceph OSD Daemons with | |
68 | Ceph Monitors). You may retrieve the short hostname with ``hostname -s``. | |
69 | ||
70 | - **Monitor Map:** Bootstrapping the initial monitor(s) requires you to | |
71 | generate a monitor map. The monitor map requires the ``fsid``, the cluster | |
72 | name (or uses the default), and at least one host name and its IP address. | |
73 | ||
74 | - **Monitor Keyring**: Monitors communicate with each other via a | |
75 | secret key. You must generate a keyring with a monitor secret and provide | |
76 | it when bootstrapping the initial monitor(s). | |
77 | ||
78 | - **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have | |
79 | a ``client.admin`` user. So you must generate the admin user and keyring, | |
80 | and you must also add the ``client.admin`` user to the monitor keyring. | |
81 | ||
82 | The foregoing requirements do not imply the creation of a Ceph Configuration | |
83 | file. However, as a best practice, we recommend creating a Ceph configuration | |
84 | file and populating it with the ``fsid``, the ``mon initial members`` and the | |
85 | ``mon host`` settings. | |
86 | ||
87 | You can get and set all of the monitor settings at runtime as well. However, | |
88 | a Ceph Configuration file may contain only those settings that override the | |
89 | default values. When you add settings to a Ceph configuration file, these | |
90 | settings override the default settings. Maintaining those settings in a | |
91 | Ceph configuration file makes it easier to maintain your cluster. | |
92 | ||
93 | The procedure is as follows: | |
94 | ||
95 | ||
96 | #. Log in to the initial monitor node(s):: | |
97 | ||
98 | ssh {hostname} | |
99 | ||
100 | For example:: | |
101 | ||
102 | ssh node1 | |
103 | ||
104 | ||
105 | #. Ensure you have a directory for the Ceph configuration file. By default, | |
106 | Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will | |
107 | create the ``/etc/ceph`` directory automatically. :: | |
108 | ||
109 | ls /etc/ceph | |
110 | ||
111 | **Note:** Deployment tools may remove this directory when purging a | |
112 | cluster (e.g., ``ceph-deploy purgedata {node-name}``, ``ceph-deploy purge | |
113 | {node-name}``). | |
114 | ||
115 | #. Create a Ceph configuration file. By default, Ceph uses | |
116 | ``ceph.conf``, where ``ceph`` reflects the cluster name. :: | |
117 | ||
118 | sudo vim /etc/ceph/ceph.conf | |
119 | ||
120 | ||
121 | #. Generate a unique ID (i.e., ``fsid``) for your cluster. :: | |
122 | ||
123 | uuidgen | |
124 | ||
125 | ||
126 | #. Add the unique ID to your Ceph configuration file. :: | |
127 | ||
128 | fsid = {UUID} | |
129 | ||
130 | For example:: | |
131 | ||
132 | fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
133 | ||
134 | ||
135 | #. Add the initial monitor(s) to your Ceph configuration file. :: | |
136 | ||
137 | mon initial members = {hostname}[,{hostname}] | |
138 | ||
139 | For example:: | |
140 | ||
141 | mon initial members = node1 | |
142 | ||
143 | ||
144 | #. Add the IP address(es) of the initial monitor(s) to your Ceph configuration | |
145 | file and save the file. :: | |
146 | ||
147 | mon host = {ip-address}[,{ip-address}] | |
148 | ||
149 | For example:: | |
150 | ||
151 | mon host = 192.168.0.1 | |
152 | ||
153 | **Note:** You may use IPv6 addresses instead of IPv4 addresses, but | |
154 | you must set ``ms bind ipv6`` to ``true``. See `Network Configuration | |
155 | Reference`_ for details about network configuration. | |
156 | ||
157 | #. Create a keyring for your cluster and generate a monitor secret key. :: | |
158 | ||
159 | ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *' | |
160 | ||
161 | ||
162 | #. Generate an administrator keyring, generate a ``client.admin`` user and add | |
163 | the user to the keyring. :: | |
164 | ||
165 | sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow' | |
166 | ||
167 | ||
168 | #. Add the ``client.admin`` key to the ``ceph.mon.keyring``. :: | |
169 | ||
170 | ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring | |
171 | ||
172 | ||
173 | #. Generate a monitor map using the hostname(s), host IP address(es) and the FSID. | |
174 | Save it as ``/tmp/monmap``:: | |
175 | ||
176 | monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap | |
177 | ||
178 | For example:: | |
179 | ||
180 | monmaptool --create --add node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap | |
181 | ||
182 | ||
183 | #. Create a default data directory (or directories) on the monitor host(s). :: | |
184 | ||
185 | sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname} | |
186 | ||
187 | For example:: | |
188 | ||
189 | sudo mkdir /var/lib/ceph/mon/ceph-node1 | |
190 | ||
191 | See `Monitor Config Reference - Data`_ for details. | |
192 | ||
193 | #. Populate the monitor daemon(s) with the monitor map and keyring. :: | |
194 | ||
195 | sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring | |
196 | ||
197 | For example:: | |
198 | ||
199 | sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring | |
200 | ||
201 | ||
202 | #. Consider settings for a Ceph configuration file. Common settings include | |
203 | the following:: | |
204 | ||
205 | [global] | |
206 | fsid = {cluster-id} | |
207 | mon initial members = {hostname}[, {hostname}] | |
208 | mon host = {ip-address}[, {ip-address}] | |
209 | public network = {network}[, {network}] | |
210 | cluster network = {network}[, {network}] | |
211 | auth cluster required = cephx | |
212 | auth service required = cephx | |
213 | auth client required = cephx | |
214 | osd journal size = {n} | |
215 | osd pool default size = {n} # Write an object n times. | |
216 | osd pool default min size = {n} # Allow writing n copy in a degraded state. | |
217 | osd pool default pg num = {n} | |
218 | osd pool default pgp num = {n} | |
219 | osd crush chooseleaf type = {n} | |
220 | ||
221 | In the foregoing example, the ``[global]`` section of the configuration might | |
222 | look like this:: | |
223 | ||
224 | [global] | |
225 | fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
226 | mon initial members = node1 | |
227 | mon host = 192.168.0.1 | |
228 | public network = 192.168.0.0/24 | |
229 | auth cluster required = cephx | |
230 | auth service required = cephx | |
231 | auth client required = cephx | |
232 | osd journal size = 1024 | |
233 | osd pool default size = 2 | |
234 | osd pool default min size = 1 | |
235 | osd pool default pg num = 333 | |
236 | osd pool default pgp num = 333 | |
237 | osd crush chooseleaf type = 1 | |
238 | ||
239 | #. Touch the ``done`` file. | |
240 | ||
241 | Mark that the monitor is created and ready to be started:: | |
242 | ||
243 | sudo touch /var/lib/ceph/mon/ceph-node1/done | |
244 | ||
245 | #. Start the monitor(s). | |
246 | ||
247 | For Ubuntu, use Upstart:: | |
248 | ||
249 | sudo start ceph-mon id=node1 [cluster={cluster-name}] | |
250 | ||
251 | In this case, to allow the start of the daemon at each reboot you | |
252 | must create two empty files like this:: | |
253 | ||
254 | sudo touch /var/lib/ceph/mon/{cluster-name}-{hostname}/upstart | |
255 | ||
256 | For example:: | |
257 | ||
258 | sudo touch /var/lib/ceph/mon/ceph-node1/upstart | |
259 | ||
260 | For Debian/CentOS/RHEL, use sysvinit:: | |
261 | ||
262 | sudo /etc/init.d/ceph start mon.node1 | |
263 | ||
264 | ||
265 | #. Verify that Ceph created the default pools. :: | |
266 | ||
267 | ceph osd lspools | |
268 | ||
269 | You should see output like this:: | |
270 | ||
271 | 0 data,1 metadata,2 rbd, | |
272 | ||
273 | ||
274 | #. Verify that the monitor is running. :: | |
275 | ||
276 | ceph -s | |
277 | ||
278 | You should see output that the monitor you started is up and running, and | |
279 | you should see a health error indicating that placement groups are stuck | |
280 | inactive. It should look something like this:: | |
281 | ||
282 | cluster a7f64266-0894-4f1e-a635-d0aeaca0e993 | |
283 | health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds | |
284 | monmap e1: 1 mons at {node1=192.168.0.1:6789/0}, election epoch 1, quorum 0 node1 | |
285 | osdmap e1: 0 osds: 0 up, 0 in | |
286 | pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects | |
287 | 0 kB used, 0 kB / 0 kB avail | |
288 | 192 creating | |
289 | ||
290 | **Note:** Once you add OSDs and start them, the placement group health errors | |
291 | should disappear. See the next section for details. | |
292 | ||
31f18b77 FG |
293 | Manager daemon configuration |
294 | ============================ | |
295 | ||
296 | On each node where you run a ceph-mon daemon, you should also set up a ceph-mgr daemon. | |
297 | ||
224ce89b | 298 | See :doc:`../mgr/administrator` |
7c673cae FG |
299 | |
300 | Adding OSDs | |
301 | =========== | |
302 | ||
303 | Once you have your initial monitor(s) running, you should add OSDs. Your cluster | |
304 | cannot reach an ``active + clean`` state until you have enough OSDs to handle the | |
305 | number of copies of an object (e.g., ``osd pool default size = 2`` requires at | |
306 | least two OSDs). After bootstrapping your monitor, your cluster has a default | |
307 | CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to | |
308 | a Ceph Node. | |
309 | ||
310 | ||
311 | Short Form | |
312 | ---------- | |
313 | ||
314 | Ceph provides the ``ceph-disk`` utility, which can prepare a disk, partition or | |
315 | directory for use with Ceph. The ``ceph-disk`` utility creates the OSD ID by | |
316 | incrementing the index. Additionally, ``ceph-disk`` will add the new OSD to the | |
317 | CRUSH map under the host for you. Execute ``ceph-disk -h`` for CLI details. | |
318 | The ``ceph-disk`` utility automates the steps of the `Long Form`_ below. To | |
319 | create the first two OSDs with the short form procedure, execute the following | |
320 | on ``node2`` and ``node3``: | |
321 | ||
322 | ||
323 | #. Prepare the OSD. :: | |
324 | ||
325 | ssh {node-name} | |
224ce89b | 326 | sudo ceph-disk prepare --cluster {cluster-name} --cluster-uuid {uuid} {data-path} [{journal-path}] |
7c673cae FG |
327 | |
328 | For example:: | |
329 | ||
330 | ssh node1 | |
331 | sudo ceph-disk prepare --cluster ceph --cluster-uuid a7f64266-0894-4f1e-a635-d0aeaca0e993 --fs-type ext4 /dev/hdd1 | |
332 | ||
333 | ||
334 | #. Activate the OSD:: | |
335 | ||
336 | sudo ceph-disk activate {data-path} [--activate-key {path}] | |
337 | ||
338 | For example:: | |
339 | ||
340 | sudo ceph-disk activate /dev/hdd1 | |
341 | ||
342 | **Note:** Use the ``--activate-key`` argument if you do not have a copy | |
343 | of ``/var/lib/ceph/bootstrap-osd/{cluster}.keyring`` on the Ceph Node. | |
344 | ||
345 | ||
346 | Long Form | |
347 | --------- | |
348 | ||
349 | Without the benefit of any helper utilities, create an OSD and add it to the | |
350 | cluster and CRUSH map with the following procedure. To create the first two | |
351 | OSDs with the long form procedure, execute the following on ``node2`` and | |
352 | ``node3``: | |
353 | ||
354 | #. Connect to the OSD host. :: | |
355 | ||
356 | ssh {node-name} | |
357 | ||
358 | #. Generate a UUID for the OSD. :: | |
359 | ||
360 | uuidgen | |
361 | ||
362 | ||
363 | #. Create the OSD. If no UUID is given, it will be set automatically when the | |
364 | OSD starts up. The following command will output the OSD number, which you | |
365 | will need for subsequent steps. :: | |
366 | ||
367 | ceph osd create [{uuid} [{id}]] | |
368 | ||
369 | ||
370 | #. Create the default directory on your new OSD. :: | |
371 | ||
372 | ssh {new-osd-host} | |
373 | sudo mkdir /var/lib/ceph/osd/{cluster-name}-{osd-number} | |
374 | ||
375 | ||
376 | #. If the OSD is for a drive other than the OS drive, prepare it | |
377 | for use with Ceph, and mount it to the directory you just created:: | |
378 | ||
379 | ssh {new-osd-host} | |
380 | sudo mkfs -t {fstype} /dev/{hdd} | |
381 | sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/{cluster-name}-{osd-number} | |
382 | ||
383 | ||
384 | #. Initialize the OSD data directory. :: | |
385 | ||
386 | ssh {new-osd-host} | |
387 | sudo ceph-osd -i {osd-num} --mkfs --mkkey --osd-uuid [{uuid}] | |
388 | ||
389 | The directory must be empty before you can run ``ceph-osd`` with the | |
390 | ``--mkkey`` option. In addition, the ceph-osd tool requires specification | |
391 | of custom cluster names with the ``--cluster`` option. | |
392 | ||
393 | ||
394 | #. Register the OSD authentication key. The value of ``ceph`` for | |
395 | ``ceph-{osd-num}`` in the path is the ``$cluster-$id``. If your | |
396 | cluster name differs from ``ceph``, use your cluster name instead.:: | |
397 | ||
398 | sudo ceph auth add osd.{osd-num} osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/{cluster-name}-{osd-num}/keyring | |
399 | ||
400 | ||
401 | #. Add your Ceph Node to the CRUSH map. :: | |
402 | ||
403 | ceph [--cluster {cluster-name}] osd crush add-bucket {hostname} host | |
404 | ||
405 | For example:: | |
406 | ||
407 | ceph osd crush add-bucket node1 host | |
408 | ||
409 | ||
410 | #. Place the Ceph Node under the root ``default``. :: | |
411 | ||
412 | ceph osd crush move node1 root=default | |
413 | ||
414 | ||
415 | #. Add the OSD to the CRUSH map so that it can begin receiving data. You may | |
416 | also decompile the CRUSH map, add the OSD to the device list, add the host as a | |
417 | bucket (if it's not already in the CRUSH map), add the device as an item in the | |
418 | host, assign it a weight, recompile it and set it. :: | |
419 | ||
420 | ceph [--cluster {cluster-name}] osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...] | |
421 | ||
422 | For example:: | |
423 | ||
424 | ceph osd crush add osd.0 1.0 host=node1 | |
425 | ||
426 | ||
427 | #. After you add an OSD to Ceph, the OSD is in your configuration. However, | |
428 | it is not yet running. The OSD is ``down`` and ``in``. You must start | |
429 | your new OSD before it can begin receiving data. | |
430 | ||
431 | For Ubuntu, use Upstart:: | |
432 | ||
433 | sudo start ceph-osd id={osd-num} [cluster={cluster-name}] | |
434 | ||
435 | For example:: | |
436 | ||
437 | sudo start ceph-osd id=0 | |
438 | sudo start ceph-osd id=1 | |
439 | ||
440 | For Debian/CentOS/RHEL, use sysvinit:: | |
441 | ||
442 | sudo /etc/init.d/ceph start osd.{osd-num} [--cluster {cluster-name}] | |
443 | ||
444 | For example:: | |
445 | ||
446 | sudo /etc/init.d/ceph start osd.0 | |
447 | sudo /etc/init.d/ceph start osd.1 | |
448 | ||
449 | In this case, to allow the start of the daemon at each reboot you | |
450 | must create an empty file like this:: | |
451 | ||
452 | sudo touch /var/lib/ceph/osd/{cluster-name}-{osd-num}/sysvinit | |
453 | ||
454 | For example:: | |
455 | ||
456 | sudo touch /var/lib/ceph/osd/ceph-0/sysvinit | |
457 | sudo touch /var/lib/ceph/osd/ceph-1/sysvinit | |
458 | ||
459 | Once you start your OSD, it is ``up`` and ``in``. | |
460 | ||
461 | ||
462 | ||
463 | Adding MDS | |
464 | ========== | |
465 | ||
466 | In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine. | |
467 | ||
468 | #. Create the mds data directory.:: | |
469 | ||
470 | mkdir -p /var/lib/ceph/mds/{cluster-name}-{id} | |
471 | ||
472 | #. Create a keyring.:: | |
473 | ||
474 | ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id} | |
475 | ||
476 | #. Import the keyring and set caps.:: | |
477 | ||
478 | ceph auth add mds.{id} osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring | |
479 | ||
480 | #. Add to ceph.conf.:: | |
481 | ||
482 | [mds.{id}] | |
483 | host = {id} | |
484 | ||
485 | #. Start the daemon the manual way.:: | |
486 | ||
487 | ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f] | |
488 | ||
489 | #. Start the daemon the right way (using ceph.conf entry).:: | |
490 | ||
491 | service ceph start | |
492 | ||
493 | #. If starting the daemon fails with this error:: | |
494 | ||
495 | mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument | |
496 | ||
497 | Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output. | |
498 | ||
499 | #. Now you are ready to `create a Ceph filesystem`_. | |
500 | ||
501 | ||
502 | Summary | |
503 | ======= | |
504 | ||
505 | Once you have your monitor and two OSDs up and running, you can watch the | |
506 | placement groups peer by executing the following:: | |
507 | ||
508 | ceph -w | |
509 | ||
510 | To view the tree, execute the following:: | |
511 | ||
512 | ceph osd tree | |
513 | ||
514 | You should see output that looks something like this:: | |
515 | ||
516 | # id weight type name up/down reweight | |
517 | -1 2 root default | |
518 | -2 2 host node1 | |
519 | 0 1 osd.0 up 1 | |
520 | -3 1 host node2 | |
521 | 1 1 osd.1 up 1 | |
522 | ||
523 | To add (or remove) additional monitors, see `Add/Remove Monitors`_. | |
524 | To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_. | |
525 | ||
526 | ||
527 | .. _federated architecture: ../../radosgw/federated-config | |
528 | .. _Installation (Quick): ../../start | |
529 | .. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons | |
530 | .. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds | |
531 | .. _Network Configuration Reference: ../../rados/configuration/network-config-ref | |
532 | .. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data | |
533 | .. _create a Ceph filesystem: ../../cephfs/createfs |