]> git.proxmox.com Git - ceph.git/blame - ceph/doc/install/manual-freebsd-deployment.rst
bump version to 18.2.4-pve3
[ceph.git] / ceph / doc / install / manual-freebsd-deployment.rst
CommitLineData
31f18b77
FG
1==============================
2 Manual Deployment on FreeBSD
3==============================
4
5This a largely a copy of the regular Manual Deployment with FreeBSD specifics.
6The difference lies in two parts: The underlying diskformat, and the way to use
7the tools.
8
9All Ceph clusters require at least one monitor, and at least as many OSDs as
10copies of an object stored on the cluster. Bootstrapping the initial monitor(s)
11is the first step in deploying a Ceph Storage Cluster. Monitor deployment also
12sets important criteria for the entire cluster, such as the number of replicas
13for pools, the number of placement groups per OSD, the heartbeat intervals,
14whether authentication is required, etc. Most of these values are set by
15default, so it's useful to know about them when setting up your cluster for
16production.
17
f67539c2
TL
18We will set up a cluster with ``node1`` as the monitor node, and ``node2`` and
19``node3`` for OSD nodes.
31f18b77
FG
20
21
22
11fdf7f2 23.. ditaa::
f91f0fd5 24
31f18b77
FG
25 /------------------\ /----------------\
26 | Admin Node | | node1 |
27 | +-------->+ |
28 | | | cCCC |
29 \---------+--------/ \----------------/
30 |
31 | /----------------\
32 | | node2 |
33 +----------------->+ |
34 | | cCCC |
35 | \----------------/
36 |
37 | /----------------\
38 | | node3 |
39 +----------------->| |
40 | cCCC |
41 \----------------/
42
43
44
45Disklayout on FreeBSD
46=====================
47
48Current implementation works on ZFS pools
49
50* All Ceph data is created in /var/lib/ceph
51* Log files go into /var/log/ceph
52* PID files go into /var/log/run
53* One ZFS pool is allocated per OSD, like::
54
55 gpart create -s GPT ada1
11fdf7f2
TL
56 gpart add -t freebsd-zfs -l osd.1 ada1
57 zpool create -m /var/lib/ceph/osd/osd.1 osd.1 gpt/osd.1
31f18b77
FG
58
59* Some cache and log (ZIL) can be attached.
60 Please note that this is different from the Ceph journals. Cache and log are
9f95a23c 61 totally transparent for Ceph, and help the file system to keep the system
11fdf7f2 62 consistent and help performance.
31f18b77 63 Assuming that ada2 is an SSD::
11fdf7f2 64
31f18b77 65 gpart create -s GPT ada2
11fdf7f2
TL
66 gpart add -t freebsd-zfs -l osd.1-log -s 1G ada2
67 zpool add osd.1 log gpt/osd.1-log
68 gpart add -t freebsd-zfs -l osd.1-cache -s 10G ada2
69 zpool add osd.1 log gpt/osd.1-cache
31f18b77
FG
70
71* Note: *UFS2 does not allow large xattribs*
72
73
74Configuration
75-------------
76
77As per FreeBSD default parts of extra software go into ``/usr/local/``. Which
11fdf7f2 78means that for ``/etc/ceph.conf`` the default location is
31f18b77
FG
79``/usr/local/etc/ceph/ceph.conf``. Smartest thing to do is to create a softlink
80from ``/etc/ceph`` to ``/usr/local/etc/ceph``::
81
82 ln -s /usr/local/etc/ceph /etc/ceph
11fdf7f2 83
31f18b77 84A sample file is provided in ``/usr/local/share/doc/ceph/sample.ceph.conf``
11fdf7f2 85Note that ``/usr/local/etc/ceph/ceph.conf`` will be found by most tools,
31f18b77
FG
86linking it to ``/etc/ceph/ceph.conf`` will help with any scripts that are found
87in extra tools, scripts, and/or discussionlists.
88
89Monitor Bootstrapping
90=====================
91
92Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires
93a number of things:
94
11fdf7f2
TL
95- **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
96 and stands for File System ID from the days when the Ceph Storage Cluster was
9f95a23c 97 principally for the Ceph File System. Ceph now supports native interfaces,
11fdf7f2 98 block devices, and object storage gateway interfaces too, so ``fsid`` is a
31f18b77
FG
99 bit of a misnomer.
100
101- **Cluster Name:** Ceph clusters have a cluster name, which is a simple string
102 without spaces. The default cluster name is ``ceph``, but you may specify
11fdf7f2
TL
103 a different cluster name. Overriding the default cluster name is
104 especially useful when you are working with multiple clusters and you need to
105 clearly understand which cluster your are working with.
106
107 For example, when you run multiple clusters in a :ref:`multisite configuration <multisite>`,
31f18b77 108 the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for
11fdf7f2
TL
109 the current CLI session. **Note:** To identify the cluster name on the
110 command line interface, specify the a Ceph configuration file with the
31f18b77
FG
111 cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.).
112 Also see CLI usage (``ceph --cluster {cluster-name}``).
11fdf7f2
TL
113
114- **Monitor Name:** Each monitor instance within a cluster has a unique name.
31f18b77 115 In common practice, the Ceph Monitor name is the host name (we recommend one
11fdf7f2 116 Ceph Monitor per host, and no commingling of Ceph OSD Daemons with
31f18b77
FG
117 Ceph Monitors). You may retrieve the short hostname with ``hostname -s``.
118
11fdf7f2
TL
119- **Monitor Map:** Bootstrapping the initial monitor(s) requires you to
120 generate a monitor map. The monitor map requires the ``fsid``, the cluster
31f18b77
FG
121 name (or uses the default), and at least one host name and its IP address.
122
11fdf7f2
TL
123- **Monitor Keyring**: Monitors communicate with each other via a
124 secret key. You must generate a keyring with a monitor secret and provide
31f18b77 125 it when bootstrapping the initial monitor(s).
11fdf7f2 126
31f18b77
FG
127- **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have
128 a ``client.admin`` user. So you must generate the admin user and keyring,
129 and you must also add the ``client.admin`` user to the monitor keyring.
130
11fdf7f2
TL
131The foregoing requirements do not imply the creation of a Ceph Configuration
132file. However, as a best practice, we recommend creating a Ceph configuration
31f18b77
FG
133file and populating it with the ``fsid``, the ``mon initial members`` and the
134``mon host`` settings.
135
136You can get and set all of the monitor settings at runtime as well. However,
11fdf7f2 137a Ceph Configuration file may contain only those settings that override the
31f18b77 138default values. When you add settings to a Ceph configuration file, these
11fdf7f2 139settings override the default settings. Maintaining those settings in a
31f18b77
FG
140Ceph configuration file makes it easier to maintain your cluster.
141
142The procedure is as follows:
143
144
145#. Log in to the initial monitor node(s)::
146
147 ssh {hostname}
148
11fdf7f2 149 For example::
31f18b77
FG
150
151 ssh node1
152
153
11fdf7f2
TL
154#. Ensure you have a directory for the Ceph configuration file. By default,
155 Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will
31f18b77
FG
156 create the ``/etc/ceph`` directory automatically. ::
157
11fdf7f2 158 ls /etc/ceph
31f18b77 159
11fdf7f2 160#. Create a Ceph configuration file. By default, Ceph uses
31f18b77
FG
161 ``ceph.conf``, where ``ceph`` reflects the cluster name. ::
162
163 sudo vim /etc/ceph/ceph.conf
164
165
11fdf7f2 166#. Generate a unique ID (i.e., ``fsid``) for your cluster. ::
31f18b77
FG
167
168 uuidgen
31f18b77 169
11fdf7f2
TL
170
171#. Add the unique ID to your Ceph configuration file. ::
31f18b77
FG
172
173 fsid = {UUID}
174
11fdf7f2 175 For example::
31f18b77
FG
176
177 fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
178
179
11fdf7f2 180#. Add the initial monitor(s) to your Ceph configuration file. ::
31f18b77
FG
181
182 mon initial members = {hostname}[,{hostname}]
183
11fdf7f2 184 For example::
31f18b77
FG
185
186 mon initial members = node1
187
188
11fdf7f2
TL
189#. Add the IP address(es) of the initial monitor(s) to your Ceph configuration
190 file and save the file. ::
31f18b77
FG
191
192 mon host = {ip-address}[,{ip-address}]
193
194 For example::
195
196 mon host = 192.168.0.1
197
198 **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
199 you must set ``ms bind ipv6`` to ``true``. See `Network Configuration
200 Reference`_ for details about network configuration.
201
202#. Create a keyring for your cluster and generate a monitor secret key. ::
203
204 ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
205
206
207#. Generate an administrator keyring, generate a ``client.admin`` user and add
11fdf7f2 208 the user to the keyring. ::
31f18b77 209
11fdf7f2 210 sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
31f18b77
FG
211
212
11fdf7f2 213#. Add the ``client.admin`` key to the ``ceph.mon.keyring``. ::
31f18b77
FG
214
215 ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
216
217
11fdf7f2
TL
218#. Generate a monitor map using the hostname(s), host IP address(es) and the FSID.
219 Save it as ``/tmp/monmap``::
31f18b77
FG
220
221 monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap
222
223 For example::
224
225 monmaptool --create --add node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap
226
227
228#. Create a default data directory (or directories) on the monitor host(s). ::
229
230 sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname}
231
232 For example::
233
234 sudo mkdir /var/lib/ceph/mon/ceph-node1
235
236 See `Monitor Config Reference - Data`_ for details.
237
238#. Populate the monitor daemon(s) with the monitor map and keyring. ::
239
240 sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
241
242 For example::
243
244 sudo -u ceph ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
245
246
11fdf7f2 247#. Consider settings for a Ceph configuration file. Common settings include
31f18b77
FG
248 the following::
249
250 [global]
251 fsid = {cluster-id}
252 mon initial members = {hostname}[, {hostname}]
253 mon host = {ip-address}[, {ip-address}]
254 public network = {network}[, {network}]
255 cluster network = {network}[, {network}]
256 auth cluster required = cephx
257 auth service required = cephx
258 auth client required = cephx
259 osd journal size = {n}
260 osd pool default size = {n} # Write an object n times.
261 osd pool default min size = {n} # Allow writing n copy in a degraded state.
262 osd pool default pg num = {n}
11fdf7f2 263 osd pool default pgp num = {n}
31f18b77
FG
264 osd crush chooseleaf type = {n}
265
266 In the foregoing example, the ``[global]`` section of the configuration might
267 look like this::
268
269 [global]
270 fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
271 mon initial members = node1
272 mon host = 192.168.0.1
273 public network = 192.168.0.0/24
274 auth cluster required = cephx
275 auth service required = cephx
276 auth client required = cephx
277 osd journal size = 1024
11fdf7f2
TL
278 osd pool default size = 3
279 osd pool default min size = 2
31f18b77 280 osd pool default pg num = 333
11fdf7f2 281 osd pool default pgp num = 333
31f18b77
FG
282 osd crush chooseleaf type = 1
283
284#. Touch the ``done`` file.
285
286 Mark that the monitor is created and ready to be started::
287
288 sudo touch /var/lib/ceph/mon/ceph-node1/done
289
11fdf7f2 290#. And for FreeBSD an entry for every monitor needs to be added to the config
31f18b77 291 file. (The requirement will be removed in future releases).
11fdf7f2 292
31f18b77 293 The entry should look like::
11fdf7f2 294
31f18b77
FG
295 [mon]
296 [mon.node1]
297 host = node1 # this name can be resolve
298
299
300#. Start the monitor(s).
301
31f18b77
FG
302 For FreeBSD we use the rc.d init scripts (called bsdrc in Ceph)::
303
304 sudo service ceph start start mon.node1
11fdf7f2 305
31f18b77
FG
306 For this to work /etc/rc.conf also needs the entry to enable ceph::
307 cat 'ceph_enable="YES"' >> /etc/rc.conf
308
309
310#. Verify that Ceph created the default pools. ::
311
312 ceph osd lspools
313
314 You should see output like this::
315
11fdf7f2
TL
316 0 data
317 1 metadata
318 2 rbd
31f18b77 319
11fdf7f2 320#. Verify that the monitor is running. ::
31f18b77
FG
321
322 ceph -s
323
324 You should see output that the monitor you started is up and running, and
325 you should see a health error indicating that placement groups are stuck
11fdf7f2 326 inactive. It should look something like this::
31f18b77
FG
327
328 cluster a7f64266-0894-4f1e-a635-d0aeaca0e993
329 health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
330 monmap e1: 1 mons at {node1=192.168.0.1:6789/0}, election epoch 1, quorum 0 node1
331 osdmap e1: 0 osds: 0 up, 0 in
332 pgmap v2: 192 pgs, 3 pools, 0 bytes data, 0 objects
333 0 kB used, 0 kB / 0 kB avail
334 192 creating
335
336 **Note:** Once you add OSDs and start them, the placement group health errors
337 should disappear. See the next section for details.
338
11fdf7f2 339.. _freebsd_adding_osds:
31f18b77
FG
340
341Adding OSDs
342===========
343
344Once you have your initial monitor(s) running, you should add OSDs. Your cluster
345cannot reach an ``active + clean`` state until you have enough OSDs to handle the
346number of copies of an object (e.g., ``osd pool default size = 2`` requires at
347least two OSDs). After bootstrapping your monitor, your cluster has a default
11fdf7f2 348CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to
31f18b77
FG
349a Ceph Node.
350
351
31f18b77
FG
352Long Form
353---------
354
355Without the benefit of any helper utilities, create an OSD and add it to the
356cluster and CRUSH map with the following procedure. To create the first two
357OSDs with the long form procedure, execute the following on ``node2`` and
358``node3``:
359
11fdf7f2 360#. Connect to the OSD host. ::
31f18b77
FG
361
362 ssh {node-name}
363
364#. Generate a UUID for the OSD. ::
365
366 uuidgen
367
368
11fdf7f2
TL
369#. Create the OSD. If no UUID is given, it will be set automatically when the
370 OSD starts up. The following command will output the OSD number, which you
31f18b77 371 will need for subsequent steps. ::
11fdf7f2 372
31f18b77
FG
373 ceph osd create [{uuid} [{id}]]
374
375
11fdf7f2 376#. Create the default directory on your new OSD. ::
31f18b77
FG
377
378 ssh {new-osd-host}
379 sudo mkdir /var/lib/ceph/osd/{cluster-name}-{osd-number}
11fdf7f2 380
31f18b77
FG
381 Above are the ZFS instructions to do this for FreeBSD.
382
31f18b77 383
11fdf7f2
TL
384#. If the OSD is for a drive other than the OS drive, prepare it
385 for use with Ceph, and mount it to the directory you just created.
386
387
388#. Initialize the OSD data directory. ::
31f18b77
FG
389
390 ssh {new-osd-host}
391 sudo ceph-osd -i {osd-num} --mkfs --mkkey --osd-uuid [{uuid}]
11fdf7f2
TL
392
393 The directory must be empty before you can run ``ceph-osd`` with the
31f18b77
FG
394 ``--mkkey`` option. In addition, the ceph-osd tool requires specification
395 of custom cluster names with the ``--cluster`` option.
396
397
11fdf7f2
TL
398#. Register the OSD authentication key. The value of ``ceph`` for
399 ``ceph-{osd-num}`` in the path is the ``$cluster-$id``. If your
31f18b77
FG
400 cluster name differs from ``ceph``, use your cluster name instead.::
401
402 sudo ceph auth add osd.{osd-num} osd 'allow *' mon 'allow profile osd' -i /var/lib/ceph/osd/{cluster-name}-{osd-num}/keyring
403
404
405#. Add your Ceph Node to the CRUSH map. ::
406
407 ceph [--cluster {cluster-name}] osd crush add-bucket {hostname} host
408
409 For example::
410
411 ceph osd crush add-bucket node1 host
412
413
414#. Place the Ceph Node under the root ``default``. ::
415
416 ceph osd crush move node1 root=default
417
418
419#. Add the OSD to the CRUSH map so that it can begin receiving data. You may
420 also decompile the CRUSH map, add the OSD to the device list, add the host as a
421 bucket (if it's not already in the CRUSH map), add the device as an item in the
422 host, assign it a weight, recompile it and set it. ::
423
424 ceph [--cluster {cluster-name}] osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
425
426 For example::
427
428 ceph osd crush add osd.0 1.0 host=node1
429
430
11fdf7f2
TL
431#. After you add an OSD to Ceph, the OSD is in your configuration. However,
432 it is not yet running. The OSD is ``down`` and ``in``. You must start
31f18b77
FG
433 your new OSD before it can begin receiving data.
434
224ce89b 435 For FreeBSD using rc.d init.
31f18b77 436
224ce89b 437 After adding the OSD to ``ceph.conf``::
11fdf7f2
TL
438
439 sudo service ceph start osd.{osd-num}
31f18b77
FG
440
441 For example::
442
443 sudo service ceph start osd.0
444 sudo service ceph start osd.1
445
446 In this case, to allow the start of the daemon at each reboot you
447 must create an empty file like this::
448
449 sudo touch /var/lib/ceph/osd/{cluster-name}-{osd-num}/bsdrc
450
451 For example::
452
453 sudo touch /var/lib/ceph/osd/ceph-0/bsdrc
454 sudo touch /var/lib/ceph/osd/ceph-1/bsdrc
455
456 Once you start your OSD, it is ``up`` and ``in``.
457
458
459
460Adding MDS
461==========
462
463In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine.
464
465#. Create the mds data directory.::
466
467 mkdir -p /var/lib/ceph/mds/{cluster-name}-{id}
468
469#. Create a keyring.::
470
471 ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id}
11fdf7f2 472
31f18b77
FG
473#. Import the keyring and set caps.::
474
f67539c2 475 ceph auth add mds.{id} osd "allow rwx" mds "allow *" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring
11fdf7f2 476
31f18b77
FG
477#. Add to ceph.conf.::
478
479 [mds.{id}]
480 host = {id}
481
482#. Start the daemon the manual way.::
483
484 ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f]
485
486#. Start the daemon the right way (using ceph.conf entry).::
487
488 service ceph start
489
490#. If starting the daemon fails with this error::
491
492 mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument
493
494 Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
495
9f95a23c 496#. Now you are ready to `create a Ceph file system`_.
31f18b77
FG
497
498
499Summary
500=======
501
502Once you have your monitor and two OSDs up and running, you can watch the
11fdf7f2 503placement groups peer by executing the following::
31f18b77
FG
504
505 ceph -w
506
11fdf7f2 507To view the tree, execute the following::
31f18b77
FG
508
509 ceph osd tree
11fdf7f2
TL
510
511You should see output that looks something like this::
31f18b77
FG
512
513 # id weight type name up/down reweight
514 -1 2 root default
515 -2 2 host node1
516 0 1 osd.0 up 1
517 -3 1 host node2
11fdf7f2 518 1 1 osd.1 up 1
31f18b77 519
11fdf7f2 520To add (or remove) additional monitors, see `Add/Remove Monitors`_.
31f18b77
FG
521To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_.
522
523
31f18b77
FG
524.. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons
525.. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
526.. _Network Configuration Reference: ../../rados/configuration/network-config-ref
527.. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
9f95a23c 528.. _create a Ceph file system: ../../cephfs/createfs