ceph/doc/install/manual-deployment.rst

   1 ===================
   2  Manual Deployment
   3 ===================
   4
   5 All Ceph clusters require at least one monitor, and at least as many OSDs as
   6 copies of an object stored on the cluster.  Bootstrapping the initial monitor(s)
   7 is the first step in deploying a Ceph Storage Cluster. Monitor deployment also
   8 sets important criteria for the entire cluster, such as the number of replicas
   9 for pools, the number of placement groups per OSD, the heartbeat intervals,
  10 whether authentication is required, etc. Most of these values are set by
  11 default, so it's useful to know about them when setting up your cluster for
  12 production.
  13
  14 We will set up a cluster with ``mon-node1`` as  the monitor node, and ``osd-node1`` and
  15 ``osd-node2`` for OSD nodes.
  16
  17
  18
  19 .. ditaa::
  20
  21            /------------------\         /----------------\
  22            |    Admin Node    |         |    mon-node1   |
  23            |                  +-------->+                |
  24            |                  |         | cCCC           |
  25            \---------+--------/         \----------------/
  26                      |
  27                      |                  /----------------\
  28                      |                  |    osd-node1   |
  29                      +----------------->+                |
  30                      |                  | cCCC           |
  31                      |                  \----------------/
  32                      |
  33                      |                  /----------------\
  34                      |                  |    osd-node2   |
  35                      +----------------->|                |
  36                                         | cCCC           |
  37                                         \----------------/
  38
  39
  40 Monitor Bootstrapping
  41 =====================
  42
  43 Bootstrapping a monitor (a Ceph Storage Cluster, in theory) requires
  44 a number of things:
  45
  46 - **Unique Identifier:** The ``fsid`` is a unique identifier for the cluster,
  47   and stands for File System ID from the days when the Ceph Storage Cluster was
  48   principally for the Ceph File System. Ceph now supports native interfaces,
  49   block devices, and object storage gateway interfaces too, so ``fsid`` is a
  50   bit of a misnomer.
  51
  52 - **Cluster Name:** Ceph clusters have a cluster name, which is a simple string
  53   without spaces. The default cluster name is ``ceph``, but you may specify
  54   a different cluster name. Overriding the default cluster name is
  55   especially useful when you are working with multiple clusters and you need to
  56   clearly understand which cluster your are working with.
  57
  58   For example, when you run multiple clusters in a :ref:`multisite configuration <multisite>`,
  59   the cluster name (e.g., ``us-west``, ``us-east``) identifies the cluster for
  60   the current CLI session. **Note:** To identify the cluster name on the
  61   command line interface, specify the Ceph configuration file with the
  62   cluster name (e.g., ``ceph.conf``, ``us-west.conf``, ``us-east.conf``, etc.).
  63   Also see CLI usage (``ceph --cluster {cluster-name}``).
  64
  65 - **Monitor Name:** Each monitor instance within a cluster has a unique name.
  66   In common practice, the Ceph Monitor name is the host name (we recommend one
  67   Ceph Monitor per host, and no commingling of Ceph OSD Daemons with
  68   Ceph Monitors). You may retrieve the short hostname with ``hostname -s``.
  69
  70 - **Monitor Map:** Bootstrapping the initial monitor(s) requires you to
  71   generate a monitor map. The monitor map requires the ``fsid``, the cluster
  72   name (or uses the default), and at least one host name and its IP address.
  73
  74 - **Monitor Keyring**: Monitors communicate with each other via a
  75   secret key. You must generate a keyring with a monitor secret and provide
  76   it when bootstrapping the initial monitor(s).
  77
  78 - **Administrator Keyring**: To use the ``ceph`` CLI tools, you must have
  79   a ``client.admin`` user. So you must generate the admin user and keyring,
  80   and you must also add the ``client.admin`` user to the monitor keyring.
  81
  82 The foregoing requirements do not imply the creation of a Ceph Configuration
  83 file. However, as a best practice, we recommend creating a Ceph configuration
  84 file and populating it with the ``fsid``, the ``mon initial members`` and the
  85 ``mon host`` settings.
  86
  87 You can get and set all of the monitor settings at runtime as well. However,
  88 a Ceph Configuration file may contain only those settings that override the
  89 default values. When you add settings to a Ceph configuration file, these
  90 settings override the default settings. Maintaining those settings in a
  91 Ceph configuration file makes it easier to maintain your cluster.
  92
  93 The procedure is as follows:
  94
  95
  96 #. Log in to the initial monitor node(s)::
  97
  98         ssh {hostname}
  99
 100    For example::
 101
 102         ssh mon-node1
 103
 104
 105 #. Ensure you have a directory for the Ceph configuration file. By default,
 106    Ceph uses ``/etc/ceph``. When you install ``ceph``, the installer will
 107    create the ``/etc/ceph`` directory automatically. ::
 108
 109         ls /etc/ceph
 110
 111
 112 #. Create a Ceph configuration file. By default, Ceph uses
 113    ``ceph.conf``, where ``ceph`` reflects the cluster name. Add a line
 114    containing "[global]" to the configuration file. ::
 115
 116         sudo vim /etc/ceph/ceph.conf
 117
 118
 119 #. Generate a unique ID (i.e., ``fsid``) for your cluster. ::
 120
 121         uuidgen
 122
 123
 124 #. Add the unique ID to your Ceph configuration file. ::
 125
 126         fsid = {UUID}
 127
 128    For example::
 129
 130         fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
 131
 132
 133 #. Add the initial monitor(s) to your Ceph configuration file. ::
 134
 135         mon_initial_members = {hostname}[,{hostname}]
 136
 137    For example::
 138
 139         mon_initial_members = mon-node1
 140
 141
 142 #. Add the IP address(es) of the initial monitor(s) to your Ceph configuration
 143    file and save the file. ::
 144
 145         mon_host = {ip-address}[,{ip-address}]
 146
 147    For example::
 148
 149         mon_host = 192.168.0.1
 150
 151    **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
 152    you must set ``ms_bind_ipv6`` to ``true``. See `Network Configuration
 153    Reference`_ for details about network configuration.
 154
 155 #. Create a keyring for your cluster and generate a monitor secret key. ::
 156
 157         sudo ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
 158
 159
 160 #. Generate an administrator keyring, generate a ``client.admin`` user and add
 161    the user to the keyring. ::
 162
 163         sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
 164
 165 #. Generate a bootstrap-osd keyring, generate a ``client.bootstrap-osd`` user and add
 166    the user to the keyring. ::
 167
 168         sudo ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
 169
 170 #. Add the generated keys to the ``ceph.mon.keyring``. ::
 171
 172         sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
 173         sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
 174
 175 #. Change the owner for ``ceph.mon.keyring``. ::
 176
 177         sudo chown ceph:ceph /tmp/ceph.mon.keyring
 178
 179 #. Generate a monitor map using the hostname(s), host IP address(es) and the FSID.
 180    Save it as ``/tmp/monmap``::
 181
 182         monmaptool --create --add {hostname} {ip-address} --fsid {uuid} /tmp/monmap
 183
 184    For example::
 185
 186         monmaptool --create --add mon-node1 192.168.0.1 --fsid a7f64266-0894-4f1e-a635-d0aeaca0e993 /tmp/monmap
 187
 188
 189 #. Create a default data directory (or directories) on the monitor host(s). ::
 190
 191         sudo mkdir /var/lib/ceph/mon/{cluster-name}-{hostname}
 192
 193    For example::
 194
 195         sudo -u ceph mkdir /var/lib/ceph/mon/ceph-mon-node1
 196
 197    See `Monitor Config Reference - Data`_ for details.
 198
 199 #. Populate the monitor daemon(s) with the monitor map and keyring. ::
 200
 201         sudo -u ceph ceph-mon [--cluster {cluster-name}] --mkfs -i {hostname} --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
 202
 203    For example::
 204
 205         sudo -u ceph ceph-mon --mkfs -i mon-node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
 206
 207
 208 #. Consider settings for a Ceph configuration file. Common settings include
 209    the following::
 210
 211         [global]
 212         fsid = {cluster-id}
 213         mon_initial_members = {hostname}[, {hostname}]
 214         mon_host = {ip-address}[, {ip-address}]
 215         public_network = {network}[, {network}]
 216         cluster_network = {network}[, {network}]
 217         auth_cluster required = cephx
 218         auth_service required = cephx
 219         auth_client required = cephx
 220         osd_pool_default_size = {n}  # Write an object n times.
 221         osd_pool_default_min_size = {n} # Allow writing n copies in a degraded state.
 222         osd_pool_default_pg_num = {n}
 223         osd_crush_chooseleaf_type = {n}
 224
 225    In the foregoing example, the ``[global]`` section of the configuration might
 226    look like this::
 227
 228         [global]
 229         fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
 230         mon_initial_members = mon-node1
 231         mon_host = 192.168.0.1
 232         public_network = 192.168.0.0/24
 233         auth_cluster_required = cephx
 234         auth_service_required = cephx
 235         auth_client_required = cephx
 236         osd_pool_default_size = 3
 237         osd_pool_default_min_size = 2
 238         osd_pool_default_pg_num = 333
 239         osd_crush_chooseleaf_type = 1
 240
 241
 242 #. Start the monitor(s).
 243
 244    Start the service with systemd::
 245
 246         sudo systemctl start ceph-mon@mon-node1
 247
 248 #. Ensure to open firewall ports for ceph-mon.
 249
 250    Open the ports with firewalld::
 251
 252         sudo firewall-cmd --zone=public --add-service=ceph-mon
 253         sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent
 254
 255 #. Verify that the monitor is running. ::
 256
 257         sudo ceph -s
 258
 259    You should see output that the monitor you started is up and running, and
 260    you should see a health error indicating that placement groups are stuck
 261    inactive. It should look something like this::
 262
 263       cluster:
 264         id:     a7f64266-0894-4f1e-a635-d0aeaca0e993
 265         health: HEALTH_OK
 266
 267       services:
 268         mon: 1 daemons, quorum mon-node1
 269         mgr: mon-node1(active)
 270         osd: 0 osds: 0 up, 0 in
 271
 272       data:
 273         pools:   0 pools, 0 pgs
 274         objects: 0 objects, 0 bytes
 275         usage:   0 kB used, 0 kB / 0 kB avail
 276         pgs:
 277
 278
 279    **Note:** Once you add OSDs and start them, the placement group health errors
 280    should disappear. See `Adding OSDs`_ for details.
 281
 282 Manager daemon configuration
 283 ============================
 284
 285 On each node where you run a ceph-mon daemon, you should also set up a ceph-mgr daemon.
 286
 287 See :ref:`mgr-administrator-guide`
 288
 289 Adding OSDs
 290 ===========
 291
 292 Once you have your initial monitor(s) running, you should add OSDs. Your cluster
 293 cannot reach an ``active + clean`` state until you have enough OSDs to handle the
 294 number of copies of an object (e.g., ``osd_pool_default_size = 2`` requires at
 295 least two OSDs). After bootstrapping your monitor, your cluster has a default
 296 CRUSH map; however, the CRUSH map doesn't have any Ceph OSD Daemons mapped to
 297 a Ceph Node.
 298
 299
 300 Short Form
 301 ----------
 302
 303 Ceph provides the ``ceph-volume`` utility, which can prepare a logical volume, disk, or partition
 304 for use with Ceph. The ``ceph-volume`` utility creates the OSD ID by
 305 incrementing the index. Additionally, ``ceph-volume`` will add the new OSD to the
 306 CRUSH map under the host for you. Execute ``ceph-volume -h`` for CLI details.
 307 The ``ceph-volume`` utility automates the steps of the `Long Form`_ below. To
 308 create the first two OSDs with the short form procedure, execute the following for each OSD:
 309
 310 #. Create the OSD. ::
 311
 312         copy /var/lib/ceph/bootstrap-osd/ceph.keyring from monitor node (mon-node1) to /var/lib/ceph/bootstrap-osd/ceph.keyring on osd node (osd-node1)
 313         ssh {osd node}
 314         sudo ceph-volume lvm create --data {data-path}
 315
 316    For example::
 317
 318         scp -3 root@mon-node1:/var/lib/ceph/bootstrap-osd/ceph.keyring root@osd-node1:/var/lib/ceph/bootstrap-osd/ceph.keyring
 319
 320         ssh osd-node1
 321         sudo ceph-volume lvm create --data /dev/hdd1
 322
 323 Alternatively, the creation process can be split in two phases (prepare, and
 324 activate):
 325
 326 #. Prepare the OSD. ::
 327
 328         ssh {osd node}
 329         sudo ceph-volume lvm prepare --data {data-path} {data-path}
 330
 331    For example::
 332
 333         ssh osd-node1
 334         sudo ceph-volume lvm prepare --data /dev/hdd1
 335
 336    Once prepared, the ``ID`` and ``FSID`` of the prepared OSD are required for
 337    activation. These can be obtained by listing OSDs in the current server::
 338
 339     sudo ceph-volume lvm list
 340
 341 #. Activate the OSD::
 342
 343         sudo ceph-volume lvm activate {ID} {FSID}
 344
 345    For example::
 346
 347         sudo ceph-volume lvm activate 0 a7f64266-0894-4f1e-a635-d0aeaca0e993
 348
 349
 350 Long Form
 351 ---------
 352
 353 Without the benefit of any helper utilities, create an OSD and add it to the
 354 cluster and CRUSH map with the following procedure. To create the first two
 355 OSDs with the long form procedure, execute the following steps for each OSD.
 356
 357 .. note:: This procedure does not describe deployment on top of dm-crypt
 358           making use of the dm-crypt 'lockbox'.
 359
 360 #. Connect to the OSD host and become root. ::
 361
 362      ssh {node-name}
 363      sudo bash
 364
 365 #. Generate a UUID for the OSD. ::
 366
 367      UUID=$(uuidgen)
 368
 369 #. Generate a cephx key for the OSD. ::
 370
 371      OSD_SECRET=$(ceph-authtool --gen-print-key)
 372
 373 #. Create the OSD. Note that an OSD ID can be provided as an
 374    additional argument to ``ceph osd new`` if you need to reuse a
 375    previously-destroyed OSD id. We assume that the
 376    ``client.bootstrap-osd`` key is present on the machine.  You may
 377    alternatively execute this command as ``client.admin`` on a
 378    different host where that key is present.::
 379
 380      ID=$(echo "{\"cephx_secret\": \"$OSD_SECRET\"}" | \
 381         ceph osd new $UUID -i - \
 382         -n client.bootstrap-osd -k /var/lib/ceph/bootstrap-osd/ceph.keyring)
 383
 384    It is also possible to include a ``crush_device_class`` property in the JSON
 385    to set an initial class other than the default (``ssd`` or ``hdd`` based on
 386    the auto-detected device type).
 387
 388 #. Create the default directory on your new OSD. ::
 389
 390      mkdir /var/lib/ceph/osd/ceph-$ID
 391
 392 #. If the OSD is for a drive other than the OS drive, prepare it
 393    for use with Ceph, and mount it to the directory you just created. ::
 394
 395      mkfs.xfs /dev/{DEV}
 396      mount /dev/{DEV} /var/lib/ceph/osd/ceph-$ID
 397
 398 #. Write the secret to the OSD keyring file. ::
 399
 400      ceph-authtool --create-keyring /var/lib/ceph/osd/ceph-$ID/keyring \
 401           --name osd.$ID --add-key $OSD_SECRET
 402
 403 #. Initialize the OSD data directory. ::
 404
 405      ceph-osd -i $ID --mkfs --osd-uuid $UUID
 406
 407 #. Fix ownership. ::
 408
 409      chown -R ceph:ceph /var/lib/ceph/osd/ceph-$ID
 410
 411 #. After you add an OSD to Ceph, the OSD is in your configuration. However,
 412    it is not yet running. You must start
 413    your new OSD before it can begin receiving data.
 414
 415    For modern systemd distributions::
 416
 417      systemctl enable ceph-osd@$ID
 418      systemctl start ceph-osd@$ID
 419
 420    For example::
 421
 422      systemctl enable ceph-osd@12
 423      systemctl start ceph-osd@12
 424
 425
 426 Adding MDS
 427 ==========
 428
 429 In the below instructions, ``{id}`` is an arbitrary name, such as the hostname of the machine.
 430
 431 #. Create the mds data directory.::
 432
 433         mkdir -p /var/lib/ceph/mds/{cluster-name}-{id}
 434
 435 #. Create a keyring.::
 436
 437         ceph-authtool --create-keyring /var/lib/ceph/mds/{cluster-name}-{id}/keyring --gen-key -n mds.{id}
 438
 439 #. Import the keyring and set caps.::
 440
 441         ceph auth add mds.{id} osd "allow rwx" mds "allow *" mon "allow profile mds" -i /var/lib/ceph/mds/{cluster}-{id}/keyring
 442
 443 #. Add to ceph.conf.::
 444
 445         [mds.{id}]
 446         host = {id}
 447
 448 #. Start the daemon the manual way.::
 449
 450         ceph-mds --cluster {cluster-name} -i {id} -m {mon-hostname}:{mon-port} [-f]
 451
 452 #. Start the daemon the right way (using ceph.conf entry).::
 453
 454         service ceph start
 455
 456 #. If starting the daemon fails with this error::
 457
 458         mds.-1.0 ERROR: failed to authenticate: (22) Invalid argument
 459
 460    Then make sure you do not have a keyring set in ceph.conf in the global section; move it to the client section; or add a keyring setting specific to this mds daemon. And verify that you see the same key in the mds data directory and ``ceph auth get mds.{id}`` output.
 461
 462 #. Now you are ready to `create a Ceph file system`_.
 463
 464 Manually Installing RADOSGW
 465 ===========================
 466
 467 For a more involved discussion of the procedure presented here, see `this
 468 thread on the ceph-users mailing list
 469 <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/LB3YRIKAPOHXYCW7MKLVUJPYWYRQVARU/>`_.
 470
 471 #. Install ``radosgw`` packages on the nodes that will be the RGW nodes.
 472
 473 #. From a monitor or from a node with admin privileges, run a command of the
 474    following form:
 475
 476    .. prompt:: bash #
 477
 478       ceph auth get-or-create client.short-hostname-of-rgw mon 'allow rw' osd 'allow rwx'
 479
 480 #. On one of the RGW nodes, do the following:
 481
 482    a. Create a ``ceph-user``-owned directory. For example:
 483
 484       .. prompt:: bash #
 485
 486          install -d -o ceph -g ceph /var/lib/ceph/radosgw/ceph-$(hostname -s)
 487
 488    b. Enter the directory just created and create a ``keyring`` file:
 489
 490       .. prompt:: bash #
 491
 492          touch /var/lib/ceph/radosgw/ceph-$(hostname -s)/keyring
 493
 494       Use a command similar to this one to put the key from the earlier ``ceph
 495       auth get-or-create`` step in the ``keyring`` file. Use your preferred
 496       editor:
 497
 498       .. prompt:: bash #
 499
 500          $EDITOR /var/lib/ceph/radosgw/ceph-$(hostname -s)/keyring
 501
 502    c. Repeat these steps on every RGW node.
 503
 504 #. Start the RADOSGW service by running the following command:
 505
 506    .. prompt:: bash #
 507
 508       systemctl start ceph-radosgw@$(hostname -s).service
 509
 510
 511 Summary
 512 =======
 513
 514 Once you have your monitor and two OSDs up and running, you can watch the
 515 placement groups peer by executing the following::
 516
 517         ceph -w
 518
 519 To view the tree, execute the following::
 520
 521         ceph osd tree
 522
 523 You should see output that looks something like this::
 524
 525         # id    weight  type name       up/down reweight
 526         -1      2       root default
 527         -2      2               host osd-node1
 528         0       1                       osd.0   up      1
 529         -3      1               host osd-node2
 530         1       1                       osd.1   up      1
 531
 532 To add (or remove) additional monitors, see `Add/Remove Monitors`_.
 533 To add (or remove) additional Ceph OSD Daemons, see `Add/Remove OSDs`_.
 534
 535
 536 .. _Add/Remove Monitors: ../../rados/operations/add-or-rm-mons
 537 .. _Add/Remove OSDs: ../../rados/operations/add-or-rm-osds
 538 .. _Network Configuration Reference: ../../rados/configuration/network-config-ref
 539 .. _Monitor Config Reference - Data: ../../rados/configuration/mon-config-ref#data
 540 .. _create a Ceph file system: ../../cephfs/createfs