ceph/doc/cephfs/add-remove-mds.rst

   1 .. _cephfs_add_remote_mds:
   2
   3 .. note::
   4    It is highly recommended to use :doc:`/cephadm/index` or another Ceph
   5    orchestrator for setting up the ceph cluster. Use this approach only if you
   6    are setting up the ceph cluster manually. If one still intends to use the
   7    manual way for deploying MDS daemons, :doc:`/cephadm/services/mds/` can
   8    also be used.
   9
  10 ============================
  11  Deploying Metadata Servers
  12 ============================
  13
  14 Each CephFS file system requires at least one MDS. The cluster operator will
  15 generally use their automated deployment tool to launch required MDS servers as
  16 needed.  Rook and ansible (via the ceph-ansible playbooks) are recommended
  17 tools for doing this. For clarity, we also show the systemd commands here which
  18 may be run by the deployment technology if executed on bare-metal.
  19
  20 See `MDS Config Reference`_ for details on configuring metadata servers.
  21
  22
  23 Provisioning Hardware for an MDS
  24 ================================
  25
  26 The present version of the MDS is single-threaded and CPU-bound for most
  27 activities, including responding to client requests. An MDS under the most
  28 aggressive client loads uses about 2 to 3 CPU cores. This is due to the other
  29 miscellaneous upkeep threads working in tandem.
  30
  31 Even so, it is recommended that an MDS server be well provisioned with an
  32 advanced CPU with sufficient cores. Development is on-going to make better use
  33 of available CPU cores in the MDS; it is expected in future versions of Ceph
  34 that the MDS server will improve performance by taking advantage of more cores.
  35
  36 The other dimension to MDS performance is the available RAM for caching. The
  37 MDS necessarily manages a distributed and cooperative metadata cache among all
  38 clients and other active MDSs. Therefore it is essential to provide the MDS
  39 with sufficient RAM to enable faster metadata access and mutation. The default
  40 MDS cache size (see also :doc:`/cephfs/cache-configuration`) is 4GB. It is
  41 recommended to provision at least 8GB of RAM for the MDS to support this cache
  42 size.
  43
  44 Generally, an MDS serving a large cluster of clients (1000 or more) will use at
  45 least 64GB of cache. An MDS with a larger cache is not well explored in the
  46 largest known community clusters; there may be diminishing returns where
  47 management of such a large cache negatively impacts performance in surprising
  48 ways. It would be best to do analysis with expected workloads to determine if
  49 provisioning more RAM is worthwhile.
  50
  51 In a bare-metal cluster, the best practice is to over-provision hardware for
  52 the MDS server. Even if a single MDS daemon is unable to fully utilize the
  53 hardware, it may be desirable later on to start more active MDS daemons on the
  54 same node to fully utilize the available cores and memory. Additionally, it may
  55 become clear with workloads on the cluster that performance improves with
  56 multiple active MDS on the same node rather than over-provisioning a single
  57 MDS.
  58
  59 Finally, be aware that CephFS is a highly-available file system by supporting
  60 standby MDS (see also :ref:`mds-standby`) for rapid failover. To get a real
  61 benefit from deploying standbys, it is usually necessary to distribute MDS
  62 daemons across at least two nodes in the cluster. Otherwise, a hardware failure
  63 on a single node may result in the file system becoming unavailable.
  64
  65 Co-locating the MDS with other Ceph daemons (hyperconverged) is an effective
  66 and recommended way to accomplish this so long as all daemons are configured to
  67 use available hardware within certain limits.  For the MDS, this generally
  68 means limiting its cache size.
  69
  70
  71 Adding an MDS
  72 =============
  73
  74 #. Create an mds directory ``/var/lib/ceph/mds/ceph-${id}``. The daemon only uses this directory to store its keyring.
  75
  76 #. Create the authentication key, if you use CephX: ::
  77
  78         $ sudo ceph auth get-or-create mds.${id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-${id}/keyring
  79
  80 #. Start the service: ::
  81
  82         $ sudo systemctl start ceph-mds@${id}
  83
  84 #. The status of the cluster should show: ::
  85
  86         mds: ${id}:1 {0=${id}=up:active} 2 up:standby
  87
  88 #. Optionally, configure the file system the MDS should join (:ref:`mds-join-fs`): ::
  89
  90     $ ceph config set mds.${id} mds_join_fs ${fs}
  91
  92
  93 Removing an MDS
  94 ===============
  95
  96 If you have a metadata server in your cluster that you'd like to remove, you may use
  97 the following method.
  98
  99 #. (Optionally:) Create a new replacement Metadata Server. If there are no
 100    replacement MDS to take over once the MDS is removed, the file system will
 101    become unavailable to clients.  If that is not desirable, consider adding a
 102    metadata server before tearing down the metadata server you would like to
 103    take offline.
 104
 105 #. Stop the MDS to be removed. ::
 106
 107         $ sudo systemctl stop ceph-mds@${id}
 108
 109    The MDS will automatically notify the Ceph monitors that it is going down.
 110    This enables the monitors to perform instantaneous failover to an available
 111    standby, if one exists. It is unnecessary to use administrative commands to
 112    effect this failover, e.g. through the use of ``ceph mds fail mds.${id}``.
 113
 114 #. Remove the ``/var/lib/ceph/mds/ceph-${id}`` directory on the MDS. ::
 115
 116         $ sudo rm -rf /var/lib/ceph/mds/ceph-${id}
 117
 118 .. _MDS Config Reference: ../mds-config-ref