]>
Commit | Line | Data |
---|---|---|
39ae355f TL |
1 | .. _cephfs_add_remote_mds: |
2 | ||
33c7a0ef TL |
3 | .. note:: |
4 | It is highly recommended to use :doc:`/cephadm/index` or another Ceph | |
5 | orchestrator for setting up the ceph cluster. Use this approach only if you | |
6 | are setting up the ceph cluster manually. If one still intends to use the | |
7 | manual way for deploying MDS daemons, :doc:`/cephadm/services/mds/` can | |
8 | also be used. | |
9 | ||
11fdf7f2 | 10 | ============================ |
92f5a8d4 | 11 | Deploying Metadata Servers |
11fdf7f2 TL |
12 | ============================ |
13 | ||
92f5a8d4 TL |
14 | Each CephFS file system requires at least one MDS. The cluster operator will |
15 | generally use their automated deployment tool to launch required MDS servers as | |
16 | needed. Rook and ansible (via the ceph-ansible playbooks) are recommended | |
17 | tools for doing this. For clarity, we also show the systemd commands here which | |
18 | may be run by the deployment technology if executed on bare-metal. | |
11fdf7f2 TL |
19 | |
20 | See `MDS Config Reference`_ for details on configuring metadata servers. | |
21 | ||
22 | ||
92f5a8d4 TL |
23 | Provisioning Hardware for an MDS |
24 | ================================ | |
11fdf7f2 | 25 | |
92f5a8d4 | 26 | The present version of the MDS is single-threaded and CPU-bound for most |
9f95a23c TL |
27 | activities, including responding to client requests. An MDS under the most |
28 | aggressive client loads uses about 2 to 3 CPU cores. This is due to the other | |
29 | miscellaneous upkeep threads working in tandem. | |
92f5a8d4 TL |
30 | |
31 | Even so, it is recommended that an MDS server be well provisioned with an | |
32 | advanced CPU with sufficient cores. Development is on-going to make better use | |
33 | of available CPU cores in the MDS; it is expected in future versions of Ceph | |
34 | that the MDS server will improve performance by taking advantage of more cores. | |
35 | ||
36 | The other dimension to MDS performance is the available RAM for caching. The | |
37 | MDS necessarily manages a distributed and cooperative metadata cache among all | |
38 | clients and other active MDSs. Therefore it is essential to provide the MDS | |
9f95a23c | 39 | with sufficient RAM to enable faster metadata access and mutation. The default |
adb31ebb | 40 | MDS cache size (see also :doc:`/cephfs/cache-configuration`) is 4GB. It is |
9f95a23c TL |
41 | recommended to provision at least 8GB of RAM for the MDS to support this cache |
42 | size. | |
92f5a8d4 TL |
43 | |
44 | Generally, an MDS serving a large cluster of clients (1000 or more) will use at | |
9f95a23c TL |
45 | least 64GB of cache. An MDS with a larger cache is not well explored in the |
46 | largest known community clusters; there may be diminishing returns where | |
47 | management of such a large cache negatively impacts performance in surprising | |
48 | ways. It would be best to do analysis with expected workloads to determine if | |
49 | provisioning more RAM is worthwhile. | |
92f5a8d4 TL |
50 | |
51 | In a bare-metal cluster, the best practice is to over-provision hardware for | |
52 | the MDS server. Even if a single MDS daemon is unable to fully utilize the | |
53 | hardware, it may be desirable later on to start more active MDS daemons on the | |
54 | same node to fully utilize the available cores and memory. Additionally, it may | |
55 | become clear with workloads on the cluster that performance improves with | |
56 | multiple active MDS on the same node rather than over-provisioning a single | |
57 | MDS. | |
58 | ||
59 | Finally, be aware that CephFS is a highly-available file system by supporting | |
60 | standby MDS (see also :ref:`mds-standby`) for rapid failover. To get a real | |
61 | benefit from deploying standbys, it is usually necessary to distribute MDS | |
62 | daemons across at least two nodes in the cluster. Otherwise, a hardware failure | |
63 | on a single node may result in the file system becoming unavailable. | |
64 | ||
65 | Co-locating the MDS with other Ceph daemons (hyperconverged) is an effective | |
66 | and recommended way to accomplish this so long as all daemons are configured to | |
67 | use available hardware within certain limits. For the MDS, this generally | |
68 | means limiting its cache size. | |
69 | ||
70 | ||
71 | Adding an MDS | |
72 | ============= | |
73 | ||
33c7a0ef | 74 | #. Create an mds directory ``/var/lib/ceph/mds/ceph-${id}``. The daemon only uses this directory to store its keyring. |
11fdf7f2 | 75 | |
9f95a23c | 76 | #. Create the authentication key, if you use CephX: :: |
11fdf7f2 | 77 | |
92f5a8d4 | 78 | $ sudo ceph auth get-or-create mds.${id} mon 'profile mds' mgr 'profile mds' mds 'allow *' osd 'allow *' > /var/lib/ceph/mds/ceph-${id}/keyring |
11fdf7f2 | 79 | |
9f95a23c | 80 | #. Start the service: :: |
11fdf7f2 | 81 | |
9f95a23c | 82 | $ sudo systemctl start ceph-mds@${id} |
11fdf7f2 | 83 | |
92f5a8d4 | 84 | #. The status of the cluster should show: :: |
11fdf7f2 | 85 | |
92f5a8d4 | 86 | mds: ${id}:1 {0=${id}=up:active} 2 up:standby |
11fdf7f2 | 87 | |
9f95a23c TL |
88 | #. Optionally, configure the file system the MDS should join (:ref:`mds-join-fs`): :: |
89 | ||
90 | $ ceph config set mds.${id} mds_join_fs ${fs} | |
91 | ||
92 | ||
92f5a8d4 TL |
93 | Removing an MDS |
94 | =============== | |
11fdf7f2 TL |
95 | |
96 | If you have a metadata server in your cluster that you'd like to remove, you may use | |
97 | the following method. | |
98 | ||
92f5a8d4 TL |
99 | #. (Optionally:) Create a new replacement Metadata Server. If there are no |
100 | replacement MDS to take over once the MDS is removed, the file system will | |
101 | become unavailable to clients. If that is not desirable, consider adding a | |
102 | metadata server before tearing down the metadata server you would like to | |
103 | take offline. | |
104 | ||
105 | #. Stop the MDS to be removed. :: | |
106 | ||
9f95a23c | 107 | $ sudo systemctl stop ceph-mds@${id} |
11fdf7f2 | 108 | |
92f5a8d4 TL |
109 | The MDS will automatically notify the Ceph monitors that it is going down. |
110 | This enables the monitors to perform instantaneous failover to an available | |
111 | standby, if one exists. It is unnecessary to use administrative commands to | |
112 | effect this failover, e.g. through the use of ``ceph mds fail mds.${id}``. | |
11fdf7f2 | 113 | |
92f5a8d4 | 114 | #. Remove the ``/var/lib/ceph/mds/ceph-${id}`` directory on the MDS. :: |
11fdf7f2 | 115 | |
92f5a8d4 | 116 | $ sudo rm -rf /var/lib/ceph/mds/ceph-${id} |
11fdf7f2 TL |
117 | |
118 | .. _MDS Config Reference: ../mds-config-ref |