1 ==========================
2 Adding/Removing Monitors
3 ==========================
5 When you have a cluster up and running, you may add or remove monitors
6 from the cluster at runtime. To bootstrap a monitor, see `Manual Deployment`_
7 or `Monitor Bootstrap`_.
12 Ceph monitors are light-weight processes that maintain a master copy of the
13 cluster map. You can run a cluster with 1 monitor. We recommend at least 3
14 monitors for a production cluster. Ceph monitors use a variation of the
15 `Paxos`_ protocol to establish consensus about maps and other critical
16 information across the cluster. Due to the nature of Paxos, Ceph requires
17 a majority of monitors running to establish a quorum (thus establishing
20 It is advisable to run an odd-number of monitors but not mandatory. An
21 odd-number of monitors has a higher resiliency to failures than an
22 even-number of monitors. For instance, on a 2 monitor deployment, no
23 failures can be tolerated in order to maintain a quorum; with 3 monitors,
24 one failure can be tolerated; in a 4 monitor deployment, one failure can
25 be tolerated; with 5 monitors, two failures can be tolerated. This is
26 why an odd-number is advisable. Summarizing, Ceph needs a majority of
27 monitors to be running (and able to communicate with each other), but that
28 majority can be achieved using a single monitor, or 2 out of 2 monitors,
29 2 out of 3, 3 out of 4, etc.
31 For an initial deployment of a multi-node Ceph cluster, it is advisable to
32 deploy three monitors, increasing the number two at a time if a valid need
33 for more than three exists.
35 Since monitors are light-weight, it is possible to run them on the same
36 host as an OSD; however, we recommend running them on separate hosts,
37 because fsync issues with the kernel may impair performance.
39 .. note:: A *majority* of monitors in your cluster must be able to
40 reach each other in order to establish a quorum.
45 If you are adding a new host when adding a new monitor, see `Hardware
46 Recommendations`_ for details on minimum recommendations for monitor hardware.
47 To add a monitor host to your cluster, first make sure you have an up-to-date
48 version of Linux installed (typically Ubuntu 14.04 or RHEL 7).
50 Add your monitor host to a rack in your cluster, connect it to the network
51 and ensure that it has network connectivity.
53 .. _Hardware Recommendations: ../../../start/hardware-recommendations
55 Install the Required Software
56 -----------------------------
58 For manually deployed clusters, you must install Ceph packages
59 manually. See `Installing Packages`_ for details.
60 You should configure SSH to a user with password-less authentication
63 .. _Installing Packages: ../../../install/install-storage-cluster
66 .. _Adding a Monitor (Manual):
68 Adding a Monitor (Manual)
69 -------------------------
71 This procedure creates a ``ceph-mon`` data directory, retrieves the monitor map
72 and monitor keyring, and adds a ``ceph-mon`` daemon to your cluster. If
73 this results in only two monitor daemons, you may add more monitors by
74 repeating this procedure until you have a sufficient number of ``ceph-mon``
75 daemons to achieve a quorum.
77 At this point you should define your monitor's id. Traditionally, monitors
78 have been named with single letters (``a``, ``b``, ``c``, ...), but you are
79 free to define the id as you see fit. For the purpose of this document,
80 please take into account that ``{mon-id}`` should be the id you chose,
81 without the ``mon.`` prefix (i.e., ``{mon-id}`` should be the ``a``
84 #. Create the default directory on the machine that will host your
88 sudo mkdir /var/lib/ceph/mon/ceph-{mon-id}
90 #. Create a temporary directory ``{tmp}`` to keep the files needed during
91 this process. This directory should be different from the monitor's default
92 directory created in the previous step, and can be removed after all the
93 steps are executed. ::
97 #. Retrieve the keyring for your monitors, where ``{tmp}`` is the path to
98 the retrieved keyring, and ``{key-filename}`` is the name of the file
99 containing the retrieved monitor key. ::
101 ceph auth get mon. -o {tmp}/{key-filename}
103 #. Retrieve the monitor map, where ``{tmp}`` is the path to
104 the retrieved monitor map, and ``{map-filename}`` is the name of the file
105 containing the retrieved monitor monitor map. ::
107 ceph mon getmap -o {tmp}/{map-filename}
109 #. Prepare the monitor's data directory created in the first step. You must
110 specify the path to the monitor map so that you can retrieve the
111 information about a quorum of monitors and their ``fsid``. You must also
112 specify a path to the monitor keyring::
114 sudo ceph-mon -i {mon-id} --mkfs --monmap {tmp}/{map-filename} --keyring {tmp}/{key-filename}
117 #. Start the new monitor and it will automatically join the cluster.
118 The daemon needs to know which address to bind to, either via
119 ``--public-addr {ip:port}`` or by setting ``mon addr`` in the
120 appropriate section of ``ceph.conf``. For example::
122 ceph-mon -i {mon-id} --public-addr {ip:port}
128 When you remove monitors from a cluster, consider that Ceph monitors use
129 PAXOS to establish consensus about the master cluster map. You must have
130 a sufficient number of monitors to establish a quorum for consensus about
133 .. _Removing a Monitor (Manual):
135 Removing a Monitor (Manual)
136 ---------------------------
138 This procedure removes a ``ceph-mon`` daemon from your cluster. If this
139 procedure results in only two monitor daemons, you may add or remove another
140 monitor until you have a number of ``ceph-mon`` daemons that can achieve a
143 #. Stop the monitor. ::
145 service ceph -a stop mon.{mon-id}
147 #. Remove the monitor from the cluster. ::
149 ceph mon remove {mon-id}
151 #. Remove the monitor entry from ``ceph.conf``.
154 Removing Monitors from an Unhealthy Cluster
155 -------------------------------------------
157 This procedure removes a ``ceph-mon`` daemon from an unhealthy
158 cluster, for example a cluster where the monitors cannot form a
162 #. Stop all ``ceph-mon`` daemons on all monitor hosts. ::
165 service ceph stop mon || stop ceph-mon-all
166 # and repeat for all mons
168 #. Identify a surviving monitor and log in to that host. ::
172 #. Extract a copy of the monmap file. ::
174 ceph-mon -i {mon-id} --extract-monmap {map-path}
175 # in most cases, that's
176 ceph-mon -i `hostname` --extract-monmap /tmp/monmap
178 #. Remove the non-surviving or problematic monitors. For example, if
179 you have three monitors, ``mon.a``, ``mon.b``, and ``mon.c``, where
180 only ``mon.a`` will survive, follow the example below::
182 monmaptool {map-path} --rm {mon-id}
184 monmaptool /tmp/monmap --rm b
185 monmaptool /tmp/monmap --rm c
187 #. Inject the surviving map with the removed monitors into the
188 surviving monitor(s). For example, to inject a map into monitor
189 ``mon.a``, follow the example below::
191 ceph-mon -i {mon-id} --inject-monmap {map-path}
193 ceph-mon -i a --inject-monmap /tmp/monmap
195 #. Start only the surviving monitors.
197 #. Verify the monitors form a quorum (``ceph -s``).
199 #. You may wish to archive the removed monitors' data directory in
200 ``/var/lib/ceph/mon`` in a safe location, or delete it if you are
201 confident the remaining monitors are healthy and are sufficiently
204 .. _Changing a Monitor's IP address:
206 Changing a Monitor's IP Address
207 ===============================
209 .. important:: Existing monitors are not supposed to change their IP addresses.
211 Monitors are critical components of a Ceph cluster, and they need to maintain a
212 quorum for the whole system to work properly. To establish a quorum, the
213 monitors need to discover each other. Ceph has strict requirements for
214 discovering monitors.
216 Ceph clients and other Ceph daemons use ``ceph.conf`` to discover monitors.
217 However, monitors discover each other using the monitor map, not ``ceph.conf``.
218 For example, if you refer to `Adding a Monitor (Manual)`_ you will see that you
219 need to obtain the current monmap for the cluster when creating a new monitor,
220 as it is one of the required arguments of ``ceph-mon -i {mon-id} --mkfs``. The
221 following sections explain the consistency requirements for Ceph monitors, and a
222 few safe ways to change a monitor's IP address.
225 Consistency Requirements
226 ------------------------
228 A monitor always refers to the local copy of the monmap when discovering other
229 monitors in the cluster. Using the monmap instead of ``ceph.conf`` avoids
230 errors that could break the cluster (e.g., typos in ``ceph.conf`` when
231 specifying a monitor address or port). Since monitors use monmaps for discovery
232 and they share monmaps with clients and other Ceph daemons, the monmap provides
233 monitors with a strict guarantee that their consensus is valid.
235 Strict consistency also applies to updates to the monmap. As with any other
236 updates on the monitor, changes to the monmap always run through a distributed
237 consensus algorithm called `Paxos`_. The monitors must agree on each update to
238 the monmap, such as adding or removing a monitor, to ensure that each monitor in
239 the quorum has the same version of the monmap. Updates to the monmap are
240 incremental so that monitors have the latest agreed upon version, and a set of
241 previous versions, allowing a monitor that has an older version of the monmap to
242 catch up with the current state of the cluster.
244 If monitors discovered each other through the Ceph configuration file instead of
245 through the monmap, it would introduce additional risks because the Ceph
246 configuration files aren't updated and distributed automatically. Monitors
247 might inadvertently use an older ``ceph.conf`` file, fail to recognize a
248 monitor, fall out of a quorum, or develop a situation where `Paxos`_ isn't able
249 to determine the current state of the system accurately. Consequently, making
250 changes to an existing monitor's IP address must be done with great care.
253 Changing a Monitor's IP address (The Right Way)
254 -----------------------------------------------
256 Changing a monitor's IP address in ``ceph.conf`` only is not sufficient to
257 ensure that other monitors in the cluster will receive the update. To change a
258 monitor's IP address, you must add a new monitor with the IP address you want
259 to use (as described in `Adding a Monitor (Manual)`_), ensure that the new
260 monitor successfully joins the quorum; then, remove the monitor that uses the
261 old IP address. Then, update the ``ceph.conf`` file to ensure that clients and
262 other daemons know the IP address of the new monitor.
264 For example, lets assume there are three monitors in place, such as ::
276 To change ``mon.c`` to ``host04`` with the IP address ``10.0.0.4``, follow the
277 steps in `Adding a Monitor (Manual)`_ by adding a new monitor ``mon.d``. Ensure
278 that ``mon.d`` is running before removing ``mon.c``, or it will break the
279 quorum. Remove ``mon.c`` as described on `Removing a Monitor (Manual)`_. Moving
280 all three monitors would thus require repeating this process as many times as
284 Changing a Monitor's IP address (The Messy Way)
285 -----------------------------------------------
287 There may come a time when the monitors must be moved to a different network, a
288 different part of the datacenter or a different datacenter altogether. While it
289 is possible to do it, the process becomes a bit more hazardous.
291 In such a case, the solution is to generate a new monmap with updated IP
292 addresses for all the monitors in the cluster, and inject the new map on each
293 individual monitor. This is not the most user-friendly approach, but we do not
294 expect this to be something that needs to be done every other week. As it is
295 clearly stated on the top of this section, monitors are not supposed to change
298 Using the previous monitor configuration as an example, assume you want to move
299 all the monitors from the ``10.0.0.x`` range to ``10.1.0.x``, and these
300 networks are unable to communicate. Use the following procedure:
302 #. Retrieve the monitor map, where ``{tmp}`` is the path to
303 the retrieved monitor map, and ``{filename}`` is the name of the file
304 containing the retrieved monitor monitor map. ::
306 ceph mon getmap -o {tmp}/{filename}
308 #. The following example demonstrates the contents of the monmap. ::
310 $ monmaptool --print {tmp}/{filename}
312 monmaptool: monmap file {tmp}/{filename}
314 fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
315 last_changed 2012-12-17 02:46:41.591248
316 created 2012-12-17 02:46:41.591248
317 0: 10.0.0.1:6789/0 mon.a
318 1: 10.0.0.2:6789/0 mon.b
319 2: 10.0.0.3:6789/0 mon.c
321 #. Remove the existing monitors. ::
323 $ monmaptool --rm a --rm b --rm c {tmp}/{filename}
325 monmaptool: monmap file {tmp}/{filename}
326 monmaptool: removing a
327 monmaptool: removing b
328 monmaptool: removing c
329 monmaptool: writing epoch 1 to {tmp}/{filename} (0 monitors)
331 #. Add the new monitor locations. ::
333 $ monmaptool --add a 10.1.0.1:6789 --add b 10.1.0.2:6789 --add c 10.1.0.3:6789 {tmp}/{filename}
335 monmaptool: monmap file {tmp}/{filename}
336 monmaptool: writing epoch 1 to {tmp}/{filename} (3 monitors)
338 #. Check new contents. ::
340 $ monmaptool --print {tmp}/{filename}
342 monmaptool: monmap file {tmp}/{filename}
344 fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
345 last_changed 2012-12-17 02:46:41.591248
346 created 2012-12-17 02:46:41.591248
347 0: 10.1.0.1:6789/0 mon.a
348 1: 10.1.0.2:6789/0 mon.b
349 2: 10.1.0.3:6789/0 mon.c
351 At this point, we assume the monitors (and stores) are installed at the new
352 location. The next step is to propagate the modified monmap to the new
353 monitors, and inject the modified monmap into each new monitor.
355 #. First, make sure to stop all your monitors. Injection must be done while
356 the daemon is not running.
358 #. Inject the monmap. ::
360 ceph-mon -i {mon-id} --inject-monmap {tmp}/{filename}
362 #. Restart the monitors.
364 After this step, migration to the new location is complete and
365 the monitors should operate successfully.
368 .. _Manual Deployment: ../../../install/manual-deployment
369 .. _Monitor Bootstrap: ../../../dev/mon-bootstrap
370 .. _Paxos: http://en.wikipedia.org/wiki/Paxos_(computer_science)