]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/operations/add-or-rm-mons.rst
update sources to v12.1.2
[ceph.git] / ceph / doc / rados / operations / add-or-rm-mons.rst
1 ==========================
2 Adding/Removing Monitors
3 ==========================
4
5 When you have a cluster up and running, you may add or remove monitors
6 from the cluster at runtime. To bootstrap a monitor, see `Manual Deployment`_
7 or `Monitor Bootstrap`_.
8
9 Adding Monitors
10 ===============
11
12 Ceph monitors are light-weight processes that maintain a master copy of the
13 cluster map. You can run a cluster with 1 monitor. We recommend at least 3
14 monitors for a production cluster. Ceph monitors use a variation of the
15 `Paxos`_ protocol to establish consensus about maps and other critical
16 information across the cluster. Due to the nature of Paxos, Ceph requires
17 a majority of monitors running to establish a quorum (thus establishing
18 consensus).
19
20 It is advisable to run an odd-number of monitors but not mandatory. An
21 odd-number of monitors has a higher resiliency to failures than an
22 even-number of monitors. For instance, on a 2 monitor deployment, no
23 failures can be tolerated in order to maintain a quorum; with 3 monitors,
24 one failure can be tolerated; in a 4 monitor deployment, one failure can
25 be tolerated; with 5 monitors, two failures can be tolerated. This is
26 why an odd-number is advisable. Summarizing, Ceph needs a majority of
27 monitors to be running (and able to communicate with each other), but that
28 majority can be achieved using a single monitor, or 2 out of 2 monitors,
29 2 out of 3, 3 out of 4, etc.
30
31 For an initial deployment of a multi-node Ceph cluster, it is advisable to
32 deploy three monitors, increasing the number two at a time if a valid need
33 for more than three exists.
34
35 Since monitors are light-weight, it is possible to run them on the same
36 host as an OSD; however, we recommend running them on separate hosts,
37 because fsync issues with the kernel may impair performance.
38
39 .. note:: A *majority* of monitors in your cluster must be able to
40 reach each other in order to establish a quorum.
41
42 Deploy your Hardware
43 --------------------
44
45 If you are adding a new host when adding a new monitor, see `Hardware
46 Recommendations`_ for details on minimum recommendations for monitor hardware.
47 To add a monitor host to your cluster, first make sure you have an up-to-date
48 version of Linux installed (typically Ubuntu 14.04 or RHEL 7).
49
50 Add your monitor host to a rack in your cluster, connect it to the network
51 and ensure that it has network connectivity.
52
53 .. _Hardware Recommendations: ../../../start/hardware-recommendations
54
55 Install the Required Software
56 -----------------------------
57
58 For manually deployed clusters, you must install Ceph packages
59 manually. See `Installing Packages`_ for details.
60 You should configure SSH to a user with password-less authentication
61 and root permissions.
62
63 .. _Installing Packages: ../../../install/install-storage-cluster
64
65
66 .. _Adding a Monitor (Manual):
67
68 Adding a Monitor (Manual)
69 -------------------------
70
71 This procedure creates a ``ceph-mon`` data directory, retrieves the monitor map
72 and monitor keyring, and adds a ``ceph-mon`` daemon to your cluster. If
73 this results in only two monitor daemons, you may add more monitors by
74 repeating this procedure until you have a sufficient number of ``ceph-mon``
75 daemons to achieve a quorum.
76
77 At this point you should define your monitor's id. Traditionally, monitors
78 have been named with single letters (``a``, ``b``, ``c``, ...), but you are
79 free to define the id as you see fit. For the purpose of this document,
80 please take into account that ``{mon-id}`` should be the id you chose,
81 without the ``mon.`` prefix (i.e., ``{mon-id}`` should be the ``a``
82 on ``mon.a``).
83
84 #. Create the default directory on the machine that will host your
85 new monitor. ::
86
87 ssh {new-mon-host}
88 sudo mkdir /var/lib/ceph/mon/ceph-{mon-id}
89
90 #. Create a temporary directory ``{tmp}`` to keep the files needed during
91 this process. This directory should be different from the monitor's default
92 directory created in the previous step, and can be removed after all the
93 steps are executed. ::
94
95 mkdir {tmp}
96
97 #. Retrieve the keyring for your monitors, where ``{tmp}`` is the path to
98 the retrieved keyring, and ``{key-filename}`` is the name of the file
99 containing the retrieved monitor key. ::
100
101 ceph auth get mon. -o {tmp}/{key-filename}
102
103 #. Retrieve the monitor map, where ``{tmp}`` is the path to
104 the retrieved monitor map, and ``{map-filename}`` is the name of the file
105 containing the retrieved monitor monitor map. ::
106
107 ceph mon getmap -o {tmp}/{map-filename}
108
109 #. Prepare the monitor's data directory created in the first step. You must
110 specify the path to the monitor map so that you can retrieve the
111 information about a quorum of monitors and their ``fsid``. You must also
112 specify a path to the monitor keyring::
113
114 sudo ceph-mon -i {mon-id} --mkfs --monmap {tmp}/{map-filename} --keyring {tmp}/{key-filename}
115
116
117 #. Start the new monitor and it will automatically join the cluster.
118 The daemon needs to know which address to bind to, either via
119 ``--public-addr {ip:port}`` or by setting ``mon addr`` in the
120 appropriate section of ``ceph.conf``. For example::
121
122 ceph-mon -i {mon-id} --public-addr {ip:port}
123
124
125 Removing Monitors
126 =================
127
128 When you remove monitors from a cluster, consider that Ceph monitors use
129 PAXOS to establish consensus about the master cluster map. You must have
130 a sufficient number of monitors to establish a quorum for consensus about
131 the cluster map.
132
133 .. _Removing a Monitor (Manual):
134
135 Removing a Monitor (Manual)
136 ---------------------------
137
138 This procedure removes a ``ceph-mon`` daemon from your cluster. If this
139 procedure results in only two monitor daemons, you may add or remove another
140 monitor until you have a number of ``ceph-mon`` daemons that can achieve a
141 quorum.
142
143 #. Stop the monitor. ::
144
145 service ceph -a stop mon.{mon-id}
146
147 #. Remove the monitor from the cluster. ::
148
149 ceph mon remove {mon-id}
150
151 #. Remove the monitor entry from ``ceph.conf``.
152
153
154 Removing Monitors from an Unhealthy Cluster
155 -------------------------------------------
156
157 This procedure removes a ``ceph-mon`` daemon from an unhealthy
158 cluster, for example a cluster where the monitors cannot form a
159 quorum.
160
161
162 #. Stop all ``ceph-mon`` daemons on all monitor hosts. ::
163
164 ssh {mon-host}
165 service ceph stop mon || stop ceph-mon-all
166 # and repeat for all mons
167
168 #. Identify a surviving monitor and log in to that host. ::
169
170 ssh {mon-host}
171
172 #. Extract a copy of the monmap file. ::
173
174 ceph-mon -i {mon-id} --extract-monmap {map-path}
175 # in most cases, that's
176 ceph-mon -i `hostname` --extract-monmap /tmp/monmap
177
178 #. Remove the non-surviving or problematic monitors. For example, if
179 you have three monitors, ``mon.a``, ``mon.b``, and ``mon.c``, where
180 only ``mon.a`` will survive, follow the example below::
181
182 monmaptool {map-path} --rm {mon-id}
183 # for example,
184 monmaptool /tmp/monmap --rm b
185 monmaptool /tmp/monmap --rm c
186
187 #. Inject the surviving map with the removed monitors into the
188 surviving monitor(s). For example, to inject a map into monitor
189 ``mon.a``, follow the example below::
190
191 ceph-mon -i {mon-id} --inject-monmap {map-path}
192 # for example,
193 ceph-mon -i a --inject-monmap /tmp/monmap
194
195 #. Start only the surviving monitors.
196
197 #. Verify the monitors form a quorum (``ceph -s``).
198
199 #. You may wish to archive the removed monitors' data directory in
200 ``/var/lib/ceph/mon`` in a safe location, or delete it if you are
201 confident the remaining monitors are healthy and are sufficiently
202 redundant.
203
204 .. _Changing a Monitor's IP address:
205
206 Changing a Monitor's IP Address
207 ===============================
208
209 .. important:: Existing monitors are not supposed to change their IP addresses.
210
211 Monitors are critical components of a Ceph cluster, and they need to maintain a
212 quorum for the whole system to work properly. To establish a quorum, the
213 monitors need to discover each other. Ceph has strict requirements for
214 discovering monitors.
215
216 Ceph clients and other Ceph daemons use ``ceph.conf`` to discover monitors.
217 However, monitors discover each other using the monitor map, not ``ceph.conf``.
218 For example, if you refer to `Adding a Monitor (Manual)`_ you will see that you
219 need to obtain the current monmap for the cluster when creating a new monitor,
220 as it is one of the required arguments of ``ceph-mon -i {mon-id} --mkfs``. The
221 following sections explain the consistency requirements for Ceph monitors, and a
222 few safe ways to change a monitor's IP address.
223
224
225 Consistency Requirements
226 ------------------------
227
228 A monitor always refers to the local copy of the monmap when discovering other
229 monitors in the cluster. Using the monmap instead of ``ceph.conf`` avoids
230 errors that could break the cluster (e.g., typos in ``ceph.conf`` when
231 specifying a monitor address or port). Since monitors use monmaps for discovery
232 and they share monmaps with clients and other Ceph daemons, the monmap provides
233 monitors with a strict guarantee that their consensus is valid.
234
235 Strict consistency also applies to updates to the monmap. As with any other
236 updates on the monitor, changes to the monmap always run through a distributed
237 consensus algorithm called `Paxos`_. The monitors must agree on each update to
238 the monmap, such as adding or removing a monitor, to ensure that each monitor in
239 the quorum has the same version of the monmap. Updates to the monmap are
240 incremental so that monitors have the latest agreed upon version, and a set of
241 previous versions, allowing a monitor that has an older version of the monmap to
242 catch up with the current state of the cluster.
243
244 If monitors discovered each other through the Ceph configuration file instead of
245 through the monmap, it would introduce additional risks because the Ceph
246 configuration files are not updated and distributed automatically. Monitors
247 might inadvertently use an older ``ceph.conf`` file, fail to recognize a
248 monitor, fall out of a quorum, or develop a situation where `Paxos`_ is not able
249 to determine the current state of the system accurately. Consequently, making
250 changes to an existing monitor's IP address must be done with great care.
251
252
253 Changing a Monitor's IP address (The Right Way)
254 -----------------------------------------------
255
256 Changing a monitor's IP address in ``ceph.conf`` only is not sufficient to
257 ensure that other monitors in the cluster will receive the update. To change a
258 monitor's IP address, you must add a new monitor with the IP address you want
259 to use (as described in `Adding a Monitor (Manual)`_), ensure that the new
260 monitor successfully joins the quorum; then, remove the monitor that uses the
261 old IP address. Then, update the ``ceph.conf`` file to ensure that clients and
262 other daemons know the IP address of the new monitor.
263
264 For example, lets assume there are three monitors in place, such as ::
265
266 [mon.a]
267 host = host01
268 addr = 10.0.0.1:6789
269 [mon.b]
270 host = host02
271 addr = 10.0.0.2:6789
272 [mon.c]
273 host = host03
274 addr = 10.0.0.3:6789
275
276 To change ``mon.c`` to ``host04`` with the IP address ``10.0.0.4``, follow the
277 steps in `Adding a Monitor (Manual)`_ by adding a new monitor ``mon.d``. Ensure
278 that ``mon.d`` is running before removing ``mon.c``, or it will break the
279 quorum. Remove ``mon.c`` as described on `Removing a Monitor (Manual)`_. Moving
280 all three monitors would thus require repeating this process as many times as
281 needed.
282
283
284 Changing a Monitor's IP address (The Messy Way)
285 -----------------------------------------------
286
287 There may come a time when the monitors must be moved to a different network, a
288 different part of the datacenter or a different datacenter altogether. While it
289 is possible to do it, the process becomes a bit more hazardous.
290
291 In such a case, the solution is to generate a new monmap with updated IP
292 addresses for all the monitors in the cluster, and inject the new map on each
293 individual monitor. This is not the most user-friendly approach, but we do not
294 expect this to be something that needs to be done every other week. As it is
295 clearly stated on the top of this section, monitors are not supposed to change
296 IP addresses.
297
298 Using the previous monitor configuration as an example, assume you want to move
299 all the monitors from the ``10.0.0.x`` range to ``10.1.0.x``, and these
300 networks are unable to communicate. Use the following procedure:
301
302 #. Retrieve the monitor map, where ``{tmp}`` is the path to
303 the retrieved monitor map, and ``{filename}`` is the name of the file
304 containing the retrieved monitor monitor map. ::
305
306 ceph mon getmap -o {tmp}/{filename}
307
308 #. The following example demonstrates the contents of the monmap. ::
309
310 $ monmaptool --print {tmp}/{filename}
311
312 monmaptool: monmap file {tmp}/{filename}
313 epoch 1
314 fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
315 last_changed 2012-12-17 02:46:41.591248
316 created 2012-12-17 02:46:41.591248
317 0: 10.0.0.1:6789/0 mon.a
318 1: 10.0.0.2:6789/0 mon.b
319 2: 10.0.0.3:6789/0 mon.c
320
321 #. Remove the existing monitors. ::
322
323 $ monmaptool --rm a --rm b --rm c {tmp}/{filename}
324
325 monmaptool: monmap file {tmp}/{filename}
326 monmaptool: removing a
327 monmaptool: removing b
328 monmaptool: removing c
329 monmaptool: writing epoch 1 to {tmp}/{filename} (0 monitors)
330
331 #. Add the new monitor locations. ::
332
333 $ monmaptool --add a 10.1.0.1:6789 --add b 10.1.0.2:6789 --add c 10.1.0.3:6789 {tmp}/{filename}
334
335 monmaptool: monmap file {tmp}/{filename}
336 monmaptool: writing epoch 1 to {tmp}/{filename} (3 monitors)
337
338 #. Check new contents. ::
339
340 $ monmaptool --print {tmp}/{filename}
341
342 monmaptool: monmap file {tmp}/{filename}
343 epoch 1
344 fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
345 last_changed 2012-12-17 02:46:41.591248
346 created 2012-12-17 02:46:41.591248
347 0: 10.1.0.1:6789/0 mon.a
348 1: 10.1.0.2:6789/0 mon.b
349 2: 10.1.0.3:6789/0 mon.c
350
351 At this point, we assume the monitors (and stores) are installed at the new
352 location. The next step is to propagate the modified monmap to the new
353 monitors, and inject the modified monmap into each new monitor.
354
355 #. First, make sure to stop all your monitors. Injection must be done while
356 the daemon is not running.
357
358 #. Inject the monmap. ::
359
360 ceph-mon -i {mon-id} --inject-monmap {tmp}/{filename}
361
362 #. Restart the monitors.
363
364 After this step, migration to the new location is complete and
365 the monitors should operate successfully.
366
367
368 .. _Manual Deployment: ../../../install/manual-deployment
369 .. _Monitor Bootstrap: ../../../dev/mon-bootstrap
370 .. _Paxos: http://en.wikipedia.org/wiki/Paxos_(computer_science)