]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/operations/add-or-rm-osds.rst
update sources to 12.2.7
[ceph.git] / ceph / doc / rados / operations / add-or-rm-osds.rst
CommitLineData
7c673cae
FG
1======================
2 Adding/Removing OSDs
3======================
4
5When you have a cluster up and running, you may add OSDs or remove OSDs
6from the cluster at runtime.
7
8Adding OSDs
9===========
10
11When you want to expand a cluster, you may add an OSD at runtime. With Ceph, an
12OSD is generally one Ceph ``ceph-osd`` daemon for one storage drive within a
13host machine. If your host has multiple storage drives, you may map one
14``ceph-osd`` daemon for each drive.
15
16Generally, it's a good idea to check the capacity of your cluster to see if you
17are reaching the upper end of its capacity. As your cluster reaches its ``near
18full`` ratio, you should add one or more OSDs to expand your cluster's capacity.
19
20.. warning:: Do not let your cluster reach its ``full ratio`` before
21 adding an OSD. OSD failures that occur after the cluster reaches
22 its ``near full`` ratio may cause the cluster to exceed its
23 ``full ratio``.
24
25Deploy your Hardware
26--------------------
27
28If you are adding a new host when adding a new OSD, see `Hardware
29Recommendations`_ for details on minimum recommendations for OSD hardware. To
30add an OSD host to your cluster, first make sure you have an up-to-date version
31of Linux installed, and you have made some initial preparations for your
32storage drives. See `Filesystem Recommendations`_ for details.
33
34Add your OSD host to a rack in your cluster, connect it to the network
35and ensure that it has network connectivity. See the `Network Configuration
36Reference`_ for details.
37
38.. _Hardware Recommendations: ../../../start/hardware-recommendations
39.. _Filesystem Recommendations: ../../configuration/filesystem-recommendations
40.. _Network Configuration Reference: ../../configuration/network-config-ref
41
42Install the Required Software
43-----------------------------
44
45For manually deployed clusters, you must install Ceph packages
46manually. See `Installing Ceph (Manual)`_ for details.
47You should configure SSH to a user with password-less authentication
48and root permissions.
49
50.. _Installing Ceph (Manual): ../../../install
51
52
53Adding an OSD (Manual)
54----------------------
55
56This procedure sets up a ``ceph-osd`` daemon, configures it to use one drive,
57and configures the cluster to distribute data to the OSD. If your host has
58multiple drives, you may add an OSD for each drive by repeating this procedure.
59
60To add an OSD, create a data directory for it, mount a drive to that directory,
61add the OSD to the cluster, and then add it to the CRUSH map.
62
63When you add the OSD to the CRUSH map, consider the weight you give to the new
64OSD. Hard drive capacity grows 40% per year, so newer OSD hosts may have larger
65hard drives than older hosts in the cluster (i.e., they may have greater
66weight).
67
68.. tip:: Ceph prefers uniform hardware across pools. If you are adding drives
69 of dissimilar size, you can adjust their weights. However, for best
70 performance, consider a CRUSH hierarchy with drives of the same type/size.
71
72#. Create the OSD. If no UUID is given, it will be set automatically when the
73 OSD starts up. The following command will output the OSD number, which you
74 will need for subsequent steps. ::
75
76 ceph osd create [{uuid} [{id}]]
77
78 If the optional parameter {id} is given it will be used as the OSD id.
79 Note, in this case the command may fail if the number is already in use.
80
81 .. warning:: In general, explicitly specifying {id} is not recommended.
82 IDs are allocated as an array, and skipping entries consumes some extra
83 memory. This can become significant if there are large gaps and/or
84 clusters are large. If {id} is not specified, the smallest available is
85 used.
86
87#. Create the default directory on your new OSD. ::
88
89 ssh {new-osd-host}
90 sudo mkdir /var/lib/ceph/osd/ceph-{osd-number}
91
92
93#. If the OSD is for a drive other than the OS drive, prepare it
94 for use with Ceph, and mount it to the directory you just created::
95
96 ssh {new-osd-host}
97 sudo mkfs -t {fstype} /dev/{drive}
98 sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}
99
100
101#. Initialize the OSD data directory. ::
102
103 ssh {new-osd-host}
104 ceph-osd -i {osd-num} --mkfs --mkkey
105
106 The directory must be empty before you can run ``ceph-osd``.
107
108#. Register the OSD authentication key. The value of ``ceph`` for
109 ``ceph-{osd-num}`` in the path is the ``$cluster-$id``. If your
110 cluster name differs from ``ceph``, use your cluster name instead.::
111
112 ceph auth add osd.{osd-num} osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring
113
114
115#. Add the OSD to the CRUSH map so that the OSD can begin receiving data. The
116 ``ceph osd crush add`` command allows you to add OSDs to the CRUSH hierarchy
117 wherever you wish. If you specify at least one bucket, the command
118 will place the OSD into the most specific bucket you specify, *and* it will
119 move that bucket underneath any other buckets you specify. **Important:** If
120 you specify only the root bucket, the command will attach the OSD directly
121 to the root, but CRUSH rules expect OSDs to be inside of hosts.
122
123 For Argonaut (v 0.48), execute the following::
124
125 ceph osd crush add {id} {name} {weight} [{bucket-type}={bucket-name} ...]
126
127 For Bobtail (v 0.56) and later releases, execute the following::
128
129 ceph osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
130
131 You may also decompile the CRUSH map, add the OSD to the device list, add the
132 host as a bucket (if it's not already in the CRUSH map), add the device as an
133 item in the host, assign it a weight, recompile it and set it. See
134 `Add/Move an OSD`_ for details.
135
136
137.. topic:: Argonaut (v0.48) Best Practices
138
139 To limit impact on user I/O performance, add an OSD to the CRUSH map
140 with an initial weight of ``0``. Then, ramp up the CRUSH weight a
141 little bit at a time. For example, to ramp by increments of ``0.2``,
142 start with::
143
144 ceph osd crush reweight {osd-id} .2
145
146 and allow migration to complete before reweighting to ``0.4``,
147 ``0.6``, and so on until the desired CRUSH weight is reached.
148
149 To limit the impact of OSD failures, you can set::
150
151 mon osd down out interval = 0
152
153 which prevents down OSDs from automatically being marked out, and then
154 ramp them down manually with::
155
156 ceph osd reweight {osd-num} .8
157
158 Again, wait for the cluster to finish migrating data, and then adjust
159 the weight further until you reach a weight of 0. Note that this
160 problem prevents the cluster to automatically re-replicate data after
161 a failure, so please ensure that sufficient monitoring is in place for
162 an administrator to intervene promptly.
163
164 Note that this practice will no longer be necessary in Bobtail and
165 subsequent releases.
166
28e407b8 167.. _rados-replacing-an-osd:
7c673cae 168
c07f9fc5
FG
169Replacing an OSD
170----------------
171
172When disks fail, or if an admnistrator wants to reprovision OSDs with a new
173backend, for instance, for switching from FileStore to BlueStore, OSDs need to
174be replaced. Unlike `Removing the OSD`_, replaced OSD's id and CRUSH map entry
175need to be keep intact after the OSD is destroyed for replacement.
176
177#. Destroy the OSD first::
178
179 ceph osd destroy {id} --yes-i-really-mean-it
180
181#. Zap a disk for the new OSD, if the disk was used before for other purposes.
182 It's not necessary for a new disk::
183
184 ceph-disk zap /dev/sdX
185
186#. Prepare the disk for replacement by using the previously destroyed OSD id::
187
188 ceph-disk prepare --bluestore /dev/sdX --osd-id {id} --osd-uuid `uuidgen`
189
190#. And activate the OSD::
191
192 ceph-disk activate /dev/sdX1
193
194
7c673cae
FG
195Starting the OSD
196----------------
197
198After you add an OSD to Ceph, the OSD is in your configuration. However,
199it is not yet running. The OSD is ``down`` and ``in``. You must start
200your new OSD before it can begin receiving data. You may use
201``service ceph`` from your admin host or start the OSD from its host
202machine.
203
204For Ubuntu Trusty use Upstart. ::
205
206 sudo start ceph-osd id={osd-num}
207
208For all other distros use systemd. ::
209
210 sudo systemctl start ceph-osd@{osd-num}
211
212
213Once you start your OSD, it is ``up`` and ``in``.
214
215
216Observe the Data Migration
217--------------------------
218
219Once you have added your new OSD to the CRUSH map, Ceph will begin rebalancing
220the server by migrating placement groups to your new OSD. You can observe this
221process with the `ceph`_ tool. ::
222
223 ceph -w
224
225You should see the placement group states change from ``active+clean`` to
226``active, some degraded objects``, and finally ``active+clean`` when migration
227completes. (Control-c to exit.)
228
229
230.. _Add/Move an OSD: ../crush-map#addosd
231.. _ceph: ../monitoring
232
233
234
235Removing OSDs (Manual)
236======================
237
238When you want to reduce the size of a cluster or replace hardware, you may
239remove an OSD at runtime. With Ceph, an OSD is generally one Ceph ``ceph-osd``
240daemon for one storage drive within a host machine. If your host has multiple
241storage drives, you may need to remove one ``ceph-osd`` daemon for each drive.
242Generally, it's a good idea to check the capacity of your cluster to see if you
243are reaching the upper end of its capacity. Ensure that when you remove an OSD
244that your cluster is not at its ``near full`` ratio.
245
246.. warning:: Do not let your cluster reach its ``full ratio`` when
247 removing an OSD. Removing OSDs could cause the cluster to reach
248 or exceed its ``full ratio``.
249
250
251Take the OSD out of the Cluster
252-----------------------------------
253
254Before you remove an OSD, it is usually ``up`` and ``in``. You need to take it
255out of the cluster so that Ceph can begin rebalancing and copying its data to
256other OSDs. ::
257
258 ceph osd out {osd-num}
259
260
261Observe the Data Migration
262--------------------------
263
264Once you have taken your OSD ``out`` of the cluster, Ceph will begin
265rebalancing the cluster by migrating placement groups out of the OSD you
266removed. You can observe this process with the `ceph`_ tool. ::
267
268 ceph -w
269
270You should see the placement group states change from ``active+clean`` to
271``active, some degraded objects``, and finally ``active+clean`` when migration
272completes. (Control-c to exit.)
273
274.. note:: Sometimes, typically in a "small" cluster with few hosts (for
275 instance with a small testing cluster), the fact to take ``out`` the
276 OSD can spawn a CRUSH corner case where some PGs remain stuck in the
277 ``active+remapped`` state. If you are in this case, you should mark
278 the OSD ``in`` with:
279
280 ``ceph osd in {osd-num}``
281
282 to come back to the initial state and then, instead of marking ``out``
283 the OSD, set its weight to 0 with:
284
285 ``ceph osd crush reweight osd.{osd-num} 0``
286
287 After that, you can observe the data migration which should come to its
288 end. The difference between marking ``out`` the OSD and reweighting it
289 to 0 is that in the first case the weight of the bucket which contains
c07f9fc5 290 the OSD is not changed whereas in the second case the weight of the bucket
7c673cae
FG
291 is updated (and decreased of the OSD weight). The reweight command could
292 be sometimes favoured in the case of a "small" cluster.
293
294
295
296Stopping the OSD
297----------------
298
299After you take an OSD out of the cluster, it may still be running.
300That is, the OSD may be ``up`` and ``out``. You must stop
301your OSD before you remove it from the configuration. ::
302
303 ssh {osd-host}
304 sudo systemctl stop ceph-osd@{osd-num}
305
306Once you stop your OSD, it is ``down``.
307
308
309Removing the OSD
310----------------
311
312This procedure removes an OSD from a cluster map, removes its authentication
313key, removes the OSD from the OSD map, and removes the OSD from the
314``ceph.conf`` file. If your host has multiple drives, you may need to remove an
315OSD for each drive by repeating this procedure.
316
c07f9fc5
FG
317#. Let the cluster forget the OSD first. This step removes the OSD from the CRUSH
318 map, removes its authentication key. And it is removed from the OSD map as
319 well. Please note the `purge subcommand`_ is introduced in Luminous, for older
320 versions, please see below ::
321
322 ceph osd purge {id} --yes-i-really-mean-it
323
324#. Navigate to the host where you keep the master copy of the cluster's
325 ``ceph.conf`` file. ::
326
327 ssh {admin-host}
328 cd /etc/ceph
329 vim ceph.conf
330
331#. Remove the OSD entry from your ``ceph.conf`` file (if it exists). ::
332
333 [osd.1]
334 host = {hostname}
335
336#. From the host where you keep the master copy of the cluster's ``ceph.conf`` file,
337 copy the updated ``ceph.conf`` file to the ``/etc/ceph`` directory of other
338 hosts in your cluster.
339
340If your Ceph cluster is older than Luminous, instead of using ``ceph osd purge``,
341you need to perform this step manually:
342
7c673cae
FG
343
344#. Remove the OSD from the CRUSH map so that it no longer receives data. You may
345 also decompile the CRUSH map, remove the OSD from the device list, remove the
346 device as an item in the host bucket or remove the host bucket (if it's in the
347 CRUSH map and you intend to remove the host), recompile the map and set it.
348 See `Remove an OSD`_ for details. ::
349
350 ceph osd crush remove {name}
351
352#. Remove the OSD authentication key. ::
353
354 ceph auth del osd.{osd-num}
355
356 The value of ``ceph`` for ``ceph-{osd-num}`` in the path is the ``$cluster-$id``.
357 If your cluster name differs from ``ceph``, use your cluster name instead.
358
359#. Remove the OSD. ::
360
361 ceph osd rm {osd-num}
362 #for example
363 ceph osd rm 1
7c673cae 364
7c673cae
FG
365
366.. _Remove an OSD: ../crush-map#removeosd
c07f9fc5 367.. _purge subcommand: /man/8/ceph#osd