5 When you have a cluster up and running, you may add OSDs or remove OSDs
6 from the cluster at runtime.
11 When you want to expand a cluster, you may add an OSD at runtime. With Ceph, an
12 OSD is generally one Ceph ``ceph-osd`` daemon for one storage drive within a
13 host machine. If your host has multiple storage drives, you may map one
14 ``ceph-osd`` daemon for each drive.
16 Generally, it's a good idea to check the capacity of your cluster to see if you
17 are reaching the upper end of its capacity. As your cluster reaches its ``near
18 full`` ratio, you should add one or more OSDs to expand your cluster's capacity.
20 .. warning:: Do not let your cluster reach its ``full ratio`` before
21 adding an OSD. OSD failures that occur after the cluster reaches
22 its ``near full`` ratio may cause the cluster to exceed its
28 If you are adding a new host when adding a new OSD, see `Hardware
29 Recommendations`_ for details on minimum recommendations for OSD hardware. To
30 add an OSD host to your cluster, first make sure you have an up-to-date version
31 of Linux installed, and you have made some initial preparations for your
32 storage drives. See `Filesystem Recommendations`_ for details.
34 Add your OSD host to a rack in your cluster, connect it to the network
35 and ensure that it has network connectivity. See the `Network Configuration
36 Reference`_ for details.
38 .. _Hardware Recommendations: ../../../start/hardware-recommendations
39 .. _Filesystem Recommendations: ../../configuration/filesystem-recommendations
40 .. _Network Configuration Reference: ../../configuration/network-config-ref
42 Install the Required Software
43 -----------------------------
45 For manually deployed clusters, you must install Ceph packages
46 manually. See `Installing Ceph (Manual)`_ for details.
47 You should configure SSH to a user with password-less authentication
50 .. _Installing Ceph (Manual): ../../../install
53 Adding an OSD (Manual)
54 ----------------------
56 This procedure sets up a ``ceph-osd`` daemon, configures it to use one drive,
57 and configures the cluster to distribute data to the OSD. If your host has
58 multiple drives, you may add an OSD for each drive by repeating this procedure.
60 To add an OSD, create a data directory for it, mount a drive to that directory,
61 add the OSD to the cluster, and then add it to the CRUSH map.
63 When you add the OSD to the CRUSH map, consider the weight you give to the new
64 OSD. Hard drive capacity grows 40% per year, so newer OSD hosts may have larger
65 hard drives than older hosts in the cluster (i.e., they may have greater
68 .. tip:: Ceph prefers uniform hardware across pools. If you are adding drives
69 of dissimilar size, you can adjust their weights. However, for best
70 performance, consider a CRUSH hierarchy with drives of the same type/size.
72 #. Create the OSD. If no UUID is given, it will be set automatically when the
73 OSD starts up. The following command will output the OSD number, which you
74 will need for subsequent steps. ::
76 ceph osd create [{uuid} [{id}]]
78 If the optional parameter {id} is given it will be used as the OSD id.
79 Note, in this case the command may fail if the number is already in use.
81 .. warning:: In general, explicitly specifying {id} is not recommended.
82 IDs are allocated as an array, and skipping entries consumes some extra
83 memory. This can become significant if there are large gaps and/or
84 clusters are large. If {id} is not specified, the smallest available is
87 #. Create the default directory on your new OSD. ::
90 sudo mkdir /var/lib/ceph/osd/ceph-{osd-number}
93 #. If the OSD is for a drive other than the OS drive, prepare it
94 for use with Ceph, and mount it to the directory you just created::
97 sudo mkfs -t {fstype} /dev/{drive}
98 sudo mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}
101 #. Initialize the OSD data directory. ::
104 ceph-osd -i {osd-num} --mkfs --mkkey
106 The directory must be empty before you can run ``ceph-osd``.
108 #. Register the OSD authentication key. The value of ``ceph`` for
109 ``ceph-{osd-num}`` in the path is the ``$cluster-$id``. If your
110 cluster name differs from ``ceph``, use your cluster name instead.::
112 ceph auth add osd.{osd-num} osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring
115 #. Add the OSD to the CRUSH map so that the OSD can begin receiving data. The
116 ``ceph osd crush add`` command allows you to add OSDs to the CRUSH hierarchy
117 wherever you wish. If you specify at least one bucket, the command
118 will place the OSD into the most specific bucket you specify, *and* it will
119 move that bucket underneath any other buckets you specify. **Important:** If
120 you specify only the root bucket, the command will attach the OSD directly
121 to the root, but CRUSH rules expect OSDs to be inside of hosts.
123 Execute the following::
125 ceph osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
127 You may also decompile the CRUSH map, add the OSD to the device list, add the
128 host as a bucket (if it's not already in the CRUSH map), add the device as an
129 item in the host, assign it a weight, recompile it and set it. See
130 `Add/Move an OSD`_ for details.
133 .. _rados-replacing-an-osd:
138 When disks fail, or if an administrator wants to reprovision OSDs with a new
139 backend, for instance, for switching from FileStore to BlueStore, OSDs need to
140 be replaced. Unlike `Removing the OSD`_, replaced OSD's id and CRUSH map entry
141 need to be keep intact after the OSD is destroyed for replacement.
143 #. Make sure it is safe to destroy the OSD::
145 while ! ceph osd safe-to-destroy osd.{id} ; do sleep 10 ; done
147 #. Destroy the OSD first::
149 ceph osd destroy {id} --yes-i-really-mean-it
151 #. Zap a disk for the new OSD, if the disk was used before for other purposes.
152 It's not necessary for a new disk::
154 ceph-volume lvm zap /dev/sdX
156 #. Prepare the disk for replacement by using the previously destroyed OSD id::
158 ceph-volume lvm prepare --osd-id {id} --data /dev/sdX
160 #. And activate the OSD::
162 ceph-volume lvm activate {id} {fsid}
164 Alternatively, instead of preparing and activating, the device can be recreated
167 ceph-volume lvm create --osd-id {id} --data /dev/sdX
173 After you add an OSD to Ceph, the OSD is in your configuration. However,
174 it is not yet running. The OSD is ``down`` and ``in``. You must start
175 your new OSD before it can begin receiving data. You may use
176 ``service ceph`` from your admin host or start the OSD from its host
179 sudo systemctl start ceph-osd@{osd-num}
182 Once you start your OSD, it is ``up`` and ``in``.
185 Observe the Data Migration
186 --------------------------
188 Once you have added your new OSD to the CRUSH map, Ceph will begin rebalancing
189 the server by migrating placement groups to your new OSD. You can observe this
190 process with the `ceph`_ tool. ::
194 You should see the placement group states change from ``active+clean`` to
195 ``active, some degraded objects``, and finally ``active+clean`` when migration
196 completes. (Control-c to exit.)
199 .. _Add/Move an OSD: ../crush-map#addosd
200 .. _ceph: ../monitoring
204 Removing OSDs (Manual)
205 ======================
207 When you want to reduce the size of a cluster or replace hardware, you may
208 remove an OSD at runtime. With Ceph, an OSD is generally one Ceph ``ceph-osd``
209 daemon for one storage drive within a host machine. If your host has multiple
210 storage drives, you may need to remove one ``ceph-osd`` daemon for each drive.
211 Generally, it's a good idea to check the capacity of your cluster to see if you
212 are reaching the upper end of its capacity. Ensure that when you remove an OSD
213 that your cluster is not at its ``near full`` ratio.
215 .. warning:: Do not let your cluster reach its ``full ratio`` when
216 removing an OSD. Removing OSDs could cause the cluster to reach
217 or exceed its ``full ratio``.
220 Take the OSD out of the Cluster
221 -----------------------------------
223 Before you remove an OSD, it is usually ``up`` and ``in``. You need to take it
224 out of the cluster so that Ceph can begin rebalancing and copying its data to
227 ceph osd out {osd-num}
230 Observe the Data Migration
231 --------------------------
233 Once you have taken your OSD ``out`` of the cluster, Ceph will begin
234 rebalancing the cluster by migrating placement groups out of the OSD you
235 removed. You can observe this process with the `ceph`_ tool. ::
239 You should see the placement group states change from ``active+clean`` to
240 ``active, some degraded objects``, and finally ``active+clean`` when migration
241 completes. (Control-c to exit.)
243 .. note:: Sometimes, typically in a "small" cluster with few hosts (for
244 instance with a small testing cluster), the fact to take ``out`` the
245 OSD can spawn a CRUSH corner case where some PGs remain stuck in the
246 ``active+remapped`` state. If you are in this case, you should mark
249 ``ceph osd in {osd-num}``
251 to come back to the initial state and then, instead of marking ``out``
252 the OSD, set its weight to 0 with:
254 ``ceph osd crush reweight osd.{osd-num} 0``
256 After that, you can observe the data migration which should come to its
257 end. The difference between marking ``out`` the OSD and reweighting it
258 to 0 is that in the first case the weight of the bucket which contains
259 the OSD is not changed whereas in the second case the weight of the bucket
260 is updated (and decreased of the OSD weight). The reweight command could
261 be sometimes favoured in the case of a "small" cluster.
268 After you take an OSD out of the cluster, it may still be running.
269 That is, the OSD may be ``up`` and ``out``. You must stop
270 your OSD before you remove it from the configuration. ::
273 sudo systemctl stop ceph-osd@{osd-num}
275 Once you stop your OSD, it is ``down``.
281 This procedure removes an OSD from a cluster map, removes its authentication
282 key, removes the OSD from the OSD map, and removes the OSD from the
283 ``ceph.conf`` file. If your host has multiple drives, you may need to remove an
284 OSD for each drive by repeating this procedure.
286 #. Let the cluster forget the OSD first. This step removes the OSD from the CRUSH
287 map, removes its authentication key. And it is removed from the OSD map as
288 well. Please note the :ref:`purge subcommand <ceph-admin-osd>` is introduced in Luminous, for older
289 versions, please see below ::
291 ceph osd purge {id} --yes-i-really-mean-it
293 #. Navigate to the host where you keep the master copy of the cluster's
294 ``ceph.conf`` file. ::
300 #. Remove the OSD entry from your ``ceph.conf`` file (if it exists). ::
305 #. From the host where you keep the master copy of the cluster's ``ceph.conf`` file,
306 copy the updated ``ceph.conf`` file to the ``/etc/ceph`` directory of other
307 hosts in your cluster.
309 If your Ceph cluster is older than Luminous, instead of using ``ceph osd purge``,
310 you need to perform this step manually:
313 #. Remove the OSD from the CRUSH map so that it no longer receives data. You may
314 also decompile the CRUSH map, remove the OSD from the device list, remove the
315 device as an item in the host bucket or remove the host bucket (if it's in the
316 CRUSH map and you intend to remove the host), recompile the map and set it.
317 See `Remove an OSD`_ for details. ::
319 ceph osd crush remove {name}
321 #. Remove the OSD authentication key. ::
323 ceph auth del osd.{osd-num}
325 The value of ``ceph`` for ``ceph-{osd-num}`` in the path is the ``$cluster-$id``.
326 If your cluster name differs from ``ceph``, use your cluster name instead.
328 #. Remove the OSD. ::
330 ceph osd rm {osd-num}
335 .. _Remove an OSD: ../crush-map#removeosd