5 Cephadm can safely upgrade Ceph from one bugfix release to the next. For
6 example, you can upgrade from v15.2.0 (the first Octopus release) to the next
7 point release, v15.2.1.
9 The automated upgrade process follows Ceph best practices. For example:
11 * The upgrade order starts with managers, monitors, then other daemons.
12 * Each daemon is restarted only after Ceph indicates that the cluster
13 will remain available.
17 The Ceph cluster health status is likely to switch to
18 ``HEALTH_WARNING`` during the upgrade.
22 In case a host of the cluster is offline, the upgrade is paused.
30 `Staggered Upgrade`_ of the mons/mgrs may be necessary to have access
33 Cephadm by default reduces `max_mds` to `1`. This can be disruptive for large
34 scale CephFS deployments because the cluster cannot quickly reduce active MDS(s)
35 to `1` and a single active MDS cannot easily handle the load of all clients
36 even for a short time. Therefore, to upgrade MDS(s) without reducing `max_mds`,
37 the `fail_fs` option can to be set to `true` (default value is `false`) prior
38 to initiating the upgrade:
42 ceph config set mgr mgr/orchestrator/fail_fs true
45 #. Fail CephFS filesystems, bringing active MDS daemon(s) to
48 #. Upgrade MDS daemons safely.
50 #. Bring CephFS filesystems back up, bringing the state of active
51 MDS daemon(s) from `up:standby` to `up:active`.
53 Before you use cephadm to upgrade Ceph, verify that all hosts are currently online and that your cluster is healthy by running the following command:
59 To upgrade (or downgrade) to a specific release, run the following command:
63 ceph orch upgrade start --ceph-version <version>
65 For example, to upgrade to v16.2.6, run the following command:
69 ceph orch upgrade start --ceph-version 16.2.6
73 From version v16.2.6 the Docker Hub registry is no longer used, so if you use Docker you have to point it to the image in the quay.io registry:
77 ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.6
80 Monitoring the upgrade
81 ======================
83 Determine (1) whether an upgrade is in progress and (2) which version the
84 cluster is upgrading to by running the following command:
88 ceph orch upgrade status
90 Watching the progress bar during a Ceph upgrade
91 -----------------------------------------------
93 During the upgrade, a progress bar is visible in the ceph status output. It
96 .. code-block:: console
102 Upgrade to docker.io/ceph/ceph:v15.2.1 (00h 20m 12s)
103 [=======.....................] (time remaining: 01h 43m 31s)
105 Watching the cephadm log during an upgrade
106 ------------------------------------------
108 Watch the cephadm log by running the following command:
118 You can stop the upgrade process at any time by running the following command:
122 ceph orch upgrade stop
127 In case the new version is based on ``cephadm``, once done with the upgrade the user
128 has to update the ``cephadm`` package (or ceph-common package in case the user
129 doesn't use ``cephadm shell``) to a version compatible with the new version.
134 There are a few health alerts that can arise during the upgrade process.
136 UPGRADE_NO_STANDBY_MGR
137 ----------------------
139 This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an
140 active standby manager daemon. In order to proceed with the upgrade, Ceph
141 requires an active standby manager daemon (which you can think of in this
142 context as "a second manager").
144 You can ensure that Cephadm is configured to run 2 (or more) managers by
145 running the following command:
149 ceph orch apply mgr 2 # or more
151 You can check the status of existing mgr daemons by running the following
156 ceph orch ps --daemon-type mgr
158 If an existing mgr daemon has stopped, you can try to restart it by running the
163 ceph orch daemon restart <name>
168 This alert (``UPGRADE_FAILED_PULL``) means that Ceph was unable to pull the
169 container image for the target version. This can happen if you specify a
170 version or container image that does not exist (e.g. "1.2.3"), or if the
171 container registry can not be reached by one or more hosts in the cluster.
173 To cancel the existing upgrade and to specify a different target version, run
174 the following commands:
178 ceph orch upgrade stop
179 ceph orch upgrade start --ceph-version <version>
182 Using customized container images
183 =================================
185 For most users, upgrading requires nothing more complicated than specifying the
186 Ceph version number to upgrade to. In such cases, cephadm locates the specific
187 Ceph container image to use by combining the ``container_image_base``
188 configuration option (default: ``docker.io/ceph/ceph``) with a tag of
191 But it is possible to upgrade to an arbitrary container image, if that's what
192 you need. For example, the following command upgrades to a development build:
196 ceph orch upgrade start --image quay.io/ceph-ci/ceph:recent-git-branch-name
198 For more information about available container images, see :ref:`containers`.
203 Some users may prefer to upgrade components in phases rather than all at once.
204 The upgrade command, starting in 16.2.11 and 17.2.1 allows parameters
205 to limit which daemons are upgraded by a single upgrade command. The options in
206 include ``daemon_types``, ``services``, ``hosts`` and ``limit``. ``daemon_types``
207 takes a comma-separated list of daemon types and will only upgrade daemons of those
208 types. ``services`` is mutually exclusive with ``daemon_types``, only takes services
209 of one type at a time (e.g. can't provide an OSD and RGW service at the same time), and
210 will only upgrade daemons belonging to those services. ``hosts`` can be combined
211 with ``daemon_types`` or ``services`` or provided on its own. The ``hosts`` parameter
212 follows the same format as the command line options for :ref:`orchestrator-cli-placement-spec`.
213 ``limit`` takes an integer > 0 and provides a numerical limit on the number of
214 daemons cephadm will upgrade. ``limit`` can be combined with any of the other
215 parameters. For example, if you specify to upgrade daemons of type osd on host
216 Host1 with ``limit`` set to 3, cephadm will upgrade (up to) 3 osd daemons on
219 Example: specifying daemon types and hosts:
223 ceph orch upgrade start --image <image-name> --daemon-types mgr,mon --hosts host1,host2
225 Example: specifying services and using limit:
229 ceph orch upgrade start --image <image-name> --services rgw.example1,rgw.example2 --limit 2
233 Cephadm strictly enforces an order to the upgrade of daemons that is still present
234 in staggered upgrade scenarios. The current upgrade ordering is
235 ``mgr -> mon -> crash -> osd -> mds -> rgw -> rbd-mirror -> cephfs-mirror -> iscsi -> nfs``.
236 If you specify parameters that would upgrade daemons out of order, the upgrade
237 command will block and note which daemons will be missed if you proceed.
241 Upgrade commands with limiting parameters will validate the options before beginning the
242 upgrade, which may require pulling the new container image. Do not be surprised
243 if the upgrade start command takes a while to return when limiting parameters are provided.
247 In staggered upgrade scenarios (when a limiting parameter is provided) monitoring
248 stack daemons including Prometheus and node-exporter are refreshed after the Manager
249 daemons have been upgraded. Do not be surprised if Manager upgrades thus take longer
250 than expected. Note that the versions of monitoring stack daemons may not change between
251 Ceph releases, in which case they are only redeployed.
253 Upgrading to a version that supports staggered upgrade from one that doesn't
254 ----------------------------------------------------------------------------
256 While upgrading from a version that already supports staggered upgrades the process
257 simply requires providing the necessary arguments. However, if you wish to upgrade
258 to a version that supports staggered upgrade from one that does not, there is a
259 workaround. It requires first manually upgrading the Manager daemons and then passing
260 the limiting parameters as usual.
263 Make sure you have multiple running mgr daemons before attempting this procedure.
265 To start with, determine which Manager is your active one and which are standby. This
266 can be done in a variety of ways such as looking at the ``ceph -s`` output. Then,
267 manually upgrade each standby mgr daemon with:
271 ceph orch daemon redeploy mgr.example1.abcdef --image <new-image-name>
275 If you are on a very early version of cephadm (early Octopus) the ``orch daemon redeploy``
276 command may not have the ``--image`` flag. In that case, you must manually set the
277 Manager container image ``ceph config set mgr container_image <new-image-name>`` and then
278 redeploy the Manager ``ceph orch daemon redeploy mgr.example1.abcdef``
280 At this point, a Manager fail over should allow us to have the active Manager be one
281 running the new version.
287 Verify the active Manager is now one running the new version. To complete the Manager
292 ceph orch upgrade start --image <new-image-name> --daemon-types mgr
294 You should now have all your Manager daemons on the new version and be able to
295 specify the limiting parameters for the rest of the upgrade.