]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephadm/upgrade.rst
update ceph source to reef 18.1.2
[ceph.git] / ceph / doc / cephadm / upgrade.rst
1 ==============
2 Upgrading Ceph
3 ==============
4
5 Cephadm can safely upgrade Ceph from one bugfix release to the next. For
6 example, you can upgrade from v15.2.0 (the first Octopus release) to the next
7 point release, v15.2.1.
8
9 The automated upgrade process follows Ceph best practices. For example:
10
11 * The upgrade order starts with managers, monitors, then other daemons.
12 * Each daemon is restarted only after Ceph indicates that the cluster
13 will remain available.
14
15 .. note::
16
17 The Ceph cluster health status is likely to switch to
18 ``HEALTH_WARNING`` during the upgrade.
19
20 .. note::
21
22 In case a host of the cluster is offline, the upgrade is paused.
23
24
25 Starting the upgrade
26 ====================
27
28 .. note::
29 .. note::
30 `Staggered Upgrade`_ of the mons/mgrs may be necessary to have access
31 to this new feature.
32
33 Cephadm by default reduces `max_mds` to `1`. This can be disruptive for large
34 scale CephFS deployments because the cluster cannot quickly reduce active MDS(s)
35 to `1` and a single active MDS cannot easily handle the load of all clients
36 even for a short time. Therefore, to upgrade MDS(s) without reducing `max_mds`,
37 the `fail_fs` option can to be set to `true` (default value is `false`) prior
38 to initiating the upgrade:
39
40 .. prompt:: bash #
41
42 ceph config set mgr mgr/orchestrator/fail_fs true
43
44 This would:
45 #. Fail CephFS filesystems, bringing active MDS daemon(s) to
46 `up:standby` state.
47
48 #. Upgrade MDS daemons safely.
49
50 #. Bring CephFS filesystems back up, bringing the state of active
51 MDS daemon(s) from `up:standby` to `up:active`.
52
53 Before you use cephadm to upgrade Ceph, verify that all hosts are currently online and that your cluster is healthy by running the following command:
54
55 .. prompt:: bash #
56
57 ceph -s
58
59 To upgrade (or downgrade) to a specific release, run the following command:
60
61 .. prompt:: bash #
62
63 ceph orch upgrade start --ceph-version <version>
64
65 For example, to upgrade to v16.2.6, run the following command:
66
67 .. prompt:: bash #
68
69 ceph orch upgrade start --ceph-version 16.2.6
70
71 .. note::
72
73 From version v16.2.6 the Docker Hub registry is no longer used, so if you use Docker you have to point it to the image in the quay.io registry:
74
75 .. prompt:: bash #
76
77 ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.6
78
79
80 Monitoring the upgrade
81 ======================
82
83 Determine (1) whether an upgrade is in progress and (2) which version the
84 cluster is upgrading to by running the following command:
85
86 .. prompt:: bash #
87
88 ceph orch upgrade status
89
90 Watching the progress bar during a Ceph upgrade
91 -----------------------------------------------
92
93 During the upgrade, a progress bar is visible in the ceph status output. It
94 looks like this:
95
96 .. code-block:: console
97
98 # ceph -s
99
100 [...]
101 progress:
102 Upgrade to docker.io/ceph/ceph:v15.2.1 (00h 20m 12s)
103 [=======.....................] (time remaining: 01h 43m 31s)
104
105 Watching the cephadm log during an upgrade
106 ------------------------------------------
107
108 Watch the cephadm log by running the following command:
109
110 .. prompt:: bash #
111
112 ceph -W cephadm
113
114
115 Canceling an upgrade
116 ====================
117
118 You can stop the upgrade process at any time by running the following command:
119
120 .. prompt:: bash #
121
122 ceph orch upgrade stop
123
124 Post upgrade actions
125 ====================
126
127 In case the new version is based on ``cephadm``, once done with the upgrade the user
128 has to update the ``cephadm`` package (or ceph-common package in case the user
129 doesn't use ``cephadm shell``) to a version compatible with the new version.
130
131 Potential problems
132 ==================
133
134 There are a few health alerts that can arise during the upgrade process.
135
136 UPGRADE_NO_STANDBY_MGR
137 ----------------------
138
139 This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an
140 active standby manager daemon. In order to proceed with the upgrade, Ceph
141 requires an active standby manager daemon (which you can think of in this
142 context as "a second manager").
143
144 You can ensure that Cephadm is configured to run 2 (or more) managers by
145 running the following command:
146
147 .. prompt:: bash #
148
149 ceph orch apply mgr 2 # or more
150
151 You can check the status of existing mgr daemons by running the following
152 command:
153
154 .. prompt:: bash #
155
156 ceph orch ps --daemon-type mgr
157
158 If an existing mgr daemon has stopped, you can try to restart it by running the
159 following command:
160
161 .. prompt:: bash #
162
163 ceph orch daemon restart <name>
164
165 UPGRADE_FAILED_PULL
166 -------------------
167
168 This alert (``UPGRADE_FAILED_PULL``) means that Ceph was unable to pull the
169 container image for the target version. This can happen if you specify a
170 version or container image that does not exist (e.g. "1.2.3"), or if the
171 container registry can not be reached by one or more hosts in the cluster.
172
173 To cancel the existing upgrade and to specify a different target version, run
174 the following commands:
175
176 .. prompt:: bash #
177
178 ceph orch upgrade stop
179 ceph orch upgrade start --ceph-version <version>
180
181
182 Using customized container images
183 =================================
184
185 For most users, upgrading requires nothing more complicated than specifying the
186 Ceph version number to upgrade to. In such cases, cephadm locates the specific
187 Ceph container image to use by combining the ``container_image_base``
188 configuration option (default: ``docker.io/ceph/ceph``) with a tag of
189 ``vX.Y.Z``.
190
191 But it is possible to upgrade to an arbitrary container image, if that's what
192 you need. For example, the following command upgrades to a development build:
193
194 .. prompt:: bash #
195
196 ceph orch upgrade start --image quay.io/ceph-ci/ceph:recent-git-branch-name
197
198 For more information about available container images, see :ref:`containers`.
199
200 Staggered Upgrade
201 =================
202
203 Some users may prefer to upgrade components in phases rather than all at once.
204 The upgrade command, starting in 16.2.11 and 17.2.1 allows parameters
205 to limit which daemons are upgraded by a single upgrade command. The options in
206 include ``daemon_types``, ``services``, ``hosts`` and ``limit``. ``daemon_types``
207 takes a comma-separated list of daemon types and will only upgrade daemons of those
208 types. ``services`` is mutually exclusive with ``daemon_types``, only takes services
209 of one type at a time (e.g. can't provide an OSD and RGW service at the same time), and
210 will only upgrade daemons belonging to those services. ``hosts`` can be combined
211 with ``daemon_types`` or ``services`` or provided on its own. The ``hosts`` parameter
212 follows the same format as the command line options for :ref:`orchestrator-cli-placement-spec`.
213 ``limit`` takes an integer > 0 and provides a numerical limit on the number of
214 daemons cephadm will upgrade. ``limit`` can be combined with any of the other
215 parameters. For example, if you specify to upgrade daemons of type osd on host
216 Host1 with ``limit`` set to 3, cephadm will upgrade (up to) 3 osd daemons on
217 Host1.
218
219 Example: specifying daemon types and hosts:
220
221 .. prompt:: bash #
222
223 ceph orch upgrade start --image <image-name> --daemon-types mgr,mon --hosts host1,host2
224
225 Example: specifying services and using limit:
226
227 .. prompt:: bash #
228
229 ceph orch upgrade start --image <image-name> --services rgw.example1,rgw.example2 --limit 2
230
231 .. note::
232
233 Cephadm strictly enforces an order to the upgrade of daemons that is still present
234 in staggered upgrade scenarios. The current upgrade ordering is
235 ``mgr -> mon -> crash -> osd -> mds -> rgw -> rbd-mirror -> cephfs-mirror -> iscsi -> nfs``.
236 If you specify parameters that would upgrade daemons out of order, the upgrade
237 command will block and note which daemons will be missed if you proceed.
238
239 .. note::
240
241 Upgrade commands with limiting parameters will validate the options before beginning the
242 upgrade, which may require pulling the new container image. Do not be surprised
243 if the upgrade start command takes a while to return when limiting parameters are provided.
244
245 .. note::
246
247 In staggered upgrade scenarios (when a limiting parameter is provided) monitoring
248 stack daemons including Prometheus and node-exporter are refreshed after the Manager
249 daemons have been upgraded. Do not be surprised if Manager upgrades thus take longer
250 than expected. Note that the versions of monitoring stack daemons may not change between
251 Ceph releases, in which case they are only redeployed.
252
253 Upgrading to a version that supports staggered upgrade from one that doesn't
254 ----------------------------------------------------------------------------
255
256 While upgrading from a version that already supports staggered upgrades the process
257 simply requires providing the necessary arguments. However, if you wish to upgrade
258 to a version that supports staggered upgrade from one that does not, there is a
259 workaround. It requires first manually upgrading the Manager daemons and then passing
260 the limiting parameters as usual.
261
262 .. warning::
263 Make sure you have multiple running mgr daemons before attempting this procedure.
264
265 To start with, determine which Manager is your active one and which are standby. This
266 can be done in a variety of ways such as looking at the ``ceph -s`` output. Then,
267 manually upgrade each standby mgr daemon with:
268
269 .. prompt:: bash #
270
271 ceph orch daemon redeploy mgr.example1.abcdef --image <new-image-name>
272
273 .. note::
274
275 If you are on a very early version of cephadm (early Octopus) the ``orch daemon redeploy``
276 command may not have the ``--image`` flag. In that case, you must manually set the
277 Manager container image ``ceph config set mgr container_image <new-image-name>`` and then
278 redeploy the Manager ``ceph orch daemon redeploy mgr.example1.abcdef``
279
280 At this point, a Manager fail over should allow us to have the active Manager be one
281 running the new version.
282
283 .. prompt:: bash #
284
285 ceph mgr fail
286
287 Verify the active Manager is now one running the new version. To complete the Manager
288 upgrading:
289
290 .. prompt:: bash #
291
292 ceph orch upgrade start --image <new-image-name> --daemon-types mgr
293
294 You should now have all your Manager daemons on the new version and be able to
295 specify the limiting parameters for the rest of the upgrade.