]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | ============== |
2 | Upgrading Ceph | |
3 | ============== | |
4 | ||
20effc67 TL |
5 | .. DANGER:: DATE: 01 NOV 2021. |
6 | ||
7 | DO NOT UPGRADE TO CEPH PACIFIC FROM AN OLDER VERSION. | |
8 | ||
9 | A recently-discovered bug (https://tracker.ceph.com/issues/53062) can cause | |
10 | data corruption. This bug occurs during OMAP format conversion for | |
11 | clusters that are updated to Pacific. New clusters are not affected by this | |
12 | bug. | |
13 | ||
14 | The trigger for this bug is BlueStore's repair/quick-fix functionality. This | |
15 | bug can be triggered in two known ways: | |
16 | ||
17 | (1) manually via the ceph-bluestore-tool, or | |
18 | (2) automatically, by OSD if ``bluestore_fsck_quick_fix_on_mount`` is set | |
19 | to true. | |
20 | ||
21 | The fix for this bug is expected to be available in Ceph v16.2.7. | |
22 | ||
23 | DO NOT set ``bluestore_quick_fix_on_mount`` to true. If it is currently | |
24 | set to true in your configuration, immediately set it to false. | |
25 | ||
26 | DO NOT run ``ceph-bluestore-tool``'s repair/quick-fix commands. | |
27 | ||
b3b6e05e TL |
28 | Cephadm can safely upgrade Ceph from one bugfix release to the next. For |
29 | example, you can upgrade from v15.2.0 (the first Octopus release) to the next | |
30 | point release, v15.2.1. | |
9f95a23c TL |
31 | |
32 | The automated upgrade process follows Ceph best practices. For example: | |
33 | ||
34 | * The upgrade order starts with managers, monitors, then other daemons. | |
35 | * Each daemon is restarted only after Ceph indicates that the cluster | |
36 | will remain available. | |
37 | ||
522d829b TL |
38 | .. note:: |
39 | ||
40 | The Ceph cluster health status is likely to switch to | |
41 | ``HEALTH_WARNING`` during the upgrade. | |
42 | ||
43 | .. note:: | |
44 | ||
45 | In case a host of the cluster is offline, the upgrade is paused. | |
9f95a23c TL |
46 | |
47 | ||
48 | Starting the upgrade | |
49 | ==================== | |
50 | ||
522d829b | 51 | Before you use cephadm to upgrade Ceph, verify that all hosts are currently online and that your cluster is healthy by running the following command: |
9f95a23c | 52 | |
b3b6e05e | 53 | .. prompt:: bash # |
9f95a23c | 54 | |
b3b6e05e TL |
55 | ceph -s |
56 | ||
522d829b | 57 | To upgrade (or downgrade) to a specific release, run the following command: |
9f95a23c | 58 | |
b3b6e05e | 59 | .. prompt:: bash # |
9f95a23c | 60 | |
b3b6e05e | 61 | ceph orch upgrade start --ceph-version <version> |
9f95a23c | 62 | |
a4b75251 | 63 | For example, to upgrade to v16.2.6, run the following command: |
9f95a23c | 64 | |
b3b6e05e TL |
65 | .. prompt:: bash # |
66 | ||
20effc67 | 67 | ceph orch upgrade start --ceph-version 16.2.6 |
9f95a23c | 68 | |
a4b75251 TL |
69 | .. note:: |
70 | ||
71 | From version v16.2.6 the Docker Hub registry is no longer used, so if you use Docker you have to point it to the image in the quay.io registry: | |
72 | ||
73 | .. prompt:: bash # | |
74 | ||
75 | ceph orch upgrade start --image quay.io/ceph/ceph:v16.2.6 | |
76 | ||
9f95a23c TL |
77 | |
78 | Monitoring the upgrade | |
79 | ====================== | |
80 | ||
b3b6e05e TL |
81 | Determine (1) whether an upgrade is in progress and (2) which version the |
82 | cluster is upgrading to by running the following command: | |
83 | ||
84 | .. prompt:: bash # | |
9f95a23c | 85 | |
b3b6e05e | 86 | ceph orch upgrade status |
9f95a23c | 87 | |
b3b6e05e TL |
88 | Watching the progress bar during a Ceph upgrade |
89 | ----------------------------------------------- | |
90 | ||
91 | During the upgrade, a progress bar is visible in the ceph status output. It | |
92 | looks like this: | |
93 | ||
94 | .. code-block:: console | |
9f95a23c TL |
95 | |
96 | # ceph -s | |
b3b6e05e | 97 | |
9f95a23c TL |
98 | [...] |
99 | progress: | |
100 | Upgrade to docker.io/ceph/ceph:v15.2.1 (00h 20m 12s) | |
101 | [=======.....................] (time remaining: 01h 43m 31s) | |
102 | ||
b3b6e05e TL |
103 | Watching the cephadm log during an upgrade |
104 | ------------------------------------------ | |
105 | ||
106 | Watch the cephadm log by running the following command: | |
9f95a23c | 107 | |
b3b6e05e TL |
108 | .. prompt:: bash # |
109 | ||
110 | ceph -W cephadm | |
9f95a23c TL |
111 | |
112 | ||
113 | Canceling an upgrade | |
114 | ==================== | |
115 | ||
522d829b | 116 | You can stop the upgrade process at any time by running the following command: |
b3b6e05e TL |
117 | |
118 | .. prompt:: bash # | |
9f95a23c | 119 | |
522d829b | 120 | ceph orch upgrade stop |
9f95a23c TL |
121 | |
122 | ||
123 | Potential problems | |
124 | ================== | |
125 | ||
126 | There are a few health alerts that can arise during the upgrade process. | |
127 | ||
128 | UPGRADE_NO_STANDBY_MGR | |
129 | ---------------------- | |
130 | ||
522d829b TL |
131 | This alert (``UPGRADE_NO_STANDBY_MGR``) means that Ceph does not detect an |
132 | active standby manager daemon. In order to proceed with the upgrade, Ceph | |
133 | requires an active standby manager daemon (which you can think of in this | |
134 | context as "a second manager"). | |
b3b6e05e | 135 | |
522d829b TL |
136 | You can ensure that Cephadm is configured to run 2 (or more) managers by |
137 | running the following command: | |
b3b6e05e TL |
138 | |
139 | .. prompt:: bash # | |
9f95a23c | 140 | |
b3b6e05e | 141 | ceph orch apply mgr 2 # or more |
9f95a23c | 142 | |
522d829b TL |
143 | You can check the status of existing mgr daemons by running the following |
144 | command: | |
9f95a23c | 145 | |
b3b6e05e | 146 | .. prompt:: bash # |
9f95a23c | 147 | |
b3b6e05e | 148 | ceph orch ps --daemon-type mgr |
9f95a23c | 149 | |
522d829b TL |
150 | If an existing mgr daemon has stopped, you can try to restart it by running the |
151 | following command: | |
9f95a23c | 152 | |
b3b6e05e TL |
153 | .. prompt:: bash # |
154 | ||
155 | ceph orch daemon restart <name> | |
9f95a23c TL |
156 | |
157 | UPGRADE_FAILED_PULL | |
158 | ------------------- | |
159 | ||
522d829b TL |
160 | This alert (``UPGRADE_FAILED_PULL``) means that Ceph was unable to pull the |
161 | container image for the target version. This can happen if you specify a | |
162 | version or container image that does not exist (e.g. "1.2.3"), or if the | |
163 | container registry can not be reached by one or more hosts in the cluster. | |
9f95a23c | 164 | |
522d829b TL |
165 | To cancel the existing upgrade and to specify a different target version, run |
166 | the following commands: | |
9f95a23c | 167 | |
b3b6e05e TL |
168 | .. prompt:: bash # |
169 | ||
170 | ceph orch upgrade stop | |
171 | ceph orch upgrade start --ceph-version <version> | |
9f95a23c TL |
172 | |
173 | ||
174 | Using customized container images | |
175 | ================================= | |
176 | ||
b3b6e05e TL |
177 | For most users, upgrading requires nothing more complicated than specifying the |
178 | Ceph version number to upgrade to. In such cases, cephadm locates the specific | |
179 | Ceph container image to use by combining the ``container_image_base`` | |
180 | configuration option (default: ``docker.io/ceph/ceph``) with a tag of | |
181 | ``vX.Y.Z``. | |
182 | ||
183 | But it is possible to upgrade to an arbitrary container image, if that's what | |
184 | you need. For example, the following command upgrades to a development build: | |
9f95a23c | 185 | |
b3b6e05e | 186 | .. prompt:: bash # |
9f95a23c | 187 | |
b3b6e05e | 188 | ceph orch upgrade start --image quay.io/ceph-ci/ceph:recent-git-branch-name |
9f95a23c TL |
189 | |
190 | For more information about available container images, see :ref:`containers`. |