]>
Commit | Line | Data |
---|---|---|
33c7a0ef TL |
1 | ====== |
2 | Quincy | |
3 | ====== | |
4 | ||
5 | Quincy is the 17th stable release of Ceph. It is named after Squidward | |
6 | Quincy Tentacles from Spongebob Squarepants. | |
7 | ||
8 | v17.2.0 Quincy | |
9 | ============== | |
10 | ||
11 | This is the first stable release of Ceph Quincy. | |
12 | ||
13 | Major Changes from Pacific | |
14 | -------------------------- | |
15 | ||
16 | General | |
17 | ~~~~~~~ | |
18 | ||
19 | * Filestore has been deprecated in Quincy. BlueStore is Ceph's default object | |
20 | store. | |
21 | ||
22 | * The `ceph-mgr-modules-core` debian package no longer recommends | |
23 | `ceph-mgr-rook`. `ceph-mgr-rook` depends on `python3-numpy`, which | |
24 | cannot be imported in different Python sub-interpreters multiple times | |
25 | when the version of `python3-numpy` is older than 1.19. Because | |
26 | `apt-get` installs the `Recommends` packages by default, `ceph-mgr-rook` | |
27 | was always installed along with the `ceph-mgr` debian package as an | |
28 | indirect dependency. If your workflow depends on this behavior, you | |
29 | might want to install `ceph-mgr-rook` separately. | |
30 | ||
31 | * The ``device_health_metrics`` pool has been renamed ``.mgr``. It is now | |
32 | used as a common store for all ``ceph-mgr`` modules. After upgrading to | |
33 | Quincy, the ``device_health_metrics`` pool will be renamed to ``.mgr`` | |
34 | on existing clusters. | |
35 | ||
36 | * The ``ceph pg dump`` command now prints three additional columns: | |
37 | `LAST_SCRUB_DURATION` shows the duration (in seconds) of the last completed | |
38 | scrub; | |
39 | `SCRUB_SCHEDULING` conveys whether a PG is scheduled to be scrubbed at a | |
40 | specified time, whether it is queued for scrubbing, or whether it is being | |
41 | scrubbed; | |
42 | `OBJECTS_SCRUBBED` shows the number of objects scrubbed in a PG after a | |
43 | scrub begins. | |
44 | ||
45 | * A health warning is now reported if the ``require-osd-release`` flag | |
46 | is not set to the appropriate release after a cluster upgrade. | |
47 | ||
48 | * LevelDB support has been removed. ``WITH_LEVELDB`` is no longer a supported | |
49 | build option. Users *should* migrate their monitors and OSDs to RocksDB | |
50 | before upgrading to Quincy. | |
51 | ||
52 | * Cephadm: ``osd_memory_target_autotune`` is enabled by default, which sets | |
53 | ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.7`` of total RAM. This | |
54 | is unsuitable for hyperconverged infrastructures. For hyperconverged Ceph, | |
55 | please refer to the documentation or set | |
56 | ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.2``. | |
57 | ||
58 | * telemetry: Improved the opt-in flow so that users can keep sharing the same | |
59 | data, even when new data collections are available. A new 'perf' channel that | |
60 | collects various performance metrics is now avaiable to opt into with: | |
61 | `ceph telemetry on` | |
62 | `ceph telemetry enable channel perf` | |
63 | See a sample report with `ceph telemetry preview`. | |
64 | Note that generating a telemetry report with 'perf' channel data might | |
65 | take a few moments in big clusters. | |
66 | For more details, see: | |
67 | https://docs.ceph.com/en/quincy/mgr/telemetry/ | |
68 | ||
69 | * MGR: The progress module disables the pg recovery event by default since the | |
70 | event is expensive and has interrupted other services when there are OSDs | |
71 | being marked in/out from the the cluster. However, the user can still enable | |
72 | this event anytime. For more detail, see: | |
73 | ||
74 | https://docs.ceph.com/en/quincy/mgr/progress/ | |
75 | ||
76 | * https://tracker.ceph.com/issues/55383 is a known issue - | |
77 | ``mon_cluster_log_to_journald`` needs to be set to false, when | |
78 | ``mon_cluster_log_to_file`` is set to true to continue to log cluster | |
79 | log messages to file, after log rotation. | |
80 | ||
81 | Cephadm | |
82 | ------- | |
83 | ||
84 | * SNMP Support | |
85 | * Colocation of Daemons (mgr, mds, rgw) | |
86 | * osd memory autotuning | |
87 | * Integration with new NFS mgr module | |
88 | * Ability to zap osds as they are removed | |
89 | * cephadm agent for increased performance/scalability | |
90 | ||
91 | Dashboard | |
92 | ~~~~~~~~~ | |
93 | * Day 1: the new "Cluster Expansion Wizard" will guide users through post-install steps: | |
94 | adding new hosts, storage devices or services. | |
95 | * NFS: the Dashboard now allows users to fully manage all NFS exports from a single place. | |
96 | * New mgr module (feedback): users can quickly report Ceph tracker issues | |
97 | or suggestions directly from the Dashboard or the CLI. | |
98 | * New "Message of the Day": cluster admins can publish a custom message in a banner. | |
99 | * Cephadm integration improvements: | |
100 | * Host management: maintenance, specs and labelling, | |
101 | * Service management: edit and display logs, | |
102 | * Daemon management (start, stop, restart, reload), | |
103 | * New services supported: ingress (HAProxy) and SNMP-gateway. | |
104 | * Monitoring and alerting: | |
105 | * 43 new alerts have been added (totalling 68) improving observability of events affecting: | |
106 | cluster health, monitors, storage devices, PGs and CephFS. | |
107 | * Alerts can now be sent externally as SNMP traps via the new SNMP gateway service | |
108 | (the MIB is provided). | |
109 | * Improved integrated full/nearfull event notifications. | |
110 | * Grafana Dashboards now use grafonnet format (though they're still available | |
111 | in JSON format). | |
112 | * Stack update: images for monitoring containers have been updated. | |
113 | Grafana 8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node Exporter 1.3.1. | |
114 | This reduced exposure to several Grafana vulnerabilities (CVE-2021-43798, | |
115 | CVE-2021-39226, CVE-2021-43798, CVE-2020-29510, CVE-2020-29511). | |
116 | ||
117 | RADOS | |
118 | ~~~~~ | |
119 | ||
120 | * OSD: Ceph now uses `mclock_scheduler` for BlueStore OSDs as its default | |
121 | `osd_op_queue` to provide QoS. The 'mclock_scheduler' is not supported | |
122 | for Filestore OSDs. Therefore, the default 'osd_op_queue' is set to `wpq` | |
123 | for Filestore OSDs and is enforced even if the user attempts to change it. | |
124 | For more details on configuring mclock see, | |
125 | ||
126 | https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/ | |
127 | ||
128 | An outstanding issue exists during runtime where the mclock config options | |
129 | related to reservation, weight and limit cannot be modified after switching | |
130 | to the `custom` mclock profile using the `ceph config set ...` command. | |
131 | This is tracked by: https://tracker.ceph.com/issues/55153. Until the issue | |
132 | is fixed, users are advised to avoid using the 'custom' profile or use the | |
133 | workaround mentioned in the tracker. | |
134 | ||
135 | * MGR: The pg_autoscaler can now be turned `on` and `off` globally | |
136 | with the `noautoscale` flag. By default, it is set to `on`, but this flag | |
137 | can come in handy to prevent rebalancing triggered by autoscaling during | |
138 | cluster upgrade and maintenance. Pools can now be created with the `--bulk` | |
139 | flag, which allows the autoscaler to allocate more PGs to such pools. This | |
140 | can be useful to get better out of the box performance for data-heavy pools. | |
141 | ||
142 | For more details about autoscaling, see: | |
143 | https://docs.ceph.com/en/quincy/rados/operations/placement-groups/ | |
144 | ||
145 | * OSD: Support for on-wire compression for osd-osd communication, `off` by | |
146 | default. | |
147 | ||
148 | For more details about compression modes, see: | |
149 | https://docs.ceph.com/en/quincy/rados/configuration/msgr2/#compression-modes | |
150 | ||
151 | * OSD: Concise reporting of slow operations in the cluster log. The old | |
152 | and more verbose logging behavior can be regained by setting | |
153 | `osd_aggregated_slow_ops_logging` to false. | |
154 | ||
155 | * the "kvs" Ceph object class is not packaged anymore. The "kvs" Ceph | |
156 | object class offers a distributed flat b-tree key-value store that | |
157 | is implemented on top of the librados objects omap. Because there | |
158 | are no existing internal users of this object class, it is not | |
159 | packaged anymore. | |
160 | ||
161 | RBD block storage | |
162 | ~~~~~~~~~~~~~~~~~ | |
163 | ||
164 | * rbd-nbd: `rbd device attach` and `rbd device detach` commands added, | |
165 | these allow for safe reattach after `rbd-nbd` daemon is restarted since | |
166 | Linux kernel 5.14. | |
167 | ||
168 | * rbd-nbd: `notrim` map option added to support thick-provisioned images, | |
169 | similar to krbd. | |
170 | ||
171 | * Large stabilization effort for client-side persistent caching on SSD | |
172 | devices, also available in 16.2.8. For details on usage, see: | |
173 | ||
174 | https://docs.ceph.com/en/quincy/rbd/rbd-persistent-write-log-cache/ | |
175 | ||
176 | * Several bug fixes in diff calculation when using fast-diff image | |
177 | feature + whole object (inexact) mode. In some rare cases these | |
178 | long-standing issues could cause an incorrect `rbd export`. Also | |
179 | fixed in 15.2.16 and 16.2.8. | |
180 | ||
181 | * Fix for a potential performance degradation when running Windows VMs | |
182 | on krbd. For details, see `rxbounce` map option description: | |
183 | ||
184 | https://docs.ceph.com/en/quincy/man/8/rbd/#kernel-rbd-krbd-options | |
185 | ||
186 | RGW object storage | |
187 | ~~~~~~~~~~~~~~~~~~ | |
188 | ||
189 | * RGW now supports rate limiting by user and/or by bucket. With this | |
190 | feature it is possible to limit user and/or bucket, the total operations | |
191 | and/or bytes per minute can be delivered. This feature allows the | |
192 | admin to limit only READ operations and/or WRITE operations. The | |
193 | rate-limiting configuration could be applied on all users and all buckets | |
194 | by using global configuration. | |
195 | ||
196 | * `radosgw-admin realm delete` has been renamed to `radosgw-admin realm | |
197 | rm`. This is consistent with the help message. | |
198 | ||
199 | * S3 bucket notification events now contain an `eTag` key instead of | |
200 | `etag`, and eventName values no longer carry the `s3:` prefix, fixing | |
201 | deviations from the message format that is observed on AWS. | |
202 | ||
203 | * It is possible to specify ssl options and ciphers for beast frontend | |
204 | now. The default ssl options setting is | |
205 | "no_sslv2:no_sslv3:no_tlsv1:no_tlsv1_1". If you want to return to the old | |
206 | behavior, add 'ssl_options=' (empty) to the ``rgw frontends`` configuration. | |
207 | ||
208 | * The behavior for Multipart Upload was modified so that only | |
209 | CompleteMultipartUpload notification is sent at the end of the multipart | |
210 | upload. The POST notification at the beginning of the upload and the PUT | |
211 | notifications that were sent on each part are no longer sent. | |
212 | ||
213 | ||
214 | CephFS distributed file system | |
215 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
216 | ||
217 | * fs: A file system can be created with a specific ID ("fscid"). This is | |
218 | useful in certain recovery scenarios (for example, when a monitor | |
219 | database has been lost and rebuilt, and the restored file system is | |
220 | expected to have the same ID as before). | |
221 | ||
222 | * fs: A file system can be renamed using the `fs rename` command. Any cephx | |
223 | credentials authorized for the old file system name will need to be | |
224 | reauthorized to the new file system name. Since the operations of the clients | |
225 | using these re-authorized IDs may be disrupted, this command requires the | |
226 | "--yes-i-really-mean-it" flag. Also, mirroring is expected to be disabled | |
227 | on the file system. | |
228 | ||
229 | * MDS upgrades no longer require all standby MDS daemons to be stoped before | |
230 | upgrading a file systems's sole active MDS. | |
231 | ||
232 | * CephFS: Failure to replay the journal by a standby-replay daemon now | |
233 | causes the rank to be marked "damaged". | |
234 | ||
235 | Upgrading from Octopus or Pacific | |
236 | ---------------------------------- | |
237 | ||
238 | Quincy does not support LevelDB. Please migrate your OSDs and monitors | |
239 | to RocksDB before upgrading to Quincy. | |
240 | ||
241 | Before starting, make sure your cluster is stable and healthy (no down or | |
242 | recovering OSDs). (This is optional, but recommended.) You can disable | |
243 | the autoscaler for all pools during the upgrade using the noautoscale flag. | |
244 | ||
245 | .. note:: | |
246 | ||
247 | You can monitor the progress of your upgrade at each stage with the | |
248 | ``ceph versions`` command, which will tell you what ceph version(s) are | |
249 | running for each type of daemon. | |
250 | ||
251 | Upgrading cephadm clusters | |
252 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
253 | ||
254 | If your cluster is deployed with cephadm (first introduced in Octopus), then | |
255 | the upgrade process is entirely automated. To initiate the upgrade, | |
256 | ||
257 | .. prompt:: bash # | |
258 | ||
259 | ceph orch upgrade start --ceph-version 17.2.0 | |
260 | ||
261 | The same process is used to upgrade to future minor releases. | |
262 | ||
263 | Upgrade progress can be monitored with ``ceph -s`` (which provides a simple | |
264 | progress bar) or more verbosely with | |
265 | ||
266 | .. prompt:: bash # | |
267 | ||
268 | ceph -W cephadm | |
269 | ||
270 | The upgrade can be paused or resumed with | |
271 | ||
272 | .. prompt:: bash # | |
273 | ||
274 | ceph orch upgrade pause # to pause | |
275 | ceph orch upgrade resume # to resume | |
276 | ||
277 | or canceled with | |
278 | ||
279 | .. prompt:: bash # | |
280 | ||
281 | ceph orch upgrade stop | |
282 | ||
283 | Note that canceling the upgrade simply stops the process; there is no ability to | |
284 | downgrade back to Octopus or Pacific. | |
285 | ||
286 | ||
287 | Upgrading non-cephadm clusters | |
288 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
289 | ||
290 | .. note:: | |
291 | If you cluster is running Octopus (15.2.x) or later, you might choose | |
292 | to first convert it to use cephadm so that the upgrade to Quincy | |
293 | is automated (see above). For more information, see | |
294 | :ref:`cephadm-adoption`. | |
295 | ||
296 | #. Set the ``noout`` flag for the duration of the upgrade. (Optional, | |
297 | but recommended.):: | |
298 | ||
299 | # ceph osd set noout | |
300 | ||
301 | #. Upgrade monitors by installing the new packages and restarting the | |
302 | monitor daemons. For example, on each monitor host,:: | |
303 | ||
304 | # systemctl restart ceph-mon.target | |
305 | ||
306 | Once all monitors are up, verify that the monitor upgrade is | |
307 | complete by looking for the ``quincy`` string in the mon | |
308 | map. The command:: | |
309 | ||
310 | # ceph mon dump | grep min_mon_release | |
311 | ||
312 | should report:: | |
313 | ||
314 | min_mon_release 17 (quincy) | |
315 | ||
316 | If it doesn't, that implies that one or more monitors hasn't been | |
317 | upgraded and restarted and/or the quorum does not include all monitors. | |
318 | ||
319 | #. Upgrade ``ceph-mgr`` daemons by installing the new packages and | |
320 | restarting all manager daemons. For example, on each manager host,:: | |
321 | ||
322 | # systemctl restart ceph-mgr.target | |
323 | ||
324 | Verify the ``ceph-mgr`` daemons are running by checking ``ceph | |
325 | -s``:: | |
326 | ||
327 | # ceph -s | |
328 | ||
329 | ... | |
330 | services: | |
331 | mon: 3 daemons, quorum foo,bar,baz | |
332 | mgr: foo(active), standbys: bar, baz | |
333 | ... | |
334 | ||
335 | #. Upgrade all OSDs by installing the new packages and restarting the | |
336 | ceph-osd daemons on all OSD hosts:: | |
337 | ||
338 | # systemctl restart ceph-osd.target | |
339 | ||
340 | #. Upgrade all CephFS MDS daemons. For each CephFS file system, | |
341 | ||
342 | #. Disable standby_replay:: | |
343 | ||
344 | # ceph fs set <fs_name> allow_standby_replay false | |
345 | ||
346 | #. Reduce the number of ranks to 1. (Make note of the original | |
347 | number of MDS daemons first if you plan to restore it later.):: | |
348 | ||
349 | # ceph status | |
350 | # ceph fs set <fs_name> max_mds 1 | |
351 | ||
352 | #. Wait for the cluster to deactivate any non-zero ranks by | |
353 | periodically checking the status:: | |
354 | ||
355 | # ceph status | |
356 | ||
357 | #. Take all standby MDS daemons offline on the appropriate hosts with:: | |
358 | ||
359 | # systemctl stop ceph-mds@<daemon_name> | |
360 | ||
361 | #. Confirm that only one MDS is online and is rank 0 for your FS:: | |
362 | ||
363 | # ceph status | |
364 | ||
365 | #. Upgrade the last remaining MDS daemon by installing the new | |
366 | packages and restarting the daemon:: | |
367 | ||
368 | # systemctl restart ceph-mds.target | |
369 | ||
370 | #. Restart all standby MDS daemons that were taken offline:: | |
371 | ||
372 | # systemctl start ceph-mds.target | |
373 | ||
374 | #. Restore the original value of ``max_mds`` for the volume:: | |
375 | ||
376 | # ceph fs set <fs_name> max_mds <original_max_mds> | |
377 | ||
378 | #. Upgrade all radosgw daemons by upgrading packages and restarting | |
379 | daemons on all hosts:: | |
380 | ||
381 | # systemctl restart ceph-radosgw.target | |
382 | ||
383 | #. Complete the upgrade by disallowing pre-Quincy OSDs and enabling | |
384 | all new Quincy-only functionality:: | |
385 | ||
386 | # ceph osd require-osd-release quincy | |
387 | ||
388 | #. If you set ``noout`` at the beginning, be sure to clear it with:: | |
389 | ||
390 | # ceph osd unset noout | |
391 | ||
392 | #. Consider transitioning your cluster to use the cephadm deployment | |
393 | and orchestration framework to simplify cluster management and | |
394 | future upgrades. For more information on converting an existing | |
395 | cluster to cephadm, see :ref:`cephadm-adoption`. | |
396 | ||
397 | Post-upgrade | |
398 | ~~~~~~~~~~~~ | |
399 | ||
400 | #. Verify the cluster is healthy with ``ceph health``. If your cluster is | |
401 | running Filestore, a deprecation warning is expected. This warning can | |
402 | be temporarily muted using the following command:: | |
403 | ||
404 | ceph health mute OSD_FILESTORE | |
405 | ||
406 | #. If you are upgrading from Mimic, or did not already do so when you | |
407 | upgraded to Nautilus, we recommend you enable the new :ref:`v2 | |
408 | network protocol <msgr2>`, issue the following command:: | |
409 | ||
410 | ceph mon enable-msgr2 | |
411 | ||
412 | This will instruct all monitors that bind to the old default port | |
413 | 6789 for the legacy v1 protocol to also bind to the new 3300 v2 | |
414 | protocol port. To see if all monitors have been updated,:: | |
415 | ||
416 | ceph mon dump | |
417 | ||
418 | and verify that each monitor has both a ``v2:`` and ``v1:`` address | |
419 | listed. | |
420 | ||
421 | #. Consider enabling the :ref:`telemetry module <telemetry>` to send | |
422 | anonymized usage statistics and crash information to the Ceph | |
423 | upstream developers. To see what would be reported (without actually | |
424 | sending any information to anyone),:: | |
425 | ||
426 | ceph telemetry preview-all | |
427 | ||
428 | If you are comfortable with the data that is reported, you can opt-in to | |
429 | automatically report the high-level cluster metadata with:: | |
430 | ||
431 | ceph telemetry on | |
432 | ||
433 | The public dashboard that aggregates Ceph telemetry can be found at | |
434 | `https://telemetry-public.ceph.com/ <https://telemetry-public.ceph.com/>`_. | |
435 | ||
436 | For more information about the telemetry module, see :ref:`the | |
437 | documentation <telemetry>`. | |
438 | ||
439 | ||
440 | Upgrading from pre-Octopus releases (like Nautilus) | |
441 | --------------------------------------------------- | |
442 | ||
443 | ||
444 | You *must* first upgrade to Octopus (15.2.z) or Pacific (16.2.z) before | |
445 | upgrading to Quincy. |