]> git.proxmox.com Git - ceph.git/blame - ceph/PendingReleaseNotes
bump version to 18.2.2-pve1
[ceph.git] / ceph / PendingReleaseNotes
CommitLineData
aee94f69
TL
1>=19.0.0
2
3* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
4 multi-site. Previously, the replicas of such objects were corrupted on decryption.
5 A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to
6 identify these original multipart uploads. The ``LastModified`` timestamp of any
7 identified object is incremented by 1ns to cause peer zones to replicate it again.
8 For multi-site deployments that make any use of Server-Side Encryption, we
9 recommended running this command against every bucket in every zone after all
10 zones have upgraded.
11* CEPHFS: MDS evicts clients which are not advancing their request tids which causes
12 a large buildup of session metadata resulting in the MDS going read-only due to
13 the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold`
14 config controls the maximum size that a (encoded) session metadata can grow.
15* RGW: New tools have been added to radosgw-admin for identifying and
16 correcting issues with versioned bucket indexes. Historical bugs with the
17 versioned bucket index transaction workflow made it possible for the index
18 to accumulate extraneous "book-keeping" olh entries and plain placeholder
19 entries. In some specific scenarios where clients made concurrent requests
20 referencing the same object key, it was likely that a lot of extra index
21 entries would accumulate. When a significant number of these entries are
22 present in a single bucket index shard, they can cause high bucket listing
23 latencies and lifecycle processing failures. To check whether a versioned
24 bucket has unnecessary olh entries, users can now run ``radosgw-admin
25 bucket check olh``. If the ``--fix`` flag is used, the extra entries will
26 be safely removed. A distinct issue from the one described thus far, it is
27 also possible that some versioned buckets are maintaining extra unlinked
28 objects that are not listable from the S3/ Swift APIs. These extra objects
29 are typically a result of PUT requests that exited abnormally, in the middle
30 of a bucket index transaction - so the client would not have received a
31 successful response. Bugs in prior releases made these unlinked objects easy
32 to reproduce with any PUT request that was made on a bucket that was actively
33 resharding. Besides the extra space that these hidden, unlinked objects
34 consume, there can be another side effect in certain scenarios, caused by
35 the nature of the failure mode that produced them, where a client of a bucket
36 that was a victim of this bug may find the object associated with the key to
37 be in an inconsistent state. To check whether a versioned bucket has unlinked
38 entries, users can now run ``radosgw-admin bucket check unlinked``. If the
39 ``--fix`` flag is used, the unlinked objects will be safely removed. Finally,
40 a third issue made it possible for versioned bucket index stats to be
41 accounted inaccurately. The tooling for recalculating versioned bucket stats
42 also had a bug, and was not previously capable of fixing these inaccuracies.
43 This release resolves those issues and users can now expect that the existing
44 ``radosgw-admin bucket check`` command will produce correct results. We
45 recommend that users with versioned buckets, especially those that existed
46 on prior releases, use these new tools to check whether their buckets are
47 affected and to clean them up accordingly.
48* mgr/snap-schedule: For clusters with multiple CephFS file systems, all the
49 snap-schedule commands now expect the '--fs' argument.
50
1e59de90
TL
51>=18.0.0
52
53* The RGW policy parser now rejects unknown principals by default. If you are
54 mirroring policies between RGW and AWS, you may wish to set
55 "rgw policy reject invalid principals" to "false". This affects only newly set
56 policies, not policies that are already in place.
57* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
58 The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
59 defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
60* The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
61 Please note that this is not an officially supported feature.
62* RGW's pubsub interface now returns boolean fields using bool. Before this change,
63 `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
64 of "true" or "false" with quotes around them. After this change, these fields
65 are returned without quotes so they can be decoded as boolean values in JSON.
66 The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
67* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
68 returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
69 encoded in `Attributes/EndPoint`.
70* All boolean fields previously rendered as string by `rgw-admin` command when
71 the JSON format is used are now rendered as boolean. If your scripts/tools
72 relies on this behavior, please update them accordingly. The impacted field names
73 are:
74 * absolute
75 * add
76 * admin
77 * appendable
78 * bucket_key_enabled
79 * delete_marker
80 * exists
81 * has_bucket_info
82 * high_precision_time
83 * index
84 * is_master
85 * is_prefix
86 * is_truncated
87 * linked
88 * log_meta
89 * log_op
90 * pending_removal
91 * read_only
92 * retain_head_object
93 * rule_exist
94 * start_with_full_sync
95 * sync_from_all
96 * syncstopped
97 * system
98 * truncated
99 * user_stats_sync
100* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
101 configurable. This has the same defaults as debug_rgw, but can now be controlled
102 independently.
39ae355f
TL
103* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
104 and `Image::aio_compare_and_write` methods) now match those of C API. Both
105 compare and write steps operate only on `len` bytes even if the respective
106 buffers are larger. The previous behavior of comparing up to the size of
107 the compare buffer was prone to subtle breakage upon straddling a stripe
108 unit boundary.
109* RBD: compare-and-write operation is no longer limited to 512-byte sectors.
110 Assuming proper alignment, it now allows operating on stripe units (4M by
111 default).
112* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
113 on both compare and write buffers. This compliments existing `rbd_aio_readv`
114 and `rbd_aio_writev` methods.
1e59de90
TL
115* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
116 macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
39ae355f
TL
117* Trimming of PGLog dups is now controlled by the size instead of the version.
118 This fixes the PGLog inflation issue that was happening when the on-line
119 (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
120 mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
121 that targets situations where OSD is unable to boot due to those inflated dups.
122 If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
123 will be visible.
124 Relevant tracker: https://tracker.ceph.com/issues/53729
1e59de90
TL
125* RBD: `rbd device unmap` command gained `--namespace` option. Support for
126 namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
127 map and unmap images in namespaces using the `image-spec` syntax since then
128 but the corresponding option available in most other commands was missing.
129* RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
05a536ef
TL
130 When both are enabled, compression is applied before encryption. Earlier releases
131 of multisite do not replicate such objects correctly, so all zones must upgrade to
132 Reef before enabling the `compress-encrypted` zonegroup feature: see
133 https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
134 security considerations.
1e59de90
TL
135* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
136 is removed. Together with it, the "pubsub" zone should not be used anymore.
137 The REST operations, as well as radosgw-admin commands for manipulating
138 subscriptions, as well as fetching and acking the notifications are removed
139 as well.
140 In case that the endpoint to which the notifications are sent maybe down or
141 disconnected, it is recommended to use persistent notifications to guarantee
142 the delivery of the notifications. In case the system that consumes the
143 notifications needs to pull them (instead of the notifications be pushed
144 to it), an external message bus (e.g. rabbitmq, Kafka) should be used for
145 that purpose.
146* RGW: The serialized format of notification and topics has changed, so that
147 new/updated topics will be unreadable by old RGWs. We recommend completing
148 the RGW upgrades before creating or modifying any notification topics.
149* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
150 `rbd encryption format` command and `--encryption-passphrase-file` option
151 in other commands) is no longer stripped.
152* RBD: Support for layered client-side encryption is added. Cloned images
153 can now be encrypted each with its own encryption format and passphrase,
154 potentially different from that of the parent image. The efficient
155 copy-on-write semantics intrinsic to unformatted (regular) cloned images
156 are retained.
157* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
158 `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
159 mds-client.yaml.in because this option was only used by MDS client from its
160 birth.
161* The `perf dump` and `perf schema` commands are deprecated in favor of new
162 `counter dump` and `counter schema` commands. These new commands add support
163 for labeled perf counters and also emit existing unlabeled perf counters. Some
164 unlabeled perf counters became labeled in this release, with more to follow in
165 future releases; such converted perf counters are no longer emitted by the
166 `perf dump` and `perf schema` commands.
167* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
168 `active_clients` fields at the top level. Previously, these fields were
169 output under `always_on_modules` field.
170* `ceph mgr dump` command now displays the name of the mgr module that
171 registered a RADOS client in the `name` field added to elements of the
172 `active_clients` array. Previously, only the address of a module's RADOS
173 client was shown in the `active_clients` array.
174* RBD: All rbd-mirror daemon perf counters became labeled and as such are now
175 emitted only by the new `counter dump` and `counter schema` commands. As part
176 of the conversion, many also got renamed to better disambiguate journal-based
177 and snapshot-based mirroring.
178* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
179 `std::list` before potentially appending to it, aligning with the semantics
180 of the corresponding C API (`rbd_watchers_list`).
05a536ef
TL
181* The rados python binding is now able to process (opt-in) omap keys as bytes
182 objects. This enables interacting with RADOS omap keys that are not decodeable as
183 UTF-8 strings.
1e59de90
TL
184* Telemetry: Users who are opted-in to telemetry can also opt-in to
185 participating in a leaderboard in the telemetry public
186 dashboards (https://telemetry-public.ceph.com/). Users can now also add a
187 description of the cluster to publicly appear in the leaderboard.
188 For more details, see:
189 https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
190 See a sample report with `ceph telemetry preview`.
191 Opt-in to telemetry with `ceph telemetry on`.
192 Opt-in to the leaderboard with
193 `ceph config set mgr mgr/telemetry/leaderboard true`.
194 Add leaderboard description with:
195 `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
196* CEPHFS: After recovering a Ceph File System post following the disaster recovery
197 procedure, the recovered files under `lost+found` directory can now be deleted.
198* core: cache-tiering is now deprecated.
199* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
200 undergone significant usability and design improvements to address the slow
201 backfill issue. Some important changes are:
202 * The 'balanced' profile is set as the default mClock profile because it
203 represents a compromise between prioritizing client IO or recovery IO. Users
204 can then choose either the 'high_client_ops' profile to prioritize client IO
205 or the 'high_recovery_ops' profile to prioritize recovery IO.
206 * QoS parameters like reservation and limit are now specified in terms of a
207 fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
208 * The cost parameters (osd_mclock_cost_per_io_usec_* and
209 osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
210 is now determined using the random IOPS and maximum sequential bandwidth
211 capability of the OSD's underlying device.
212 * Degraded object recovery is given higher priority when compared to misplaced
213 object recovery because degraded objects present a data safety issue not
214 present with objects that are merely misplaced. Therefore, backfilling
215 operations with the 'balanced' and 'high_client_ops' mClock profiles may
216 progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
217 scheduler.
218 * The QoS allocations in all the mClock profiles are optimized based on the above
219 fixes and enhancements.
220 * For more detailed information see:
221 https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/
05a536ef
TL
222* CEPHFS: After recovering a Ceph File System post following the disaster recovery
223 procedure, the recovered files under `lost+found` directory can now be deleted.
aee94f69
TL
224 https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/
225* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot
226 than the number mentioned against the config tunable `mds_max_snaps_per_dir`
227 so that a new snapshot can be created and retained during the next schedule
228 run.
39ae355f 229
33c7a0ef
TL
230>=17.2.1
231
232* The "BlueStore zero block detection" feature (first introduced to Quincy in
233https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
234new global configuration called `bluestore_zero_block_detection`. This feature,
235intended for large-scale synthetic testing, does not interact well with some RBD
236and CephFS features. Any side effects experienced in previous Quincy versions
237would no longer occur, provided that the configuration remains set to false.
238Relevant tracker: https://tracker.ceph.com/issues/55521
239
240* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
241 version, Kubernetes version, node metrics, etc.
242 See a sample report with `ceph telemetry preview`.
243 Opt-in with `ceph telemetry on`.
244
245 For more details, see:
246
247 https://docs.ceph.com/en/latest/mgr/telemetry/
248
1e59de90
TL
249* OSD: The issue of high CPU utilization during recovery/backfill operations
250 has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.
20effc67 251
1e59de90
TL
252>=15.2.17
253
254* OSD: Octopus modified the SnapMapper key format from
255 <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
256 to
257 <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
258 When this change was introduced, 94ebe0e also introduced a conversion
259 with a crucial bug which essentially destroyed legacy keys by mapping them
260 to
261 <MAPPING_PREFIX><poolid>_<snapid>_
262 without the object-unique suffix. The conversion is fixed in this release.
263 Relevant tracker: https://tracker.ceph.com/issues/56147
264
265* Cephadm may now be configured to carry out CephFS MDS upgrades without
266reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
267avoid having two active MDS modifying on-disk structures with new versions,
268communicating cross-version-incompatible messages, or other potential
269incompatibilities. This could be disruptive for large-scale CephFS deployments
270because the cluster cannot easily reduce active MDS daemons to 1.
271NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
272of the feature, refer this link on how to perform it:
273https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
274Relevant tracker: https://tracker.ceph.com/issues/55715
275
276 Relevant tracker: https://tracker.ceph.com/issues/5614
277
278* Cephadm may now be configured to carry out CephFS MDS upgrades without
279reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
280avoid having two active MDS modifying on-disk structures with new versions,
281communicating cross-version-incompatible messages, or other potential
282incompatibilities. This could be disruptive for large-scale CephFS deployments
283because the cluster cannot easily reduce active MDS daemons to 1.
284NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
285of the feature, refer this link on how to perform it:
286https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
287Relevant tracker: https://tracker.ceph.com/issues/55715
288
289* Introduced a new file system flag `refuse_client_session` that can be set using the
290`fs set` command. This flag allows blocking any incoming session
291request from client(s). This can be useful during some recovery situations
292where it's desirable to bring MDS up but have no client workload.
293Relevant tracker: https://tracker.ceph.com/issues/57090