ceph/PendingReleaseNotes

   1 >=18.0.0
   2
   3 * The RGW policy parser now rejects unknown principals by default. If you are
   4   mirroring policies between RGW and AWS, you may wish to set
   5   "rgw policy reject invalid principals" to "false". This affects only newly set
   6   policies, not policies that are already in place.
   7 * RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
   8   The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
   9   defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
  10 * The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
  11   Please note that this is not an officially supported feature.
  12 * RGW's pubsub interface now returns boolean fields using bool. Before this change,
  13   `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
  14   of "true" or "false" with quotes around them. After this change, these fields
  15   are returned without quotes so they can be decoded as boolean values in JSON.
  16   The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
  17 * RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
  18   returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
  19   encoded in `Attributes/EndPoint`.
  20 * All boolean fields previously rendered as string by `rgw-admin` command when
  21   the JSON format is used are now rendered as boolean. If your scripts/tools
  22   relies on this behavior, please update them accordingly. The impacted field names
  23   are:
  24   * absolute
  25   * add
  26   * admin
  27   * appendable
  28   * bucket_key_enabled
  29   * delete_marker
  30   * exists
  31   * has_bucket_info
  32   * high_precision_time
  33   * index
  34   * is_master
  35   * is_prefix
  36   * is_truncated
  37   * linked
  38   * log_meta
  39   * log_op
  40   * pending_removal
  41   * read_only
  42   * retain_head_object
  43   * rule_exist
  44   * start_with_full_sync
  45   * sync_from_all
  46   * syncstopped
  47   * system
  48   * truncated
  49   * user_stats_sync
  50 * RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
  51   configurable. This has the same defaults as debug_rgw, but can now be controlled
  52   independently.
  53 * RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
  54   and `Image::aio_compare_and_write` methods) now match those of C API.  Both
  55   compare and write steps operate only on `len` bytes even if the respective
  56   buffers are larger. The previous behavior of comparing up to the size of
  57   the compare buffer was prone to subtle breakage upon straddling a stripe
  58   unit boundary.
  59 * RBD: compare-and-write operation is no longer limited to 512-byte sectors.
  60   Assuming proper alignment, it now allows operating on stripe units (4M by
  61   default).
  62 * RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
  63   on both compare and write buffers.  This compliments existing `rbd_aio_readv`
  64   and `rbd_aio_writev` methods.
  65 * The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
  66   macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
  67 * Trimming of PGLog dups is now controlled by the size instead of the version.
  68   This fixes the PGLog inflation issue that was happening when the on-line
  69   (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
  70   mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
  71   that targets situations where OSD is unable to boot due to those inflated dups.
  72   If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
  73   will be visible.
  74   Relevant tracker: https://tracker.ceph.com/issues/53729
  75 * RBD: `rbd device unmap` command gained `--namespace` option.  Support for
  76   namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
  77   map and unmap images in namespaces using the `image-spec` syntax since then
  78   but the corresponding option available in most other commands was missing.
  79 * RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
  80   When both are enabled, compression is applied before encryption. Earlier releases
  81   of multisite do not replicate such objects correctly, so all zones must upgrade to
  82   Reef before enabling the `compress-encrypted` zonegroup feature: see
  83   https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
  84   security considerations.
  85 * RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
  86   is removed. Together with it, the "pubsub" zone should not be used anymore.
  87   The REST operations, as well as radosgw-admin commands for manipulating
  88   subscriptions, as well as fetching and acking the notifications are removed
  89   as well.
  90   In case that the endpoint to which the notifications are sent maybe down or
  91   disconnected, it is recommended to use persistent notifications to guarantee
  92   the delivery of the notifications. In case the system that consumes the
  93   notifications needs to pull them (instead of the notifications be pushed
  94   to it), an external message bus (e.g. rabbitmq, Kafka) should be used for
  95   that purpose.
  96 * RGW: The serialized format of notification and topics has changed, so that
  97   new/updated topics will be unreadable by old RGWs. We recommend completing
  98   the RGW upgrades before creating or modifying any notification topics.
  99 * RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
 100   `rbd encryption format` command and `--encryption-passphrase-file` option
 101   in other commands) is no longer stripped.
 102 * RBD: Support for layered client-side encryption is added.  Cloned images
 103   can now be encrypted each with its own encryption format and passphrase,
 104   potentially different from that of the parent image.  The efficient
 105   copy-on-write semantics intrinsic to unformatted (regular) cloned images
 106   are retained.
 107 * CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
 108   `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
 109   mds-client.yaml.in because this option was only used by MDS client from its
 110   birth.
 111 * The `perf dump` and `perf schema` commands are deprecated in favor of new
 112   `counter dump` and `counter schema` commands. These new commands add support
 113   for labeled perf counters and also emit existing unlabeled perf counters. Some
 114   unlabeled perf counters became labeled in this release, with more to follow in
 115   future releases; such converted perf counters are no longer emitted by the
 116   `perf dump` and `perf schema` commands.
 117 * `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
 118   `active_clients` fields at the top level.  Previously, these fields were
 119   output under `always_on_modules` field.
 120 * `ceph mgr dump` command now displays the name of the mgr module that
 121   registered a RADOS client in the `name` field added to elements of the
 122   `active_clients` array. Previously, only the address of a module's RADOS
 123   client was shown in the `active_clients` array.
 124 * RBD: All rbd-mirror daemon perf counters became labeled and as such are now
 125   emitted only by the new `counter dump` and `counter schema` commands.  As part
 126   of the conversion, many also got renamed to better disambiguate journal-based
 127   and snapshot-based mirroring.
 128 * RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
 129   `std::list` before potentially appending to it, aligning with the semantics
 130   of the corresponding C API (`rbd_watchers_list`).
 131 * The rados python binding is now able to process (opt-in) omap keys as bytes
 132   objects. This enables interacting with RADOS omap keys that are not decodeable as
 133   UTF-8 strings.
 134 * Telemetry: Users who are opted-in to telemetry can also opt-in to
 135   participating in a leaderboard in the telemetry public
 136   dashboards (https://telemetry-public.ceph.com/). Users can now also add a
 137   description of the cluster to publicly appear in the leaderboard.
 138   For more details, see:
 139   https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
 140   See a sample report with `ceph telemetry preview`.
 141   Opt-in to telemetry with `ceph telemetry on`.
 142   Opt-in to the leaderboard with
 143   `ceph config set mgr mgr/telemetry/leaderboard true`.
 144   Add leaderboard description with:
 145   `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
 146 * CEPHFS: After recovering a Ceph File System post following the disaster recovery
 147   procedure, the recovered files under `lost+found` directory can now be deleted.
 148 * core: cache-tiering is now deprecated.
 149 * mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
 150   undergone significant usability and design improvements to address the slow
 151   backfill issue. Some important changes are:
 152   * The 'balanced' profile is set as the default mClock profile because it
 153     represents a compromise between prioritizing client IO or recovery IO. Users
 154     can then choose either the 'high_client_ops' profile to prioritize client IO
 155     or the 'high_recovery_ops' profile to prioritize recovery IO.
 156   * QoS parameters like reservation and limit are now specified in terms of a
 157     fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
 158   * The cost parameters (osd_mclock_cost_per_io_usec_* and
 159     osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
 160     is now determined using the random IOPS and maximum sequential bandwidth
 161     capability of the OSD's underlying device.
 162   * Degraded object recovery is given higher priority when compared to misplaced
 163     object recovery because degraded objects present a data safety issue not
 164     present with objects that are merely misplaced. Therefore, backfilling
 165     operations with the 'balanced' and 'high_client_ops' mClock profiles may
 166     progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
 167     scheduler.
 168   * The QoS allocations in all the mClock profiles are optimized based on the above
 169     fixes and enhancements.
 170   * For more detailed information see:
 171     https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/
 172 * CEPHFS: After recovering a Ceph File System post following the disaster recovery
 173   procedure, the recovered files under `lost+found` directory can now be deleted.
 174
 175 >=17.2.1
 176
 177 * The "BlueStore zero block detection" feature (first introduced to Quincy in
 178 https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
 179 new global configuration called `bluestore_zero_block_detection`. This feature,
 180 intended for large-scale synthetic testing, does not interact well with some RBD
 181 and CephFS features. Any side effects experienced in previous Quincy versions
 182 would no longer occur, provided that the configuration remains set to false.
 183 Relevant tracker: https://tracker.ceph.com/issues/55521
 184
 185 * telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
 186   version, Kubernetes version, node metrics, etc.
 187   See a sample report with `ceph telemetry preview`.
 188   Opt-in with `ceph telemetry on`.
 189
 190   For more details, see:
 191
 192   https://docs.ceph.com/en/latest/mgr/telemetry/
 193
 194 * OSD: The issue of high CPU utilization during recovery/backfill operations
 195   has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.
 196
 197 >=15.2.17
 198
 199 * OSD: Octopus modified the SnapMapper key format from
 200   <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
 201   to
 202   <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
 203   When this change was introduced, 94ebe0e also introduced a conversion
 204   with a crucial bug which essentially destroyed legacy keys by mapping them
 205   to
 206   <MAPPING_PREFIX><poolid>_<snapid>_
 207   without the object-unique suffix. The conversion is fixed in this release.
 208   Relevant tracker: https://tracker.ceph.com/issues/56147
 209
 210 * Cephadm may now be configured to carry out CephFS MDS upgrades without
 211 reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
 212 avoid having two active MDS modifying on-disk structures with new versions,
 213 communicating cross-version-incompatible messages, or other potential
 214 incompatibilities. This could be disruptive for large-scale CephFS deployments
 215 because the cluster cannot easily reduce active MDS daemons to 1.
 216 NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
 217 of the feature, refer this link on how to perform it:
 218 https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
 219 Relevant tracker: https://tracker.ceph.com/issues/55715
 220
 221   Relevant tracker: https://tracker.ceph.com/issues/5614
 222
 223 * Cephadm may now be configured to carry out CephFS MDS upgrades without
 224 reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
 225 avoid having two active MDS modifying on-disk structures with new versions,
 226 communicating cross-version-incompatible messages, or other potential
 227 incompatibilities. This could be disruptive for large-scale CephFS deployments
 228 because the cluster cannot easily reduce active MDS daemons to 1.
 229 NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
 230 of the feature, refer this link on how to perform it:
 231 https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
 232 Relevant tracker: https://tracker.ceph.com/issues/55715
 233
 234 * Introduced a new file system flag `refuse_client_session` that can be set using the
 235 `fs set` command. This flag allows blocking any incoming session
 236 request from client(s). This can be useful during some recovery situations
 237 where it's desirable to bring MDS up but have no client workload.
 238 Relevant tracker: https://tracker.ceph.com/issues/57090