bump version to 12.2.10-pve1

[ceph.git] / ceph / PendingReleaseNotes
diff --git a/ceph/PendingReleaseNotes b/ceph/PendingReleaseNotes

index 7f57cc52a149b92b7bbb0eacd78362e14cd845f7..00ee957e046bd19a496684cd74f6a24e6e743fe5 100644 (file)
--- a/ceph/PendingReleaseNotes
+++ b/ceph/PendingReleaseNotes
@@ -1,184 +1,124 @@
->= 12.0.0
+>= 12.1.2
+---------
+* When running 'df' on a CephFS filesystem comprising exactly one data pool,
+  the result now reflects the file storage space used and available in that
+  data pool (fuse client only).
+* Added new commands "pg force-recovery" and "pg-force-backfill". Use them
+  to boost recovery or backfill priority of specified pgs, so they're
+  recovered/backfilled before any other. Note that these commands don't
+  interrupt ongoing recovery/backfill, but merely queue specified pgs
+  before others so they're recovered/backfilled as soon as possible.
+  New commands "pg cancel-force-recovery" and "pg cancel-force-backfill"
+  restore default recovery/backfill priority of previously forced pgs.
+
+
+12.2.1
  ------
-* The "journaler allow split entries" config setting has been removed.
-* The 'apply' mode of cephfs-journal-tool has been removed
  
-12.0.0
+* Clusters will need to upgrade to 12.2.1 before upgrading to any
+  Mimic 13.y.z version (either a development release or an eventual
+  stable Mimic release).
+
+- *CephFS*:
+
+  * Limiting MDS cache via a memory limit is now supported using the new
+    mds_cache_memory_limit config option (1GB by default).  A cache reservation
+    can also be specified using mds_cache_reservation as a percentage of the
+    limit (5% by default). Limits by inode count are still supported using
+    mds_cache_size. Setting mds_cache_size to 0 (the default) disables the
+    inode limit.
+
+* The maximum number of PGs per OSD before the monitor issues a
+  warning has been reduced from 300 to 200 PGs.  200 is still twice
+  the generally recommended target of 100 PGs per OSD.  This limit can
+  be adjusted via the ``mon_max_pg_per_osd`` option on the
+  monitors.  The older ``mon_pg_warn_max_per_osd`` option has been removed.
+
+* Creating pools or adjusting pg_num will now fail if the change would
+  make the number of PGs per OSD exceed the configured
+  ``mon_max_pg_per_osd`` limit.  The option can be adjusted if it
+  is really necessary to create a pool with more PGs.
+
+12.2.3
  ------
  
- * When assigning a network to the public network and not to
-   the cluster network the network specification of the public
-   network will be used for the cluster network as well.
-   In older versions this would lead to cluster services
-   being bound to 0.0.0.0:<port>, thus making the
-   cluster service even more publicly available than the
-   public services. When only specifying a cluster network it
-   will still result in the public services binding to 0.0.0.0.
-
-*  Some variants of the omap_get_keys and omap_get_vals librados
-   functions have been deprecated in favor of omap_get_vals2 and
-   omap_get_keys2.  The new methods include an output argument
-   indicating whether there are additional keys left to fetch.
-   Previously this had to be inferred from the requested key count vs
-   the number of keys returned, but this breaks with new OSD-side
-   limits on the number of keys or bytes that can be returned by a
-   single omap request.  These limits were introduced by kraken but
-   are effectively disabled by default (by setting a very large limit
-   of 1 GB) because users of the newly deprecated interface cannot
-   tell whether they should fetch more keys or not.  In the case of
-   the standalone calls in the C++ interface
-   (IoCtx::get_omap_{keys,vals}), librados has been updated to loop on
-   the client side to provide a correct result via multiple calls to
-   the OSD.  In the case of the methods used for building
-   multi-operation transactions, however, client-side looping is not
-   practical, and the methods have been deprecated.  Note that use of
-   either the IoCtx methods on older librados versions or the
-   deprecated methods on any version of librados will lead to
-   incomplete results if/when the new OSD limits are enabled.
-
-* In previous versions, if a client sent an op to the wrong OSD, the OSD
-  would reply with ENXIO.  The rationale here is that the client or OSD is
-  clearly buggy and we want to surface the error as clearly as possible.
-  We now only send the ENXIO reply if the osd_enxio_on_misdirected_op option
-  is enabled (it's off by default).  This means that a VM using librbd that
-  previously would have gotten an EIO and gone read-only will now see a
-  blocked/hung IO instead.
-
-*  When configuring ceph-fuse mounts in /etc/fstab, a new syntax is
-   available that uses "ceph.<arg>=<val>" in the options column, instead
-   of putting configuration in the device column.  The old style syntax
-   still works.  See the documentation page "Mount CephFS in your
-   file systems table" for details.
-
-12.0.1
+- *RBD*:
+
+  * The RBD C API's rbd_discard method now enforces a maximum length of
+    2GB to match the C++ API's Image::discard method. This restriction
+    prevents overflow of the result code.
+
+- *CephFS*:
+
+  * The CephFS client now catches failures to clear dentries during startup
+    and refuses to start as consistency and untrimmable cache issues may
+    develop. The new option client_die_on_failed_dentry_invalidate (default:
+    true) may be turned off to allow the client to proceed (dangerous!).
+
+12.2.5
  ------
  
-* The original librados rados_objects_list_open (C) and objects_begin
-  (C++) object listing API, deprecated in Hammer, has finally been
-  removed.  Users of this interface must update their software to use
-  either the rados_nobjects_list_open (C) and nobjects_begin (C++) API or
-  the new rados_object_list_begin (C) and object_list_begin (C++) API
-  before updating the client-side librados library to Luminous.
+- *CephFS*:
  
-  Object enumeration (via any API) with the latest librados version
-  and pre-Hammer OSDs is no longer supported.  Note that no in-tree
-  Ceph services rely on object enumeration via the deprecated APIs, so
-  only external librados users might be affected.
+  * Upgrading an MDS cluster to 12.2.3+ will result in all active MDS
+    exiting due to feature incompatibilities once an upgraded MDS comes online
+    (even as standby). Operators may ignore the error messages and continue
+    upgrading/restarting or follow this upgrade sequence:
  
-  The newest (and recommended) rados_object_list_begin (C) and
-  object_list_begin (C++) API is only usable on clusters with the
-  SORTBITWISE flag enabled (Jewel and later).  (Note that this flag is
-  required to be set before upgrading beyond Jewel.)
+    Reduce the number of ranks to 1 (`ceph fs set <fs_name> max_mds 1`),
+    deactivate all other ranks (`ceph mds deactivate <fs_name>:<n>`), shutdown
+    standbys leaving the one active MDS, upgrade the single active MDS, then
+    upgrade/start standbys. Finally, restore the previous max_mds.
  
-* The rados copy-get-classic operation has been removed since it has not been
-  used by the OSD since before hammer.  It is unlikely any librados user is
-  using this operation explicitly since there is also the more modern copy-get.
+    See also: https://tracker.ceph.com/issues/23172
  
-* The RGW api for getting object torrent has changed its params from 'get_torrent'
-  to 'torrent' so that it can be compatible with Amazon S3. Now the request for 
-  object torrent is like 'GET /ObjectName?torrent'.
+* *rados list-inconsistent-obj format changes:*
  
-* The configuration option "osd pool erasure code stripe width" has
-  been replaced by "osd pool erasure code stripe unit", and given the
-  ability to be overridden by the erasure code profile setting
-  "stripe_unit". For more details see "Erasure Code Profiles" in the
-  documentation.
+  * Various error strings have been improved.  For example, the "oi" or "oi_attr"
+    in errors which stands for object info is now "info" (e.g. oi_attr_missing is
+    now info_missing).
  
-* rbd and cephfs can use erasure coding with bluestore. This may be
-  enabled by setting 'allow_ec_overwrites' to 'true' for a pool. Since
-  this relies on bluestore's checksumming to do deep scrubbing,
-  enabling this on a pool stored on filestore is not allowed.
+  * The object's "selected_object_info" is now in json format instead of string.
  
-* The 'rados df' JSON output now prints numeric values as numbers instead of
-  strings.
+  * The attribute errors (attr_value_mismatch, attr_name_mismatch) only apply to user
+    attributes.  Only user attributes are output and have the internal leading underscore
+    stripped.
  
-* There was a bug introduced in Jewel (#19119) that broke the mapping behavior
-  when an "out" OSD that still existed in the CRUSH map was removed with 'osd rm'.
-  This could result in 'misdirected op' and other errors.  The bug is now fixed,
-  but the fix itself introduces the same risk because the behavior may vary between
-  clients and OSDs.  To avoid problems, please ensure that all OSDs are removed
-  from the CRUSH map before deleting them.  That is, be sure to do::
+  * If there are hash information errors (hinfo_missing, hinfo_corrupted,
+    hinfo_inconsistency) then "hashinfo" is added with the json format of the
+    information.  If the information is corrupt then "hashinfo" is a string
+    containing the value.
  
-     ceph osd crush rm osd.123
+  * If there are snapset errors (snapset_missing, snapset_corrupted,
+    snapset_inconsistency) then "snapset" is added with the json format of the
+    information.  If the information is corrupt then "snapset" is a string containing
+    the value.
  
-  before::
+  * If there are object information errors (info_missing, info_corrupted,
+    obj_size_info_mismatch, object_info_inconsistency) then "object_info" is added
+    with the json format of the information instead of a string.  If the information
+    is corrupt then "object_info" is a string containing the value.
  
-     ceph osd rm osd.123
+* *rados list-inconsistent-snapset format changes:*
  
-12.0.2
-------
+  * Various error strings have been improved.  For example, the "ss_attr" in
+    errors which stands for snapset info is now "snapset" (e.g. ss_attr_missing is
+    now snapset_missing).  The error snapset_mismatch has been renamed to snapset_error
+    to better reflect what it means.
+
+  * The head snapset information is output in json format as "snapset."  This means that
+    even when there are no head errors, the head object will be output when any shard
+    has an error.  This head object is there to show the snapset that was used in
+    determining errors.
  
-* The original librados rados_objects_list_open (C) and objects_begin
-  (C++) object listing API, deprecated in Hammer, has finally been
-  removed.  Users of this interface must update their software to use
-  either the rados_nobjects_list_open (C) and nobjects_begin (C++) API or
-  the new rados_object_list_begin (C) and object_list_begin (C++) API
-  before updating the client-side librados library to Luminous.
-
-  Object enumeration (via any API) with the latest librados version
-  and pre-Hammer OSDs is no longer supported.  Note that no in-tree
-  Ceph services rely on object enumeration via the deprecated APIs, so
-  only external librados users might be affected.
-
-  The newest (and recommended) rados_object_list_begin (C) and
-  object_list_begin (C++) API is only usable on clusters with the
-  SORTBITWISE flag enabled (Jewel and later).  (Note that this flag is
-  required to be set before upgrading beyond Jewel.)
-* CephFS clients without the 'p' flag in their authentication capability
-  string will no longer be able to set quotas or any layout fields.  This
-  flag previously only restricted modification of the pool and namespace
-  fields in layouts.
-* CephFS directory fragmentation (large directory support) is enabled
-  by default on new filesystems.  To enable it on existing filesystems
-  use "ceph fs set <fs_name> allow_dirfrags".
-* CephFS will generate a health warning if you have fewer standby daemons
-  than it thinks you wanted.  By default this will be 1 if you ever had
-  a standby, and 0 if you did not.  You can customize this using
-  ``ceph fs set <fs> standby_count_wanted <number>``.  Setting it
-  to zero will effectively disable the health check.
-* The "ceph mds tell ..." command has been removed.  It is superceded
-  by "ceph tell mds.<id> ..."
-
-12.1.0
-------
  
-* The ``mon_osd_max_op_age`` option has been renamed to
-  ``mon_osd_warn_op_age`` (default: 32 seconds), to indicate we
-  generate a warning at this age.  There is also a new
-  ``mon_osd_err_op_age_ratio`` that is a expressed as a multitple of
-  ``mon_osd_warn_op_age`` (default: 128, for roughly 60 minutes) to
-  control when an error is generated.
-
-* The default maximum size for a single RADOS object has been reduced from
-  100GB to 128MB.  The 100GB limit was completely impractical in practice
-  while the 128MB limit is a bit high but not unreasonable.  If you have an
-  application written directly to librados that is using objects larger than
-  128MB you may need to adjust ``osd_max_object_size``.
-
-* The semantics of the 'rados ls' and librados object listing
-  operations have always been a bit confusing in that "whiteout"
-  objects (which logically don't exist and will return ENOENT if you
-  try to access them) are included in the results.  Previously
-  whiteouts only occurred in cache tier pools.  In luminous, logically
-  deleted but snapshotted objects now result in a whiteout object, and
-  as a result they will appear in 'rados ls' results, even though
-  trying to read such an object will result in ENOENT.  The 'rados
-  listsnaps' operation can be used in such a case to enumerate which
-  snapshots are present.
-
-  This may seem a bit strange, but is less strange than having a
-  deleted-but-snapshotted object not appear at all and be completely
-  hidden from librados's ability to enumerate objects.  Future
-  versions of Ceph will likely include an alternative object
-  enumeration interface that makes it more natural and efficient to
-  enumerate all objects along with their snapshot and clone metadata.
-
-* The deprecated 'crush_ruleset' property has finally been removed; please use
-  'crush_rule' instead for the 'osd pool get ...' and 'osd pool set ..' commands.
-
-* The 'osd pool default crush replicated ruleset' option has been
-  removed and replaced by the 'osd pool default crush rule' option.
-  By default it is -1, which means the mon will pick the first type
-  replicated rule in the CRUSH map for replicated pools.  Erasure
-  coded pools have rules that are automatically created for them if they are
-  not specified at pool creation time.
+12.2.9
+------
+* 12.2.9 contains the pg hard hard limit patches(https://tracker.ceph.com/issues/23979).
+  A partial upgrade during recovery/backfill, can cause the osds on the previous version,
+  to fail with assert(trim_to <= info.last_complete). The workaround for users is to
+  upgrade and restart all OSDs to a version with the pg hard limit, or only upgrade
+  when all PGs are active+clean. This patch will be reverted in 12.2.10, until
+  a clean upgrade path is added to the pg log hard limit patches.
+
+  See also: http://tracker.ceph.com/issues/36686