->= 12.0.0
-------
-* The "journaler allow split entries" config setting has been removed.
-* The 'apply' mode of cephfs-journal-tool has been removed
-* Added new configuration "public bind addr" to support dynamic environments
- like Kubernetes. When set the Ceph MON daemon could bind locally to an IP
- address and advertise a different IP address "public addr" on the network.
-* RGW: bucket index resharding now uses the reshard namespace in upgrade scenarios as well
- this is a changed behaviour from RC1 where a new pool for reshard was created
-
-12.0.0
+14.2.4
------
- * When assigning a network to the public network and not to
- the cluster network the network specification of the public
- network will be used for the cluster network as well.
- In older versions this would lead to cluster services
- being bound to 0.0.0.0:<port>, thus making the
- cluster service even more publicly available than the
- public services. When only specifying a cluster network it
- will still result in the public services binding to 0.0.0.0.
-
-* Some variants of the omap_get_keys and omap_get_vals librados
- functions have been deprecated in favor of omap_get_vals2 and
- omap_get_keys2. The new methods include an output argument
- indicating whether there are additional keys left to fetch.
- Previously this had to be inferred from the requested key count vs
- the number of keys returned, but this breaks with new OSD-side
- limits on the number of keys or bytes that can be returned by a
- single omap request. These limits were introduced by kraken but
- are effectively disabled by default (by setting a very large limit
- of 1 GB) because users of the newly deprecated interface cannot
- tell whether they should fetch more keys or not. In the case of
- the standalone calls in the C++ interface
- (IoCtx::get_omap_{keys,vals}), librados has been updated to loop on
- the client side to provide a correct result via multiple calls to
- the OSD. In the case of the methods used for building
- multi-operation transactions, however, client-side looping is not
- practical, and the methods have been deprecated. Note that use of
- either the IoCtx methods on older librados versions or the
- deprecated methods on any version of librados will lead to
- incomplete results if/when the new OSD limits are enabled.
-
-* In previous versions, if a client sent an op to the wrong OSD, the OSD
- would reply with ENXIO. The rationale here is that the client or OSD is
- clearly buggy and we want to surface the error as clearly as possible.
- We now only send the ENXIO reply if the osd_enxio_on_misdirected_op option
- is enabled (it's off by default). This means that a VM using librbd that
- previously would have gotten an EIO and gone read-only will now see a
- blocked/hung IO instead.
-
-* When configuring ceph-fuse mounts in /etc/fstab, a new syntax is
- available that uses "ceph.<arg>=<val>" in the options column, instead
- of putting configuration in the device column. The old style syntax
- still works. See the documentation page "Mount CephFS in your
- file systems table" for details.
-
-12.0.1
+* In the Zabbix Mgr Module there was a typo in the key being send
+ to Zabbix for PGs in backfill_wait state. The key that was sent
+ was 'wait_backfill' and the correct name is 'backfill_wait'.
+ Update your Zabbix template accordingly so that it accepts the
+ new key being send to Zabbix.
+
+14.2.3
+--------
+
+* Nautilus-based librbd clients can now open images on Jewel clusters.
+
+* The RGW "num_rados_handles" has been removed.
+ If you were using a value of "num_rados_handles" greater than 1
+ multiply your current "objecter_inflight_ops" and
+ "objecter_inflight_op_bytes" paramaeters by the old
+ "num_rados_handles" to get the same throttle behavior.
+
+* The ``bluestore_no_per_pool_stats_tolerance`` config option has been
+ replaced with ``bluestore_fsck_error_on_no_per_pool_stats``
+ (default: false). The overall default behavior has not changed:
+ fsck will warn but not fail on legacy stores, and repair will
+ convert to per-pool stats.
+
+14.2.2
------
-* The original librados rados_objects_list_open (C) and objects_begin
- (C++) object listing API, deprecated in Hammer, has finally been
- removed. Users of this interface must update their software to use
- either the rados_nobjects_list_open (C) and nobjects_begin (C++) API or
- the new rados_object_list_begin (C) and object_list_begin (C++) API
- before updating the client-side librados library to Luminous.
-
- Object enumeration (via any API) with the latest librados version
- and pre-Hammer OSDs is no longer supported. Note that no in-tree
- Ceph services rely on object enumeration via the deprecated APIs, so
- only external librados users might be affected.
-
- The newest (and recommended) rados_object_list_begin (C) and
- object_list_begin (C++) API is only usable on clusters with the
- SORTBITWISE flag enabled (Jewel and later). (Note that this flag is
- required to be set before upgrading beyond Jewel.)
-
-* The rados copy-get-classic operation has been removed since it has not been
- used by the OSD since before hammer. It is unlikely any librados user is
- using this operation explicitly since there is also the more modern copy-get.
-
-* The RGW api for getting object torrent has changed its params from 'get_torrent'
- to 'torrent' so that it can be compatible with Amazon S3. Now the request for
- object torrent is like 'GET /ObjectName?torrent'.
-
-* The configuration option "osd pool erasure code stripe width" has
- been replaced by "osd pool erasure code stripe unit", and given the
- ability to be overridden by the erasure code profile setting
- "stripe_unit". For more details see "Erasure Code Profiles" in the
- documentation.
-
-* rbd and cephfs can use erasure coding with bluestore. This may be
- enabled by setting 'allow_ec_overwrites' to 'true' for a pool. Since
- this relies on bluestore's checksumming to do deep scrubbing,
- enabling this on a pool stored on filestore is not allowed.
-
-* The 'rados df' JSON output now prints numeric values as numbers instead of
- strings.
-
-* There was a bug introduced in Jewel (#19119) that broke the mapping behavior
- when an "out" OSD that still existed in the CRUSH map was removed with 'osd rm'.
- This could result in 'misdirected op' and other errors. The bug is now fixed,
- but the fix itself introduces the same risk because the behavior may vary between
- clients and OSDs. To avoid problems, please ensure that all OSDs are removed
- from the CRUSH map before deleting them. That is, be sure to do::
-
- ceph osd crush rm osd.123
-
- before::
-
- ceph osd rm osd.123
-
-12.0.2
-------
-
-* The original librados rados_objects_list_open (C) and objects_begin
- (C++) object listing API, deprecated in Hammer, has finally been
- removed. Users of this interface must update their software to use
- either the rados_nobjects_list_open (C) and nobjects_begin (C++) API or
- the new rados_object_list_begin (C) and object_list_begin (C++) API
- before updating the client-side librados library to Luminous.
-
- Object enumeration (via any API) with the latest librados version
- and pre-Hammer OSDs is no longer supported. Note that no in-tree
- Ceph services rely on object enumeration via the deprecated APIs, so
- only external librados users might be affected.
-
- The newest (and recommended) rados_object_list_begin (C) and
- object_list_begin (C++) API is only usable on clusters with the
- SORTBITWISE flag enabled (Jewel and later). (Note that this flag is
- required to be set before upgrading beyond Jewel.)
-* CephFS clients without the 'p' flag in their authentication capability
- string will no longer be able to set quotas or any layout fields. This
- flag previously only restricted modification of the pool and namespace
- fields in layouts.
-* CephFS directory fragmentation (large directory support) is enabled
- by default on new filesystems. To enable it on existing filesystems
- use "ceph fs set <fs_name> allow_dirfrags".
-* CephFS will generate a health warning if you have fewer standby daemons
- than it thinks you wanted. By default this will be 1 if you ever had
- a standby, and 0 if you did not. You can customize this using
- ``ceph fs set <fs> standby_count_wanted <number>``. Setting it
- to zero will effectively disable the health check.
-* The "ceph mds tell ..." command has been removed. It is superceded
- by "ceph tell mds.<id> ..."
-
-12.1.0
-------
-
-* The ``mon_osd_max_op_age`` option has been renamed to
- ``mon_osd_warn_op_age`` (default: 32 seconds), to indicate we
- generate a warning at this age. There is also a new
- ``mon_osd_err_op_age_ratio`` that is a expressed as a multitple of
- ``mon_osd_warn_op_age`` (default: 128, for roughly 60 minutes) to
- control when an error is generated.
-
-* The default maximum size for a single RADOS object has been reduced from
- 100GB to 128MB. The 100GB limit was completely impractical in practice
- while the 128MB limit is a bit high but not unreasonable. If you have an
- application written directly to librados that is using objects larger than
- 128MB you may need to adjust ``osd_max_object_size``.
-
-* The semantics of the 'rados ls' and librados object listing
- operations have always been a bit confusing in that "whiteout"
- objects (which logically don't exist and will return ENOENT if you
- try to access them) are included in the results. Previously
- whiteouts only occurred in cache tier pools. In luminous, logically
- deleted but snapshotted objects now result in a whiteout object, and
- as a result they will appear in 'rados ls' results, even though
- trying to read such an object will result in ENOENT. The 'rados
- listsnaps' operation can be used in such a case to enumerate which
- snapshots are present.
-
- This may seem a bit strange, but is less strange than having a
- deleted-but-snapshotted object not appear at all and be completely
- hidden from librados's ability to enumerate objects. Future
- versions of Ceph will likely include an alternative object
- enumeration interface that makes it more natural and efficient to
- enumerate all objects along with their snapshot and clone metadata.
-
-* The deprecated 'crush_ruleset' property has finally been removed; please use
- 'crush_rule' instead for the 'osd pool get ...' and 'osd pool set ..' commands.
-
-* The 'osd pool default crush replicated ruleset' option has been
- removed and replaced by the 'osd pool default crush rule' option.
- By default it is -1, which means the mon will pick the first type
- replicated rule in the CRUSH map for replicated pools. Erasure
- coded pools have rules that are automatically created for them if they are
- not specified at pool creation time.
-
-* The `status` ceph-mgr module is enabled by default, and initially provides two
- commands: `ceph tell mgr osd status` and `ceph tell mgr fs status`. These
- are high level colorized views to complement the existing CLI.
-
-12.1.1
-------
-
-* choose_args encoding has been changed to make it architecture-independent.
- If you deployed Luminous dev releases or 12.1.0 rc release and made use of
- the CRUSH choose_args feature, you need to remove all choose_args mappings
- from your CRUSH map before starting the upgrade.
-
-* The 'ceph health' structured output (JSON or XML) no longer contains
- a 'timechecks' section describing the time sync status. This
- information is now available via the 'ceph time-sync-status'
- command.
-
-* Certain extra fields in the 'ceph health' structured output that
- used to appear if the mons were low on disk space (which duplicated
- the information in the normal health warning messages) are now gone.
+* The no{up,down,in,out} related commands has been revamped.
+ There are now 2 ways to set the no{up,down,in,out} flags:
+ the old 'ceph osd [un]set <flag>' command, which sets cluster-wide flags;
+ and the new 'ceph osd [un]set-group <flags> <who>' command,
+ which sets flags in batch at the granularity of any crush node,
+ or device class.
-* The "ceph -w" output no longer contains audit log entries by default.
- Add a "--watch-channel=audit" or "--watch-channel=*" to see them.
+* RGW: radosgw-admin introduces two subcommands that allow the
+ managing of expire-stale objects that might be left behind after a
+ bucket reshard in earlier versions of RGW. One subcommand lists such
+ objects and the other deletes them. Read the troubleshooting section
+ of the dynamic resharding docs for details.
-12.1.2
+14.2.5
------
-* New "ceph -w" behavior - the "ceph -w" output no longer contains I/O rates,
- available space, pg info, etc. because these are no longer logged to the
- central log (which is what "ceph -w" shows). The same information can be
- obtained by running "ceph pg stat"; alternatively, I/O rates per pool can
- be determined using "ceph osd pool stats". Although these commands do not
- self-update like "ceph -w" did, they do have the ability to return formatted
- output by providing a "--format=<format>" option.
-
-* Pools are now expected to be associated with the application using them.
- Upon completing the upgrade to Luminous, the cluster will attempt to associate
- existing pools to known applications (i.e. CephFS, RBD, and RGW). In-use pools
- that are not associated to an application will generate a health warning. Any
- unassociated pools can be manually associated using the new
- "ceph osd pool application enable" command. For more details see
- "Associate Pool to Application" in the documentation.
-
-* ceph-mgr now has a Zabbix plugin. Using zabbix_sender it sends trapper
- events to a Zabbix server containing high-level information of the Ceph
- cluster. This makes it easy to monitor a Ceph cluster's status and send
- out notifications in case of a malfunction.
-
-* The 'mon_warn_osd_usage_min_max_delta' config option has been
- removed and the associated health warning has been disabled because
- it does not address clusters undergoing recovery or CRUSH rules that do
- not target all devices in the cluster.
-
-* Specifying user authorization capabilities for RBD clients has been
- simplified. The general syntax for using RBD capability profiles is
- "mon 'profile rbd' osd 'profile rbd[-read-only][ pool={pool-name}[, ...]]'".
- For more details see "User Management" in the documentation.
-
-* ``ceph config-key put`` has been deprecated in favor of ``ceph config-key set``.
\ No newline at end of file
+* The telemetry module now has a 'device' channel, enabled by default, that
+ will report anonymized hard disk and SSD health metrics to telemetry.ceph.com
+ in order to build and improve device failure prediction algorithms. Because
+ the content of telemetry reports has changed, you will need to either re-opt-in
+ with::
+
+ ceph telemetry on
+
+ You can view exactly what information will be reported first with::
+
+ ceph telemetry show
+ ceph telemetry show device # specifically show the device channel
+
+ If you are not comfortable sharing device metrics, you can disable that
+ channel first before re-opting-in:
+
+ ceph config set mgr mgr/telemetry/channel_crash false
+ ceph telemetry on
+
+* The telemetry module now reports more information about CephFS file systems,
+ including:
+
+ - how many MDS daemons (in total and per file system)
+ - which features are (or have been) enabled
+ - how many data pools
+ - approximate file system age (year + month of creation)
+ - how many files, bytes, and snapshots
+ - how much metadata is being cached
+
+ We have also added:
+
+ - which Ceph release the monitors are running
+ - whether msgr v1 or v2 addresses are used for the monitors
+ - whether IPv4 or IPv6 addresses are used for the monitors
+ - whether RADOS cache tiering is enabled (and which mode)
+ - whether pools are replicated or erasure coded, and
+ which erasure code profile plugin and parameters are in use
+ - how many hosts are in the cluster, and how many hosts have each type of daemon
+ - whether a separate OSD cluster network is being used
+ - how many RBD pools and images are in the cluster, and how many pools have RBD mirroring enabled
+ - how many RGW daemons, zones, and zonegroups are present; which RGW frontends are in use
+ - aggregate stats about the CRUSH map, like which algorithms are used, how big buckets are, how many rules are defined, and what tunables are in use
+
+ If you had telemetry enabled, you will need to re-opt-in with::
+
+ ceph telemetry on
+
+ You can view exactly what information will be reported first with::
+
+ ceph telemetry show # see everything
+ ceph telemetry show basic # basic cluster info (including all of the new info)
+
+* A health warning is now generated if the average osd heartbeat ping
+ time exceeds a configurable threshold for any of the intervals
+ computed. The OSD computes 1 minute, 5 minute and 15 minute
+ intervals with average, minimum and maximum values. New configuration
+ option ``mon_warn_on_slow_ping_ratio`` specifies a percentage of
+ ``osd_heartbeat_grace`` to determine the threshold. A value of zero
+ disables the warning. New configuration option
+ ``mon_warn_on_slow_ping_time`` specified in milliseconds over-rides the
+ computed value, causes a warning
+ when OSD heartbeat pings take longer than the specified amount.
+ New admin command ``ceph daemon mgr.# dump_osd_network [threshold]`` command will
+ list all connections with a ping time longer than the specified threshold or
+ value determined by the config options, for the average for any of the 3 intervals.
+ New admin command ``ceph daemon osd.# dump_osd_network [threshold]`` will
+ do the same but only including heartbeats initiated by the specified OSD.
+
+* New OSD daemon command dump_recovery_reservations which reveals the
+ recovery locks held (in_progress) and waiting in priority queues.
+
+* New OSD daemon command dump_scrub_reservations which reveals the
+ scrub reservations that are held for local (primary) and remote (replica) PGs.