[ceph.git] / ceph / doc / releases / quincy.rst

======
Quincy
======

Quincy is the 17th stable release of Ceph.  It is named after Squidward
Quincy Tentacles from Spongebob Squarepants.

v17.2.0 Quincy
==============

This is the first stable release of Ceph Quincy.

Major Changes from Pacific
--------------------------

General
~~~~~~~

* Filestore has been deprecated in Quincy. BlueStore is Ceph's default object
  store.

* The `ceph-mgr-modules-core` debian package no longer recommends
  `ceph-mgr-rook`. `ceph-mgr-rook` depends on `python3-numpy`, which
  cannot be imported in different Python sub-interpreters multiple times
  when the version of `python3-numpy` is older than 1.19. Because
  `apt-get` installs the `Recommends` packages by default, `ceph-mgr-rook`
  was always installed along with the `ceph-mgr` debian package as an
  indirect dependency. If your workflow depends on this behavior, you
  might want to install `ceph-mgr-rook` separately.

* The ``device_health_metrics`` pool has been renamed ``.mgr``. It is now
  used as a common store for all ``ceph-mgr`` modules. After upgrading to
  Quincy, the ``device_health_metrics`` pool will be renamed to ``.mgr``
  on existing clusters.

* The ``ceph pg dump`` command now prints three additional columns:
  `LAST_SCRUB_DURATION` shows the duration (in seconds) of the last completed
  scrub;
  `SCRUB_SCHEDULING` conveys whether a PG is scheduled to be scrubbed at a
  specified time, whether it is queued for scrubbing, or whether it is being
  scrubbed;
  `OBJECTS_SCRUBBED` shows the number of objects scrubbed in a PG after a
  scrub begins.

* A health warning is now reported if the ``require-osd-release`` flag
  is not set to the appropriate release after a cluster upgrade.

* LevelDB support has been removed. ``WITH_LEVELDB`` is no longer a supported
  build option. Users *should* migrate their monitors and OSDs to RocksDB
  before upgrading to Quincy.

* Cephadm: ``osd_memory_target_autotune`` is enabled by default, which sets
  ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.7`` of total RAM. This
  is unsuitable for hyperconverged infrastructures. For hyperconverged Ceph,
  please refer to the documentation or set
  ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.2``.

* telemetry: Improved the opt-in flow so that users can keep sharing the same
  data, even when new data collections are available. A new 'perf' channel that
  collects various performance metrics is now avaiable to opt into with:
  `ceph telemetry on`
  `ceph telemetry enable channel perf`
  See a sample report with `ceph telemetry preview`.
  Note that generating a telemetry report with 'perf' channel data might
  take a few moments in big clusters.
  For more details, see:
  https://docs.ceph.com/en/quincy/mgr/telemetry/

* MGR: The progress module disables the pg recovery event by default since the
  event is expensive and has interrupted other services when there are OSDs
  being marked in/out from the the cluster. However, the user can still enable
  this event anytime. For more detail, see:

  https://docs.ceph.com/en/quincy/mgr/progress/

* https://tracker.ceph.com/issues/55383 is a known issue -
  ``mon_cluster_log_to_journald`` needs to be set to false, when
  ``mon_cluster_log_to_file`` is set to true to continue to log cluster
  log messages to file, after log rotation.

Cephadm
-------

* SNMP Support
* Colocation of Daemons (mgr, mds, rgw)
* osd memory autotuning
* Integration with new NFS mgr module
* Ability to zap osds as they are removed
* cephadm agent for increased performance/scalability

Dashboard
~~~~~~~~~
* Day 1: the new "Cluster Expansion Wizard" will guide users through post-install steps:
  adding new hosts, storage devices or services.
* NFS: the Dashboard now allows users to fully manage all NFS exports from a single place.
* New mgr module (feedback): users can quickly report Ceph tracker issues
  or suggestions directly from the Dashboard or the CLI.
* New "Message of the Day": cluster admins can publish a custom message in a banner.
* Cephadm integration improvements:
   * Host management: maintenance, specs and labelling,
   * Service management: edit and display logs,
   * Daemon management (start, stop, restart, reload),
   * New services supported: ingress (HAProxy) and SNMP-gateway.
* Monitoring and alerting:
   * 43 new alerts have been added (totalling 68) improving observability of events affecting:
     cluster health, monitors, storage devices, PGs and CephFS.
   * Alerts can now be sent externally as SNMP traps via the new SNMP gateway service
     (the MIB is provided).
   * Improved integrated full/nearfull event notifications.
   * Grafana Dashboards now use grafonnet format (though they're still available
     in JSON format).
   * Stack update: images for monitoring containers have been updated.
     Grafana 8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node Exporter 1.3.1.
     This reduced exposure to several Grafana vulnerabilities (CVE-2021-43798,
     CVE-2021-39226, CVE-2021-43798,  CVE-2020-29510, CVE-2020-29511).

RADOS
~~~~~

* OSD: Ceph now uses `mclock_scheduler` for BlueStore OSDs as its default
  `osd_op_queue` to provide QoS. The 'mclock_scheduler' is not supported
  for Filestore OSDs. Therefore, the default 'osd_op_queue' is set to `wpq`
  for Filestore OSDs and is enforced even if the user attempts to change it.
  For more details on configuring mclock see,

  https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/

  An outstanding issue exists during runtime where the mclock config options
  related to reservation, weight and limit cannot be modified after switching
  to the `custom` mclock profile using the `ceph config set ...` command.
  This is tracked by: https://tracker.ceph.com/issues/55153. Until the issue
  is fixed, users are advised to avoid using the 'custom' profile or use the
  workaround mentioned in the tracker.

* MGR: The pg_autoscaler can now be turned `on` and `off` globally
  with the `noautoscale` flag. By default, it is set to `on`, but this flag
  can come in handy to prevent rebalancing triggered by autoscaling during
  cluster upgrade and maintenance. Pools can now be created with the `--bulk`
  flag, which allows the autoscaler to allocate more PGs to such pools. This
  can be useful to get better out of the box performance for data-heavy pools.

  For more details about autoscaling, see:
  https://docs.ceph.com/en/quincy/rados/operations/placement-groups/

* OSD: Support for on-wire compression for osd-osd communication, `off` by
  default.

  For more details about compression modes, see:
  https://docs.ceph.com/en/quincy/rados/configuration/msgr2/#compression-modes

* OSD: Concise reporting of slow operations in the cluster log. The old
  and more verbose logging behavior can be regained by setting
  `osd_aggregated_slow_ops_logging` to false.

* the "kvs" Ceph object class is not packaged anymore. The "kvs" Ceph
  object class offers a distributed flat b-tree key-value store that
  is implemented on top of the librados objects omap. Because there
  are no existing internal users of this object class, it is not
  packaged anymore.

RBD block storage
~~~~~~~~~~~~~~~~~

* rbd-nbd: `rbd device attach` and `rbd device detach` commands added,
  these allow for safe reattach after `rbd-nbd` daemon is restarted since
  Linux kernel 5.14.

* rbd-nbd: `notrim` map option added to support thick-provisioned images,
  similar to krbd.

* Large stabilization effort for client-side persistent caching on SSD
  devices, also available in 16.2.8. For details on usage, see:

  https://docs.ceph.com/en/quincy/rbd/rbd-persistent-write-log-cache/

* Several bug fixes in diff calculation when using fast-diff image
  feature + whole object (inexact) mode. In some rare cases these
  long-standing issues could cause an incorrect `rbd export`. Also
  fixed in 15.2.16 and 16.2.8.

* Fix for a potential performance degradation when running Windows VMs
  on krbd. For details, see `rxbounce` map option description:

  https://docs.ceph.com/en/quincy/man/8/rbd/#kernel-rbd-krbd-options

RGW object storage
~~~~~~~~~~~~~~~~~~

* RGW now supports rate limiting by user and/or by bucket. With this
  feature it is possible to limit user and/or bucket, the total operations
  and/or bytes per minute can be delivered. This feature allows the
  admin to limit only READ operations and/or WRITE operations. The
  rate-limiting configuration could be applied on all users and all buckets
  by using global configuration.

* `radosgw-admin realm delete` has been renamed to `radosgw-admin realm
  rm`. This is consistent with the help message.

* S3 bucket notification events now contain an `eTag` key instead of
  `etag`, and eventName values no longer carry the `s3:` prefix, fixing
  deviations from the message format that is observed on AWS.

* It is possible to specify ssl options and ciphers for beast frontend
  now. The default ssl options setting is
  "no_sslv2:no_sslv3:no_tlsv1:no_tlsv1_1". If you want to return to the old
  behavior, add 'ssl_options=' (empty) to the ``rgw frontends`` configuration.

* The behavior for Multipart Upload was modified so that only
  CompleteMultipartUpload notification is sent at the end of the multipart
  upload. The POST notification at the beginning of the upload and the PUT
  notifications that were sent on each part are no longer sent.


CephFS distributed file system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* fs: A file system can be created with a specific ID ("fscid"). This is
  useful in certain recovery scenarios (for example, when a monitor
  database has been lost and rebuilt, and the restored file system is
  expected to have the same ID as before).

* fs: A file system can be renamed using the `fs rename` command. Any cephx
  credentials authorized for the old file system name will need to be
  reauthorized to the new file system name. Since the operations of the clients
  using these re-authorized IDs may be disrupted, this command requires the
  "--yes-i-really-mean-it" flag. Also, mirroring is expected to be disabled
  on the file system.

* MDS upgrades no longer require all standby MDS daemons to be stoped before
  upgrading a file systems's sole active MDS.

* CephFS: Failure to replay the journal by a standby-replay daemon now
  causes the rank to be marked "damaged".

Upgrading from Octopus or Pacific
----------------------------------

Quincy does not support LevelDB. Please migrate your OSDs and monitors
to RocksDB before upgrading to Quincy.

Before starting, make sure your cluster is stable and healthy (no down or
recovering OSDs).  (This is optional, but recommended.) You can disable
the autoscaler for all pools during the upgrade using the noautoscale flag.

.. note::

  You can monitor the progress of your upgrade at each stage with the
  ``ceph versions`` command, which will tell you what ceph version(s) are
  running for each type of daemon.

Upgrading cephadm clusters
~~~~~~~~~~~~~~~~~~~~~~~~~~

If your cluster is deployed with cephadm (first introduced in Octopus), then
the upgrade process is entirely automated.  To initiate the upgrade,

  .. prompt:: bash #

    ceph orch upgrade start --ceph-version 17.2.0

The same process is used to upgrade to future minor releases.

Upgrade progress can be monitored with ``ceph -s`` (which provides a simple
progress bar) or more verbosely with

  .. prompt:: bash #

    ceph -W cephadm

The upgrade can be paused or resumed with

  .. prompt:: bash #

    ceph orch upgrade pause   # to pause
    ceph orch upgrade resume  # to resume

or canceled with

  .. prompt:: bash #

    ceph orch upgrade stop

Note that canceling the upgrade simply stops the process; there is no ability to
downgrade back to Octopus or Pacific.


Upgrading non-cephadm clusters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. note::
   If you cluster is running Octopus (15.2.x) or later, you might choose
   to first convert it to use cephadm so that the upgrade to Quincy
   is automated (see above).  For more information, see
   :ref:`cephadm-adoption`.

#. Set the ``noout`` flag for the duration of the upgrade. (Optional,
   but recommended.)::

     # ceph osd set noout

#. Upgrade monitors by installing the new packages and restarting the
   monitor daemons.  For example, on each monitor host,::

     # systemctl restart ceph-mon.target

   Once all monitors are up, verify that the monitor upgrade is
   complete by looking for the ``quincy`` string in the mon
   map.  The command::

     # ceph mon dump | grep min_mon_release

   should report::

     min_mon_release 17 (quincy)

   If it doesn't, that implies that one or more monitors hasn't been
   upgraded and restarted and/or the quorum does not include all monitors.

#. Upgrade ``ceph-mgr`` daemons by installing the new packages and
   restarting all manager daemons.  For example, on each manager host,::

     # systemctl restart ceph-mgr.target

   Verify the ``ceph-mgr`` daemons are running by checking ``ceph
   -s``::

     # ceph -s

     ...
       services:
        mon: 3 daemons, quorum foo,bar,baz
        mgr: foo(active), standbys: bar, baz
     ...

#. Upgrade all OSDs by installing the new packages and restarting the
   ceph-osd daemons on all OSD hosts::

     # systemctl restart ceph-osd.target

#. Upgrade all CephFS MDS daemons. For each CephFS file system,

   #. Disable standby_replay::

	# ceph fs set <fs_name> allow_standby_replay false

   #. Reduce the number of ranks to 1.  (Make note of the original
      number of MDS daemons first if you plan to restore it later.)::

	# ceph status
	# ceph fs set <fs_name> max_mds 1

   #. Wait for the cluster to deactivate any non-zero ranks by
      periodically checking the status::

	# ceph status

   #. Take all standby MDS daemons offline on the appropriate hosts with::

	# systemctl stop ceph-mds@<daemon_name>

   #. Confirm that only one MDS is online and is rank 0 for your FS::

	# ceph status

   #. Upgrade the last remaining MDS daemon by installing the new
      packages and restarting the daemon::

        # systemctl restart ceph-mds.target

   #. Restart all standby MDS daemons that were taken offline::

	# systemctl start ceph-mds.target

   #. Restore the original value of ``max_mds`` for the volume::

	# ceph fs set <fs_name> max_mds <original_max_mds>

#. Upgrade all radosgw daemons by upgrading packages and restarting
   daemons on all hosts::

     # systemctl restart ceph-radosgw.target

#. Complete the upgrade by disallowing pre-Quincy OSDs and enabling
   all new Quincy-only functionality::

     # ceph osd require-osd-release quincy

#. If you set ``noout`` at the beginning, be sure to clear it with::

     # ceph osd unset noout

#. Consider transitioning your cluster to use the cephadm deployment
   and orchestration framework to simplify cluster management and
   future upgrades.  For more information on converting an existing
   cluster to cephadm, see :ref:`cephadm-adoption`.

Post-upgrade
~~~~~~~~~~~~

#. Verify the cluster is healthy with ``ceph health``. If your cluster is
   running Filestore, a deprecation warning is expected. This warning can
   be temporarily muted using the following command::

     ceph health mute OSD_FILESTORE

#. If you are upgrading from Mimic, or did not already do so when you
   upgraded to Nautilus, we recommend you enable the new :ref:`v2
   network protocol <msgr2>`, issue the following command::

     ceph mon enable-msgr2

   This will instruct all monitors that bind to the old default port
   6789 for the legacy v1 protocol to also bind to the new 3300 v2
   protocol port.  To see if all monitors have been updated,::

     ceph mon dump

   and verify that each monitor has both a ``v2:`` and ``v1:`` address
   listed.

#. Consider enabling the :ref:`telemetry module <telemetry>` to send
   anonymized usage statistics and crash information to the Ceph
   upstream developers.  To see what would be reported (without actually
   sending any information to anyone),::

     ceph telemetry preview-all

   If you are comfortable with the data that is reported, you can opt-in to
   automatically report the high-level cluster metadata with::

     ceph telemetry on

   The public dashboard that aggregates Ceph telemetry can be found at
   `https://telemetry-public.ceph.com/ <https://telemetry-public.ceph.com/>`_.

   For more information about the telemetry module, see :ref:`the
   documentation <telemetry>`.


Upgrading from pre-Octopus releases (like Nautilus)
---------------------------------------------------


You *must* first upgrade to Octopus (15.2.z) or Pacific (16.2.z) before
upgrading to Quincy.
Commit	Line	Data
33c7a0ef TL	1	======
	2	Quincy
	3	======
	4
	5	Quincy is the 17th stable release of Ceph. It is named after Squidward
	6	Quincy Tentacles from Spongebob Squarepants.
	7
	8	v17.2.0 Quincy
	9	==============
	10
	11	This is the first stable release of Ceph Quincy.
	12
	13	Major Changes from Pacific
	14	--------------------------
	15
	16	General
	17	~~~~~~~
	18
	19	* Filestore has been deprecated in Quincy. BlueStore is Ceph's default object
	20	store.
	21
	22	* The `ceph-mgr-modules-core` debian package no longer recommends
	23	`ceph-mgr-rook`. `ceph-mgr-rook` depends on `python3-numpy`, which
	24	cannot be imported in different Python sub-interpreters multiple times
	25	when the version of `python3-numpy` is older than 1.19. Because
	26	`apt-get` installs the `Recommends` packages by default, `ceph-mgr-rook`
	27	was always installed along with the `ceph-mgr` debian package as an
	28	indirect dependency. If your workflow depends on this behavior, you
	29	might want to install `ceph-mgr-rook` separately.
	30
	31	* The ``device_health_metrics`` pool has been renamed ``.mgr``. It is now
	32	used as a common store for all ``ceph-mgr`` modules. After upgrading to
	33	Quincy, the ``device_health_metrics`` pool will be renamed to ``.mgr``
	34	on existing clusters.
	35
	36	* The ``ceph pg dump`` command now prints three additional columns:
	37	`LAST_SCRUB_DURATION` shows the duration (in seconds) of the last completed
	38	scrub;
	39	`SCRUB_SCHEDULING` conveys whether a PG is scheduled to be scrubbed at a
	40	specified time, whether it is queued for scrubbing, or whether it is being
	41	scrubbed;
	42	`OBJECTS_SCRUBBED` shows the number of objects scrubbed in a PG after a
	43	scrub begins.
	44
	45	* A health warning is now reported if the ``require-osd-release`` flag
	46	is not set to the appropriate release after a cluster upgrade.
	47
	48	* LevelDB support has been removed. ``WITH_LEVELDB`` is no longer a supported
	49	build option. Users should migrate their monitors and OSDs to RocksDB
	50	before upgrading to Quincy.
	51
	52	* Cephadm: ``osd_memory_target_autotune`` is enabled by default, which sets
	53	``mgr/cephadm/autotune_memory_target_ratio`` to ``0.7`` of total RAM. This
	54	is unsuitable for hyperconverged infrastructures. For hyperconverged Ceph,
	55	please refer to the documentation or set
	56	``mgr/cephadm/autotune_memory_target_ratio`` to ``0.2``.
	57
	58	* telemetry: Improved the opt-in flow so that users can keep sharing the same
	59	data, even when new data collections are available. A new 'perf' channel that
	60	collects various performance metrics is now avaiable to opt into with:
	61	`ceph telemetry on`
	62	`ceph telemetry enable channel perf`
	63	See a sample report with `ceph telemetry preview`.
	64	Note that generating a telemetry report with 'perf' channel data might
65	take a few moments in big clusters.
66	For more details, see:
67	https://docs.ceph.com/en/quincy/mgr/telemetry/
68
69	* MGR: The progress module disables the pg recovery event by default since the
70	event is expensive and has interrupted other services when there are OSDs
71	being marked in/out from the the cluster. However, the user can still enable
72	this event anytime. For more detail, see:
73
74	https://docs.ceph.com/en/quincy/mgr/progress/
75
76	* https://tracker.ceph.com/issues/55383 is a known issue -
77	``mon_cluster_log_to_journald`` needs to be set to false, when
78	``mon_cluster_log_to_file`` is set to true to continue to log cluster
79	log messages to file, after log rotation.
80
81	Cephadm
82	-------
83
84	* SNMP Support
85	* Colocation of Daemons (mgr, mds, rgw)
86	* osd memory autotuning
87	* Integration with new NFS mgr module
88	* Ability to zap osds as they are removed
89	* cephadm agent for increased performance/scalability
90
91	Dashboard
92	~~~~~~~~~
93	* Day 1: the new "Cluster Expansion Wizard" will guide users through post-install steps:
94	adding new hosts, storage devices or services.
95	* NFS: the Dashboard now allows users to fully manage all NFS exports from a single place.
96	* New mgr module (feedback): users can quickly report Ceph tracker issues
97	or suggestions directly from the Dashboard or the CLI.
98	* New "Message of the Day": cluster admins can publish a custom message in a banner.
99	* Cephadm integration improvements:
100	* Host management: maintenance, specs and labelling,
101	* Service management: edit and display logs,
102	* Daemon management (start, stop, restart, reload),
103	* New services supported: ingress (HAProxy) and SNMP-gateway.
104	* Monitoring and alerting:
105	* 43 new alerts have been added (totalling 68) improving observability of events affecting:
106	cluster health, monitors, storage devices, PGs and CephFS.
107	* Alerts can now be sent externally as SNMP traps via the new SNMP gateway service
108	(the MIB is provided).
109	* Improved integrated full/nearfull event notifications.
110	* Grafana Dashboards now use grafonnet format (though they're still available
111	in JSON format).
112	* Stack update: images for monitoring containers have been updated.
113	Grafana 8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node Exporter 1.3.1.
114	This reduced exposure to several Grafana vulnerabilities (CVE-2021-43798,
115	CVE-2021-39226, CVE-2021-43798, CVE-2020-29510, CVE-2020-29511).
116
117	RADOS
118	~~~~~
119
120	* OSD: Ceph now uses `mclock_scheduler` for BlueStore OSDs as its default
121	`osd_op_queue` to provide QoS. The 'mclock_scheduler' is not supported
122	for Filestore OSDs. Therefore, the default 'osd_op_queue' is set to `wpq`
123	for Filestore OSDs and is enforced even if the user attempts to change it.
124	For more details on configuring mclock see,
125
126	https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/
127
128	An outstanding issue exists during runtime where the mclock config options
129	related to reservation, weight and limit cannot be modified after switching
130	to the `custom` mclock profile using the `ceph config set ...` command.
131	This is tracked by: https://tracker.ceph.com/issues/55153. Until the issue
132	is fixed, users are advised to avoid using the 'custom' profile or use the
133	workaround mentioned in the tracker.
134
135	* MGR: The pg_autoscaler can now be turned `on` and `off` globally
136	with the `noautoscale` flag. By default, it is set to `on`, but this flag
137	can come in handy to prevent rebalancing triggered by autoscaling during
138	cluster upgrade and maintenance. Pools can now be created with the `--bulk`
139	flag, which allows the autoscaler to allocate more PGs to such pools. This
140	can be useful to get better out of the box performance for data-heavy pools.
141
142	For more details about autoscaling, see:
143	https://docs.ceph.com/en/quincy/rados/operations/placement-groups/
144
145	* OSD: Support for on-wire compression for osd-osd communication, `off` by
146	default.
147
148	For more details about compression modes, see:
149	https://docs.ceph.com/en/quincy/rados/configuration/msgr2/#compression-modes
150
151	* OSD: Concise reporting of slow operations in the cluster log. The old
152	and more verbose logging behavior can be regained by setting
153	`osd_aggregated_slow_ops_logging` to false.
154
155	* the "kvs" Ceph object class is not packaged anymore. The "kvs" Ceph
156	object class offers a distributed flat b-tree key-value store that
157	is implemented on top of the librados objects omap. Because there
158	are no existing internal users of this object class, it is not
159	packaged anymore.
160
161	RBD block storage
162	~~~~~~~~~~~~~~~~~
163
164	* rbd-nbd: `rbd device attach` and `rbd device detach` commands added,
165	these allow for safe reattach after `rbd-nbd` daemon is restarted since
166	Linux kernel 5.14.
167
168	* rbd-nbd: `notrim` map option added to support thick-provisioned images,
169	similar to krbd.
170
171	* Large stabilization effort for client-side persistent caching on SSD
172	devices, also available in 16.2.8. For details on usage, see:
173
174	https://docs.ceph.com/en/quincy/rbd/rbd-persistent-write-log-cache/
175
176	* Several bug fixes in diff calculation when using fast-diff image
177	feature + whole object (inexact) mode. In some rare cases these
178	long-standing issues could cause an incorrect `rbd export`. Also
179	fixed in 15.2.16 and 16.2.8.
180
181	* Fix for a potential performance degradation when running Windows VMs
182	on krbd. For details, see `rxbounce` map option description:
183
184	https://docs.ceph.com/en/quincy/man/8/rbd/#kernel-rbd-krbd-options
185
186	RGW object storage
187	~~~~~~~~~~~~~~~~~~
188
189	* RGW now supports rate limiting by user and/or by bucket. With this
190	feature it is possible to limit user and/or bucket, the total operations
191	and/or bytes per minute can be delivered. This feature allows the
192	admin to limit only READ operations and/or WRITE operations. The
193	rate-limiting configuration could be applied on all users and all buckets
194	by using global configuration.
195
196	* `radosgw-admin realm delete` has been renamed to `radosgw-admin realm
197	rm`. This is consistent with the help message.
198
199	* S3 bucket notification events now contain an `eTag` key instead of
200	`etag`, and eventName values no longer carry the `s3:` prefix, fixing
201	deviations from the message format that is observed on AWS.
202
203	* It is possible to specify ssl options and ciphers for beast frontend
204	now. The default ssl options setting is
205	"no_sslv2:no_sslv3:no_tlsv1:no_tlsv1_1". If you want to return to the old
206	behavior, add 'ssl_options=' (empty) to the ``rgw frontends`` configuration.
207
208	* The behavior for Multipart Upload was modified so that only
209	CompleteMultipartUpload notification is sent at the end of the multipart
210	upload. The POST notification at the beginning of the upload and the PUT
211	notifications that were sent on each part are no longer sent.
212
213
214	CephFS distributed file system
215	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216
217	* fs: A file system can be created with a specific ID ("fscid"). This is
218	useful in certain recovery scenarios (for example, when a monitor
219	database has been lost and rebuilt, and the restored file system is
220	expected to have the same ID as before).
221
222	* fs: A file system can be renamed using the `fs rename` command. Any cephx
223	credentials authorized for the old file system name will need to be
224	reauthorized to the new file system name. Since the operations of the clients
225	using these re-authorized IDs may be disrupted, this command requires the
226	"--yes-i-really-mean-it" flag. Also, mirroring is expected to be disabled
227	on the file system.
228
229	* MDS upgrades no longer require all standby MDS daemons to be stoped before
230	upgrading a file systems's sole active MDS.
231
232	* CephFS: Failure to replay the journal by a standby-replay daemon now
233	causes the rank to be marked "damaged".
234
235	Upgrading from Octopus or Pacific
236	----------------------------------
237
238	Quincy does not support LevelDB. Please migrate your OSDs and monitors
239	to RocksDB before upgrading to Quincy.
240
241	Before starting, make sure your cluster is stable and healthy (no down or
242	recovering OSDs). (This is optional, but recommended.) You can disable
243	the autoscaler for all pools during the upgrade using the noautoscale flag.
244
245	.. note::
246
247	You can monitor the progress of your upgrade at each stage with the
248	``ceph versions`` command, which will tell you what ceph version(s) are
249	running for each type of daemon.
250
251	Upgrading cephadm clusters
252	~~~~~~~~~~~~~~~~~~~~~~~~~~
253
254	If your cluster is deployed with cephadm (first introduced in Octopus), then
255	the upgrade process is entirely automated. To initiate the upgrade,
256
257	.. prompt:: bash #
258
259	ceph orch upgrade start --ceph-version 17.2.0
260
261	The same process is used to upgrade to future minor releases.
262
263	Upgrade progress can be monitored with ``ceph -s`` (which provides a simple
264	progress bar) or more verbosely with
265
266	.. prompt:: bash #
267
268	ceph -W cephadm
269
270	The upgrade can be paused or resumed with
271
272	.. prompt:: bash #
273
274	ceph orch upgrade pause # to pause
275	ceph orch upgrade resume # to resume
276
277	or canceled with
278
279	.. prompt:: bash #
280
281	ceph orch upgrade stop
282
283	Note that canceling the upgrade simply stops the process; there is no ability to
284	downgrade back to Octopus or Pacific.
285
286
287	Upgrading non-cephadm clusters
288	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
289
290	.. note::
291	If you cluster is running Octopus (15.2.x) or later, you might choose
292	to first convert it to use cephadm so that the upgrade to Quincy
293	is automated (see above). For more information, see
294	:ref:`cephadm-adoption`.
295
296	#. Set the ``noout`` flag for the duration of the upgrade. (Optional,
297	but recommended.)::
298
299	# ceph osd set noout
300
301	#. Upgrade monitors by installing the new packages and restarting the
302	monitor daemons. For example, on each monitor host,::
303
304	# systemctl restart ceph-mon.target
305
306	Once all monitors are up, verify that the monitor upgrade is
307	complete by looking for the ``quincy`` string in the mon
308	map. The command::
309
310	# ceph mon dump \| grep min_mon_release
311
312	should report::
313
314	min_mon_release 17 (quincy)
315
316	If it doesn't, that implies that one or more monitors hasn't been
317	upgraded and restarted and/or the quorum does not include all monitors.
318
319	#. Upgrade ``ceph-mgr`` daemons by installing the new packages and
320	restarting all manager daemons. For example, on each manager host,::
321
322	# systemctl restart ceph-mgr.target
323
324	Verify the ``ceph-mgr`` daemons are running by checking ``ceph
325	-s``::
326
327	# ceph -s
328
329	...
330	services:
331	mon: 3 daemons, quorum foo,bar,baz
332	mgr: foo(active), standbys: bar, baz
333	...
334
335	#. Upgrade all OSDs by installing the new packages and restarting the
336	ceph-osd daemons on all OSD hosts::
337
338	# systemctl restart ceph-osd.target
339
340	#. Upgrade all CephFS MDS daemons. For each CephFS file system,
341
342	#. Disable standby_replay::
343
344	# ceph fs set <fs_name> allow_standby_replay false
345
346	#. Reduce the number of ranks to 1. (Make note of the original
347	number of MDS daemons first if you plan to restore it later.)::
348
349	# ceph status
350	# ceph fs set <fs_name> max_mds 1
351
352	#. Wait for the cluster to deactivate any non-zero ranks by
353	periodically checking the status::
354
355	# ceph status
356
357	#. Take all standby MDS daemons offline on the appropriate hosts with::
358
359	# systemctl stop ceph-mds@<daemon_name>
360
361	#. Confirm that only one MDS is online and is rank 0 for your FS::
362
363	# ceph status
364
365	#. Upgrade the last remaining MDS daemon by installing the new
366	packages and restarting the daemon::
367
368	# systemctl restart ceph-mds.target
369
370	#. Restart all standby MDS daemons that were taken offline::
371
372	# systemctl start ceph-mds.target
373
374	#. Restore the original value of ``max_mds`` for the volume::
375
376	# ceph fs set <fs_name> max_mds <original_max_mds>
377
378	#. Upgrade all radosgw daemons by upgrading packages and restarting
379	daemons on all hosts::
380
381	# systemctl restart ceph-radosgw.target
382
383	#. Complete the upgrade by disallowing pre-Quincy OSDs and enabling
384	all new Quincy-only functionality::
385
386	# ceph osd require-osd-release quincy
387
388	#. If you set ``noout`` at the beginning, be sure to clear it with::
389
390	# ceph osd unset noout
391
392	#. Consider transitioning your cluster to use the cephadm deployment
393	and orchestration framework to simplify cluster management and
394	future upgrades. For more information on converting an existing
395	cluster to cephadm, see :ref:`cephadm-adoption`.
396
397	Post-upgrade
398	~~~~~~~~~~~~
399
400	#. Verify the cluster is healthy with ``ceph health``. If your cluster is
401	running Filestore, a deprecation warning is expected. This warning can
402	be temporarily muted using the following command::
403
404	ceph health mute OSD_FILESTORE
405
406	#. If you are upgrading from Mimic, or did not already do so when you
407	upgraded to Nautilus, we recommend you enable the new :ref:`v2
408	network protocol <msgr2>`, issue the following command::
409
410	ceph mon enable-msgr2
411
412	This will instruct all monitors that bind to the old default port
413	6789 for the legacy v1 protocol to also bind to the new 3300 v2
414	protocol port. To see if all monitors have been updated,::
415
416	ceph mon dump
417
418	and verify that each monitor has both a ``v2:`` and ``v1:`` address
419	listed.
420
421	#. Consider enabling the :ref:`telemetry module <telemetry>` to send
422	anonymized usage statistics and crash information to the Ceph
423	upstream developers. To see what would be reported (without actually
424	sending any information to anyone),::
425
426	ceph telemetry preview-all
427
428	If you are comfortable with the data that is reported, you can opt-in to
429	automatically report the high-level cluster metadata with::
430
431	ceph telemetry on
432
433	The public dashboard that aggregates Ceph telemetry can be found at
434	`https://telemetry-public.ceph.com/ <https://telemetry-public.ceph.com/>`_.
435
436	For more information about the telemetry module, see :ref:`the
437	documentation <telemetry>`.
438
439
440	Upgrading from pre-Octopus releases (like Nautilus)
441	---------------------------------------------------
442
443
444	You must first upgrade to Octopus (15.2.z) or Pacific (16.2.z) before
445	upgrading to Quincy.