ceph/doc/releases/quincy.rst

   1 ======
   2 Quincy
   3 ======
   4
   5 Quincy is the 17th stable release of Ceph.  It is named after Squidward
   6 Quincy Tentacles from Spongebob Squarepants.
   7
   8 v17.2.0 Quincy
   9 ==============
  10
  11 This is the first stable release of Ceph Quincy.
  12
  13 Major Changes from Pacific
  14 --------------------------
  15
  16 General
  17 ~~~~~~~
  18
  19 * Filestore has been deprecated in Quincy. BlueStore is Ceph's default object
  20   store.
  21
  22 * The `ceph-mgr-modules-core` debian package no longer recommends
  23   `ceph-mgr-rook`. `ceph-mgr-rook` depends on `python3-numpy`, which
  24   cannot be imported in different Python sub-interpreters multiple times
  25   when the version of `python3-numpy` is older than 1.19. Because
  26   `apt-get` installs the `Recommends` packages by default, `ceph-mgr-rook`
  27   was always installed along with the `ceph-mgr` debian package as an
  28   indirect dependency. If your workflow depends on this behavior, you
  29   might want to install `ceph-mgr-rook` separately.
  30
  31 * The ``device_health_metrics`` pool has been renamed ``.mgr``. It is now
  32   used as a common store for all ``ceph-mgr`` modules. After upgrading to
  33   Quincy, the ``device_health_metrics`` pool will be renamed to ``.mgr``
  34   on existing clusters.
  35
  36 * The ``ceph pg dump`` command now prints three additional columns:
  37   `LAST_SCRUB_DURATION` shows the duration (in seconds) of the last completed
  38   scrub;
  39   `SCRUB_SCHEDULING` conveys whether a PG is scheduled to be scrubbed at a
  40   specified time, whether it is queued for scrubbing, or whether it is being
  41   scrubbed;
  42   `OBJECTS_SCRUBBED` shows the number of objects scrubbed in a PG after a
  43   scrub begins.
  44
  45 * A health warning is now reported if the ``require-osd-release`` flag
  46   is not set to the appropriate release after a cluster upgrade.
  47
  48 * LevelDB support has been removed. ``WITH_LEVELDB`` is no longer a supported
  49   build option. Users *should* migrate their monitors and OSDs to RocksDB
  50   before upgrading to Quincy.
  51
  52 * Cephadm: ``osd_memory_target_autotune`` is enabled by default, which sets
  53   ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.7`` of total RAM. This
  54   is unsuitable for hyperconverged infrastructures. For hyperconverged Ceph,
  55   please refer to the documentation or set
  56   ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.2``.
  57
  58 * telemetry: Improved the opt-in flow so that users can keep sharing the same
  59   data, even when new data collections are available. A new 'perf' channel that
  60   collects various performance metrics is now avaiable to opt into with:
  61   `ceph telemetry on`
  62   `ceph telemetry enable channel perf`
  63   See a sample report with `ceph telemetry preview`.
  64   Note that generating a telemetry report with 'perf' channel data might
  65   take a few moments in big clusters.
  66   For more details, see:
  67   https://docs.ceph.com/en/quincy/mgr/telemetry/
  68
  69 * MGR: The progress module disables the pg recovery event by default since the
  70   event is expensive and has interrupted other services when there are OSDs
  71   being marked in/out from the the cluster. However, the user can still enable
  72   this event anytime. For more detail, see:
  73
  74   https://docs.ceph.com/en/quincy/mgr/progress/
  75
  76 * https://tracker.ceph.com/issues/55383 is a known issue -
  77   ``mon_cluster_log_to_journald`` needs to be set to false, when
  78   ``mon_cluster_log_to_file`` is set to true to continue to log cluster
  79   log messages to file, after log rotation.
  80
  81 Cephadm
  82 -------
  83
  84 * SNMP Support
  85 * Colocation of Daemons (mgr, mds, rgw)
  86 * osd memory autotuning
  87 * Integration with new NFS mgr module
  88 * Ability to zap osds as they are removed
  89 * cephadm agent for increased performance/scalability
  90
  91 Dashboard
  92 ~~~~~~~~~
  93 * Day 1: the new "Cluster Expansion Wizard" will guide users through post-install steps:
  94   adding new hosts, storage devices or services.
  95 * NFS: the Dashboard now allows users to fully manage all NFS exports from a single place.
  96 * New mgr module (feedback): users can quickly report Ceph tracker issues
  97   or suggestions directly from the Dashboard or the CLI.
  98 * New "Message of the Day": cluster admins can publish a custom message in a banner.
  99 * Cephadm integration improvements:
 100    * Host management: maintenance, specs and labelling,
 101    * Service management: edit and display logs,
 102    * Daemon management (start, stop, restart, reload),
 103    * New services supported: ingress (HAProxy) and SNMP-gateway.
 104 * Monitoring and alerting:
 105    * 43 new alerts have been added (totalling 68) improving observability of events affecting:
 106      cluster health, monitors, storage devices, PGs and CephFS.
 107    * Alerts can now be sent externally as SNMP traps via the new SNMP gateway service
 108      (the MIB is provided).
 109    * Improved integrated full/nearfull event notifications.
 110    * Grafana Dashboards now use grafonnet format (though they're still available
 111      in JSON format).
 112    * Stack update: images for monitoring containers have been updated.
 113      Grafana 8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node Exporter 1.3.1.
 114      This reduced exposure to several Grafana vulnerabilities (CVE-2021-43798,
 115      CVE-2021-39226, CVE-2021-43798,  CVE-2020-29510, CVE-2020-29511).
 116
 117 RADOS
 118 ~~~~~
 119
 120 * OSD: Ceph now uses `mclock_scheduler` for BlueStore OSDs as its default
 121   `osd_op_queue` to provide QoS. The 'mclock_scheduler' is not supported
 122   for Filestore OSDs. Therefore, the default 'osd_op_queue' is set to `wpq`
 123   for Filestore OSDs and is enforced even if the user attempts to change it.
 124   For more details on configuring mclock see,
 125
 126   https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/
 127
 128   An outstanding issue exists during runtime where the mclock config options
 129   related to reservation, weight and limit cannot be modified after switching
 130   to the `custom` mclock profile using the `ceph config set ...` command.
 131   This is tracked by: https://tracker.ceph.com/issues/55153. Until the issue
 132   is fixed, users are advised to avoid using the 'custom' profile or use the
 133   workaround mentioned in the tracker.
 134
 135 * MGR: The pg_autoscaler can now be turned `on` and `off` globally
 136   with the `noautoscale` flag. By default, it is set to `on`, but this flag
 137   can come in handy to prevent rebalancing triggered by autoscaling during
 138   cluster upgrade and maintenance. Pools can now be created with the `--bulk`
 139   flag, which allows the autoscaler to allocate more PGs to such pools. This
 140   can be useful to get better out of the box performance for data-heavy pools.
 141
 142   For more details about autoscaling, see:
 143   https://docs.ceph.com/en/quincy/rados/operations/placement-groups/
 144
 145 * OSD: Support for on-wire compression for osd-osd communication, `off` by
 146   default.
 147
 148   For more details about compression modes, see:
 149   https://docs.ceph.com/en/quincy/rados/configuration/msgr2/#compression-modes
 150
 151 * OSD: Concise reporting of slow operations in the cluster log. The old
 152   and more verbose logging behavior can be regained by setting
 153   `osd_aggregated_slow_ops_logging` to false.
 154
 155 * the "kvs" Ceph object class is not packaged anymore. The "kvs" Ceph
 156   object class offers a distributed flat b-tree key-value store that
 157   is implemented on top of the librados objects omap. Because there
 158   are no existing internal users of this object class, it is not
 159   packaged anymore.
 160
 161 RBD block storage
 162 ~~~~~~~~~~~~~~~~~
 163
 164 * rbd-nbd: `rbd device attach` and `rbd device detach` commands added,
 165   these allow for safe reattach after `rbd-nbd` daemon is restarted since
 166   Linux kernel 5.14.
 167
 168 * rbd-nbd: `notrim` map option added to support thick-provisioned images,
 169   similar to krbd.
 170
 171 * Large stabilization effort for client-side persistent caching on SSD
 172   devices, also available in 16.2.8. For details on usage, see:
 173
 174   https://docs.ceph.com/en/quincy/rbd/rbd-persistent-write-log-cache/
 175
 176 * Several bug fixes in diff calculation when using fast-diff image
 177   feature + whole object (inexact) mode. In some rare cases these
 178   long-standing issues could cause an incorrect `rbd export`. Also
 179   fixed in 15.2.16 and 16.2.8.
 180
 181 * Fix for a potential performance degradation when running Windows VMs
 182   on krbd. For details, see `rxbounce` map option description:
 183
 184   https://docs.ceph.com/en/quincy/man/8/rbd/#kernel-rbd-krbd-options
 185
 186 RGW object storage
 187 ~~~~~~~~~~~~~~~~~~
 188
 189 * RGW now supports rate limiting by user and/or by bucket. With this
 190   feature it is possible to limit user and/or bucket, the total operations
 191   and/or bytes per minute can be delivered. This feature allows the
 192   admin to limit only READ operations and/or WRITE operations. The
 193   rate-limiting configuration could be applied on all users and all buckets
 194   by using global configuration.
 195
 196 * `radosgw-admin realm delete` has been renamed to `radosgw-admin realm
 197   rm`. This is consistent with the help message.
 198
 199 * S3 bucket notification events now contain an `eTag` key instead of
 200   `etag`, and eventName values no longer carry the `s3:` prefix, fixing
 201   deviations from the message format that is observed on AWS.
 202
 203 * It is possible to specify ssl options and ciphers for beast frontend
 204   now. The default ssl options setting is
 205   "no_sslv2:no_sslv3:no_tlsv1:no_tlsv1_1". If you want to return to the old
 206   behavior, add 'ssl_options=' (empty) to the ``rgw frontends`` configuration.
 207
 208 * The behavior for Multipart Upload was modified so that only
 209   CompleteMultipartUpload notification is sent at the end of the multipart
 210   upload. The POST notification at the beginning of the upload and the PUT
 211   notifications that were sent on each part are no longer sent.
 212
 213
 214 CephFS distributed file system
 215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 216
 217 * fs: A file system can be created with a specific ID ("fscid"). This is
 218   useful in certain recovery scenarios (for example, when a monitor
 219   database has been lost and rebuilt, and the restored file system is
 220   expected to have the same ID as before).
 221
 222 * fs: A file system can be renamed using the `fs rename` command. Any cephx
 223   credentials authorized for the old file system name will need to be
 224   reauthorized to the new file system name. Since the operations of the clients
 225   using these re-authorized IDs may be disrupted, this command requires the
 226   "--yes-i-really-mean-it" flag. Also, mirroring is expected to be disabled
 227   on the file system.
 228
 229 * MDS upgrades no longer require all standby MDS daemons to be stoped before
 230   upgrading a file systems's sole active MDS.
 231
 232 * CephFS: Failure to replay the journal by a standby-replay daemon now
 233   causes the rank to be marked "damaged".
 234
 235 Upgrading from Octopus or Pacific
 236 ----------------------------------
 237
 238 Quincy does not support LevelDB. Please migrate your OSDs and monitors
 239 to RocksDB before upgrading to Quincy.
 240
 241 Before starting, make sure your cluster is stable and healthy (no down or
 242 recovering OSDs).  (This is optional, but recommended.) You can disable
 243 the autoscaler for all pools during the upgrade using the noautoscale flag.
 244
 245 .. note::
 246
 247   You can monitor the progress of your upgrade at each stage with the
 248   ``ceph versions`` command, which will tell you what ceph version(s) are
 249   running for each type of daemon.
 250
 251 Upgrading cephadm clusters
 252 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 253
 254 If your cluster is deployed with cephadm (first introduced in Octopus), then
 255 the upgrade process is entirely automated.  To initiate the upgrade,
 256
 257   .. prompt:: bash #
 258
 259     ceph orch upgrade start --ceph-version 17.2.0
 260
 261 The same process is used to upgrade to future minor releases.
 262
 263 Upgrade progress can be monitored with ``ceph -s`` (which provides a simple
 264 progress bar) or more verbosely with
 265
 266   .. prompt:: bash #
 267
 268     ceph -W cephadm
 269
 270 The upgrade can be paused or resumed with
 271
 272   .. prompt:: bash #
 273
 274     ceph orch upgrade pause   # to pause
 275     ceph orch upgrade resume  # to resume
 276
 277 or canceled with
 278
 279   .. prompt:: bash #
 280
 281     ceph orch upgrade stop
 282
 283 Note that canceling the upgrade simply stops the process; there is no ability to
 284 downgrade back to Octopus or Pacific.
 285
 286
 287 Upgrading non-cephadm clusters
 288 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 289
 290 .. note::
 291    If you cluster is running Octopus (15.2.x) or later, you might choose
 292    to first convert it to use cephadm so that the upgrade to Quincy
 293    is automated (see above).  For more information, see
 294    :ref:`cephadm-adoption`.
 295
 296 #. Set the ``noout`` flag for the duration of the upgrade. (Optional,
 297    but recommended.)::
 298
 299      # ceph osd set noout
 300
 301 #. Upgrade monitors by installing the new packages and restarting the
 302    monitor daemons.  For example, on each monitor host,::
 303
 304      # systemctl restart ceph-mon.target
 305
 306    Once all monitors are up, verify that the monitor upgrade is
 307    complete by looking for the ``quincy`` string in the mon
 308    map.  The command::
 309
 310      # ceph mon dump | grep min_mon_release
 311
 312    should report::
 313
 314      min_mon_release 17 (quincy)
 315
 316    If it doesn't, that implies that one or more monitors hasn't been
 317    upgraded and restarted and/or the quorum does not include all monitors.
 318
 319 #. Upgrade ``ceph-mgr`` daemons by installing the new packages and
 320    restarting all manager daemons.  For example, on each manager host,::
 321
 322      # systemctl restart ceph-mgr.target
 323
 324    Verify the ``ceph-mgr`` daemons are running by checking ``ceph
 325    -s``::
 326
 327      # ceph -s
 328
 329      ...
 330        services:
 331         mon: 3 daemons, quorum foo,bar,baz
 332         mgr: foo(active), standbys: bar, baz
 333      ...
 334
 335 #. Upgrade all OSDs by installing the new packages and restarting the
 336    ceph-osd daemons on all OSD hosts::
 337
 338      # systemctl restart ceph-osd.target
 339
 340 #. Upgrade all CephFS MDS daemons. For each CephFS file system,
 341
 342    #. Disable standby_replay::
 343
 344         # ceph fs set <fs_name> allow_standby_replay false
 345
 346    #. Reduce the number of ranks to 1.  (Make note of the original
 347       number of MDS daemons first if you plan to restore it later.)::
 348
 349         # ceph status
 350         # ceph fs set <fs_name> max_mds 1
 351
 352    #. Wait for the cluster to deactivate any non-zero ranks by
 353       periodically checking the status::
 354
 355         # ceph status
 356
 357    #. Take all standby MDS daemons offline on the appropriate hosts with::
 358
 359         # systemctl stop ceph-mds@<daemon_name>
 360
 361    #. Confirm that only one MDS is online and is rank 0 for your FS::
 362
 363         # ceph status
 364
 365    #. Upgrade the last remaining MDS daemon by installing the new
 366       packages and restarting the daemon::
 367
 368         # systemctl restart ceph-mds.target
 369
 370    #. Restart all standby MDS daemons that were taken offline::
 371
 372         # systemctl start ceph-mds.target
 373
 374    #. Restore the original value of ``max_mds`` for the volume::
 375
 376         # ceph fs set <fs_name> max_mds <original_max_mds>
 377
 378 #. Upgrade all radosgw daemons by upgrading packages and restarting
 379    daemons on all hosts::
 380
 381      # systemctl restart ceph-radosgw.target
 382
 383 #. Complete the upgrade by disallowing pre-Quincy OSDs and enabling
 384    all new Quincy-only functionality::
 385
 386      # ceph osd require-osd-release quincy
 387
 388 #. If you set ``noout`` at the beginning, be sure to clear it with::
 389
 390      # ceph osd unset noout
 391
 392 #. Consider transitioning your cluster to use the cephadm deployment
 393    and orchestration framework to simplify cluster management and
 394    future upgrades.  For more information on converting an existing
 395    cluster to cephadm, see :ref:`cephadm-adoption`.
 396
 397 Post-upgrade
 398 ~~~~~~~~~~~~
 399
 400 #. Verify the cluster is healthy with ``ceph health``. If your cluster is
 401    running Filestore, a deprecation warning is expected. This warning can
 402    be temporarily muted using the following command::
 403
 404      ceph health mute OSD_FILESTORE
 405
 406 #. If you are upgrading from Mimic, or did not already do so when you
 407    upgraded to Nautilus, we recommend you enable the new :ref:`v2
 408    network protocol <msgr2>`, issue the following command::
 409
 410      ceph mon enable-msgr2
 411
 412    This will instruct all monitors that bind to the old default port
 413    6789 for the legacy v1 protocol to also bind to the new 3300 v2
 414    protocol port.  To see if all monitors have been updated,::
 415
 416      ceph mon dump
 417
 418    and verify that each monitor has both a ``v2:`` and ``v1:`` address
 419    listed.
 420
 421 #. Consider enabling the :ref:`telemetry module <telemetry>` to send
 422    anonymized usage statistics and crash information to the Ceph
 423    upstream developers.  To see what would be reported (without actually
 424    sending any information to anyone),::
 425
 426      ceph telemetry preview-all
 427
 428    If you are comfortable with the data that is reported, you can opt-in to
 429    automatically report the high-level cluster metadata with::
 430
 431      ceph telemetry on
 432
 433    The public dashboard that aggregates Ceph telemetry can be found at
 434    `https://telemetry-public.ceph.com/ <https://telemetry-public.ceph.com/>`_.
 435
 436    For more information about the telemetry module, see :ref:`the
 437    documentation <telemetry>`.
 438
 439
 440 Upgrading from pre-Octopus releases (like Nautilus)
 441 ---------------------------------------------------
 442
 443
 444 You *must* first upgrade to Octopus (15.2.z) or Pacific (16.2.z) before
 445 upgrading to Quincy.