[ceph.git] / ceph / PendingReleaseNotes

12.2.12
-------
* In 12.2.9 and earlier releases, keyring caps were not checked for validity,
  so the caps string could be anything. As of 12.2.10, caps strings are
  validated and providing a keyring with an invalid caps string to, e.g.,
  "ceph auth add" will result in an error.

12.2.11
-------
* `cephfs-journal-tool` makes rank argument (--rank) mandatory. Rank is
  of format `filesystem:rank`, where `filesystem` is the cephfs filesystem
  and `rank` is the MDS rank on which the operation is to be executed. To
  operate on all ranks, use `all` or `*` as the rank specifier. Note that,
  operations that dump journal information to file will now dump to per-rank
  suffixed dump files. Importing journal information from dump files is
  disallowed if operation is targetted for all ranks.

* The MDS cache trimming is now throttled. Dropping the MDS cache
  via the `ceph tell mds.<foo> cache drop` command or large reductions in the
  cache size will no longer cause service unavailability.

* The CephFS MDS behavior with recalling caps has been significantly improved
  to not attempt recalling too many caps at once, leading to instability.
  MDS with a large cache (64GB+) should be more stable.

* MDS now provides a config option "mds_max_caps_per_client" (default: 1M) to
  limit the number of caps a client session may hold. Long running client
  sessions with a large number of caps have been a source of instability in the
  MDS when all of these caps need to be processed during certain session
  events. It is recommended to not unnecessarily increase this value.

* The MDS config mds_recall_state_timeout has been removed. Late client recall
  warnings are now generated based on the number of caps the MDS has recalled
  which have not been released. The new configs mds_recall_warning_threshold
  (default: 32K) and mds_recall_warning_decay_rate (default: 60s) sets the
  threshold for this warning.

>= 12.1.2
---------
* When running 'df' on a CephFS filesystem comprising exactly one data pool,
  the result now reflects the file storage space used and available in that
  data pool (fuse client only).
* Added new commands "pg force-recovery" and "pg-force-backfill". Use them
  to boost recovery or backfill priority of specified pgs, so they're
  recovered/backfilled before any other. Note that these commands don't
  interrupt ongoing recovery/backfill, but merely queue specified pgs
  before others so they're recovered/backfilled as soon as possible.
  New commands "pg cancel-force-recovery" and "pg cancel-force-backfill"
  restore default recovery/backfill priority of previously forced pgs.


12.2.1
------

* Clusters will need to upgrade to 12.2.1 before upgrading to any
  Mimic 13.y.z version (either a development release or an eventual
  stable Mimic release).

- *CephFS*:

  * Limiting MDS cache via a memory limit is now supported using the new
    mds_cache_memory_limit config option (1GB by default).  A cache reservation
    can also be specified using mds_cache_reservation as a percentage of the
    limit (5% by default). Limits by inode count are still supported using
    mds_cache_size. Setting mds_cache_size to 0 (the default) disables the
    inode limit.

* The maximum number of PGs per OSD before the monitor issues a
  warning has been reduced from 300 to 200 PGs.  200 is still twice
  the generally recommended target of 100 PGs per OSD.  This limit can
  be adjusted via the ``mon_max_pg_per_osd`` option on the
  monitors.  The older ``mon_pg_warn_max_per_osd`` option has been removed.

* Creating pools or adjusting pg_num will now fail if the change would
  make the number of PGs per OSD exceed the configured
  ``mon_max_pg_per_osd`` limit.  The option can be adjusted if it
  is really necessary to create a pool with more PGs.

12.2.3
------

- *RBD*:

  * The RBD C API's rbd_discard method now enforces a maximum length of
    2GB to match the C++ API's Image::discard method. This restriction
    prevents overflow of the result code.

- *CephFS*:

  * The CephFS client now catches failures to clear dentries during startup
    and refuses to start as consistency and untrimmable cache issues may
    develop. The new option client_die_on_failed_dentry_invalidate (default:
    true) may be turned off to allow the client to proceed (dangerous!).

12.2.5
------

- *CephFS*:

  * Upgrading an MDS cluster to 12.2.3+ will result in all active MDS
    exiting due to feature incompatibilities once an upgraded MDS comes online
    (even as standby). Operators may ignore the error messages and continue
    upgrading/restarting or follow this upgrade sequence:

    Reduce the number of ranks to 1 (`ceph fs set <fs_name> max_mds 1`),
    deactivate all other ranks (`ceph mds deactivate <fs_name>:<n>`), shutdown
    standbys leaving the one active MDS, upgrade the single active MDS, then
    upgrade/start standbys. Finally, restore the previous max_mds.

    See also: https://tracker.ceph.com/issues/23172

* *rados list-inconsistent-obj format changes:*

  * Various error strings have been improved.  For example, the "oi" or "oi_attr"
    in errors which stands for object info is now "info" (e.g. oi_attr_missing is
    now info_missing).

  * The object's "selected_object_info" is now in json format instead of string.

  * The attribute errors (attr_value_mismatch, attr_name_mismatch) only apply to user
    attributes.  Only user attributes are output and have the internal leading underscore
    stripped.

  * If there are hash information errors (hinfo_missing, hinfo_corrupted,
    hinfo_inconsistency) then "hashinfo" is added with the json format of the
    information.  If the information is corrupt then "hashinfo" is a string
    containing the value.

  * If there are snapset errors (snapset_missing, snapset_corrupted,
    snapset_inconsistency) then "snapset" is added with the json format of the
    information.  If the information is corrupt then "snapset" is a string containing
    the value.

  * If there are object information errors (info_missing, info_corrupted,
    obj_size_info_mismatch, object_info_inconsistency) then "object_info" is added
    with the json format of the information instead of a string.  If the information
    is corrupt then "object_info" is a string containing the value.

* *rados list-inconsistent-snapset format changes:*

  * Various error strings have been improved.  For example, the "ss_attr" in
    errors which stands for snapset info is now "snapset" (e.g. ss_attr_missing is
    now snapset_missing).  The error snapset_mismatch has been renamed to snapset_error
    to better reflect what it means.

  * The head snapset information is output in json format as "snapset."  This means that
    even when there are no head errors, the head object will be output when any shard
    has an error.  This head object is there to show the snapset that was used in
    determining errors.


12.2.9
------
* 12.2.9 contains the pg hard hard limit patches(https://tracker.ceph.com/issues/23979).
  A partial upgrade during recovery/backfill, can cause the osds on the previous version,
  to fail with assert(trim_to <= info.last_complete). The workaround for users is to
  upgrade and restart all OSDs to a version with the pg hard limit, or only upgrade
  when all PGs are active+clean. This patch will be reverted in 12.2.10, until
  a clean upgrade path is added to the pg log hard limit patches.

  See also: http://tracker.ceph.com/issues/36686

12.2.11
-------

* The default memory utilization for the mons has been increased
  somewhat.  Rocksdb now uses 512 MB of RAM by default, which should
  be sufficient for small to medium-sized clusters; large clusters
  should tune this up.  Also, the ``mon_osd_cache_size`` has been
  increase from 10 OSDMaps to 500, which will translate to an
  additional 500 MB to 1 GB of RAM for large clusters, and much less
  for small clusters.

* New CephFS file system attributes session_timeout and session_autoclose
  are configurable via `ceph fs set`. The MDS config options
  mds_session_timeout, mds_session_autoclose, and mds_max_file_size are now
  obsolete.

* This release fixes the pg log hard limit bug(https://tracker.ceph.com/issues/23979).
  A flag called pglog_hardlimit has been introduced. It is off by default.
  This flag enables the feature that limits the length of the pg log. Users should run
  'ceph osd set pglog_hardlimit' after completely upgrading to 12.2.11. Once all the OSDs
  have this flag set, the length of the pg log will be capped by a hard limit. We do not
  recommend unsetting this flag beyond this point.
Commit	Line	Data
a8e16298 TL	1	12.2.12
	2	-------
	3	* In 12.2.9 and earlier releases, keyring caps were not checked for validity,
	4	so the caps string could be anything. As of 12.2.10, caps strings are
	5	validated and providing a keyring with an invalid caps string to, e.g.,
	6	"ceph auth add" will result in an error.
	7
	8	12.2.11
	9	-------
f64942e4 AA	10	* `cephfs-journal-tool` makes rank argument (--rank) mandatory. Rank is
	11	of format `filesystem:rank`, where `filesystem` is the cephfs filesystem
	12	and `rank` is the MDS rank on which the operation is to be executed. To
	13	operate on all ranks, use `all` or `*` as the rank specifier. Note that,
	14	operations that dump journal information to file will now dump to per-rank
	15	suffixed dump files. Importing journal information from dump files is
	16	disallowed if operation is targetted for all ranks.
	17
a8e16298 TL	18	* The MDS cache trimming is now throttled. Dropping the MDS cache
	19	via the `ceph tell mds.<foo> cache drop` command or large reductions in the
	20	cache size will no longer cause service unavailability.
	21
	22	* The CephFS MDS behavior with recalling caps has been significantly improved
	23	to not attempt recalling too many caps at once, leading to instability.
	24	MDS with a large cache (64GB+) should be more stable.
	25
	26	* MDS now provides a config option "mds_max_caps_per_client" (default: 1M) to
	27	limit the number of caps a client session may hold. Long running client
	28	sessions with a large number of caps have been a source of instability in the
	29	MDS when all of these caps need to be processed during certain session
	30	events. It is recommended to not unnecessarily increase this value.
	31
	32	* The MDS config mds_recall_state_timeout has been removed. Late client recall
	33	warnings are now generated based on the number of caps the MDS has recalled
	34	which have not been released. The new configs mds_recall_warning_threshold
	35	(default: 32K) and mds_recall_warning_decay_rate (default: 60s) sets the
	36	threshold for this warning.
	37
d2e6a577 FG	38	>= 12.1.2
	39	---------
	40	* When running 'df' on a CephFS filesystem comprising exactly one data pool,
	41	the result now reflects the file storage space used and available in that
	42	data pool (fuse client only).
	43	* Added new commands "pg force-recovery" and "pg-force-backfill". Use them
	44	to boost recovery or backfill priority of specified pgs, so they're
	45	recovered/backfilled before any other. Note that these commands don't
	46	interrupt ongoing recovery/backfill, but merely queue specified pgs
	47	before others so they're recovered/backfilled as soon as possible.
	48	New commands "pg cancel-force-recovery" and "pg cancel-force-backfill"
	49	restore default recovery/backfill priority of previously forced pgs.
7c673cae	50
181888fb FG	51
	52	12.2.1
	53	------
	54
	55	* Clusters will need to upgrade to 12.2.1 before upgrading to any
	56	Mimic 13.y.z version (either a development release or an eventual
	57	stable Mimic release).
	58
	59	- CephFS:
	60
	61	* Limiting MDS cache via a memory limit is now supported using the new
	62	mds_cache_memory_limit config option (1GB by default). A cache reservation
	63	can also be specified using mds_cache_reservation as a percentage of the
	64	limit (5% by default). Limits by inode count are still supported using
	65	mds_cache_size. Setting mds_cache_size to 0 (the default) disables the
	66	inode limit.
3efd9988 FG	67
	68	* The maximum number of PGs per OSD before the monitor issues a
	69	warning has been reduced from 300 to 200 PGs. 200 is still twice
	70	the generally recommended target of 100 PGs per OSD. This limit can
	71	be adjusted via the ``mon_max_pg_per_osd`` option on the
	72	monitors. The older ``mon_pg_warn_max_per_osd`` option has been removed.
	73
	74	* Creating pools or adjusting pg_num will now fail if the change would
	75	make the number of PGs per OSD exceed the configured
	76	``mon_max_pg_per_osd`` limit. The option can be adjusted if it
	77	is really necessary to create a pool with more PGs.
b32b8144 FG	78
	79	12.2.3
	80	------
	81
	82	- RBD:
	83
	84	* The RBD C API's rbd_discard method now enforces a maximum length of
	85	2GB to match the C++ API's Image::discard method. This restriction
	86	prevents overflow of the result code.
	87
	88	- CephFS:
	89
	90	* The CephFS client now catches failures to clear dentries during startup
	91	and refuses to start as consistency and untrimmable cache issues may
	92	develop. The new option client_die_on_failed_dentry_invalidate (default:
	93	true) may be turned off to allow the client to proceed (dangerous!).
94b18763 FG	94
	95	12.2.5
	96	------
	97
	98	- CephFS:
	99
	100	* Upgrading an MDS cluster to 12.2.3+ will result in all active MDS
	101	exiting due to feature incompatibilities once an upgraded MDS comes online
	102	(even as standby). Operators may ignore the error messages and continue
	103	upgrading/restarting or follow this upgrade sequence:
	104
	105	Reduce the number of ranks to 1 (`ceph fs set <fs_name> max_mds 1`),
	106	deactivate all other ranks (`ceph mds deactivate <fs_name>:<n>`), shutdown
	107	standbys leaving the one active MDS, upgrade the single active MDS, then
	108	upgrade/start standbys. Finally, restore the previous max_mds.
	109
	110	See also: https://tracker.ceph.com/issues/23172
	111
	112	* rados list-inconsistent-obj format changes:
	113
	114	* Various error strings have been improved. For example, the "oi" or "oi_attr"
	115	in errors which stands for object info is now "info" (e.g. oi_attr_missing is
	116	now info_missing).
	117
	118	* The object's "selected_object_info" is now in json format instead of string.
	119
	120	* The attribute errors (attr_value_mismatch, attr_name_mismatch) only apply to user
	121	attributes. Only user attributes are output and have the internal leading underscore
	122	stripped.
	123
	124	* If there are hash information errors (hinfo_missing, hinfo_corrupted,
	125	hinfo_inconsistency) then "hashinfo" is added with the json format of the
	126	information. If the information is corrupt then "hashinfo" is a string
	127	containing the value.
	128
	129	* If there are snapset errors (snapset_missing, snapset_corrupted,
	130	snapset_inconsistency) then "snapset" is added with the json format of the
	131	information. If the information is corrupt then "snapset" is a string containing
	132	the value.
	133
	134	* If there are object information errors (info_missing, info_corrupted,
	135	obj_size_info_mismatch, object_info_inconsistency) then "object_info" is added
	136	with the json format of the information instead of a string. If the information
	137	is corrupt then "object_info" is a string containing the value.
	138
	139	* rados list-inconsistent-snapset format changes:
	140
	141	* Various error strings have been improved. For example, the "ss_attr" in
	142	errors which stands for snapset info is now "snapset" (e.g. ss_attr_missing is
	143	now snapset_missing). The error snapset_mismatch has been renamed to snapset_error
	144	to better reflect what it means.
	145
	146	* The head snapset information is output in json format as "snapset." This means that
	147	even when there are no head errors, the head object will be output when any shard
	148	has an error. This head object is there to show the snapset that was used in
	149	determining errors.
	150
91327a77 AA	151
	152	12.2.9
	153	------
	154	* 12.2.9 contains the pg hard hard limit patches(https://tracker.ceph.com/issues/23979).
	155	A partial upgrade during recovery/backfill, can cause the osds on the previous version,
	156	to fail with assert(trim_to <= info.last_complete). The workaround for users is to
	157	upgrade and restart all OSDs to a version with the pg hard limit, or only upgrade
	158	when all PGs are active+clean. This patch will be reverted in 12.2.10, until
	159	a clean upgrade path is added to the pg log hard limit patches.
	160
	161	See also: http://tracker.ceph.com/issues/36686
f64942e4 AA	162
	163	12.2.11
	164	-------
	165
	166	* The default memory utilization for the mons has been increased
	167	somewhat. Rocksdb now uses 512 MB of RAM by default, which should
	168	be sufficient for small to medium-sized clusters; large clusters
	169	should tune this up. Also, the ``mon_osd_cache_size`` has been
	170	increase from 10 OSDMaps to 500, which will translate to an
	171	additional 500 MB to 1 GB of RAM for large clusters, and much less
	172	for small clusters.
	173
	174	* New CephFS file system attributes session_timeout and session_autoclose
	175	are configurable via `ceph fs set`. The MDS config options
	176	mds_session_timeout, mds_session_autoclose, and mds_max_file_size are now
	177	obsolete.
	178
	179	* This release fixes the pg log hard limit bug(https://tracker.ceph.com/issues/23979).
	180	A flag called pglog_hardlimit has been introduced. It is off by default.
	181	This flag enables the feature that limits the length of the pg log. Users should run
	182	'ceph osd set pglog_hardlimit' after completely upgrading to 12.2.11. Once all the OSDs
	183	have this flag set, the length of the pg log will be capped by a hard limit. We do not
	184	recommend unsetting this flag beyond this point.