[ceph.git] / ceph / doc / cephfs / upgrading.rst

Upgrading the MDS Cluster
=========================

Currently the MDS cluster does not have built-in versioning or file system
flags to support seamless upgrades of the MDSs without potentially causing
assertions or other faults due to incompatible messages or other functional
differences. For this reason, it's necessary during any cluster upgrade to
reduce the number of active MDS for a file system to one first so that two
active MDS do not communicate with different versions.  Further, it's also
necessary to take standbys offline as any new CompatSet flags will propagate
via the MDSMap to all MDS and cause older MDS to suicide.

The proper sequence for upgrading the MDS cluster is:

1. Disable and stop standby-replay daemons.

::

    ceph fs set <fs_name> allow_standby_replay false

In Pacific, the standby-replay daemons are stopped for you after running this
command. Older versions of Ceph require you to stop these daemons manually.

::

    ceph fs dump # find standby-replay daemons
    ceph mds fail mds.<X>


2. Reduce the number of ranks to 1:

::

    ceph fs set <fs_name> max_mds 1

3. Wait for cluster to stop non-zero ranks where only rank 0 is active and the rest are standbys.

::

    ceph status # wait for MDS to finish stopping

4. Take all standbys offline, e.g. using systemctl:

::

    systemctl stop ceph-mds.target

5. Confirm only one MDS is online and is rank 0 for your FS:

::

    ceph status

6. Upgrade the single active MDS, e.g. using systemctl:

::

    # use package manager to update cluster
    systemctl restart ceph-mds.target

7. Upgrade/start the standby daemons.

::

    # use package manager to update cluster
    systemctl restart ceph-mds.target

8. Restore the previous max_mds for your cluster:

::

    ceph fs set <fs_name> max_mds <old_max_mds>

9. Restore setting for ``allow_standby_replay`` (if applicable):

::

    ceph fs set <fs_name> allow_standby_replay true


Upgrading pre-Firefly file systems past Jewel
=============================================

.. tip::

    This advice only applies to users with file systems
    created using versions of Ceph older than *Firefly* (0.80).
    Users creating new file systems may disregard this advice.

Pre-firefly versions of Ceph used a now-deprecated format
for storing CephFS directory objects, called TMAPs.  Support
for reading these in RADOS will be removed after the Jewel
release of Ceph, so for upgrading CephFS users it is important
to ensure that any old directory objects have been converted.

After installing Jewel on all your MDS and OSD servers, and restarting
the services, run the following command:

::
    
    cephfs-data-scan tmap_upgrade <metadata pool name>

This only needs to be run once, and it is not necessary to
stop any other services while it runs.  The command may take some
time to execute, as it iterates overall objects in your metadata
pool.  It is safe to continue using your file system as normal while
it executes.  If the command aborts for any reason, it is safe
to simply run it again.

If you are upgrading a pre-Firefly CephFS file system to a newer Ceph version
than Jewel, you must first upgrade to Jewel and run the ``tmap_upgrade``
command before completing your upgrade to the latest version.
Commit	Line	Data
94b18763 FG	1	Upgrading the MDS Cluster
	2	=========================
	3
	4	Currently the MDS cluster does not have built-in versioning or file system
	5	flags to support seamless upgrades of the MDSs without potentially causing
	6	assertions or other faults due to incompatible messages or other functional
	7	differences. For this reason, it's necessary during any cluster upgrade to
	8	reduce the number of active MDS for a file system to one first so that two
	9	active MDS do not communicate with different versions. Further, it's also
11fdf7f2	10	necessary to take standbys offline as any new CompatSet flags will propagate
94b18763 FG	11	via the MDSMap to all MDS and cause older MDS to suicide.
	12
	13	The proper sequence for upgrading the MDS cluster is:
	14
f67539c2 TL	15	1. Disable and stop standby-replay daemons.
	16
	17	::
	18
	19	ceph fs set <fs_name> allow_standby_replay false
	20
	21	In Pacific, the standby-replay daemons are stopped for you after running this
	22	command. Older versions of Ceph require you to stop these daemons manually.
	23
	24	::
	25
	26	ceph fs dump # find standby-replay daemons
	27	ceph mds fail mds.<X>
	28
	29
	30	2. Reduce the number of ranks to 1:
94b18763 FG	31
	32	::
	33
	34	ceph fs set <fs_name> max_mds 1
	35
f67539c2	36	3. Wait for cluster to stop non-zero ranks where only rank 0 is active and the rest are standbys.
94b18763 FG	37
	38	::
	39
94b18763 FG	40	ceph status # wait for MDS to finish stopping
94b18763 FG	41
f67539c2	42	4. Take all standbys offline, e.g. using systemctl:
94b18763 FG	43
	44	::
	45
	46	systemctl stop ceph-mds.target
94b18763	47
f67539c2	48	5. Confirm only one MDS is online and is rank 0 for your FS:
94b18763 FG	49
	50	::
	51
11fdf7f2 TL	52	ceph status
11fdf7f2 TL	53
f67539c2	54	6. Upgrade the single active MDS, e.g. using systemctl:
11fdf7f2 TL	55
	56	::
	57
	58	# use package manager to update cluster
94b18763 FG	59	systemctl restart ceph-mds.target
94b18763 FG	60
f67539c2	61	7. Upgrade/start the standby daemons.
11fdf7f2 TL	62
	63	::
	64
	65	# use package manager to update cluster
	66	systemctl restart ceph-mds.target
94b18763	67
f67539c2	68	8. Restore the previous max_mds for your cluster:
94b18763 FG	69
	70	::
	71
	72	ceph fs set <fs_name> max_mds <old_max_mds>
	73
b3b6e05e TL	74	9. Restore setting for ``allow_standby_replay`` (if applicable):
	75
	76	::
	77
	78	ceph fs set <fs_name> allow_standby_replay true
	79
7c673cae	80
9f95a23c TL	81	Upgrading pre-Firefly file systems past Jewel
9f95a23c TL	82	=============================================
7c673cae FG	83
	84	.. tip::
	85
9f95a23c	86	This advice only applies to users with file systems
7c673cae	87	created using versions of Ceph older than Firefly (0.80).
9f95a23c	88	Users creating new file systems may disregard this advice.
7c673cae FG	89
	90	Pre-firefly versions of Ceph used a now-deprecated format
	91	for storing CephFS directory objects, called TMAPs. Support
	92	for reading these in RADOS will be removed after the Jewel
	93	release of Ceph, so for upgrading CephFS users it is important
	94	to ensure that any old directory objects have been converted.
	95
	96	After installing Jewel on all your MDS and OSD servers, and restarting
	97	the services, run the following command:
	98
	99	::
	100
	101	cephfs-data-scan tmap_upgrade <metadata pool name>
	102
	103	This only needs to be run once, and it is not necessary to
	104	stop any other services while it runs. The command may take some
	105	time to execute, as it iterates overall objects in your metadata
9f95a23c	106	pool. It is safe to continue using your file system as normal while
7c673cae FG	107	it executes. If the command aborts for any reason, it is safe
	108	to simply run it again.
	109
9f95a23c	110	If you are upgrading a pre-Firefly CephFS file system to a newer Ceph version
7c673cae FG	111	than Jewel, you must first upgrade to Jewel and run the ``tmap_upgrade``
	112	command before completing your upgrade to the latest version.
	113