]>
Commit | Line | Data |
---|---|---|
94b18763 FG |
1 | Upgrading the MDS Cluster |
2 | ========================= | |
3 | ||
4 | Currently the MDS cluster does not have built-in versioning or file system | |
5 | flags to support seamless upgrades of the MDSs without potentially causing | |
6 | assertions or other faults due to incompatible messages or other functional | |
7 | differences. For this reason, it's necessary during any cluster upgrade to | |
8 | reduce the number of active MDS for a file system to one first so that two | |
9 | active MDS do not communicate with different versions. Further, it's also | |
11fdf7f2 | 10 | necessary to take standbys offline as any new CompatSet flags will propagate |
94b18763 FG |
11 | via the MDSMap to all MDS and cause older MDS to suicide. |
12 | ||
13 | The proper sequence for upgrading the MDS cluster is: | |
14 | ||
f67539c2 TL |
15 | 1. Disable and stop standby-replay daemons. |
16 | ||
17 | :: | |
18 | ||
19 | ceph fs set <fs_name> allow_standby_replay false | |
20 | ||
21 | In Pacific, the standby-replay daemons are stopped for you after running this | |
22 | command. Older versions of Ceph require you to stop these daemons manually. | |
23 | ||
24 | :: | |
25 | ||
26 | ceph fs dump # find standby-replay daemons | |
27 | ceph mds fail mds.<X> | |
28 | ||
29 | ||
30 | 2. Reduce the number of ranks to 1: | |
94b18763 FG |
31 | |
32 | :: | |
33 | ||
34 | ceph fs set <fs_name> max_mds 1 | |
35 | ||
f67539c2 | 36 | 3. Wait for cluster to stop non-zero ranks where only rank 0 is active and the rest are standbys. |
94b18763 FG |
37 | |
38 | :: | |
39 | ||
94b18763 FG |
40 | ceph status # wait for MDS to finish stopping |
41 | ||
f67539c2 | 42 | 4. Take all standbys offline, e.g. using systemctl: |
94b18763 FG |
43 | |
44 | :: | |
45 | ||
46 | systemctl stop ceph-mds.target | |
94b18763 | 47 | |
f67539c2 | 48 | 5. Confirm only one MDS is online and is rank 0 for your FS: |
94b18763 FG |
49 | |
50 | :: | |
51 | ||
11fdf7f2 TL |
52 | ceph status |
53 | ||
f67539c2 | 54 | 6. Upgrade the single active MDS, e.g. using systemctl: |
11fdf7f2 TL |
55 | |
56 | :: | |
57 | ||
58 | # use package manager to update cluster | |
94b18763 FG |
59 | systemctl restart ceph-mds.target |
60 | ||
f67539c2 | 61 | 7. Upgrade/start the standby daemons. |
11fdf7f2 TL |
62 | |
63 | :: | |
64 | ||
65 | # use package manager to update cluster | |
66 | systemctl restart ceph-mds.target | |
94b18763 | 67 | |
f67539c2 | 68 | 8. Restore the previous max_mds for your cluster: |
94b18763 FG |
69 | |
70 | :: | |
71 | ||
72 | ceph fs set <fs_name> max_mds <old_max_mds> | |
73 | ||
b3b6e05e TL |
74 | 9. Restore setting for ``allow_standby_replay`` (if applicable): |
75 | ||
76 | :: | |
77 | ||
78 | ceph fs set <fs_name> allow_standby_replay true | |
79 | ||
7c673cae | 80 | |
9f95a23c TL |
81 | Upgrading pre-Firefly file systems past Jewel |
82 | ============================================= | |
7c673cae FG |
83 | |
84 | .. tip:: | |
85 | ||
9f95a23c | 86 | This advice only applies to users with file systems |
7c673cae | 87 | created using versions of Ceph older than *Firefly* (0.80). |
9f95a23c | 88 | Users creating new file systems may disregard this advice. |
7c673cae FG |
89 | |
90 | Pre-firefly versions of Ceph used a now-deprecated format | |
91 | for storing CephFS directory objects, called TMAPs. Support | |
92 | for reading these in RADOS will be removed after the Jewel | |
93 | release of Ceph, so for upgrading CephFS users it is important | |
94 | to ensure that any old directory objects have been converted. | |
95 | ||
96 | After installing Jewel on all your MDS and OSD servers, and restarting | |
97 | the services, run the following command: | |
98 | ||
99 | :: | |
100 | ||
101 | cephfs-data-scan tmap_upgrade <metadata pool name> | |
102 | ||
103 | This only needs to be run once, and it is not necessary to | |
104 | stop any other services while it runs. The command may take some | |
105 | time to execute, as it iterates overall objects in your metadata | |
9f95a23c | 106 | pool. It is safe to continue using your file system as normal while |
7c673cae FG |
107 | it executes. If the command aborts for any reason, it is safe |
108 | to simply run it again. | |
109 | ||
9f95a23c | 110 | If you are upgrading a pre-Firefly CephFS file system to a newer Ceph version |
7c673cae FG |
111 | than Jewel, you must first upgrade to Jewel and run the ``tmap_upgrade`` |
112 | command before completing your upgrade to the latest version. | |
113 |