]> git.proxmox.com Git - ceph.git/blame - ceph/doc/dev/cephfs-reclaim.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / dev / cephfs-reclaim.rst
CommitLineData
11fdf7f2
TL
1CephFS Reclaim Interface
2========================
3
4Introduction
5------------
6NFS servers typically do not track ephemeral state on stable storage. If
7the NFS server is restarted, then it will be resurrected with no
8ephemeral state, and the NFS clients are expected to send requests to
9reclaim what state they held during a grace period.
10
11In order to support this use-case, libcephfs has grown several functions
12that allow a client that has been stopped and restarted to destroy or
13reclaim state held by a previous incarnation of itself. This allows the
14client to reacquire state held by its previous incarnation, and to avoid
15the long wait for the old session to time out before releasing the state
16previously held.
17
18As soon as an NFS server running over cephfs goes down, it's racing
19against its MDS session timeout. If the Ceph session times out before
20the NFS grace period is started, then conflicting state could be
21acquired by another client. This mechanism also allows us to increase
22the timeout for these clients, to ensure that the server has a long
23window of time to be restarted.
24
25Setting the UUID
26----------------
27In order to properly reset or reclaim against the old session, we need a
28way to identify the old session. This done by setting a unique opaque
29value on the session using **ceph_set_uuid()**. The uuid value can be
30any string and is treated as opaque by the client.
31
32Setting the uuid directly can only be done on a new session, prior to
33mounting. When reclaim is performed the current session will inherit the
34old session's uuid.
35
36Starting Reclaim
37----------------
38After calling ceph_create and ceph_init on the resulting struct
39ceph_mount_info, the client should then issue ceph_start_reclaim,
40passing in the uuid of the previous incarnation of the client with any
41flags.
42
43CEPH_RECLAIM_RESET
44 This flag indicates that we do not intend to do any sort of reclaim
45 against the old session indicated by the given uuid, and that it
46 should just be discarded. Any state held by the previous client
47 should be released immediately.
48
49Finishing Reclaim
50-----------------
51After the Ceph client has completed all of its reclaim operations, the
52client should issue ceph_finish_reclaim to indicate that the reclaim is
53now complete.
54
55Setting Session Timeout (Optional)
56----------------------------------
57When a client dies and is restarted, and we need to preserve its state,
58we are effectively racing against the session expiration clock. In this
59situation we generally want a longer timeout since we expect to
60eventually kill off the old session manually.
61
62Example 1: Reset Old Session
63----------------------------
64This example just kills off the MDS session held by a previous instance
65of itself. An NFS server can start a grace period and then ask the MDS
66to tear down the old session. This allows clients to start reclaim
67immediately.
68
69(Note: error handling omitted for clarity)
70
71.. code-block:: c
72
73 struct ceph_mount_info *cmount;
74 const char *uuid = "foobarbaz";
75
76 /* Set up a new cephfs session, but don't mount it yet. */
77 rc = ceph_create(&cmount);
78 rc = ceph_init(&cmount);
79
80 /*
81 * Set the timeout to 5 minutes to lengthen the window of time for
82 * the server to restart, should it crash.
83 */
84 ceph_set_session_timeout(cmount, 300);
85
86 /*
87 * Start reclaim vs. session with old uuid. Before calling this,
88 * all NFS servers that could acquire conflicting state _must_ be
89 * enforcing their grace period locally.
90 */
91 rc = ceph_start_reclaim(cmount, uuid, CEPH_RECLAIM_RESET);
92
93 /* Declare reclaim complete */
94 rc = ceph_finish_reclaim(cmount);
95
96 /* Set uuid held by new session */
97 ceph_set_uuid(cmount, nodeid);
98
99 /*
9f95a23c 100 * Now mount up the file system and do normal open/lock operations to
11fdf7f2
TL
101 * satisfy reclaim requests.
102 */
103 ceph_mount(cmount, rootpath);
104 ...