]> git.proxmox.com Git - ceph.git/blob - ceph/doc/cephfs/eviction.rst
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / doc / cephfs / eviction.rst
1
2 Ceph filesystem client eviction
3 ===============================
4
5 When a filesystem client is unresponsive or otherwise misbehaving, it
6 may be necessary to forcibly terminate its access to the filesystem. This
7 process is called *eviction*.
8
9 This process is somewhat thorough in order to protect against data inconsistency
10 resulting from misbehaving clients.
11
12 OSD blacklisting
13 ----------------
14
15 First, prevent the client from performing any more data operations by *blacklisting*
16 it at the RADOS level. You may be familiar with this concept as *fencing* in other
17 storage systems.
18
19 Identify the client to evict from the MDS session list:
20
21 ::
22
23 # ceph daemon mds.a session ls
24 [
25 { "id": 4117,
26 "num_leases": 0,
27 "num_caps": 1,
28 "state": "open",
29 "replay_requests": 0,
30 "reconnecting": false,
31 "inst": "client.4117 172.16.79.251:0\/3271",
32 "client_metadata": { "entity_id": "admin",
33 "hostname": "fedoravm.localdomain",
34 "mount_point": "\/home\/user\/mnt"}}]
35
36 In this case the 'fedoravm' client has address ``172.16.79.251:0/3271``, so we blacklist
37 it as follows:
38
39 ::
40
41 # ceph osd blacklist add 172.16.79.251:0/3271
42 blacklisting 172.16.79.251:0/3271 until 2014-12-09 13:09:56.569368 (3600 sec)
43
44 OSD epoch barrier
45 -----------------
46
47 While the evicted client is now marked as blacklisted in the central (mon) copy of the OSD
48 map, it is now necessary to ensure that this OSD map update has propagated to all daemons
49 involved in subsequent filesystem I/O. To do this, use the ``osdmap barrier`` MDS admin
50 socket command.
51
52 First read the latest OSD epoch:
53
54 ::
55
56 # ceph osd dump
57 epoch 12
58 fsid fd61ca96-53ff-4311-826c-f36b176d69ea
59 created 2014-12-09 12:03:38.595844
60 modified 2014-12-09 12:09:56.619957
61 ...
62
63 In this case it is 12. Now request the MDS to barrier on this epoch:
64
65 ::
66
67 # ceph daemon mds.a osdmap barrier 12
68
69 MDS session eviction
70 --------------------
71
72 Finally, it is safe to evict the client's MDS session, such that any capabilities it held
73 may be issued to other clients. The ID here is the ``id`` attribute from the ``session ls``
74 output:
75
76 ::
77
78 # ceph daemon mds.a session evict 4117
79
80 That's it! The client has now been evicted, and any resources it had locked will
81 now be available for other clients.
82
83 Background: OSD epoch barrier
84 -----------------------------
85
86 The purpose of the barrier is to ensure that when we hand out any
87 capabilities which might allow touching the same RADOS objects, the
88 clients we hand out the capabilities to must have a sufficiently recent
89 OSD map to not race with cancelled operations (from ENOSPC) or
90 blacklisted clients (from evictions)
91
92 More specifically, the cases where we set an epoch barrier are:
93
94 * Client eviction (where the client is blacklisted and other clients
95 must wait for a post-blacklist epoch to touch the same objects)
96 * OSD map full flag handling in the client (where the client may
97 cancel some OSD ops from a pre-full epoch, so other clients must
98 wait until the full epoch or later before touching the same objects).
99 * MDS startup, because we don't persist the barrier epoch, so must
100 assume that latest OSD map is always required after a restart.
101
102 Note that this is a global value for simplicity: we could maintain this on
103 a per-inode basis. We don't, because:
104
105 * It would be more complicated
106 * It would use an extra 4 bytes of memory for every inode
107 * It would not be much more efficient as almost always everyone has the latest
108 OSD map anyway, in most cases everyone will breeze through this barrier
109 rather than waiting.
110 * We only do this barrier in very rare cases, so any benefit from per-inode
111 granularity would only very rarely be seen.
112
113 The epoch barrier is transmitted along with all capability messages, and
114 instructs the receiver of the message to avoid sending any more RADOS
115 operations to OSDs until it has seen this OSD epoch. This mainly applies
116 to clients (doing their data writes directly to files), but also applies
117 to the MDS because things like file size probing and file deletion are
118 done directly from the MDS.
119