]>
Commit | Line | Data |
---|---|---|
36c95590 DH |
1 | ==================================================== |
2 | IN-KERNEL CACHE OBJECT REPRESENTATION AND MANAGEMENT | |
3 | ==================================================== | |
4 | ||
5 | By: David Howells <dhowells@redhat.com> | |
6 | ||
7 | Contents: | |
8 | ||
9 | (*) Representation | |
10 | ||
11 | (*) Object management state machine. | |
12 | ||
13 | - Provision of cpu time. | |
14 | - Locking simplification. | |
15 | ||
16 | (*) The set of states. | |
17 | ||
18 | (*) The set of events. | |
19 | ||
20 | ||
21 | ============== | |
22 | REPRESENTATION | |
23 | ============== | |
24 | ||
25 | FS-Cache maintains an in-kernel representation of each object that a netfs is | |
26 | currently interested in. Such objects are represented by the fscache_cookie | |
27 | struct and are referred to as cookies. | |
28 | ||
29 | FS-Cache also maintains a separate in-kernel representation of the objects that | |
30 | a cache backend is currently actively caching. Such objects are represented by | |
31 | the fscache_object struct. The cache backends allocate these upon request, and | |
32 | are expected to embed them in their own representations. These are referred to | |
33 | as objects. | |
34 | ||
35 | There is a 1:N relationship between cookies and objects. A cookie may be | |
36 | represented by multiple objects - an index may exist in more than one cache - | |
37 | or even by no objects (it may not be cached). | |
38 | ||
39 | Furthermore, both cookies and objects are hierarchical. The two hierarchies | |
40 | correspond, but the cookies tree is a superset of the union of the object trees | |
41 | of multiple caches: | |
42 | ||
43 | NETFS INDEX TREE : CACHE 1 : CACHE 2 | |
44 | : : | |
45 | : +-----------+ : | |
46 | +----------->| IObject | : | |
47 | +-----------+ | : +-----------+ : | |
48 | | ICookie |-------+ : | : | |
49 | +-----------+ | : | : +-----------+ | |
50 | | +------------------------------>| IObject | | |
51 | | : | : +-----------+ | |
52 | | : V : | | |
53 | | : +-----------+ : | | |
54 | V +----------->| IObject | : | | |
55 | +-----------+ | : +-----------+ : | | |
56 | | ICookie |-------+ : | : V | |
57 | +-----------+ | : | : +-----------+ | |
58 | | +------------------------------>| IObject | | |
59 | +-----+-----+ : | : +-----------+ | |
60 | | | : | : | | |
61 | V | : V : | | |
62 | +-----------+ | : +-----------+ : | | |
63 | | ICookie |------------------------->| IObject | : | | |
64 | +-----------+ | : +-----------+ : | | |
65 | | V : | : V | |
66 | | +-----------+ : | : +-----------+ | |
67 | | | ICookie |-------------------------------->| IObject | | |
68 | | +-----------+ : | : +-----------+ | |
69 | V | : V : | | |
70 | +-----------+ | : +-----------+ : | | |
71 | | DCookie |------------------------->| DObject | : | | |
72 | +-----------+ | : +-----------+ : | | |
73 | | : : | | |
74 | +-------+-------+ : : | | |
75 | | | : : | | |
76 | V V : : V | |
77 | +-----------+ +-----------+ : : +-----------+ | |
78 | | DCookie | | DCookie |------------------------>| DObject | | |
79 | +-----------+ +-----------+ : : +-----------+ | |
80 | : : | |
81 | ||
82 | In the above illustration, ICookie and IObject represent indices and DCookie | |
83 | and DObject represent data storage objects. Indices may have representation in | |
84 | multiple caches, but currently, non-index objects may not. Objects of any type | |
85 | may also be entirely unrepresented. | |
86 | ||
87 | As far as the netfs API goes, the netfs is only actually permitted to see | |
88 | pointers to the cookies. The cookies themselves and any objects attached to | |
89 | those cookies are hidden from it. | |
90 | ||
91 | ||
92 | =============================== | |
93 | OBJECT MANAGEMENT STATE MACHINE | |
94 | =============================== | |
95 | ||
96 | Within FS-Cache, each active object is managed by its own individual state | |
97 | machine. The state for an object is kept in the fscache_object struct, in | |
98 | object->state. A cookie may point to a set of objects that are in different | |
99 | states. | |
100 | ||
101 | Each state has an action associated with it that is invoked when the machine | |
102 | wakes up in that state. There are four logical sets of states: | |
103 | ||
104 | (1) Preparation: states that wait for the parent objects to become ready. The | |
105 | representations are hierarchical, and it is expected that an object must | |
106 | be created or accessed with respect to its parent object. | |
107 | ||
108 | (2) Initialisation: states that perform lookups in the cache and validate | |
109 | what's found and that create on disk any missing metadata. | |
110 | ||
111 | (3) Normal running: states that allow netfs operations on objects to proceed | |
112 | and that update the state of objects. | |
113 | ||
114 | (4) Termination: states that detach objects from their netfs cookies, that | |
115 | delete objects from disk, that handle disk and system errors and that free | |
116 | up in-memory resources. | |
117 | ||
118 | ||
119 | In most cases, transitioning between states is in response to signalled events. | |
120 | When a state has finished processing, it will usually set the mask of events in | |
121 | which it is interested (object->event_mask) and relinquish the worker thread. | |
122 | Then when an event is raised (by calling fscache_raise_event()), if the event | |
123 | is not masked, the object will be queued for processing (by calling | |
124 | fscache_enqueue_object()). | |
125 | ||
126 | ||
127 | PROVISION OF CPU TIME | |
128 | --------------------- | |
129 | ||
395cf969 PB |
130 | The work to be done by the various states was given CPU time by the threads of |
131 | the slow work facility. This was used in preference to the workqueue facility | |
132 | because: | |
36c95590 DH |
133 | |
134 | (1) Threads may be completely occupied for very long periods of time by a | |
135 | particular work item. These state actions may be doing sequences of | |
136 | synchronous, journalled disk accesses (lookup, mkdir, create, setxattr, | |
137 | getxattr, truncate, unlink, rmdir, rename). | |
138 | ||
139 | (2) Threads may do little actual work, but may rather spend a lot of time | |
140 | sleeping on I/O. This means that single-threaded and 1-per-CPU-threaded | |
141 | workqueues don't necessarily have the right numbers of threads. | |
142 | ||
143 | ||
144 | LOCKING SIMPLIFICATION | |
145 | ---------------------- | |
146 | ||
147 | Because only one worker thread may be operating on any particular object's | |
148 | state machine at once, this simplifies the locking, particularly with respect | |
149 | to disconnecting the netfs's representation of a cache object (fscache_cookie) | |
150 | from the cache backend's representation (fscache_object) - which may be | |
151 | requested from either end. | |
152 | ||
153 | ||
154 | ================= | |
155 | THE SET OF STATES | |
156 | ================= | |
157 | ||
158 | The object state machine has a set of states that it can be in. There are | |
159 | preparation states in which the object sets itself up and waits for its parent | |
160 | object to transit to a state that allows access to its children: | |
161 | ||
162 | (1) State FSCACHE_OBJECT_INIT. | |
163 | ||
164 | Initialise the object and wait for the parent object to become active. In | |
165 | the cache, it is expected that it will not be possible to look an object | |
166 | up from the parent object, until that parent object itself has been looked | |
167 | up. | |
168 | ||
169 | There are initialisation states in which the object sets itself up and accesses | |
170 | disk for the object metadata: | |
171 | ||
172 | (2) State FSCACHE_OBJECT_LOOKING_UP. | |
173 | ||
174 | Look up the object on disk, using the parent as a starting point. | |
175 | FS-Cache expects the cache backend to probe the cache to see whether this | |
176 | object is represented there, and if it is, to see if it's valid (coherency | |
177 | management). | |
178 | ||
179 | The cache should call fscache_object_lookup_negative() to indicate lookup | |
180 | failure for whatever reason, and should call fscache_obtained_object() to | |
181 | indicate success. | |
182 | ||
183 | At the completion of lookup, FS-Cache will let the netfs go ahead with | |
184 | read operations, no matter whether the file is yet cached. If not yet | |
185 | cached, read operations will be immediately rejected with ENODATA until | |
186 | the first known page is uncached - as to that point there can be no data | |
187 | to be read out of the cache for that file that isn't currently also held | |
188 | in the pagecache. | |
189 | ||
190 | (3) State FSCACHE_OBJECT_CREATING. | |
191 | ||
192 | Create an object on disk, using the parent as a starting point. This | |
193 | happens if the lookup failed to find the object, or if the object's | |
194 | coherency data indicated what's on disk is out of date. In this state, | |
195 | FS-Cache expects the cache to create | |
196 | ||
197 | The cache should call fscache_obtained_object() if creation completes | |
198 | successfully, fscache_object_lookup_negative() otherwise. | |
199 | ||
200 | At the completion of creation, FS-Cache will start processing write | |
201 | operations the netfs has queued for an object. If creation failed, the | |
202 | write ops will be transparently discarded, and nothing recorded in the | |
203 | cache. | |
204 | ||
205 | There are some normal running states in which the object spends its time | |
206 | servicing netfs requests: | |
207 | ||
208 | (4) State FSCACHE_OBJECT_AVAILABLE. | |
209 | ||
210 | A transient state in which pending operations are started, child objects | |
211 | are permitted to advance from FSCACHE_OBJECT_INIT state, and temporary | |
212 | lookup data is freed. | |
213 | ||
214 | (5) State FSCACHE_OBJECT_ACTIVE. | |
215 | ||
216 | The normal running state. In this state, requests the netfs makes will be | |
217 | passed on to the cache. | |
218 | ||
ef778e7a DH |
219 | (6) State FSCACHE_OBJECT_INVALIDATING. |
220 | ||
221 | The object is undergoing invalidation. When the state comes here, it | |
222 | discards all pending read, write and attribute change operations as it is | |
223 | going to clear out the cache entirely and reinitialise it. It will then | |
224 | continue to the FSCACHE_OBJECT_UPDATING state. | |
225 | ||
226 | (7) State FSCACHE_OBJECT_UPDATING. | |
36c95590 DH |
227 | |
228 | The state machine comes here to update the object in the cache from the | |
229 | netfs's records. This involves updating the auxiliary data that is used | |
230 | to maintain coherency. | |
231 | ||
232 | And there are terminal states in which an object cleans itself up, deallocates | |
233 | memory and potentially deletes stuff from disk: | |
234 | ||
ef778e7a | 235 | (8) State FSCACHE_OBJECT_LC_DYING. |
36c95590 DH |
236 | |
237 | The object comes here if it is dying because of a lookup or creation | |
238 | error. This would be due to a disk error or system error of some sort. | |
239 | Temporary data is cleaned up, and the parent is released. | |
240 | ||
ef778e7a | 241 | (9) State FSCACHE_OBJECT_DYING. |
36c95590 DH |
242 | |
243 | The object comes here if it is dying due to an error, because its parent | |
244 | cookie has been relinquished by the netfs or because the cache is being | |
245 | withdrawn. | |
246 | ||
247 | Any child objects waiting on this one are given CPU time so that they too | |
248 | can destroy themselves. This object waits for all its children to go away | |
249 | before advancing to the next state. | |
250 | ||
ef778e7a | 251 | (10) State FSCACHE_OBJECT_ABORT_INIT. |
36c95590 DH |
252 | |
253 | The object comes to this state if it was waiting on its parent in | |
254 | FSCACHE_OBJECT_INIT, but its parent died. The object will destroy itself | |
255 | so that the parent may proceed from the FSCACHE_OBJECT_DYING state. | |
256 | ||
ef778e7a DH |
257 | (11) State FSCACHE_OBJECT_RELEASING. |
258 | (12) State FSCACHE_OBJECT_RECYCLING. | |
36c95590 DH |
259 | |
260 | The object comes to one of these two states when dying once it is rid of | |
261 | all its children, if it is dying because the netfs relinquished its | |
262 | cookie. In the first state, the cached data is expected to persist, and | |
263 | in the second it will be deleted. | |
264 | ||
ef778e7a | 265 | (13) State FSCACHE_OBJECT_WITHDRAWING. |
36c95590 DH |
266 | |
267 | The object transits to this state if the cache decides it wants to | |
268 | withdraw the object from service, perhaps to make space, but also due to | |
269 | error or just because the whole cache is being withdrawn. | |
270 | ||
ef778e7a | 271 | (14) State FSCACHE_OBJECT_DEAD. |
36c95590 DH |
272 | |
273 | The object transits to this state when the in-memory object record is | |
274 | ready to be deleted. The object processor shouldn't ever see an object in | |
275 | this state. | |
276 | ||
277 | ||
278 | THE SET OF EVENTS | |
279 | ----------------- | |
280 | ||
281 | There are a number of events that can be raised to an object state machine: | |
282 | ||
283 | (*) FSCACHE_OBJECT_EV_UPDATE | |
284 | ||
285 | The netfs requested that an object be updated. The state machine will ask | |
286 | the cache backend to update the object, and the cache backend will ask the | |
287 | netfs for details of the change through its cookie definition ops. | |
288 | ||
289 | (*) FSCACHE_OBJECT_EV_CLEARED | |
290 | ||
291 | This is signalled in two circumstances: | |
292 | ||
293 | (a) when an object's last child object is dropped and | |
294 | ||
295 | (b) when the last operation outstanding on an object is completed. | |
296 | ||
297 | This is used to proceed from the dying state. | |
298 | ||
299 | (*) FSCACHE_OBJECT_EV_ERROR | |
300 | ||
301 | This is signalled when an I/O error occurs during the processing of some | |
302 | object. | |
303 | ||
304 | (*) FSCACHE_OBJECT_EV_RELEASE | |
305 | (*) FSCACHE_OBJECT_EV_RETIRE | |
306 | ||
307 | These are signalled when the netfs relinquishes a cookie it was using. | |
308 | The event selected depends on whether the netfs asks for the backing | |
309 | object to be retired (deleted) or retained. | |
310 | ||
311 | (*) FSCACHE_OBJECT_EV_WITHDRAW | |
312 | ||
313 | This is signalled when the cache backend wants to withdraw an object. | |
314 | This means that the object will have to be detached from the netfs's | |
315 | cookie. | |
316 | ||
317 | Because the withdrawing releasing/retiring events are all handled by the object | |
318 | state machine, it doesn't matter if there's a collision with both ends trying | |
319 | to sever the connection at the same time. The state machine can just pick | |
320 | which one it wants to honour, and that effects the other. |