]>
Commit | Line | Data |
---|---|---|
46ca3599 MH |
1 | ================================= |
2 | GFP masks used from FS/IO context | |
3 | ================================= | |
4 | ||
5 | :Date: May, 2018 | |
6 | :Author: Michal Hocko <mhocko@kernel.org> | |
7 | ||
8 | Introduction | |
9 | ============ | |
10 | ||
11 | Code paths in the filesystem and IO stacks must be careful when | |
12 | allocating memory to prevent recursion deadlocks caused by direct | |
13 | memory reclaim calling back into the FS or IO paths and blocking on | |
14 | already held resources (e.g. locks - most commonly those used for the | |
15 | transaction context). | |
16 | ||
17 | The traditional way to avoid this deadlock problem is to clear __GFP_FS | |
18 | respectively __GFP_IO (note the latter implies clearing the first as well) in | |
19 | the gfp mask when calling an allocator. GFP_NOFS respectively GFP_NOIO can be | |
20 | used as shortcut. It turned out though that above approach has led to | |
21 | abuses when the restricted gfp mask is used "just in case" without a | |
22 | deeper consideration which leads to problems because an excessive use | |
23 | of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory | |
24 | reclaim issues. | |
25 | ||
26 | New API | |
27 | ======== | |
28 | ||
29 | Since 4.12 we do have a generic scope API for both NOFS and NOIO context | |
30 | ``memalloc_nofs_save``, ``memalloc_nofs_restore`` respectively ``memalloc_noio_save``, | |
31 | ``memalloc_noio_restore`` which allow to mark a scope to be a critical | |
32 | section from a filesystem or I/O point of view. Any allocation from that | |
33 | scope will inherently drop __GFP_FS respectively __GFP_IO from the given | |
34 | mask so no memory allocation can recurse back in the FS/IO. | |
35 | ||
d43f2c98 JC |
36 | .. kernel-doc:: include/linux/sched/mm.h |
37 | :functions: memalloc_nofs_save memalloc_nofs_restore | |
38 | .. kernel-doc:: include/linux/sched/mm.h | |
39 | :functions: memalloc_noio_save memalloc_noio_restore | |
40 | ||
46ca3599 MH |
41 | FS/IO code then simply calls the appropriate save function before |
42 | any critical section with respect to the reclaim is started - e.g. | |
43 | lock shared with the reclaim context or when a transaction context | |
44 | nesting would be possible via reclaim. The restore function should be | |
45 | called when the critical section ends. All that ideally along with an | |
46 | explanation what is the reclaim context for easier maintenance. | |
47 | ||
48 | Please note that the proper pairing of save/restore functions | |
49 | allows nesting so it is safe to call ``memalloc_noio_save`` or | |
50 | ``memalloc_noio_restore`` respectively from an existing NOIO or NOFS | |
51 | scope. | |
52 | ||
53 | What about __vmalloc(GFP_NOFS) | |
54 | ============================== | |
55 | ||
56 | vmalloc doesn't support GFP_NOFS semantic because there are hardcoded | |
57 | GFP_KERNEL allocations deep inside the allocator which are quite non-trivial | |
58 | to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is | |
59 | almost always a bug. The good news is that the NOFS/NOIO semantic can be | |
60 | achieved by the scope API. | |
61 | ||
62 | In the ideal world, upper layers should already mark dangerous contexts | |
63 | and so no special care is required and vmalloc should be called without | |
64 | any problems. Sometimes if the context is not really clear or there are | |
65 | layering violations then the recommended way around that is to wrap ``vmalloc`` | |
66 | by the scope API with a comment explaining the problem. |