]>
Commit | Line | Data |
---|---|---|
8ff7e072 MR |
1 | .. _gfp_mask_from_fs_io: |
2 | ||
46ca3599 MH |
3 | ================================= |
4 | GFP masks used from FS/IO context | |
5 | ================================= | |
6 | ||
7 | :Date: May, 2018 | |
8 | :Author: Michal Hocko <mhocko@kernel.org> | |
9 | ||
10 | Introduction | |
11 | ============ | |
12 | ||
13 | Code paths in the filesystem and IO stacks must be careful when | |
14 | allocating memory to prevent recursion deadlocks caused by direct | |
15 | memory reclaim calling back into the FS or IO paths and blocking on | |
16 | already held resources (e.g. locks - most commonly those used for the | |
17 | transaction context). | |
18 | ||
19 | The traditional way to avoid this deadlock problem is to clear __GFP_FS | |
20 | respectively __GFP_IO (note the latter implies clearing the first as well) in | |
21 | the gfp mask when calling an allocator. GFP_NOFS respectively GFP_NOIO can be | |
22 | used as shortcut. It turned out though that above approach has led to | |
23 | abuses when the restricted gfp mask is used "just in case" without a | |
24 | deeper consideration which leads to problems because an excessive use | |
25 | of GFP_NOFS/GFP_NOIO can lead to memory over-reclaim or other memory | |
26 | reclaim issues. | |
27 | ||
28 | New API | |
29 | ======== | |
30 | ||
31 | Since 4.12 we do have a generic scope API for both NOFS and NOIO context | |
32 | ``memalloc_nofs_save``, ``memalloc_nofs_restore`` respectively ``memalloc_noio_save``, | |
33 | ``memalloc_noio_restore`` which allow to mark a scope to be a critical | |
34 | section from a filesystem or I/O point of view. Any allocation from that | |
35 | scope will inherently drop __GFP_FS respectively __GFP_IO from the given | |
36 | mask so no memory allocation can recurse back in the FS/IO. | |
37 | ||
d43f2c98 JC |
38 | .. kernel-doc:: include/linux/sched/mm.h |
39 | :functions: memalloc_nofs_save memalloc_nofs_restore | |
40 | .. kernel-doc:: include/linux/sched/mm.h | |
41 | :functions: memalloc_noio_save memalloc_noio_restore | |
42 | ||
46ca3599 MH |
43 | FS/IO code then simply calls the appropriate save function before |
44 | any critical section with respect to the reclaim is started - e.g. | |
45 | lock shared with the reclaim context or when a transaction context | |
46 | nesting would be possible via reclaim. The restore function should be | |
47 | called when the critical section ends. All that ideally along with an | |
48 | explanation what is the reclaim context for easier maintenance. | |
49 | ||
50 | Please note that the proper pairing of save/restore functions | |
51 | allows nesting so it is safe to call ``memalloc_noio_save`` or | |
52 | ``memalloc_noio_restore`` respectively from an existing NOIO or NOFS | |
53 | scope. | |
54 | ||
55 | What about __vmalloc(GFP_NOFS) | |
56 | ============================== | |
57 | ||
58 | vmalloc doesn't support GFP_NOFS semantic because there are hardcoded | |
59 | GFP_KERNEL allocations deep inside the allocator which are quite non-trivial | |
60 | to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is | |
61 | almost always a bug. The good news is that the NOFS/NOIO semantic can be | |
62 | achieved by the scope API. | |
63 | ||
64 | In the ideal world, upper layers should already mark dangerous contexts | |
65 | and so no special care is required and vmalloc should be called without | |
66 | any problems. Sometimes if the context is not really clear or there are | |
67 | layering violations then the recommended way around that is to wrap ``vmalloc`` | |
68 | by the scope API with a comment explaining the problem. |