]>
Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Documentation for /proc/sys/vm/* kernel version 2.2.10 |
2 | (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> | |
3 | ||
4 | For general info and legal blurb, please look in README. | |
5 | ||
6 | ============================================================== | |
7 | ||
8 | This file contains the documentation for the sysctl files in | |
9 | /proc/sys/vm and is valid for Linux kernel version 2.2. | |
10 | ||
11 | The files in this directory can be used to tune the operation | |
12 | of the virtual memory (VM) subsystem of the Linux kernel and | |
13 | the writeout of dirty data to disk. | |
14 | ||
15 | Default values and initialization routines for most of these | |
16 | files can be found in mm/swap.c. | |
17 | ||
18 | Currently, these files are in /proc/sys/vm: | |
19 | - overcommit_memory | |
20 | - page-cluster | |
21 | - dirty_ratio | |
22 | - dirty_background_ratio | |
23 | - dirty_expire_centisecs | |
24 | - dirty_writeback_centisecs | |
25 | - max_map_count | |
26 | - min_free_kbytes | |
27 | - laptop_mode | |
28 | - block_dump | |
9d0243bc | 29 | - drop-caches |
1743660b | 30 | - zone_reclaim_mode |
2a11ff06 | 31 | - zone_reclaim_interval |
fadd8fbd | 32 | - panic_on_oom |
1da177e4 LT |
33 | |
34 | ============================================================== | |
35 | ||
36 | dirty_ratio, dirty_background_ratio, dirty_expire_centisecs, | |
37 | dirty_writeback_centisecs, vfs_cache_pressure, laptop_mode, | |
9d0243bc | 38 | block_dump, swap_token_timeout, drop-caches: |
1da177e4 LT |
39 | |
40 | See Documentation/filesystems/proc.txt | |
41 | ||
42 | ============================================================== | |
43 | ||
44 | overcommit_memory: | |
45 | ||
46 | This value contains a flag that enables memory overcommitment. | |
47 | ||
48 | When this flag is 0, the kernel attempts to estimate the amount | |
49 | of free memory left when userspace requests more memory. | |
50 | ||
51 | When this flag is 1, the kernel pretends there is always enough | |
52 | memory until it actually runs out. | |
53 | ||
54 | When this flag is 2, the kernel uses a "never overcommit" | |
55 | policy that attempts to prevent any overcommit of memory. | |
56 | ||
57 | This feature can be very useful because there are a lot of | |
58 | programs that malloc() huge amounts of memory "just-in-case" | |
59 | and don't use much of it. | |
60 | ||
61 | The default value is 0. | |
62 | ||
63 | See Documentation/vm/overcommit-accounting and | |
64 | security/commoncap.c::cap_vm_enough_memory() for more information. | |
65 | ||
66 | ============================================================== | |
67 | ||
68 | overcommit_ratio: | |
69 | ||
70 | When overcommit_memory is set to 2, the committed address | |
71 | space is not permitted to exceed swap plus this percentage | |
72 | of physical RAM. See above. | |
73 | ||
74 | ============================================================== | |
75 | ||
76 | page-cluster: | |
77 | ||
78 | The Linux VM subsystem avoids excessive disk seeks by reading | |
79 | multiple pages on a page fault. The number of pages it reads | |
80 | is dependent on the amount of memory in your machine. | |
81 | ||
82 | The number of pages the kernel reads in at once is equal to | |
83 | 2 ^ page-cluster. Values above 2 ^ 5 don't make much sense | |
84 | for swap because we only cluster swap data in 32-page groups. | |
85 | ||
86 | ============================================================== | |
87 | ||
88 | max_map_count: | |
89 | ||
90 | This file contains the maximum number of memory map areas a process | |
91 | may have. Memory map areas are used as a side-effect of calling | |
92 | malloc, directly by mmap and mprotect, and also when loading shared | |
93 | libraries. | |
94 | ||
95 | While most applications need less than a thousand maps, certain | |
96 | programs, particularly malloc debuggers, may consume lots of them, | |
97 | e.g., up to one or two maps per allocation. | |
98 | ||
99 | The default value is 65536. | |
100 | ||
101 | ============================================================== | |
102 | ||
103 | min_free_kbytes: | |
104 | ||
105 | This is used to force the Linux VM to keep a minimum number | |
106 | of kilobytes free. The VM uses this number to compute a pages_min | |
107 | value for each lowmem zone in the system. Each lowmem zone gets | |
108 | a number of reserved free pages based proportionally on its size. | |
8ad4b1fb RS |
109 | |
110 | ============================================================== | |
111 | ||
112 | percpu_pagelist_fraction | |
113 | ||
114 | This is the fraction of pages at most (high mark pcp->high) in each zone that | |
115 | are allocated for each per cpu page list. The min value for this is 8. It | |
116 | means that we don't allow more than 1/8th of pages in each zone to be | |
117 | allocated in any single per_cpu_pagelist. This entry only changes the value | |
118 | of hot per cpu pagelists. User can specify a number like 100 to allocate | |
119 | 1/100th of each zone to each per cpu page list. | |
120 | ||
121 | The batch value of each per cpu pagelist is also updated as a result. It is | |
122 | set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) | |
123 | ||
124 | The initial value is zero. Kernel does not use this value at boot time to set | |
125 | the high water marks for each per cpu page list. | |
1743660b CL |
126 | |
127 | =============================================================== | |
128 | ||
129 | zone_reclaim_mode: | |
130 | ||
1b2ffb78 CL |
131 | Zone_reclaim_mode allows to set more or less agressive approaches to |
132 | reclaim memory when a zone runs out of memory. If it is set to zero then no | |
133 | zone reclaim occurs. Allocations will be satisfied from other zones / nodes | |
134 | in the system. | |
135 | ||
136 | This is value ORed together of | |
137 | ||
138 | 1 = Zone reclaim on | |
139 | 2 = Zone reclaim writes dirty pages out | |
140 | 4 = Zone reclaim swaps pages | |
2a16e3f4 | 141 | 8 = Also do a global slab reclaim pass |
1b2ffb78 CL |
142 | |
143 | zone_reclaim_mode is set during bootup to 1 if it is determined that pages | |
144 | from remote zones will cause a measurable performance reduction. The | |
1743660b | 145 | page allocator will then reclaim easily reusable pages (those page |
1b2ffb78 CL |
146 | cache pages that are currently not used) before allocating off node pages. |
147 | ||
148 | It may be beneficial to switch off zone reclaim if the system is | |
149 | used for a file server and all of memory should be used for caching files | |
150 | from disk. In that case the caching effect is more important than | |
151 | data locality. | |
152 | ||
153 | Allowing zone reclaim to write out pages stops processes that are | |
154 | writing large amounts of data from dirtying pages on other nodes. Zone | |
155 | reclaim will write out dirty pages if a zone fills up and so effectively | |
156 | throttle the process. This may decrease the performance of a single process | |
157 | since it cannot use all of system memory to buffer the outgoing writes | |
158 | anymore but it preserve the memory on other nodes so that the performance | |
159 | of other processes running on other nodes will not be affected. | |
1743660b | 160 | |
1b2ffb78 CL |
161 | Allowing regular swap effectively restricts allocations to the local |
162 | node unless explicitly overridden by memory policies or cpuset | |
163 | configurations. | |
1743660b | 164 | |
2a16e3f4 CL |
165 | It may be advisable to allow slab reclaim if the system makes heavy |
166 | use of files and builds up large slab caches. However, the slab | |
167 | shrink operation is global, may take a long time and free slabs | |
168 | in all nodes of the system. | |
169 | ||
2a11ff06 CL |
170 | ================================================================ |
171 | ||
172 | zone_reclaim_interval: | |
173 | ||
174 | The time allowed for off node allocations after zone reclaim | |
175 | has failed to reclaim enough pages to allow a local allocation. | |
176 | ||
177 | Time is set in seconds and set by default to 30 seconds. | |
178 | ||
179 | Reduce the interval if undesired off node allocations occur. However, too | |
180 | frequent scans will have a negative impact onoff node allocation performance. | |
1743660b | 181 | |
fadd8fbd KH |
182 | ============================================================= |
183 | ||
184 | panic_on_oom | |
185 | ||
186 | This enables or disables panic on out-of-memory feature. If this is set to 1, | |
187 | the kernel panics when out-of-memory happens. If this is set to 0, the kernel | |
188 | will kill some rogue process, called oom_killer. Usually, oom_killer can kill | |
189 | rogue processes and system will survive. If you want to panic the system | |
190 | rather than killing rogue processes, set this to 1. | |
191 | ||
192 | The default value is 0. | |
193 |