]> git.proxmox.com Git - mirror_ubuntu-artful-kernel.git/blob - Documentation/fault-injection/fault-injection.txt
UBUNTU: Start new release
[mirror_ubuntu-artful-kernel.git] / Documentation / fault-injection / fault-injection.txt
1 Fault injection capabilities infrastructure
2 ===========================================
3
4 See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.
5
6
7 Available fault injection capabilities
8 --------------------------------------
9
10 o failslab
11
12 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
13
14 o fail_page_alloc
15
16 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
17
18 o fail_futex
19
20 injects futex deadlock and uaddr fault errors.
21
22 o fail_make_request
23
24 injects disk IO errors on devices permitted by setting
25 /sys/block/<device>/make-it-fail or
26 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
27
28 o fail_mmc_request
29
30 injects MMC data errors on devices permitted by setting
31 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
32
33 Configure fault-injection capabilities behavior
34 -----------------------------------------------
35
36 o debugfs entries
37
38 fault-inject-debugfs kernel module provides some debugfs entries for runtime
39 configuration of fault-injection capabilities.
40
41 - /sys/kernel/debug/fail*/probability:
42
43 likelihood of failure injection, in percent.
44 Format: <percent>
45
46 Note that one-failure-per-hundred is a very high error rate
47 for some testcases. Consider setting probability=100 and configure
48 /sys/kernel/debug/fail*/interval for such testcases.
49
50 - /sys/kernel/debug/fail*/interval:
51
52 specifies the interval between failures, for calls to
53 should_fail() that pass all the other tests.
54
55 Note that if you enable this, by setting interval>1, you will
56 probably want to set probability=100.
57
58 - /sys/kernel/debug/fail*/times:
59
60 specifies how many times failures may happen at most.
61 A value of -1 means "no limit".
62
63 - /sys/kernel/debug/fail*/space:
64
65 specifies an initial resource "budget", decremented by "size"
66 on each call to should_fail(,size). Failure injection is
67 suppressed until "space" reaches zero.
68
69 - /sys/kernel/debug/fail*/verbose
70
71 Format: { 0 | 1 | 2 }
72 specifies the verbosity of the messages when failure is
73 injected. '0' means no messages; '1' will print only a single
74 log line per failure; '2' will print a call trace too -- useful
75 to debug the problems revealed by fault injection.
76
77 - /sys/kernel/debug/fail*/task-filter:
78
79 Format: { 'Y' | 'N' }
80 A value of 'N' disables filtering by process (default).
81 Any positive value limits failures to only processes indicated by
82 /proc/<pid>/make-it-fail==1.
83
84 - /sys/kernel/debug/fail*/require-start:
85 - /sys/kernel/debug/fail*/require-end:
86 - /sys/kernel/debug/fail*/reject-start:
87 - /sys/kernel/debug/fail*/reject-end:
88
89 specifies the range of virtual addresses tested during
90 stacktrace walking. Failure is injected only if some caller
91 in the walked stacktrace lies within the required range, and
92 none lies within the rejected range.
93 Default required range is [0,ULONG_MAX) (whole of virtual address space).
94 Default rejected range is [0,0).
95
96 - /sys/kernel/debug/fail*/stacktrace-depth:
97
98 specifies the maximum stacktrace depth walked during search
99 for a caller within [require-start,require-end) OR
100 [reject-start,reject-end).
101
102 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
103
104 Format: { 'Y' | 'N' }
105 default is 'N', setting it to 'Y' won't inject failures into
106 highmem/user allocations.
107
108 - /sys/kernel/debug/failslab/ignore-gfp-wait:
109 - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
110
111 Format: { 'Y' | 'N' }
112 default is 'N', setting it to 'Y' will inject failures
113 only into non-sleep allocations (GFP_ATOMIC allocations).
114
115 - /sys/kernel/debug/fail_page_alloc/min-order:
116
117 specifies the minimum page allocation order to be injected
118 failures.
119
120 - /sys/kernel/debug/fail_futex/ignore-private:
121
122 Format: { 'Y' | 'N' }
123 default is 'N', setting it to 'Y' will disable failure injections
124 when dealing with private (address space) futexes.
125
126 o Boot option
127
128 In order to inject faults while debugfs is not available (early boot time),
129 use the boot option:
130
131 failslab=
132 fail_page_alloc=
133 fail_make_request=
134 fail_futex=
135 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
136
137 o proc entries
138
139 - /proc/<pid>/fail-nth:
140 - /proc/self/task/<tid>/fail-nth:
141
142 Write to this file of integer N makes N-th call in the task fail.
143 Read from this file returns a integer value. A value of '0' indicates
144 that the fault setup with a previous write to this file was injected.
145 A positive integer N indicates that the fault wasn't yet injected.
146 Note that this file enables all types of faults (slab, futex, etc).
147 This setting takes precedence over all other generic debugfs settings
148 like probability, interval, times, etc. But per-capability settings
149 (e.g. fail_futex/ignore-private) take precedence over it.
150
151 This feature is intended for systematic testing of faults in a single
152 system call. See an example below.
153
154 How to add new fault injection capability
155 -----------------------------------------
156
157 o #include <linux/fault-inject.h>
158
159 o define the fault attributes
160
161 DECLARE_FAULT_INJECTION(name);
162
163 Please see the definition of struct fault_attr in fault-inject.h
164 for details.
165
166 o provide a way to configure fault attributes
167
168 - boot option
169
170 If you need to enable the fault injection capability from boot time, you can
171 provide boot option to configure it. There is a helper function for it:
172
173 setup_fault_attr(attr, str);
174
175 - debugfs entries
176
177 failslab, fail_page_alloc, and fail_make_request use this way.
178 Helper functions:
179
180 fault_create_debugfs_attr(name, parent, attr);
181
182 - module parameters
183
184 If the scope of the fault injection capability is limited to a
185 single kernel module, it is better to provide module parameters to
186 configure the fault attributes.
187
188 o add a hook to insert failures
189
190 Upon should_fail() returning true, client code should inject a failure.
191
192 should_fail(attr, size);
193
194 Application Examples
195 --------------------
196
197 o Inject slab allocation failures into module init/exit code
198
199 #!/bin/bash
200
201 FAILTYPE=failslab
202 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
203 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
204 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
205 echo -1 > /sys/kernel/debug/$FAILTYPE/times
206 echo 0 > /sys/kernel/debug/$FAILTYPE/space
207 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
208 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
209
210 faulty_system()
211 {
212 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
213 }
214
215 if [ $# -eq 0 ]
216 then
217 echo "Usage: $0 modulename [ modulename ... ]"
218 exit 1
219 fi
220
221 for m in $*
222 do
223 echo inserting $m...
224 faulty_system modprobe $m
225
226 echo removing $m...
227 faulty_system modprobe -r $m
228 done
229
230 ------------------------------------------------------------------------------
231
232 o Inject page allocation failures only for a specific module
233
234 #!/bin/bash
235
236 FAILTYPE=fail_page_alloc
237 module=$1
238
239 if [ -z $module ]
240 then
241 echo "Usage: $0 <modulename>"
242 exit 1
243 fi
244
245 modprobe $module
246
247 if [ ! -d /sys/module/$module/sections ]
248 then
249 echo Module $module is not loaded
250 exit 1
251 fi
252
253 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
254 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
255
256 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
257 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
258 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
259 echo -1 > /sys/kernel/debug/$FAILTYPE/times
260 echo 0 > /sys/kernel/debug/$FAILTYPE/space
261 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
262 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
263 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
264 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
265
266 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
267
268 echo "Injecting errors into the module $module... (interrupt to stop)"
269 sleep 1000000
270
271 Tool to run command with failslab or fail_page_alloc
272 ----------------------------------------------------
273 In order to make it easier to accomplish the tasks mentioned above, we can use
274 tools/testing/fault-injection/failcmd.sh. Please run a command
275 "./tools/testing/fault-injection/failcmd.sh --help" for more information and
276 see the following examples.
277
278 Examples:
279
280 Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
281 allocation failure.
282
283 # ./tools/testing/fault-injection/failcmd.sh \
284 -- make -C tools/testing/selftests/ run_tests
285
286 Same as above except to specify 100 times failures at most instead of one time
287 at most by default.
288
289 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
290 -- make -C tools/testing/selftests/ run_tests
291
292 Same as above except to inject page allocation failure instead of slab
293 allocation failure.
294
295 # env FAILCMD_TYPE=fail_page_alloc \
296 ./tools/testing/fault-injection/failcmd.sh --times=100 \
297 -- make -C tools/testing/selftests/ run_tests
298
299 Systematic faults using fail-nth
300 ---------------------------------
301
302 The following code systematically faults 0-th, 1-st, 2-nd and so on
303 capabilities in the socketpair() system call.
304
305 #include <sys/types.h>
306 #include <sys/stat.h>
307 #include <sys/socket.h>
308 #include <sys/syscall.h>
309 #include <fcntl.h>
310 #include <unistd.h>
311 #include <string.h>
312 #include <stdlib.h>
313 #include <stdio.h>
314 #include <errno.h>
315
316 int main()
317 {
318 int i, err, res, fail_nth, fds[2];
319 char buf[128];
320
321 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
322 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
323 fail_nth = open(buf, O_RDWR);
324 for (i = 1;; i++) {
325 sprintf(buf, "%d", i);
326 write(fail_nth, buf, strlen(buf));
327 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
328 err = errno;
329 pread(fail_nth, buf, sizeof(buf), 0);
330 if (res == 0) {
331 close(fds[0]);
332 close(fds[1]);
333 }
334 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
335 res, err);
336 if (atoi(buf))
337 break;
338 }
339 return 0;
340 }
341
342 An example output:
343
344 1-th fault Y: res=-1/23
345 2-th fault Y: res=-1/23
346 3-th fault Y: res=-1/12
347 4-th fault Y: res=-1/12
348 5-th fault Y: res=-1/23
349 6-th fault Y: res=-1/23
350 7-th fault Y: res=-1/23
351 8-th fault Y: res=-1/12
352 9-th fault Y: res=-1/12
353 10-th fault Y: res=-1/12
354 11-th fault Y: res=-1/12
355 12-th fault Y: res=-1/12
356 13-th fault Y: res=-1/12
357 14-th fault Y: res=-1/12
358 15-th fault Y: res=-1/12
359 16-th fault N: res=0/12