]>
Commit | Line | Data |
---|---|---|
da9bb1d2 AC |
1 | |
2 | ||
3 | EDAC - Error Detection And Correction | |
4 | ||
5 | Written by Doug Thompson <norsk5@xmission.com> | |
6 | 7 Dec 2005 | |
7 | ||
8 | ||
9 | EDAC was written by: | |
10 | Thayne Harbaugh, | |
11 | modified by Dave Peterson, Doug Thompson, et al, | |
12 | from the bluesmoke.sourceforge.net project. | |
13 | ||
14 | ||
15 | ============================================================================ | |
16 | EDAC PURPOSE | |
17 | ||
18 | The 'edac' kernel module goal is to detect and report errors that occur | |
19 | within the computer system. In the initial release, memory Correctable Errors | |
20 | (CE) and Uncorrectable Errors (UE) are the primary errors being harvested. | |
21 | ||
22 | Detecting CE events, then harvesting those events and reporting them, | |
23 | CAN be a predictor of future UE events. With CE events, the system can | |
f3479816 | 24 | continue to operate, but with less safety. Preventive maintenance and |
da9bb1d2 AC |
25 | proactive part replacement of memory DIMMs exhibiting CEs can reduce |
26 | the likelihood of the dreaded UE events and system 'panics'. | |
27 | ||
28 | ||
29 | In addition, PCI Bus Parity and SERR Errors are scanned for on PCI devices | |
30 | in order to determine if errors are occurring on data transfers. | |
31 | The presence of PCI Parity errors must be examined with a grain of salt. | |
f3479816 | 32 | There are several add-in adapters that do NOT follow the PCI specification |
da9bb1d2 AC |
33 | with regards to Parity generation and reporting. The specification says |
34 | the vendor should tie the parity status bits to 0 if they do not intend | |
35 | to generate parity. Some vendors do not do this, and thus the parity bit | |
36 | can "float" giving false positives. | |
37 | ||
f3479816 | 38 | The PCI Parity EDAC device has the ability to "skip" known flaky |
da9bb1d2 AC |
39 | cards during the parity scan. These are set by the parity "blacklist" |
40 | interface in the sysfs for PCI Parity. (See the PCI section in the sysfs | |
41 | section below.) There is also a parity "whitelist" which is used as | |
42 | an explicit list of devices to scan, while the blacklist is a list | |
43 | of devices to skip. | |
44 | ||
45 | EDAC will have future error detectors that will be added or integrated | |
46 | into EDAC in the following list: | |
47 | ||
48 | MCE Machine Check Exception | |
49 | MCA Machine Check Architecture | |
50 | NMI NMI notification of ECC errors | |
51 | MSRs Machine Specific Register error cases | |
52 | and other mechanisms. | |
53 | ||
54 | These errors are usually bus errors, ECC errors, thermal throttling | |
55 | and the like. | |
56 | ||
57 | ||
58 | ============================================================================ | |
59 | EDAC VERSIONING | |
60 | ||
61 | EDAC is composed of a "core" module (edac_mc.ko) and several Memory | |
62 | Controller (MC) driver modules. On a given system, the CORE | |
63 | is loaded and one MC driver will be loaded. Both the CORE and | |
64 | the MC driver have individual versions that reflect current release | |
65 | level of their respective modules. Thus, to "report" on what version | |
66 | a system is running, one must report both the CORE's and the | |
67 | MC driver's versions. | |
68 | ||
69 | ||
70 | LOADING | |
71 | ||
72 | If 'edac' was statically linked with the kernel then no loading is | |
73 | necessary. If 'edac' was built as modules then simply modprobe the | |
74 | 'edac' pieces that you need. You should be able to modprobe | |
75 | hardware-specific modules and have the dependencies load the necessary core | |
76 | modules. | |
77 | ||
78 | Example: | |
79 | ||
80 | $> modprobe amd76x_edac | |
81 | ||
82 | loads both the amd76x_edac.ko memory controller module and the edac_mc.ko | |
83 | core module. | |
84 | ||
85 | ||
86 | ============================================================================ | |
87 | EDAC sysfs INTERFACE | |
88 | ||
89 | EDAC presents a 'sysfs' interface for control, reporting and attribute | |
90 | reporting purposes. | |
91 | ||
92 | EDAC lives in the /sys/devices/system/edac directory. Within this directory | |
93 | there currently reside 2 'edac' components: | |
94 | ||
95 | mc memory controller(s) system | |
96 | pci PCI status system | |
97 | ||
98 | ||
99 | ============================================================================ | |
100 | Memory Controller (mc) Model | |
101 | ||
102 | First a background on the memory controller's model abstracted in EDAC. | |
103 | Each mc device controls a set of DIMM memory modules. These modules are | |
f3479816 | 104 | laid out in a Chip-Select Row (csrowX) and Channel table (chX). There can |
da9bb1d2 AC |
105 | be multiple csrows and two channels. |
106 | ||
107 | Memory controllers allow for several csrows, with 8 csrows being a typical value. | |
108 | Yet, the actual number of csrows depends on the electrical "loading" | |
109 | of a given motherboard, memory controller and DIMM characteristics. | |
110 | ||
111 | Dual channels allows for 128 bit data transfers to the CPU from memory. | |
112 | ||
113 | ||
114 | Channel 0 Channel 1 | |
115 | =================================== | |
116 | csrow0 | DIMM_A0 | DIMM_B0 | | |
117 | csrow1 | DIMM_A0 | DIMM_B0 | | |
118 | =================================== | |
119 | ||
120 | =================================== | |
121 | csrow2 | DIMM_A1 | DIMM_B1 | | |
122 | csrow3 | DIMM_A1 | DIMM_B1 | | |
123 | =================================== | |
124 | ||
125 | In the above example table there are 4 physical slots on the motherboard | |
126 | for memory DIMMs: | |
127 | ||
128 | DIMM_A0 | |
129 | DIMM_B0 | |
130 | DIMM_A1 | |
131 | DIMM_B1 | |
132 | ||
133 | Labels for these slots are usually silk screened on the motherboard. Slots | |
f3479816 | 134 | labeled 'A' are channel 0 in this example. Slots labeled 'B' |
da9bb1d2 AC |
135 | are channel 1. Notice that there are two csrows possible on a |
136 | physical DIMM. These csrows are allocated their csrow assignment | |
137 | based on the slot into which the memory DIMM is placed. Thus, when 1 DIMM | |
138 | is placed in each Channel, the csrows cross both DIMMs. | |
139 | ||
140 | Memory DIMMs come single or dual "ranked". A rank is a populated csrow. | |
141 | Thus, 2 single ranked DIMMs, placed in slots DIMM_A0 and DIMM_B0 above | |
142 | will have 1 csrow, csrow0. csrow1 will be empty. On the other hand, | |
f3479816 | 143 | when 2 dual ranked DIMMs are similarly placed, then both csrow0 and |
da9bb1d2 AC |
144 | csrow1 will be populated. The pattern repeats itself for csrow2 and |
145 | csrow3. | |
146 | ||
147 | The representation of the above is reflected in the directory tree | |
148 | in EDAC's sysfs interface. Starting in directory | |
149 | /sys/devices/system/edac/mc each memory controller will be represented | |
150 | by its own 'mcX' directory, where 'X" is the index of the MC. | |
151 | ||
152 | ||
153 | ..../edac/mc/ | |
154 | | | |
155 | |->mc0 | |
156 | |->mc1 | |
157 | |->mc2 | |
158 | .... | |
159 | ||
160 | Under each 'mcX' directory each 'csrowX' is again represented by a | |
161 | 'csrowX', where 'X" is the csrow index: | |
162 | ||
163 | ||
164 | .../mc/mc0/ | |
165 | | | |
166 | |->csrow0 | |
167 | |->csrow2 | |
168 | |->csrow3 | |
169 | .... | |
170 | ||
171 | Notice that there is no csrow1, which indicates that csrow0 is | |
172 | composed of a single ranked DIMMs. This should also apply in both | |
173 | Channels, in order to have dual-channel mode be operational. Since | |
174 | both csrow2 and csrow3 are populated, this indicates a dual ranked | |
175 | set of DIMMs for channels 0 and 1. | |
176 | ||
177 | ||
178 | Within each of the 'mc','mcX' and 'csrowX' directories are several | |
179 | EDAC control and attribute files. | |
180 | ||
181 | ||
182 | ============================================================================ | |
183 | DIRECTORY 'mc' | |
184 | ||
185 | In directory 'mc' are EDAC system overall control and attribute files: | |
186 | ||
187 | ||
188 | Panic on UE control file: | |
189 | ||
190 | 'panic_on_ue' | |
191 | ||
192 | An uncorrectable error will cause a machine panic. This is usually | |
193 | desirable. It is a bad idea to continue when an uncorrectable error | |
194 | occurs - it is indeterminate what was uncorrected and the operating | |
195 | system context might be so mangled that continuing will lead to further | |
196 | corruption. If the kernel has MCE configured, then EDAC will never | |
197 | notice the UE. | |
198 | ||
199 | LOAD TIME: module/kernel parameter: panic_on_ue=[0|1] | |
200 | ||
201 | RUN TIME: echo "1" >/sys/devices/system/edac/mc/panic_on_ue | |
202 | ||
203 | ||
204 | Log UE control file: | |
205 | ||
206 | 'log_ue' | |
207 | ||
208 | Generate kernel messages describing uncorrectable errors. These errors | |
209 | are reported through the system message log system. UE statistics | |
210 | will be accumulated even when UE logging is disabled. | |
211 | ||
212 | LOAD TIME: module/kernel parameter: log_ue=[0|1] | |
213 | ||
214 | RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ue | |
215 | ||
216 | ||
217 | Log CE control file: | |
218 | ||
219 | 'log_ce' | |
220 | ||
221 | Generate kernel messages describing correctable errors. These | |
222 | errors are reported through the system message log system. | |
223 | CE statistics will be accumulated even when CE logging is disabled. | |
224 | ||
225 | LOAD TIME: module/kernel parameter: log_ce=[0|1] | |
226 | ||
227 | RUN TIME: echo "1" >/sys/devices/system/edac/mc/log_ce | |
228 | ||
229 | ||
230 | Polling period control file: | |
231 | ||
232 | 'poll_msec' | |
233 | ||
234 | The time period, in milliseconds, for polling for error information. | |
235 | Too small a value wastes resources. Too large a value might delay | |
236 | necessary handling of errors and might loose valuable information for | |
237 | locating the error. 1000 milliseconds (once each second) is about | |
238 | right for most uses. | |
239 | ||
240 | LOAD TIME: module/kernel parameter: poll_msec=[0|1] | |
241 | ||
242 | RUN TIME: echo "1000" >/sys/devices/system/edac/mc/poll_msec | |
243 | ||
244 | ||
245 | Module Version read-only attribute file: | |
246 | ||
247 | 'mc_version' | |
248 | ||
f3479816 | 249 | The EDAC CORE module's version and compile date are shown here to |
da9bb1d2 AC |
250 | indicate what EDAC is running. |
251 | ||
252 | ||
253 | ||
254 | ============================================================================ | |
255 | 'mcX' DIRECTORIES | |
256 | ||
257 | ||
258 | In 'mcX' directories are EDAC control and attribute files for | |
259 | this 'X" instance of the memory controllers: | |
260 | ||
261 | ||
262 | Counter reset control file: | |
263 | ||
264 | 'reset_counters' | |
265 | ||
266 | This write-only control file will zero all the statistical counters | |
267 | for UE and CE errors. Zeroing the counters will also reset the timer | |
268 | indicating how long since the last counter zero. This is useful | |
269 | for computing errors/time. Since the counters are always reset at | |
270 | driver initialization time, no module/kernel parameter is available. | |
271 | ||
272 | RUN TIME: echo "anything" >/sys/devices/system/edac/mc/mc0/counter_reset | |
273 | ||
274 | This resets the counters on memory controller 0 | |
275 | ||
276 | ||
277 | Seconds since last counter reset control file: | |
278 | ||
279 | 'seconds_since_reset' | |
280 | ||
281 | This attribute file displays how many seconds have elapsed since the | |
282 | last counter reset. This can be used with the error counters to | |
283 | measure error rates. | |
284 | ||
285 | ||
286 | ||
287 | DIMM capability attribute file: | |
288 | ||
289 | 'edac_capability' | |
290 | ||
291 | The EDAC (Error Detection and Correction) capabilities/modes of | |
292 | the memory controller hardware. | |
293 | ||
294 | ||
295 | DIMM Current Capability attribute file: | |
296 | ||
297 | 'edac_current_capability' | |
298 | ||
299 | The EDAC capabilities available with the hardware | |
300 | configuration. This may not be the same as "EDAC capability" | |
301 | if the correct memory is not used. If a memory controller is | |
302 | capable of EDAC, but DIMMs without check bits are in use, then | |
303 | Parity, SECDED, S4ECD4ED capabilities will not be available | |
304 | even though the memory controller might be capable of those | |
305 | modes with the proper memory loaded. | |
306 | ||
307 | ||
308 | Memory Type supported on this controller attribute file: | |
309 | ||
310 | 'supported_mem_type' | |
311 | ||
312 | This attribute file displays the memory type, usually | |
313 | buffered and unbuffered DIMMs. | |
314 | ||
315 | ||
316 | Memory Controller name attribute file: | |
317 | ||
318 | 'mc_name' | |
319 | ||
320 | This attribute file displays the type of memory controller | |
321 | that is being utilized. | |
322 | ||
323 | ||
324 | Memory Controller Module name attribute file: | |
325 | ||
326 | 'module_name' | |
327 | ||
328 | This attribute file displays the memory controller module name, | |
329 | version and date built. The name of the memory controller | |
330 | hardware - some drivers work with multiple controllers and | |
331 | this field shows which hardware is present. | |
332 | ||
333 | ||
334 | Total memory managed by this memory controller attribute file: | |
335 | ||
336 | 'size_mb' | |
337 | ||
338 | This attribute file displays, in count of megabytes, of memory | |
339 | that this instance of memory controller manages. | |
340 | ||
341 | ||
342 | Total Uncorrectable Errors count attribute file: | |
343 | ||
344 | 'ue_count' | |
345 | ||
346 | This attribute file displays the total count of uncorrectable | |
347 | errors that have occurred on this memory controller. If panic_on_ue | |
348 | is set this counter will not have a chance to increment, | |
349 | since EDAC will panic the system. | |
350 | ||
351 | ||
352 | Total UE count that had no information attribute fileY: | |
353 | ||
354 | 'ue_noinfo_count' | |
355 | ||
356 | This attribute file displays the number of UEs that | |
357 | have occurred have occurred with no informations as to which DIMM | |
358 | slot is having errors. | |
359 | ||
360 | ||
361 | Total Correctable Errors count attribute file: | |
362 | ||
363 | 'ce_count' | |
364 | ||
365 | This attribute file displays the total count of correctable | |
366 | errors that have occurred on this memory controller. This | |
367 | count is very important to examine. CEs provide early | |
368 | indications that a DIMM is beginning to fail. This count | |
369 | field should be monitored for non-zero values and report | |
370 | such information to the system administrator. | |
371 | ||
372 | ||
373 | Total Correctable Errors count attribute file: | |
374 | ||
375 | 'ce_noinfo_count' | |
376 | ||
377 | This attribute file displays the number of CEs that | |
378 | have occurred wherewith no informations as to which DIMM slot | |
379 | is having errors. Memory is handicapped, but operational, | |
380 | yet no information is available to indicate which slot | |
381 | the failing memory is in. This count field should be also | |
382 | be monitored for non-zero values. | |
383 | ||
384 | Device Symlink: | |
385 | ||
386 | 'device' | |
387 | ||
388 | Symlink to the memory controller device | |
389 | ||
390 | ||
391 | ||
392 | ============================================================================ | |
393 | 'csrowX' DIRECTORIES | |
394 | ||
395 | In the 'csrowX' directories are EDAC control and attribute files for | |
396 | this 'X" instance of csrow: | |
397 | ||
398 | ||
399 | Total Uncorrectable Errors count attribute file: | |
400 | ||
401 | 'ue_count' | |
402 | ||
403 | This attribute file displays the total count of uncorrectable | |
404 | errors that have occurred on this csrow. If panic_on_ue is set | |
405 | this counter will not have a chance to increment, since EDAC | |
406 | will panic the system. | |
407 | ||
408 | ||
409 | Total Correctable Errors count attribute file: | |
410 | ||
411 | 'ce_count' | |
412 | ||
413 | This attribute file displays the total count of correctable | |
414 | errors that have occurred on this csrow. This | |
415 | count is very important to examine. CEs provide early | |
416 | indications that a DIMM is beginning to fail. This count | |
417 | field should be monitored for non-zero values and report | |
418 | such information to the system administrator. | |
419 | ||
420 | ||
421 | Total memory managed by this csrow attribute file: | |
422 | ||
423 | 'size_mb' | |
424 | ||
425 | This attribute file displays, in count of megabytes, of memory | |
f3479816 | 426 | that this csrow contains. |
da9bb1d2 AC |
427 | |
428 | ||
429 | Memory Type attribute file: | |
430 | ||
431 | 'mem_type' | |
432 | ||
433 | This attribute file will display what type of memory is currently | |
434 | on this csrow. Normally, either buffered or unbuffered memory. | |
435 | ||
436 | ||
437 | EDAC Mode of operation attribute file: | |
438 | ||
439 | 'edac_mode' | |
440 | ||
441 | This attribute file will display what type of Error detection | |
442 | and correction is being utilized. | |
443 | ||
444 | ||
445 | Device type attribute file: | |
446 | ||
447 | 'dev_type' | |
448 | ||
449 | This attribute file will display what type of DIMM device is | |
450 | being utilized. Example: x4 | |
451 | ||
452 | ||
453 | Channel 0 CE Count attribute file: | |
454 | ||
455 | 'ch0_ce_count' | |
456 | ||
457 | This attribute file will display the count of CEs on this | |
458 | DIMM located in channel 0. | |
459 | ||
460 | ||
461 | Channel 0 UE Count attribute file: | |
462 | ||
463 | 'ch0_ue_count' | |
464 | ||
465 | This attribute file will display the count of UEs on this | |
466 | DIMM located in channel 0. | |
467 | ||
468 | ||
469 | Channel 0 DIMM Label control file: | |
470 | ||
471 | 'ch0_dimm_label' | |
472 | ||
473 | This control file allows this DIMM to have a label assigned | |
474 | to it. With this label in the module, when errors occur | |
475 | the output can provide the DIMM label in the system log. | |
476 | This becomes vital for panic events to isolate the | |
477 | cause of the UE event. | |
478 | ||
479 | DIMM Labels must be assigned after booting, with information | |
480 | that correctly identifies the physical slot with its | |
481 | silk screen label. This information is currently very | |
482 | motherboard specific and determination of this information | |
483 | must occur in userland at this time. | |
484 | ||
485 | ||
486 | Channel 1 CE Count attribute file: | |
487 | ||
488 | 'ch1_ce_count' | |
489 | ||
490 | This attribute file will display the count of CEs on this | |
491 | DIMM located in channel 1. | |
492 | ||
493 | ||
494 | Channel 1 UE Count attribute file: | |
495 | ||
496 | 'ch1_ue_count' | |
497 | ||
498 | This attribute file will display the count of UEs on this | |
499 | DIMM located in channel 0. | |
500 | ||
501 | ||
502 | Channel 1 DIMM Label control file: | |
503 | ||
504 | 'ch1_dimm_label' | |
505 | ||
506 | This control file allows this DIMM to have a label assigned | |
507 | to it. With this label in the module, when errors occur | |
508 | the output can provide the DIMM label in the system log. | |
509 | This becomes vital for panic events to isolate the | |
510 | cause of the UE event. | |
511 | ||
512 | DIMM Labels must be assigned after booting, with information | |
513 | that correctly identifies the physical slot with its | |
514 | silk screen label. This information is currently very | |
515 | motherboard specific and determination of this information | |
516 | must occur in userland at this time. | |
517 | ||
518 | ||
519 | ============================================================================ | |
520 | SYSTEM LOGGING | |
521 | ||
522 | If logging for UEs and CEs are enabled then system logs will have | |
523 | error notices indicating errors that have been detected: | |
524 | ||
525 | MC0: CE page 0x283, offset 0xce0, grain 8, syndrome 0x6ec3, row 0, | |
526 | channel 1 "DIMM_B1": amd76x_edac | |
527 | ||
528 | MC0: CE page 0x1e5, offset 0xfb0, grain 8, syndrome 0xb741, row 0, | |
529 | channel 1 "DIMM_B1": amd76x_edac | |
530 | ||
531 | ||
532 | The structure of the message is: | |
533 | the memory controller (MC0) | |
534 | Error type (CE) | |
535 | memory page (0x283) | |
536 | offset in the page (0xce0) | |
537 | the byte granularity (grain 8) | |
538 | or resolution of the error | |
539 | the error syndrome (0xb741) | |
540 | memory row (row 0) | |
541 | memory channel (channel 1) | |
542 | DIMM label, if set prior (DIMM B1 | |
543 | and then an optional, driver-specific message that may | |
544 | have additional information. | |
545 | ||
546 | Both UEs and CEs with no info will lack all but memory controller, | |
547 | error type, a notice of "no info" and then an optional, | |
548 | driver-specific error message. | |
549 | ||
550 | ||
551 | ||
552 | ============================================================================ | |
553 | PCI Bus Parity Detection | |
554 | ||
555 | ||
556 | On Header Type 00 devices the primary status is looked at | |
557 | for any parity error regardless of whether Parity is enabled on the | |
558 | device. (The spec indicates parity is generated in some cases). | |
559 | On Header Type 01 bridges, the secondary status register is also | |
f3479816 | 560 | looked at to see if parity occurred on the bus on the other side of |
da9bb1d2 AC |
561 | the bridge. |
562 | ||
563 | ||
564 | SYSFS CONFIGURATION | |
565 | ||
566 | Under /sys/devices/system/edac/pci are control and attribute files as follows: | |
567 | ||
568 | ||
569 | Enable/Disable PCI Parity checking control file: | |
570 | ||
571 | 'check_pci_parity' | |
572 | ||
573 | ||
574 | This control file enables or disables the PCI Bus Parity scanning | |
575 | operation. Writing a 1 to this file enables the scanning. Writing | |
576 | a 0 to this file disables the scanning. | |
577 | ||
578 | Enable: | |
579 | echo "1" >/sys/devices/system/edac/pci/check_pci_parity | |
580 | ||
581 | Disable: | |
582 | echo "0" >/sys/devices/system/edac/pci/check_pci_parity | |
583 | ||
584 | ||
585 | ||
586 | Panic on PCI PARITY Error: | |
587 | ||
588 | 'panic_on_pci_parity' | |
589 | ||
590 | ||
f3479816 | 591 | This control files enables or disables panicking when a parity |
da9bb1d2 AC |
592 | error has been detected. |
593 | ||
594 | ||
595 | module/kernel parameter: panic_on_pci_parity=[0|1] | |
596 | ||
597 | Enable: | |
598 | echo "1" >/sys/devices/system/edac/pci/panic_on_pci_parity | |
599 | ||
600 | Disable: | |
601 | echo "0" >/sys/devices/system/edac/pci/panic_on_pci_parity | |
602 | ||
603 | ||
604 | Parity Count: | |
605 | ||
606 | 'pci_parity_count' | |
607 | ||
608 | This attribute file will display the number of parity errors that | |
609 | have been detected. | |
610 | ||
611 | ||
612 | ||
613 | PCI Device Whitelist: | |
614 | ||
615 | 'pci_parity_whitelist' | |
616 | ||
617 | This control file allows for an explicit list of PCI devices to be | |
618 | scanned for parity errors. Only devices found on this list will | |
f3479816 | 619 | be examined. The list is a line of hexadecimal VENDOR and DEVICE |
da9bb1d2 AC |
620 | ID tuples: |
621 | ||
622 | 1022:7450,1434:16a6 | |
623 | ||
f3479816 | 624 | One or more can be inserted, separated by a comma. |
da9bb1d2 AC |
625 | |
626 | To write the above list doing the following as one command line: | |
627 | ||
628 | echo "1022:7450,1434:16a6" | |
629 | > /sys/devices/system/edac/pci/pci_parity_whitelist | |
630 | ||
631 | ||
632 | ||
633 | To display what the whitelist is, simply 'cat' the same file. | |
634 | ||
635 | ||
636 | PCI Device Blacklist: | |
637 | ||
638 | 'pci_parity_blacklist' | |
639 | ||
640 | This control file allows for a list of PCI devices to be | |
641 | skipped for scanning. | |
f3479816 | 642 | The list is a line of hexadecimal VENDOR and DEVICE ID tuples: |
da9bb1d2 AC |
643 | |
644 | 1022:7450,1434:16a6 | |
645 | ||
f3479816 | 646 | One or more can be inserted, separated by a comma. |
da9bb1d2 AC |
647 | |
648 | To write the above list doing the following as one command line: | |
649 | ||
650 | echo "1022:7450,1434:16a6" | |
651 | > /sys/devices/system/edac/pci/pci_parity_blacklist | |
652 | ||
653 | ||
f3479816 | 654 | To display what the whitelist currently contains, |
da9bb1d2 AC |
655 | simply 'cat' the same file. |
656 | ||
657 | ======================================================================= | |
658 | ||
659 | PCI Vendor and Devices IDs can be obtained with the lspci command. Using | |
660 | the -n option lspci will display the vendor and device IDs. The system | |
f3479816 | 661 | administrator will have to determine which devices should be scanned or |
da9bb1d2 AC |
662 | skipped. |
663 | ||
664 | ||
665 | ||
666 | The two lists (white and black) are prioritized. blacklist is the lower | |
667 | priority and will NOT be utilized when a whitelist has been set. | |
668 | Turn OFF a whitelist by an empty echo command: | |
669 | ||
670 | echo > /sys/devices/system/edac/pci/pci_parity_whitelist | |
671 | ||
f3479816 | 672 | and any previous blacklist will be utilized. |
da9bb1d2 | 673 |