]>
Commit | Line | Data |
---|---|---|
9538cc28 SR |
1 | From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 |
2 | From: Yazen Ghannam <yazen.ghannam@amd.com> | |
3 | Date: Thu, 21 Nov 2019 08:15:08 -0600 | |
4 | Subject: [PATCH] x86/MCE/AMD: Allow Reserved types to be overwritten in | |
5 | smca_banks[] | |
6 | ||
7 | Each logical CPU in Scalable MCA systems controls a unique set of MCA | |
8 | banks in the system. These banks are not shared between CPUs. The bank | |
9 | types and ordering will be the same across CPUs on currently available | |
10 | systems. | |
11 | ||
12 | However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while | |
13 | other CPUs do not. In this case, the bank seen as Reserved on one CPU is | |
14 | assumed to be the same type as the bank seen as a known type on another | |
15 | CPU. | |
16 | ||
17 | In general, this occurs when the hardware represented by the MCA bank | |
18 | is disabled, e.g. disabled memory controllers on certain models, etc. | |
19 | The MCA bank is disabled in the hardware, so there is no possibility of | |
20 | getting an MCA/MCE from it even if it is assumed to have a known type. | |
21 | ||
22 | For example: | |
23 | ||
24 | Full system: | |
25 | Bank | Type seen on CPU0 | Type seen on CPU1 | |
26 | ------------------------------------------------ | |
27 | 0 | LS | LS | |
28 | 1 | UMC | UMC | |
29 | 2 | CS | CS | |
30 | ||
31 | System with hardware disabled: | |
32 | Bank | Type seen on CPU0 | Type seen on CPU1 | |
33 | ------------------------------------------------ | |
34 | 0 | LS | LS | |
35 | 1 | UMC | RAZ | |
36 | 2 | CS | CS | |
37 | ||
38 | For this reason, there is a single, global struct smca_banks[] that is | |
39 | initialized at boot time. This array is initialized on each CPU as it | |
40 | comes online. However, the array will not be updated if an entry already | |
41 | exists. | |
42 | ||
43 | This works as expected when the first CPU (usually CPU0) has all | |
44 | possible MCA banks enabled. But if the first CPU has a subset, then it | |
45 | will save a "Reserved" type in smca_banks[]. Successive CPUs will then | |
46 | not be able to update smca_banks[] even if they encounter a known bank | |
47 | type. | |
48 | ||
49 | This may result in unexpected behavior. Depending on the system | |
50 | configuration, a user may observe issues enumerating the MCA | |
51 | thresholding sysfs interface. The issues may be as trivial as sysfs | |
52 | entries not being available, or as severe as system hangs. | |
53 | ||
54 | For example: | |
55 | ||
56 | Bank | Type seen on CPU0 | Type seen on CPU1 | |
57 | ------------------------------------------------ | |
58 | 0 | LS | LS | |
59 | 1 | RAZ | UMC | |
60 | 2 | CS | CS | |
61 | ||
62 | Extend the smca_banks[] entry check to return if the entry is a | |
63 | non-reserved type. Otherwise, continue so that CPUs that encounter a | |
64 | known bank type can update smca_banks[]. | |
65 | ||
66 | Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type") | |
67 | Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> | |
68 | Signed-off-by: Borislav Petkov <bp@suse.de> | |
de6f4b1d | 69 | Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com> |
9538cc28 SR |
70 | --- |
71 | arch/x86/kernel/cpu/mce/amd.c | 2 +- | |
72 | 1 file changed, 1 insertion(+), 1 deletion(-) | |
73 | ||
74 | diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c | |
75 | index 6ea7fdc82f3c..08e09c8c269f 100644 | |
76 | --- a/arch/x86/kernel/cpu/mce/amd.c | |
77 | +++ b/arch/x86/kernel/cpu/mce/amd.c | |
78 | @@ -266,7 +266,7 @@ static void smca_configure(unsigned int bank, unsigned int cpu) | |
79 | smca_set_misc_banks_map(bank, cpu); | |
80 | ||
81 | /* Return early if this bank was already initialized. */ | |
82 | - if (smca_banks[bank].hwid) | |
83 | + if (smca_banks[bank].hwid && smca_banks[bank].hwid->hwid_mcatype != 0) | |
84 | return; | |
85 | ||
86 | if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { |