BugLink: http://bugs.launchpad.net/bugs/1747090
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(back ported from commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Colin Ian King [Fri, 9 Feb 2018 10:08:54 +0000 (10:08 +0000)]
Revert "mm, memory_hotplug: do not associate hotadded memory to zones until online"
BugLink: http://bugs.launchpad.net/bugs/1747069
Hotplug removal causes i386 crashes when exercised with the kernel
selftest mem-on-off-test script. A fix occurs in 4.15 however this
requires a large set of changes that are way too large to be SRU'able
and the least risky way forward is to revert the offending commit.
This fix reverts commit f1dd2cd13c4b ("mm, memory_hotplug: do not
associate hotadded memory to zones until online"), however the
revert required some manual fix-ups because of subsequent fixes after
this commit.
Note that running the mem-on-off-script is not always sufficient to
trigger the bug. A good reproducer is to run in a 4 CPU VM with 2GB
of memory and after running the script run sync and then re-install
the kernel packages to trip the issue. This has been thoroughly
tested on i386 and amd64 and given a solid soak test using the
ADT tests too.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Andy Whitcroft [Thu, 8 Feb 2018 20:37:45 +0000 (20:37 +0000)]
UBUNTU: SAUCE: turn off IBPB when full retpoline is present
CVE-2017-5715 (Spectre v2 Intel)
When we have full retpoline enabled then we do not actually require IBPB
flushes when entering the kernel. Add a new use_ibpb bit to represent
when we have retpoline enabled. Further split the enable bit into two
0x1 representing whether entry IBPB is enabled and 0x10 representing
whether kernel flushes for userspace/VMs etc are applied.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
KVM: x86: Add speculative control CPUID support for guests
CVE-2017-5715 (Spectre v2 Intel)
Provide the guest with the speculative control CPUID related values.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/svm: Set IBPB when running a different VCPU
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
going to run a VCPU different from what was previously run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/svm: Set IBRS value on VM entry and exit
CVE-2017-5715 (Spectre v2 Intel)
Set/restore the guests IBRS value on VM entry. On VM exit back to the
kernel save the guest IBRS value and then set IBRS to 1.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
KVM: SVM: Do not intercept new speculative control MSRs
CVE-2017-5715 (Spectre v2 Intel)
Allow guest access to the speculative control MSRs without being
intercepted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/microcode: Extend post microcode reload to support IBPB feature
CVE-2017-5715 (Spectre v2 Intel)
Add an IBPB feature check to the speculative control update check after
a microcode reload.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:52:54 +0000 (10:52 +0000)]
x86/cpu/AMD: Add speculative control support for AMD
CVE-2017-5715 (Spectre v2 Intel)
Add speculative control support for AMD processors. For AMD, speculative
control is indicated as follows:
CPUID EAX=0x00000007, ECX=0x00 return EDX[26] indicates support for
both IBRS and IBPB.
CPUID EAX=0x80000008, ECX=0x00 return EBX[12] indicates support for
just IBPB.
On AMD family 0x10, 0x12 and 0x16 processors where either of the above
features are not supported, IBPB can be achieved by disabling
indirect branch predictor support in MSR 0xc0011021[14] at boot.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Mon, 20 Nov 2017 21:47:54 +0000 (13:47 -0800)]
x86/spec_ctrl: Add lock to serialize changes to ibrs and ibpb control
CVE-2017-5715 (Spectre v2 Intel)
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Thu, 16 Nov 2017 12:47:48 +0000 (04:47 -0800)]
x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
CVE-2017-5715 (Spectre v2 Intel)
There are 2 ways to control IBPB and IBRS
1. At boot time
noibrs kernel boot parameter will disable IBRS usage
noibpb kernel boot parameter will disable IBPB usage
Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.
2. At run time
echo 0 > /proc/sys/kernel/ibrs_enabled will turn off IBRS
echo 1 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in kernel
echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both userspace and kernel
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
[marcelo.cerri@canonical.com: add x86 guards to kernel/smp.c]
[marcelo.cerri@canonical.com: include asm/msr.h under x86 guard in kernel/sysctl.c] Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Sat, 21 Oct 2017 00:04:35 +0000 (17:04 -0700)]
x86/kvm: Toggle IBRS on VM entry and exit
CVE-2017-5715 (Spectre v2 Intel)
Restore guest IBRS on VM entry and set it to 1 on VM exit
back to kernel.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 13 Oct 2017 21:31:46 +0000 (14:31 -0700)]
x86/kvm: Set IBPB when switching VM
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB (Indirect branch prediction barrier) when switching VM.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Wei Wang [Tue, 7 Nov 2017 08:47:53 +0000 (16:47 +0800)]
x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD to kvm
CVE-2017-5715 (Spectre v2 Intel)
Add field to access guest MSR_IA332_SPEC_CTRL and MSR_IA32_PRED_CMD state.
Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 15 Nov 2017 01:16:30 +0000 (17:16 -0800)]
x86/entry: Stuff RSB for entry to kernel for non-SMEP platform
CVE-2017-5715 (Spectre v2 Intel)
Stuff RSB to prevent RSB underflow on non-SMEP platform.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Tue, 7 Nov 2017 21:52:42 +0000 (13:52 -0800)]
x86/mm: Only set IBPB when the new thread cannot ptrace current thread
CVE-2017-5715 (Spectre v2 Intel)
To reduce overhead of setting IBPB, we only do that when
the new thread cannot ptrace the current one. If the new
thread has ptrace capability on current thread, it is safe.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 20 Oct 2017 19:56:29 +0000 (12:56 -0700)]
x86/mm: Set IBPB upon context switch
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB on context switch with changing of page table.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 15 Nov 2017 20:24:19 +0000 (12:24 -0800)]
x86/idle: Disable IBRS when offlining cpu and re-enable on wakeup
CVE-2017-5715 (Spectre v2 Intel)
Clear IBRS when cpu is offlined and set it when brining it back online.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Tue, 7 Nov 2017 02:19:14 +0000 (18:19 -0800)]
x86/idle: Disable IBRS entering idle and enable it on wakeup
CVE-2017-5715 (Spectre v2 Intel)
Clear IBRS on idle entry and set it on idle exit into kernel on mwait.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 13 Oct 2017 21:25:00 +0000 (14:25 -0700)]
x86/enter: Use IBRS on syscall and interrupts
CVE-2017-5715 (Spectre v2 Intel)
Set IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Sat, 16 Sep 2017 01:04:53 +0000 (18:04 -0700)]
x86/enter: MACROS to set/clear IBRS and set IBPB
CVE-2017-5715 (Spectre v2 Intel)
Setup macros to control IBRS and IBPB
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 27 Sep 2017 19:09:14 +0000 (12:09 -0700)]
x86/feature: Report presence of IBPB and IBRS control
CVE-2017-5715 (Spectre v2 Intel)
Report presence of IBPB and IBRS.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Thu, 24 Aug 2017 16:34:41 +0000 (09:34 -0700)]
x86/feature: Enable the x86 feature to control Speculation
CVE-2017-5715 (Spectre v2 Intel)
cpuid ax=0x7, return rdx bit 26 to indicate presence of this feature
IA32_SPEC_CTRL (0x48) and IA32_PRED_CMD (0x49)
IA32_SPEC_CTRL, bit0 – Indirect Branch Restricted Speculation (IBRS)
IA32_PRED_CMD, bit0 – Indirect Branch Prediction Barrier (IBPB)
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Craig Gallek [Mon, 30 Oct 2017 22:50:11 +0000 (18:50 -0400)]
tun/tap: sanitize TUNSETSNDBUF input
CVE-2013-4343
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
Julien Gomes [Wed, 25 Oct 2017 18:50:50 +0000 (11:50 -0700)]
tun: allow positive return values on dev_get_valid_name() call
CVE-2013-4343
If the name argument of dev_get_valid_name() contains "%d", it will try
to assign it a unit number in __dev__alloc_name() and return either the
unit number (>= 0) or an error code (< 0).
Considering positive values as error values prevent tun device creations
relying this mechanism, therefor we should only consider negative values
as errors here.
Signed-off-by: Julien Gomes <julien@arista.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Cong Wang [Fri, 13 Oct 2017 18:58:53 +0000 (11:58 -0700)]
tun: call dev_get_valid_name() before register_netdevice()
CVE-2013-4343
register_netdevice() could fail early when we have an invalid
dev name, in which case ->ndo_uninit() is not called. For tun
device, this is a problem because a timer etc. are already
initialized and it expects ->ndo_uninit() to clean them up.
We could move these initializations into a ->ndo_init() so
that register_netdevice() knows better, however this is still
complicated due to the logic in tun_detach().
Therefore, I choose to just call dev_get_valid_name() before
register_netdevice(), which is quicker and much easier to audit.
And for this specific case, it is already enough.
Fixes: 96442e42429e ("tuntap: choose the txq based on rxq") Reported-by: Dmitry Alexeev <avekceeb@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://launchpad.net/bugs/1742759
Add quirks for handling PX/HG systems. In this case, add
a quirk for a weston dGPU that only seems to properly power
down using ATPX power control rather than HG (_PR3).
v2: append a new weston XT
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> (v2) Reviewed-and-Tested-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Khaled Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law <law@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 04007c4ac40ca141e9160b33368e592394d1b6d3) Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap. The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE. On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.
The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD. Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.
The same sequence has been adopted by GCC for the GCC generated retpolines.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Kees Cook <keescook@google.com> Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 956ec9e7b59a1fb3fd4c9bc8a13a7f7700e9d7d2) Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.
This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.
Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.
On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
other conditions which may result in the RSB becoming empty. The full
solution on Skylake+ is to use IBRS, which will prevent the problem even
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.
[ tglx: Added missing vendor check and slighty massaged comments and
changelog ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 051547583bdda4b74953053a1034026c56b55c4c) Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
It is possible to run into some problems when executing a host reset
when cq_tasklet_vx_hw() is being executed.
So, prior to host reset, execute tasklet_kill() to ensure that all CQ
tasklets are complete.
Besides, as the function hisi_sas_wait_tasklets_done() is added to do
tasklet_kill(), this patch refactors some code where tasklet_kill() is
used.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(backported from commit 571295f8055c0b69c9911021ae6cf1a6973cf517)
[ dannf: dropped change in hisi_sas_v3_hw.c reset handler, which didn't
yet exit; resolved trivial merge conflict in hisi_sas.h ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d499669facfdc9e8a7c479e86fd11f3d49e065ee) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Xiaofei Tan [Tue, 24 Oct 2017 15:51:38 +0000 (23:51 +0800)]
scsi: hisi_sas: fix the risk of freeing slot twice
BugLink: https://bugs.launchpad.net/bugs/1739807
The function hisi_sas_slot_task_free() is used to free the slot and do
tidy-up of LLDD resources. The LLDD generally should know the state of
a slot and decide when to free it, and it should only be done once.
For some scenarios, we really don't know the state, like when TMF
timeout. In this case, we check task->lldd_task before calling
hisi_sas_slot_task_free().
However, we may miss some scenarios when we should also check
task->lldd_task, and it is not SMP safe to check task->lldd_task as we
don't protect it within spin lock.
This patch is to fix this risk of freeing slot twice, as follows:
1. Check task->lldd_task in the hisi_sas_slot_task_free(), and give
up freeing of this time if task->lldd_task is NULL.
2. Set slot->buf to NULL after it is freed.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6ba0fbc35aa9f3bc8c12be3b4047055c9ce2ac92) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
This is to guard against the scenario of the slot being freed during
the from the preceding internal abort.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 378c233bcb21dfb2d9c2548b9a1fa6a8d35c78dd) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Xiang Chen [Tue, 24 Oct 2017 15:51:36 +0000 (23:51 +0800)]
scsi: hisi_sas: us start_phy in PHY_FUNC_LINK_RESET
BugLink: https://bugs.launchpad.net/bugs/1739807
When a PHY_FUNC_LINK_RESET is issued, we need to fill the transport
identify_frame to SAS controller before the PHYs are enabled.
Without this, we may find that if a PHY which belonged to a wideport
before the reset may generate a new port id.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1eb8eeac17ee808b50b422f5ef2e27f5497f82ad) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1739807
When an internal abort times out in hisi_sas_internal_task_abort(),
goto the exit label in and not go through the other task status
checks.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f692a677e2cb0ccab4ef91be97d8fb020cca246d) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Before, we polled to ITCT CLR interrupt to check if a device is free.
This was error prone, as if the interrupt doesn't occur in 10us, we miss
processing it.
To avoid this situation, service this interrupt and sync the event with
a completion.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 640acc9a9693e729b7936fb4801f2dd59041a141) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Xiang Chen [Thu, 10 Aug 2017 16:09:32 +0000 (00:09 +0800)]
scsi: hisi_sas: add irq and tasklet cleanup in v2 hw
BugLink: https://bugs.launchpad.net/bugs/1739807
This patch adds support to clean-up allocated IRQs and kill tasklets
when probe fails and for driver removal.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8a253888bfc395a7a169f20d710d9289e59a8592) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Xiaofei Tan [Thu, 10 Aug 2017 16:09:29 +0000 (00:09 +0800)]
scsi: hisi_sas: add v2 hw DFX feature
BugLink: https://bugs.launchpad.net/bugs/1739807
Add DFX feature for v2 hw. We are adding support for
the following errors:
- loss_of_dword_sync_count
- invalid_dword_count
- phy_reset_problem_count
- running_disparity_error_count
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c52108c61bd3e97495858e6c7423d312093fcfba) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Xiang Chen [Thu, 10 Aug 2017 16:09:28 +0000 (00:09 +0800)]
scsi: hisi_sas: fix v2 hw underflow residual value
BugLink: https://bugs.launchpad.net/bugs/1739807
The value dw0 is the residual bytes when UNDERFLOW error happens, but we
filled the residual with the value of dw3 before. So change the residual
from dw3 to dw0.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 01b361fc900dd8ef7f8537c20e3a1986ab63f4d1) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1739807
When some interrupts happen together, we need to process every interrupt
one-by-one, and should not return immediately when one interrupt process
is finished being processed.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c16db736653f5319d1d9b0d176f1f80394bd2614) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
1. Fix issue of controller reset required to send commands. For reset
process, it may be required to send commands to the controller, but
not during soft reset. So add HISI_SAS_NOT_ACCEPT_CMD_BIT to prevent
executing a task during this period.
2. Send a broadcast event in rescan topology to detect any topology
changes during reset.
3. Previously it was not ensured that libsas has processed the PHY up
and down events after reset. Potentially this could cause an issue
that we still process the PHY event after reset. So resolve this by
flushing shot workqueue in LLDD reset.
4. Port ID requires refresh after reset. The port ID generated after
reset is not guaranteed to be the same as before reset, so it needs
to be refreshed for each device's ITCT.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 917d3bdaf8f2ab3bace2bd60b78d83a2b3096d98) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Tyler Baicar [Tue, 16 Jan 2018 23:14:48 +0000 (17:14 -0600)]
ACPI / APEI: clear error status before acknowledging the error
BugLink: https://launchpad.net/bugs/1732990
Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.
Also, make sure to acknowledge the error if the error status read
fails.
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit aaf2c2fb0f51f91c699039440862b6ae9c25c10e) Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
gengdongjiu [Tue, 16 Jan 2018 23:14:47 +0000 (17:14 -0600)]
ACPI: APEI: fix the wrong iteration of generic error status block
BugLink: https://launchpad.net/bugs/1732990
The revision 0x300 generic error data entry is different
from the old version, but currently iterating through the
GHES estatus blocks does not take into account this difference.
This will lead to failure to get the right data entry if GHES
has revision 0x300 error data entry.
Update the GHES estatus iteration macro to properly increment using
acpi_hest_get_next(), and correct the iteration termination condition
because the status block data length only includes error data
length.
Convert the CPER estatus checking and printing iteration logic
to use same macro.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Tested-by: Tyler Baicar <tbaicar@codeaurora.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit c4335fdd38227788178953c101b77180504d7ea0) Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Alex Williamson [Tue, 16 Jan 2018 23:40:40 +0000 (17:40 -0600)]
vfio/pci: Virtualize Maximum Read Request Size
BugLink: https://launchpad.net/bugs/1732804
MRRS defines the maximum read request size a device is allowed to
make. Drivers will often increase this to allow more data transfer
with a single request. Completions to this request are bound by the
MPS setting for the bus. Aside from device quirks (none known), it
doesn't seem to make sense to set an MRRS value less than MPS, yet
this is a likely scenario given that user drivers do not have a
system-wide view of the PCI topology. Virtualize MRRS such that the
user can set MRRS >= MPS, but use MPS as the floor value that we'll
write to hardware.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit cf0d53ba4947aad6e471491d5b20a567cbe92e56) Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Alex Williamson [Tue, 16 Jan 2018 23:40:39 +0000 (17:40 -0600)]
vfio/pci: Virtualize Maximum Payload Size
BugLink: https://launchpad.net/bugs/1732804
With virtual PCI-Express chipsets, we now see userspace/guest drivers
trying to match the physical MPS setting to a virtual downstream port.
Of course a lone physical device surrounded by virtual interconnects
cannot make a correct decision for a proper MPS setting. Instead,
let's virtualize the MPS control register so that writes through to
hardware are disallowed. Userspace drivers like QEMU assume they can
write anything to the device and we'll filter out anything dangerous.
Since mismatched MPS can lead to AER and other faults, let's add it
to the kernel side rather than relying on userspace virtualization to
handle it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
(cherry picked from commit 523184972b282cd9ca17a76f6ca4742394856818) Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Xiaofei Tan [Thu, 4 Jan 2018 17:52:00 +0000 (18:52 +0100)]
scsi: hisi_sas: support zone management commands
BugLink: https://bugs.launchpad.net/bugs/1739891
Add two ATA commands, ATA_CMD_ZAC_MGMT_IN and ATA_CMD_ZAC_MGMT_OUT in
hisi_sas_get_ata_protocol(), to support SATA SMR disk.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c3fe8a2bbbc22bd4945ea69ab5a29913baeb35e4) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Jayachandran C [Thu, 4 Jan 2018 17:37:10 +0000 (10:37 -0700)]
i2c: xlp9xx: Get clock frequency with clk API
BugLink: https://bugs.launchpad.net/bugs/1738073
Get the input clock frequency to the controller from the linux clk
API, if it is available. This allows us to pass in the block input
frequency either from ACPI (using APD) or from device tree.
The old hardcoded frequency is used as default for backwards compatibility.
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Kamlakant Patel <kamlakant.patel@cavium.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
(cherry picked from commit c347b8fc22b21899154cc153a4951aaf226b4e1a) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: Add software workaround for Falkor erratum 1041
BugLink: https://bugs.launchpad.net/bugs/1738497
The ARM architecture defines the memory locations that are permitted
to be accessed as the result of a speculative instruction fetch from
an exception level for which all stages of translation are disabled.
Specifically, the core is permitted to speculatively fetch from the
4KB region containing the current program counter 4K and next 4K.
When translation is changed from enabled to disabled for the running
exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the
Falkor core may errantly speculatively access memory locations outside
of the 4KB region permitted by the architecture. The errant memory
access may lead to one of the following unexpected behaviors.
1) A System Error Interrupt (SEI) being raised by the Falkor core due
to the errant memory access attempting to access a region of memory
that is protected by a slave-side memory protection unit.
2) Unpredictable device behavior due to a speculative read from device
memory. This behavior may only occur if the instruction cache is
disabled prior to or coincident with translation being changed from
enabled to disabled.
The conditions leading to this erratum will not occur when either of the
following occur:
1) A higher exception level disables translation of a lower exception level
(e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0).
2) An exception level disabling its stage-1 translation if its stage-2
translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1
to 0 when HCR_EL2[VM] has a value of 1).
To avoid the errant behavior, software must execute an ISB immediately
prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0.
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
(backported from commit 932b50c7c1c65e6f23002e075b97ee083c4a9e71)
[ dannf: trivial offset fix in Kconfig ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
[kleber.souza: solve conflicts during cherry-pick] Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1738497 Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
[kleber.souza: fixed BugLink on the commit message.] Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
This fixes a previous patch which disabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting disabled for IPv4 pkts. And HW is dropping these pkts
for some reason.
Without this patch, IPv4 TSO appears to be broken:
WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s
when copying files via scp from test box to my home workstation.
Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs.
This patch restores performance for me, ipv6 looks good too.
Fixes: fa6d7cb5d76c ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts") Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 134059fd2775be79e26c2dff87d25cc2f6ea5626) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
This fixes a previous patch which enabled checksum offloading
for both IPv4 and IPv6 packets. So L3 checksum offload was
getting enabled for IPv6 pkts. And HW is dropping these pkts
as it assumes the pkt is IPv4 when IP csum offload is set
in the SQ descriptor.
Fixes: 3a9024f52c2e ("net: thunderx: Enable TSO and checksum offloads for ipv6") Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fa6d7cb5d76cf0467c61420fc9238045aedfd379) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Vadim Lomovtsev [Thu, 4 Jan 2018 17:31:00 +0000 (18:31 +0100)]
PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF
BugLink: https://bugs.launchpad.net/bugs/1736774
The Cavium ThunderX (CN8XXX) family of PCIe Root Ports does not advertise
an ACS capability. However, the RTL internally implements similar
protection as if ACS had Request Redirection, Completion Redirection,
Source Validation, and Upstream Forwarding features enabled.
Will Deacon [Fri, 5 Jan 2018 01:48:37 +0000 (18:48 -0700)]
locking/qrwlock: Prevent slowpath writers getting held up by fastpath
BugLink: https://bugs.launchpad.net/bugs/1732238
When a prospective writer takes the qrwlock locking slowpath due to the
lock being held, it attempts to cmpxchg the wmode field from 0 to
_QW_WAITING so that concurrent lockers also take the slowpath and queue
on the spinlock accordingly, allowing the lockers to drain.
Unfortunately, this isn't fair, because a fastpath writer that comes in
after the lock is made available but before the _QW_WAITING flag is set
can effectively jump the queue. If there is a steady stream of prospective
writers, then the waiter will be held off indefinitely.
This patch restores fairness by separating _QW_WAITING and _QW_LOCKED
into two distinct fields: _QW_LOCKED continues to occupy the bottom byte
of the lockword so that it can be cleared unconditionally when unlocking,
but _QW_WAITING now occupies what used to be the bottom bit of the reader
count. This then forces the slow-path for concurrent lockers.
Tested-by: Waiman Long <longman@redhat.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Adam Wallis <awallis@codeaurora.org> Tested-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Jeremy.Linton@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1507810851-306-6-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit d133166146333e1f13fc81c0e6c43c8d99290a8a) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Will Deacon [Fri, 5 Jan 2018 01:48:36 +0000 (18:48 -0700)]
locking/qrwlock, arm64: Move rwlock implementation over to qrwlocks
BugLink: https://bugs.launchpad.net/bugs/1732238
Now that the qrwlock can make use of WFE, remove our homebrewed rwlock
code in favour of the generic queued implementation.
Tested-by: Waiman Long <longman@redhat.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Adam Wallis <awallis@codeaurora.org> Tested-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jeremy.Linton@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1507810851-306-5-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 087133ac90763cd339b6b67f2998f87dcc136c52) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Will Deacon [Fri, 5 Jan 2018 01:48:35 +0000 (18:48 -0700)]
locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock
BugLink: https://bugs.launchpad.net/bugs/1732238
The qrwlock slowpaths involve spinning when either a prospective reader
is waiting for a concurrent writer to drain, or a prospective writer is
waiting for concurrent readers to drain. In both of these situations,
atomic_cond_read_acquire() can be used to avoid busy-waiting and make use
of any backoff functionality provided by the architecture.
This patch replaces the open-code loops and rspin_until_writer_unlock()
implementation with atomic_cond_read_acquire(). The write mode transition
zero to _QW_WAITING is left alone, since (a) this doesn't need acquire
semantics and (b) should be fast.
Tested-by: Waiman Long <longman@redhat.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Adam Wallis <awallis@codeaurora.org> Tested-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Jeremy.Linton@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1507810851-306-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit b519b56e378ee82caf9b079b04f5db87dedc3251) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Will Deacon [Fri, 5 Jan 2018 01:48:34 +0000 (18:48 -0700)]
locking/atomic: Add atomic_cond_read_acquire()
BugLink: https://bugs.launchpad.net/bugs/1732238
smp_cond_load_acquire() provides a way to spin on a variable with acquire
semantics until some conditional expression involving the variable is
satisfied. Architectures such as arm64 can potentially enter a low-power
state, waking up only when the value of the variable changes, which
reduces the system impact of tight polling loops.
This patch makes the same interface available to users of atomic_t,
atomic64_t and atomic_long_t, rather than require messy accesses to the
structure internals.
Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Jeremy.Linton@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1507810851-306-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 4df714be4dcf40bfb0d4af0f851a6e1977afa02e) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Will Deacon [Fri, 5 Jan 2018 01:48:33 +0000 (18:48 -0700)]
locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'
BugLink: https://bugs.launchpad.net/bugs/1732238
There's no good reason to keep the internal structure of struct qrwlock
hidden from qrwlock.h, particularly as it's actually needed for unlock
and ends up being abstracted independently behind the __qrwlock_write_byte()
function.
Stop pretending we can hide this stuff, and move the __qrwlock definition
into qrwlock, removing the __qrwlock_write_byte() nastiness and using the
same struct definition everywhere instead.
Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Jeremy.Linton@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1507810851-306-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit e0d02285f16e8d5810f3d5d5e8a5886ca0015d3b) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
scsi: libiscsi: Allow sd_shutdown on bad transport
BugLink: https://bugs.launchpad.net/bugs/1569925
If, for any reason, userland shuts down iscsi transport interfaces
before proper logouts - like when logging in to LUNs manually, without
logging out on server shutdown, or when automated scripts can't
umount/logout from logged LUNs - kernel will hang forever on its
sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all
still existent paths.
This happens because iscsi_eh_cmd_timed_out(), the transport layer
timeout helper, would tell the queue timeout function (scsi_times_out)
to reset the request timer over and over, until the session state is
back to logged in state. Unfortunately, during server shutdown, this
might never happen again.
Other option would be "not to handle" the issue in the transport
layer. That would trigger the error handler logic, which would also need
the session state to be logged in again.
Best option, for such case, is to tell upper layers that the command was
handled during the transport layer error handler helper, marking it as
DID_NO_CONNECT, which will allow completion and inform about the
problem.
After the session was marked as ISCSI_STATE_FAILED, due to the first
timeout during the server shutdown phase, all subsequent cmds will fail
to be queued, allowing upper logic to fail faster.
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry-picked from commit d754941225a7dbc61f6dd2173fa9498049f9a7ee next-20180117) Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Jens Axboe [Fri, 19 Jan 2018 12:43:12 +0000 (10:43 -0200)]
blk-mq-tag: check for NULL rq when iterating tags
BugLink: https://bugs.launchpad.net/bugs/1744300
Since we introduced blk-mq-sched, the tags->rqs[] array has been
dynamically assigned. So we need to check for NULL when iterating,
since there's a window of time where the bit is set, but we haven't
dynamically assigned the tags->rqs[] array position yet.
This is perfectly safe, since the memory backing of the request is
never going away while the device is alive.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry-pick from commit 7f5562d5ecc44c757599b201df928ba52fa05047) Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
This fixes crash in ttm_bo_vm_fault when hibmc driver is used.
Signed-off-by: Michal Srb <msrb@suse.com>
(cherry-picked from https://lists.freedesktop.org/archives/dri-devel/2017-November/159002.html)
[upstream fix at
https://lists.freedesktop.org/archives/dri-devel/2017-December/161184.html
is quite intrusive - this is the minimal fix] Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Jayachandran C [Tue, 9 Jan 2018 12:47:02 +0000 (04:47 -0800)]
UBUNTU: SAUCE: arm64: Branch predictor hardening for Cavium ThunderX2
When upstream applied this commit, the existing hardening function was used
instead of the new one. This commit applies the delta between upstream and
vendor commit.
CVE-2017-5754 ARM64 KPTI fixes
Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. The approach is similar to the one used for
Cortex-A CPUs, but in case of ThunderX2 we add another SMC call to
test if the firmware supports the capability.
If the secure firmware has been updated with the mitigation code to
invalidate the branch target buffer, we use the PSCI version call to
invoke it.
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
UBUNTU: SAUCE: arm64: Implement branch predictor hardening for Falkor
When upstream applied this commit, FALKOR_V1 was missing and only FALKOR
support was added. This commit applies the delta between upstream and vendor
commit.
CVE-2017-5754 ARM64 KPTI fixes
Falkor is susceptible to branch predictor aliasing and can
theoretically be attacked by malicious code. This patch
implements a mitigation for these attacks, preventing any
malicious entries from affecting other victim contexts.
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[will: fix label name when !CONFIG_KVM] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Under speculation, CPUs may mis-predict branches in bounds checks. Thus,
memory accesses under a bounds check may be speculated even if the
bounds check fails, providing a primitive for building a side channel.
The EBPF map code has a number of such bounds-checks accesses in
map_lookup_elem implementations. This patch modifies these to use the
nospec helpers to inhibit such side channels.
The JITted lookup_elem implementations remain potentially vulnerable,
and are disabled (with JITted code falling back to the C
implementations).
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Mark Rutland [Sat, 6 Jan 2018 01:10:14 +0000 (17:10 -0800)]
UBUNTU: SAUCE: arm: implement nospec_ptr()
CVE-2017-5754 ARM64 KPTI fixes
This patch implements nospec_ptr() for arm, following the recommended
architectural sequences for the arm and thumb instruction sets.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
This patch implements nospec_load() and nospec_ptr() for arm64,
following the recommended architectural sequence.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Document the rationale and usage of the new nospec*() helpers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Under speculation, CPUs may mis-predict branches in bounds checks. Thus,
memory accesses under a bounds check may be speculated even if the
bounds check fails, providing a primitive for building a side channel.
This patch adds helpers which can be used to inhibit the use of
out-of-bounds pointers and/or valeus read from these under speculation.
A generic implementation is provided for compatibility, but does not
guarantee safety under speculation. Architectures are expected to
override these helpers as necessary.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:42 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: KVM: Invalidate icache on guest exit for Cortex-A15
CVE-2017-5754 ARM64 KPTI fixes
In order to avoid aliasing attacks against the branch predictor
on Cortex-A15, let's invalidate the BTB on guest exit, which can
only be done by invalidating the icache (with ACTLR[0] being set).
We use the same hack as for A12/A17 to perform the vector decoding.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:41 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: Invalidate icache on prefetch abort outside of user mapping on Cortex-A15
CVE-2017-5754 ARM64 KPTI fixes
In order to prevent aliasing attacks on the branch predictor,
invalidate the icache on Cortex-A15, which has the side effect
of invalidating the BTB. This requires ACTLR[0] to be set to 1
(secure operation).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:40 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: Add icache invalidation on switch_mm for Cortex-A15
CVE-2017-5754 ARM64 KPTI fixes
In order to avoid aliasing attacks against the branch predictor,
Cortex-A15 require to invalidate the BTB when switching
from one user context to another. The only way to do so on this
CPU is to perform an ICIALLU, having set ACTLR[0] to 1 from secure
mode.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:39 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: KVM: Invalidate BTB on guest exit
CVE-2017-5754 ARM64 KPTI fixes
In order to avoid aliasing attacks against the branch predictor,
let's invalidate the BTB on guest exit. This is made complicated
by the fact that we cannot take a branch before invalidating the
BTB.
Another thing is that we perform the invalidation on all
implementations, no matter if they are affected or not.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:38 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: Invalidate BTB on prefetch abort outside of user mapping on Cortex A8, A9, A12 and A17
CVE-2017-5754 ARM64 KPTI fixes
In order to prevent aliasing attacks on the branch predictor,
invalidate the BTB on CPUs that are known to be affected when taking
a prefetch abort on a address that is outside of a user task limit.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Marc Zyngier [Thu, 11 Jan 2018 14:28:37 +0000 (14:28 +0000)]
UBUNTU: SAUCE: arm: Add BTB invalidation on switch_mm for Cortex-A9, A12 and A17
CVE-2017-5754 ARM64 KPTI fixes
In order to avoid aliasing attacks against the branch predictor,
some implementations require to invalidate the BTB when switching
from one user context to another.
For this, we reuse the existing implementation for Cortex-A8, and
apply it to A9, A12 and A17.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Catalin Marinas [Mon, 5 Feb 2018 14:36:10 +0000 (12:36 -0200)]
arm64: kpti: Fix the interaction between ASID switching and software PAN
CVE-2017-5754 ARM64 KPTI fixes
With ARM64_SW_TTBR0_PAN enabled, the exception entry code checks the
active ASID to decide whether user access was enabled (non-zero ASID)
when the exception was taken. On return from exception, if user access
was previously disabled, it re-instates TTBR0_EL1 from the per-thread
saved value (updated in switch_mm() or efi_set_pgd()).
Commit 7655abb95386 ("arm64: mm: Move ASID from TTBR0 to TTBR1") makes a
TTBR0_EL1 + ASID switching non-atomic. Subsequently, commit 27a921e75711
("arm64: mm: Fix and re-enable ARM64_SW_TTBR0_PAN") changes the
__uaccess_ttbr0_disable() function and asm macro to first write the
reserved TTBR0_EL1 followed by the ASID=0 update in TTBR1_EL1. If an
exception occurs between these two, the exception return code will
re-instate a valid TTBR0_EL1. Similar scenario can happen in
cpu_switch_mm() between setting the reserved TTBR0_EL1 and the ASID
update in cpu_do_switch_mm().
This patch reverts the entry.S check for ASID == 0 to TTBR0_EL1 and
disables the interrupts around the TTBR0_EL1 and ASID switching code in
__uaccess_ttbr0_disable(). It also ensures that, when returning from the
EFI runtime services, efi_set_pgd() doesn't leave a non-zero ASID in
TTBR1_EL1 by using uaccess_ttbr0_{enable,disable}.
The accesses to current_thread_info()->ttbr0 are updated to use
READ_ONCE/WRITE_ONCE.
As a safety measure, __uaccess_ttbr0_enable() always masks out any
existing non-zero ASID TTBR1_EL1 before writing in the new ASID.
Fixes: 27a921e75711 ("arm64: mm: Fix and re-enable ARM64_SW_TTBR0_PAN") Acked-by: Will Deacon <will.deacon@arm.com> Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Co-developed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(backported from commit 6b88a32c7af68895134872cdec3b6bfdb532d94e) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
arm64: capabilities: Handle duplicate entries for a capability
CVE-2017-5754 ARM64 KPTI fixes
Sometimes a single capability could be listed multiple times with
differing matches(), e.g, CPU errata for different MIDR versions.
This breaks verify_local_cpu_feature() and this_cpu_has_cap() as
we stop checking for a capability on a CPU with the first
entry in the given table, which is not sufficient. Make sure we
run the checks for all entries of the same capability. We do
this by fixing __this_cpu_has_cap() to run through all the
entries in the given table for a match and reuse it for
verify_local_cpu_feature().
Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 67948af41f2e6818edeeba5182811c704d484949) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Jayachandran C [Fri, 19 Jan 2018 12:22:47 +0000 (04:22 -0800)]
arm64: Branch predictor hardening for Cavium ThunderX2
CVE-2017-5754 ARM64 KPTI fixes
Use PSCI based mitigation for speculative execution attacks targeting
the branch predictor. We use the same mechanism as the one used for
Cortex-A CPUs, we expect the PSCI version call to have a side effect
of clearing the BTBs.
Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit f3d795d9b360523beca6d13ba64c2c532f601149) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Daniel Borkmann [Tue, 17 Oct 2017 14:55:54 +0000 (16:55 +0200)]
bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations
CVE-2017-5754 ARM64 KPTI fixes
PCPU_MIN_UNIT_SIZE is an implementation detail of the percpu
allocator. Given we support __GFP_NOWARN now, lets just let
the allocation request fail naturally instead. The two call
sites from BPF mistakenly assumed __GFP_NOWARN would work, so
no changes needed to their actual __alloc_percpu_gfp() calls
which use the flag already.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bc6d5031b43a2291de638ab9304320b4cae61689) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Yonghong Song [Thu, 5 Oct 2017 16:19:19 +0000 (09:19 -0700)]
bpf: perf event change needed for subsequent bpf helpers
CVE-2017-5754 ARM64 KPTI fixes
This patch does not impact existing functionalities.
It contains the changes in perf event area needed for
subsequent bpf_perf_event_read_value and
bpf_perf_prog_read_value helpers.
Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 97562633bcbac4a07d605ae628d7655fa71caaf5) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Daniel Borkmann [Sat, 19 Aug 2017 01:12:46 +0000 (03:12 +0200)]
bpf: inline map in map lookup functions for array and htab
CVE-2017-5754 ARM64 KPTI fixes
Avoid two successive functions calls for the map in map lookup, first
is the bpf_map_lookup_elem() helper call, and second the callback via
map->ops->map_lookup_elem() to get to the map in map implementation.
Implementation inlines array and htab flavor for map in map lookups.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7b0c2a0508b90fce79d3782b2e55d0e8bf6a283e) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Jayachandran C [Mon, 8 Jan 2018 06:53:35 +0000 (22:53 -0800)]
arm64: cputype: Add MIDR values for Cavium ThunderX2 CPUs
CVE-2017-5754 ARM64 KPTI fixes
Add the older Broadcom ID as well as the new Cavium ID for ThunderX2
CPUs.
Signed-off-by: Jayachandran C <jnair@caviumnetworks.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 0d90718871fe80f019b7295ec9d2b23121e396fb) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
arm64: Implement branch predictor hardening for Falkor
CVE-2017-5754 ARM64 KPTI fixes
Falkor is susceptible to branch predictor aliasing and can
theoretically be attacked by malicious code. This patch
implements a mitigation for these attacks, preventing any
malicious entries from affecting other victim contexts.
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[will: fix label name when !CONFIG_KVM and remove references to MIDR_FALKOR] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(backported from commit ec82b567a74fbdffdf418d4bb381d55f6a9096af) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>