Dave Martin [Tue, 1 Aug 2017 14:35:53 +0000 (15:35 +0100)]
arm64: syscallno is secretly an int, make it official
The upper 32 bits of the syscallno field in thread_struct are
handled inconsistently, being sometimes zero extended and sometimes
sign-extended. In fact, only the lower 32 bits seem to have any
real significance for the behaviour of the code: it's been OK to
handle the upper bits inconsistently because they don't matter.
Currently, the only place I can find where those bits are
significant is in calling trace_sys_enter(), which may be
unintentional: for example, if a compat tracer attempts to cancel a
syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
syscall-enter-stop, it will be traced as syscall 4294967295
rather than -1 as might be expected (and as occurs for a native
tracer doing the same thing). Elsewhere, reads of syscallno cast
it to an int or truncate it.
There's also a conspicuous amount of code and casting to bodge
around the fact that although semantically an int, syscallno is
stored as a u64.
Let's not pretend any more.
In order to preserve the stp x instruction that stores the syscall
number in entry.S, this patch special-cases the layout of struct
pt_regs for big endian so that the newly 32-bit syscallno field
maps onto the low bits of the stored value. This is not beautiful,
but benchmarking of the getpid syscall on Juno suggests indicates a
minor slowdown if the stp is split into an stp x and stp w.
Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 35d0e6fb4d219d64ab3b7cffef7a11a0662140f5)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
KVM doesn't follow the SMCCC when it comes to unimplemented calls,
and inject an UNDEF instead of returning an error. Since firmware
calls are now used for security mitigation, they are becoming more
common, and the undef is counter productive.
Instead, let's follow the SMCCC which states that -1 must be returned
to the caller when getting an unknown function number.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 14c7f5bc7a9059037feeb7666ca646c88eaacb5f)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Jiri Slaby [Thu, 24 Aug 2017 07:31:05 +0000 (09:31 +0200)]
futex: Remove duplicated code and fix undefined behaviour
There is code duplicated over all architecture's headers for
futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
and comparison of the result.
Remove this duplication and leave up to the arches only the needed
assembly which is now in arch_futex_atomic_op_inuser.
This effectively distributes the Will Deacon's arm64 fix for undefined
behaviour reported by UBSAN to all architectures. The fix was done in
commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with
FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
And as suggested by Thomas, check for negative oparg too, because it was
also reported to cause undefined behaviour report.
Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
remove pointless access_ok() checks") as access_ok there returns true.
We introduce it back to the helper for the sake of simplicity (it gets
optimized away anyway).
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390] Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile] Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org> Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64] Cc: linux-mips@linux-mips.org Cc: Rich Felker <dalias@libc.org> Cc: linux-ia64@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: peterz@infradead.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: sparclinux@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: linux-s390@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-hexagon@vger.kernel.org Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: linux-snps-arc@lists.infradead.org Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-xtensa@linux-xtensa.org Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Stafford Horne <shorne@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Richard Henderson <rth@twiddle.net> Cc: Chris Zankel <chris@zankel.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-parisc@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-alpha@vger.kernel.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
(cherry picked from commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Thomas Garnier [Thu, 7 Sep 2017 15:30:47 +0000 (08:30 -0700)]
arm64/syscalls: Move address limit check in loop
A bug was reported on ARM where set_fs might be called after it was
checked on the work pending function. ARM64 is not affected by this bug
but has a similar construct. In order to avoid any similar problems in
the future, the addr_limit_user_check function is moved at the beginning
of the loop.
Fixes: cf7de27ab351 ("arm64/syscalls: Check address limit on user-mode return") Reported-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pratyush Anand <panand@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Will Drewry <wad@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: David Howells <dhowells@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org Cc: Yonghong Song <yhs@fb.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1504798247-48833-5-git-send-email-keescook@chromium.org
(cherry picked from commit a2048e34d4655c06d31400646ae495bbfeb16b27)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Thomas Garnier [Thu, 7 Sep 2017 15:30:46 +0000 (08:30 -0700)]
arm/syscalls: Optimize address limit check
Disable the generic address limit check in favor of an architecture
specific optimized implementation. The generic implementation using
pending work flags did not work well with ARM and alignment faults.
The address limit is checked on each syscall return path to user-mode
path as well as the irq user-mode return function. If the address limit
was changed, a function is called to report data corruption (stopping
the kernel or process based on configuration).
The address limit check has to be done before any pending work because
they can reset the address limit and the process is killed using a
SIGKILL signal. For example the lkdtm address limit check does not work
because the signal to kill the process will reset the user-mode address
limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pratyush Anand <panand@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Will Drewry <wad@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: David Howells <dhowells@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org Cc: Yonghong Song <yhs@fb.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1504798247-48833-4-git-send-email-keescook@chromium.org
(cherry picked from commit e33f8d32677fa4f4f8996ef46748f86aac81ccff)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
The work pending loop can call set_fs after addr_limit_user_check
removed the _TIF_FSCHECK flag. This may happen at anytime based on how
ARM handles alignment exceptions. It leads to an infinite loop condition.
After discussion, it has been agreed that the generic approach is not
tailored to the ARM architecture and any fix might not be complete. This
patch will be replaced by an architecture specific implementation. The
work flag approach will be kept for other architectures.
Reported-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pratyush Anand <panand@redhat.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Will Drewry <wad@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: David Howells <dhowells@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org Cc: Yonghong Song <yhs@fb.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1504798247-48833-3-git-send-email-keescook@chromium.org
(cherry picked from commit 2404269bc4e77a67875c8db6667be34c9913c96e)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Thomas Garnier [Thu, 15 Jun 2017 01:12:03 +0000 (18:12 -0700)]
arm64/syscalls: Check address limit on user-mode return
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and
elevate privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
Thomas Garnier [Thu, 15 Jun 2017 01:12:02 +0000 (18:12 -0700)]
arm/syscalls: Check address limit on user-mode return
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and
elevate privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
for arm instruction immediate support. The global work mask is too big
to used on a single instruction so adapt ret_fast_syscall.
Thomas Garnier [Thu, 15 Jun 2017 01:12:01 +0000 (18:12 -0700)]
x86/syscalls: Check address limit on user-mode return
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and elevate
privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
The addr_limit_user_check function is added as a cross-architecture
function to check the address limit.
enter_lazy_tlb is called when a kernel thread rides on the back of
another mm, due to a context switch or an explicit call to unuse_mm
where a call to switch_mm is elided.
In these cases, it's important to keep the saved ttbr value up to date
with the active mm, otherwise we can end up with a stale value which
points to a potentially freed page table.
This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0
is kept up-to-date with the active mm for kernel threads.
Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e7ef4e829fe1a7e783a5e8233cde4889f4b65729)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper
contains kernel mappings and should never be installed into ttbr0. However,
this means that callers must avoid passing the init_mm to update_saved_ttbr0
which in turn can cause the saved ttbr0 value to be out-of-date in the context
of the idle thread. For example, EFI runtime services may leave the saved ttbr0
pointing at the EFI page table, and kernel threads may end up with stale
references to freed page tables.
This patch changes update_saved_ttbr0 so that the init_mm points the saved
ttbr0 value to the empty zero page, which always exists and never contains
valid translations. EFI and switch can then call into update_saved_ttbr0
unconditionally.
Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a5347596586db3ab5201ff24be75286dd911f897)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Add cputype definition macros for Qualcomm Datacenter Technologies
Falkor CPU in cputype.h. It's unfortunate that the first revision
of the Falkor CPU used the wrong part number 0x800, got fixed in v2
chip with part number 0xC00, and would be used the same value for
future revisions.
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Timur Tabi <timur@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0b69ec336d3d5c50710e19aaf0ca5f738fc20bc3)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Tue, 3 Oct 2017 17:25:46 +0000 (18:25 +0100)]
arm64: Use larger stacks when KASAN is selected
AddressSanitizer instrumentation can significantly bloat the stack, and
with GCC 7 this can result in stack overflows at boot time in some
configurations.
We can avoid this by doubling our stack size when KASAN is in use, as is
already done on x86 (and has been since KASAN was introduced).
Regardless of other patches to decrease KASAN's stack utilization,
kernels built with KASAN will always require more stack space than those
built without, and we should take this into account.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit b02faed15d86f846b0f23f47b92e0782baa873ed)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Christoffer Dall [Thu, 31 Aug 2017 20:24:25 +0000 (22:24 +0200)]
KVM: arm/arm64: Support uaccess of GICC_APRn
When migrating guests around we need to know the active priorities to
ensure functional virtual interrupt prioritization by the GIC.
This commit clarifies the API and how active priorities of interrupts in
different groups are represented, and implements the accessor functions
for the uaccess register range.
We live with a slight layering violation in accessing GICv3 data
structures from vgic-mmio-v2.c, because anything else just adds too much
complexity for us to deal with (it's not like there's a benefit
elsewhere in the code of an intermediate representation as is the case
with the VMCR). We accept this, because while doing v3 processing from
a file named something-v2.c can look strange at first, this really is
specific to dealing with the user space interface for something that
looks like a GICv2.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 9b87e7a8bfb5098129836757608b3cbbdc11245a)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
KVM: arm/arm64: Extract GICv3 max APRn index calculation
As we are about to access the APRs from the GICv2 uaccess interface,
make this logic generally available.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 50f5bd5718df9e71d1f4ba69a6480dbad54b2f24)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Marc Zyngier [Fri, 1 Sep 2017 16:51:56 +0000 (17:51 +0100)]
KVM: arm/arm64: vITS: Drop its_ite->lpi field
For unknown reasons, the its_ite data structure carries an "lpi" field
which contains the intid of the LPI. This is an obvious duplication
of the vgic_irq->intid field, so let's fix the only user and remove
the now useless field.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 7c7d2fa1cd1e9aa9b925bac409e91544eee52c03)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
James Morse [Tue, 18 Jul 2017 12:37:41 +0000 (13:37 +0100)]
KVM: arm/arm64: Fix guest external abort matching
The ARM-ARM has two bits in the ESR/HSR relevant to external aborts.
A range of {I,D}FSC values (of which bit 5 is always set) and bit 9 'EA'
which provides:
> an IMPLEMENTATION DEFINED classification of External Aborts.
This bit is in addition to the {I,D}FSC range, and has an implementation
defined meaning. KVM should always ignore this bit when handling external
aborts from a guest.
Remove the ESR_ELx_EA definition and rewrite its helper
kvm_vcpu_dabt_isextabt() to check the {I,D}FSC range. This merges
kvm_vcpu_dabt_isextabt() and the recently added is_abort_sea() helper.
CC: Tyler Baicar <tbaicar@codeaurora.org> Reported-by: gengdongjiu <gengdj.1984@gmail.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit bb428921b777a5e36753b5d6aa0ba8d46705cc0d)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Yury Norov [Sun, 20 Aug 2017 10:20:48 +0000 (13:20 +0300)]
arm64: cleanup {COMPAT_,}SET_PERSONALITY() macro
There is some work that should be done after setting the personality.
Currently it's done in the macro, which is not the best idea.
In this patch new arch_setup_new_exec() routine is introduced, and all
setup code is moved there, as suggested by Catalin:
https://lkml.org/lkml/2017/8/4/494
Yury Norov [Sun, 20 Aug 2017 10:20:47 +0000 (13:20 +0300)]
arm64: introduce separated bits for mm_context_t flags
Currently mm->context.flags field uses thread_info flags which is not
the best idea for many reasons. For example, mm_context_t doesn't need
most of thread_info flags. And it would be difficult to add new mm-related
flag if needed because it may easily interfere with TIF ones.
To deal with it, the new MMCF_AARCH32 flag is introduced for
mm_context_t->flags, where MMCF prefix stands for mm_context_t flags.
Also, mm_context_t flag doesn't require atomicity and ordering of the
access, so using set/clear_bit() is replaced with simple masks.
arm64: Remove the !CONFIG_ARM64_HW_AFDBM alternative code paths
Since the pte handling for hardware AF/DBM works even when the hardware
feature is not present, make the pte accessors implementation permanent
and remove the corresponding #ifdefs. The Kconfig option is kept as it
can still be used to disable the feature at the hardware level.
Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit af29678fe785ad79e7386e97b57093482f0dd7c4)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()
ptep_set_wrprotect() is only called on CoW mappings which are private
(!VM_SHARED) with the pte either read-only (!PTE_WRITE && PTE_RDONLY) or
writable and software-dirty (PTE_WRITE && !PTE_RDONLY && PTE_DIRTY).
There is no race with the hardware update of the dirty state: clearing
of PTE_RDONLY when PTE_WRITE (a.k.a. PTE_DBM) is set. This patch removes
the code setting the software PTE_DIRTY bit in ptep_set_wrprotect() as
superfluous. A VM_WARN_ONCE is introduced in case the above logic is
wrong or the core mm code changes its use of ptep_set_wrprotect().
Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 64c26841b34957ef8f33f7a9e8663aeee59c3ded)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: Move PTE_RDONLY bit handling out of set_pte_at()
Currently PTE_RDONLY is treated as a hardware only bit and not handled
by the pte_mkwrite(), pte_wrprotect() or the user PAGE_* definitions.
The set_pte_at() function is responsible for setting this bit based on
the write permission or dirty state. This patch moves the PTE_RDONLY
handling out of set_pte_at into the pte_mkwrite()/pte_wrprotect()
functions. The PAGE_* definitions to need to be updated to explicitly
include PTE_RDONLY when !PTE_WRITE.
The patch also removes the redundant PAGE_COPY(_EXEC) definitions as
they are identical to the corresponding PAGE_READONLY(_EXEC).
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 73e86cb03cf2ec0aa3789dc8621c6d53619cac5e)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
kvm: arm64: Convert kvm_set_s2pte_readonly() from inline asm to cmpxchg()
To take advantage of the LSE atomic instructions and also make the code
cleaner, convert the kvm_set_s2pte_readonly() function to use the more
generic cmpxchg().
Cc: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 0966253d7ccddc42a5211b3488bb4f202c04de1b)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Catalin Marinas [Mon, 26 Jun 2017 13:27:36 +0000 (14:27 +0100)]
arm64: Convert pte handling from inline asm to using (cmp)xchg
With the support for hardware updates of the access and dirty states,
the following pte handling functions had to be implemented using
exclusives: __ptep_test_and_clear_young(), ptep_get_and_clear(),
ptep_set_wrprotect() and ptep_set_access_flags(). To take advantage of
the LSE atomic instructions and also make the code cleaner, convert
these pte functions to use the more generic cmpxchg()/xchg().
Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 3bbf7157ac66a88d94b291d4d5e2b2a9319a0f90)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 14 Jul 2017 19:30:35 +0000 (20:30 +0100)]
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
Mark Rutland [Tue, 1 Aug 2017 17:51:15 +0000 (18:51 +0100)]
arm64: add on_accessible_stack()
Both unwind_frame() and dump_backtrace() try to check whether a stack
address is sane to access, with very similar logic. Both will need
updating in order to handle overflow stacks.
Factor out this logic into a helper, so that we can avoid further
duplication when we add overflow stacks.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit 12964443e8d1914010f9269f9f9abc4e122bc6ca)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 21 Jul 2017 13:25:33 +0000 (14:25 +0100)]
arm64: add basic VMAP_STACK support
This patch enables arm64 to be built with vmap'd task and IRQ stacks.
As vmap'd stacks are mapped at page granularity, stacks must be a multiple of
PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in
size.
To minimize the increase in Image size, IRQ stacks are dynamically allocated at
boot time, rather than embedding the boot CPU's IRQ stack in the kernel image.
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit e3067861ba6650a566a6273738c23c956ad55c02)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Mon, 31 Jul 2017 20:17:03 +0000 (21:17 +0100)]
arm64: use an irq stack pointer
We allocate our IRQ stacks using a percpu array. This allows us to generate our
IRQ stack pointers with adr_this_cpu, but bloats the kernel Image with the boot
CPU's IRQ stack. Additionally, these are packed with other percpu variables,
and aren't guaranteed to have guard pages.
When we enable VMAP_STACK we'll want to vmap our IRQ stacks also, in order to
provide guard pages and to permit more stringent alignment requirements. Doing
so will require that we use a percpu pointer to each IRQ stack, rather than
allocating a percpu IRQ stack in the kernel image.
This patch updates our IRQ stack code to use a percpu pointer to the base of
each IRQ stack. This will allow us to change the way the stack is allocated
with minimal changes elsewhere. In some cases we may try to backtrace before
the IRQ stack pointers are initialised, so on_irq_stack() is updated to account
for this.
In testing with cyclictest, there was no measureable difference between using
adr_this_cpu (for irq_stack) and ldr_this_cpu (for irq_stack_ptr) in the IRQ
entry path.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit f60fe78f133243e6de0f05fdefc3ed2f3c5085ca)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: assembler: allow adr_this_cpu to use the stack pointer
Given that adr_this_cpu already requires a temp register in addition
to the destination register, tweak the instruction sequence so that sp
may be used as well.
This will simplify switching to per-cpu stacks in subsequent patches. While
this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
adr_this_cpu in modules, and this is not problematic for the main kernel image.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: add more commit text] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit 8ea41b11ef746e1ac97f8c90911e5c61f8bd5cc0)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Wed, 19 Jul 2017 16:24:49 +0000 (17:24 +0100)]
arm64: factor out entry stack manipulation
In subsequent patches, we will detect stack overflow in our exception
entry code, by verifying the SP after it has been decremented to make
space for the exception regs.
This verification code is small, and we can minimize its impact by
placing it directly in the vectors. To avoid redundant modification of
the SP, we also need to move the initial decrement of the SP into the
vectors.
As a preparatory step, this patch introduces kernel_ventry, which
performs this decrement, and updates the entry code accordingly.
Subsequent patches will fold SP verification into kernel_ventry.
There should be no functional change as a result of this patch.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: turn into prep patch, expand commit msg] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit b11e5759bfac0c474d95ec4780b1566350e64cad)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 14 Jul 2017 14:54:36 +0000 (15:54 +0100)]
efi/arm64: add EFI_KIMG_ALIGN
The EFI stub is intimately coupled with the kernel, and takes advantage
of this by relocating the kernel at a weaker alignment than the
documented boot protocol mandates.
However, it does so by assuming it can align the kernel to the segment
alignment, and assumes that this is 64K. In subsequent patches, we'll
have to consider other details to determine this de-facto alignment
constraint.
This patch adds a new EFI_KIMG_ALIGN definition that will track the
kernel's de-facto alignment requirements. Subsequent patches will modify
this as required.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Matt Fleming <matt@codeblueprint.co.uk>
(cherry picked from commit 170976bcab073870af059b5e848c80689bd5e931)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Thu, 20 Jul 2017 11:26:48 +0000 (12:26 +0100)]
arm64: clean up irq stack definitions
Before we add yet another stack to the kernel, it would be nice to
ensure that we consistently organise stack definitions and related
helper functions.
This patch moves the basic IRQ stack defintions to <asm/memory.h> to
live with their task stack counterparts. Helpers used for unwinding are
moved into <asm/stacktrace.h>, where subsequent patches will add helpers
for other stacks. Includes are fixed up accordingly.
This patch is a pure refactoring -- there should be no functional
changes as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit f60ad4edcf07238a3d2646d65d8d217032452550)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 14 Jul 2017 15:39:21 +0000 (16:39 +0100)]
arm64: clean up THREAD_* definitions
Currently we define THREAD_SIZE and THREAD_SIZE_ORDER separately, with
the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
This is somewhat opaque, and will get in the way of future modifications
to THREAD_SIZE.
This patch cleans this up, defining both in terms of a common
THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
rather than using a number of definitions dependent on config symbols.
Subsequent patches will make use of this to alter the stack size used in
some configurations.
At the same time, these are moved into <asm/memory.h>, which will avoid
circular include issues in subsequent patches. To ensure that existing
code isn't adversely affected, <asm/thread_info.h> is updated to
transitively include these definitions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit dbc9344a68e506f19f80a9affc8fe7023a9cdc4c)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 14 Jul 2017 18:43:56 +0000 (19:43 +0100)]
arm64: factor out PAGE_* and CONT_* definitions
Some headers rely on PAGE_* definitions from <asm/page.h>, but cannot
include this due to potential circular includes. For example, a number
of definitions in <asm/memory.h> rely on PAGE_SHIFT, and <asm/page.h>
includes <asm/memory.h>.
This requires users of these definitions to include both headers, which
is fragile and error-prone.
This patch ameliorates matters by moving the basic definitions out to a
new header, <asm/page-def.h>. Both <asm/page.h> and <asm/memory.h> are
updated to include this, avoiding this fragility, and avoiding the
possibility of circular include dependencies.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit b6531456ba279bb7cea5dd2e7b3ec6a3ed0f4668)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
For historical reasons, we leave the top 16 bytes of our task and IRQ
stacks unused, a practice used to ensure that the SP can always be
masked to find the base of the current stack (historically, where
thread_info could be found).
However, this is not necessary, as:
* When an exception is taken from a task stack, we decrement the SP by
S_FRAME_SIZE and stash the exception registers before we compare the
SP against the task stack. In such cases, the SP must be at least
S_FRAME_SIZE below the limit, and can be safely masked to determine
whether the task stack is in use.
* When transitioning to an IRQ stack, we'll place a dummy frame onto the
IRQ stack before enabling asynchronous exceptions, or executing code
we expect to trigger faults. Thus, if an exception is taken from the
IRQ stack, the SP must be at least 16 bytes below the limit.
* We no longer mask the SP to find the thread_info, which is now found
via sp_el0. Note that historically, the offset was critical to ensure
that cpu_switch_to() found the correct stack for new threads that
hadn't yet executed ret_from_fork().
Given that, this initial offset serves no purpose, and can be removed.
This brings us in-line with other architectures (e.g. x86) which do not
rely on this masking.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: rebase, kill THREAD_START_SP, commit msg additions] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
(cherry picked from commit 34be98f4944f99076f049a6806fc5f5207a755d3)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Fri, 14 Jul 2017 11:23:09 +0000 (12:23 +0100)]
fork: allow arch-override of VMAP stack alignment
In some cases, an architecture might wish its stacks to be aligned to a
boundary larger than THREAD_SIZE. For example, using an alignment of
double THREAD_SIZE can allow for stack overflows smaller than
THREAD_SIZE to be detected by checking a single bit of the stack
pointer.
This patch allows architectures to override the alignment of VMAP'd
stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
THREAD_SIZE, as is the case today.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: linux-kernel@vger.kernel.org
(cherry picked from commit 48ac3c18cc62d4a23d5dc5c59f8720589d0de14b)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Commit a1d5ebaf8ccd ("arm64: big-endian: don't treat code as data when
copying sigret code") moved the 32-bit sigreturn trampoline code from
the aarch32_sigret_code array to kuser32.S. The commit removed the
array definition from signal32.c, but not its declaration in
signal32.h. Remove the leftover declaration.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 82d24d114f249d919b918ff8eefde4117db8f088)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Arnd Bergmann [Thu, 10 Aug 2017 14:52:31 +0000 (16:52 +0200)]
arm64: fix pmem interface definition
Defining the two functions as 'static inline' and exporting them
leads to the interesting case where we can use the interface
from loadable modules, but not from built-in drivers, as shown
in this link failure:
vers/nvdimm/claim.o: In function `nsio_rw_bytes':
claim.c:(.text+0x1b8): undefined reference to `arch_invalidate_pmem'
drivers/nvdimm/pmem.o: In function `pmem_dax_flush':
pmem.c:(.text+0x11c): undefined reference to `arch_wb_cache_pmem'
drivers/nvdimm/pmem.o: In function `pmem_make_request':
pmem.c:(.text+0x5a4): undefined reference to `arch_invalidate_pmem'
pmem.c:(.text+0x650): undefined reference to `arch_invalidate_pmem'
pmem.c:(.text+0x6d4): undefined reference to `arch_invalidate_pmem'
This removes the bogus 'static inline'.
Fixes: d50e071fdaa3 ("arm64: Implement pmem API support") Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit caf5ef7d15c511bbef691d0931adad56c2967435)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
The unwind code sets the sp member of struct stackframe to
'frame pointer + 0x10' unconditionally, without regard for whether
doing so produces a legal value. So let's simply remove it now that
we have stopped using it anyway.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 31e43ad3b74a5d7b282023b72f25fc677c14c727)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 7326749801396105aef0ed9229df746ac9e24300)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
vDSO VMA address is saved in mm_context for the purpose of using
restorer from vDSO page to return to userspace after signal handling.
In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA
on restore back to the place where it was on the dump.
With the exception for x86 (where there is API to map vDSO with
arch_prctl()), we move vDSO inherited from CRIU task to restoree
position by mremap().
CRIU does support arm64 architecture, but kernel doesn't update
context.vdso pointer after mremap(). Which results in translation
fault after signal handling on restored application:
https://github.com/xemul/criu/issues/288
Make vDSO code track the VMA address by supplying .mremap() fops
the same way it's done for x86 and arm32 by:
commit b059a453b1cf ("x86/vdso: Add mremap hook to vm_special_mapping")
commit 280e87e98c09 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO").
Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Christopher Covington <cov@codeaurora.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 739586951b8abe381a98797a5e27a0a9336333d6)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Robin Murphy [Tue, 25 Jul 2017 10:55:41 +0000 (11:55 +0100)]
arm64: Handle trapped DC CVAP
Cache clean to PoP is subject to the same access controls as to PoC, so
if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we
need to be prepared to handle it. To avoid getting into complicated
fights with binutils about ARMv8.2 options, we'll just cheat and use the
raw SYS instruction rather than the 'proper' DC alias.
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit e1bc5d1b8e0547c258e65dd97a03560f4d69e635)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Robin Murphy [Tue, 25 Jul 2017 10:55:40 +0000 (11:55 +0100)]
arm64: Expose DC CVAP to userspace
The ARMv8.2-DCPoP feature introduces persistent memory support to the
architecture, by defining a point of persistence in the memory
hierarchy, and a corresponding cache maintenance operation, DC CVAP.
Expose the support via HWCAP and MRS emulation.
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 7aac405ebb3224037efd56b73d82d181111cdac3)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Robin Murphy [Tue, 25 Jul 2017 10:55:39 +0000 (11:55 +0100)]
arm64: Convert __inval_cache_range() to area-based
__inval_cache_range() is already the odd one out among our data cache
maintenance routines as the only remaining range-based one; as we're
going to want an invalidation routine to call from C code for the pmem
API, let's tweak the prototype and name to bring it in line with the
clean operations, and to make its relationship with __dma_inv_area()
neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
The loop clearing the early page tables gets mildly massaged in the
process for the sake of consistency.
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit d46befef4c03fb61a62b3319ff5265aaac7bc465)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Robin Murphy [Tue, 25 Jul 2017 10:55:38 +0000 (11:55 +0100)]
arm64: mm: Fix set_memory_valid() declaration
Clearly, set_memory_valid() has never been seen in the same room as its
declaration... Whilst the type mismatch is such that kexec probably
wasn't broken in practice, fix it to match the definition as it should.
Fixes: 9b0aa14e3155 ("arm64: mm: add set_memory_valid()") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 09c2a7dc4ca2755892b74858fe3bf62b652ff9d0)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
arm64: unwind: disregard frame.sp when validating frame pointer
Currently, when unwinding the call stack, we validate the frame pointer
of each frame against frame.sp, whose value is not clearly defined, and
which makes it more difficult to link stack frames together across
different stacks. It is far better to simply check whether the frame
pointer itself points into a valid stack.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit c7365330753c55a061db0a1837a27fd5e44b1408)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Thu, 20 Jul 2017 13:01:01 +0000 (14:01 +0100)]
arm64: unwind: avoid percpu indirection for irq stack
Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument,
used to generate a percpu address. In all cases, they are passed
{raw_,}smp_processor_id(), so this parameter is redundant.
Since {raw_,}smp_processor_id() use a percpu variable internally, this
approach means we generate a percpu offset to find the current cpu, then
use this to index an array of percpu offsets, which we then use to find
the current CPU's IRQ stack pointer. Thus, most of the work is
redundant.
Instead, we can consistently use raw_cpu_ptr() to generate the CPU's
irq_stack pointer by simply adding the percpu offset to the irq_stack
address, which is simpler in both respects.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 096683724cb2eb95fea759a2580996df1039fdd0)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Wed, 26 Jul 2017 15:05:20 +0000 (16:05 +0100)]
arm64: move non-entry code out of .entry.text
Currently, cpu_switch_to and ret_from_fork both live in .entry.text,
though neither form the critical path for an exception entry.
In subsequent patches, we will require that code in .entry.text is part
of the critical path for exception entry, for which we can assume
certain properties (e.g. the presence of exception regs on the stack).
Neither cpu_switch_to nor ret_from_fork will meet these requirements, so
we must move them out of .entry.text. To ensure that neither are kprobed
after being moved out of .entry.text, we must explicitly blacklist them,
requiring a new NOKPROBE() asm helper.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit ed84b4e9582bdfeffc617589fe17dddfc5fe6672)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Wed, 26 Jul 2017 10:14:53 +0000 (11:14 +0100)]
arm64: consistently use bl for C exception entry
In most cases, our exception entry assembly branches to C handlers with
a BL instruction, but in cases where we do not expect to return, we use
B instead.
While this is correct today, it means that backtraces for fatal
exceptions miss the entry assembly (as the LR is stale at the point we
call C code), while non-fatal exceptions have the entry assembly in the
LR. In subsequent patches, we will need the LR to be set in these cases
in order to backtrace reliably.
This patch updates these sites to use a BL, ensuring consistency, and
preparing for backtrace rework. An ASM_BUG() is added after each of
these new BLs, which both catches unexpected returns, and ensures that
the LR value doesn't point to another function label.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 2d0e751a4789fc5ab4a5c9de5d6407b41fdfbbf0)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Mark Rutland [Wed, 26 Jul 2017 13:41:40 +0000 (14:41 +0100)]
arm64: Add ASM_BUG()
Currently. we can only use BUG() from C code, though there are
situations where we would like an equivalent mechanism in assembly code.
This patch refactors our BUG() definition such that it can be used in
either C or assembly, in the form of a new ASM_BUG().
The refactoring requires the removal of escape sequences, such as '\n'
and '\t', but these aren't strictly necessary as we can use ';' to
terminate assembler statements.
The low-level assembly is factored out into <asm/asm-bug.h>, with
<asm/bug.h> retained as the C wrapper.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Martin <dave.martin@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
(cherry picked from commit db44e9c5ecf1b60336ccfc176b07ba7d81d855e0)
CVE-2017-5753
CVE-2017-5715
CVE-2017-5754
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1747090
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(back ported from commit 35b3fde6203b932b2b1a5b53b3d8808abc9c4f60) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Colin Ian King [Fri, 9 Feb 2018 10:08:54 +0000 (10:08 +0000)]
Revert "mm, memory_hotplug: do not associate hotadded memory to zones until online"
BugLink: http://bugs.launchpad.net/bugs/1747069
Hotplug removal causes i386 crashes when exercised with the kernel
selftest mem-on-off-test script. A fix occurs in 4.15 however this
requires a large set of changes that are way too large to be SRU'able
and the least risky way forward is to revert the offending commit.
This fix reverts commit f1dd2cd13c4b ("mm, memory_hotplug: do not
associate hotadded memory to zones until online"), however the
revert required some manual fix-ups because of subsequent fixes after
this commit.
Note that running the mem-on-off-script is not always sufficient to
trigger the bug. A good reproducer is to run in a 4 CPU VM with 2GB
of memory and after running the script run sync and then re-install
the kernel packages to trip the issue. This has been thoroughly
tested on i386 and amd64 and given a solid soak test using the
ADT tests too.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Andy Whitcroft [Thu, 8 Feb 2018 20:37:45 +0000 (20:37 +0000)]
UBUNTU: SAUCE: turn off IBPB when full retpoline is present
CVE-2017-5715 (Spectre v2 Intel)
When we have full retpoline enabled then we do not actually require IBPB
flushes when entering the kernel. Add a new use_ibpb bit to represent
when we have retpoline enabled. Further split the enable bit into two
0x1 representing whether entry IBPB is enabled and 0x10 representing
whether kernel flushes for userspace/VMs etc are applied.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
KVM: x86: Add speculative control CPUID support for guests
CVE-2017-5715 (Spectre v2 Intel)
Provide the guest with the speculative control CPUID related values.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/svm: Set IBPB when running a different VCPU
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
going to run a VCPU different from what was previously run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/svm: Set IBRS value on VM entry and exit
CVE-2017-5715 (Spectre v2 Intel)
Set/restore the guests IBRS value on VM entry. On VM exit back to the
kernel save the guest IBRS value and then set IBRS to 1.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
KVM: SVM: Do not intercept new speculative control MSRs
CVE-2017-5715 (Spectre v2 Intel)
Allow guest access to the speculative control MSRs without being
intercepted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:55:47 +0000 (10:55 +0000)]
x86/microcode: Extend post microcode reload to support IBPB feature
CVE-2017-5715 (Spectre v2 Intel)
Add an IBPB feature check to the speculative control update check after
a microcode reload.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tom Lendacky [Wed, 20 Dec 2017 10:52:54 +0000 (10:52 +0000)]
x86/cpu/AMD: Add speculative control support for AMD
CVE-2017-5715 (Spectre v2 Intel)
Add speculative control support for AMD processors. For AMD, speculative
control is indicated as follows:
CPUID EAX=0x00000007, ECX=0x00 return EDX[26] indicates support for
both IBRS and IBPB.
CPUID EAX=0x80000008, ECX=0x00 return EBX[12] indicates support for
just IBPB.
On AMD family 0x10, 0x12 and 0x16 processors where either of the above
features are not supported, IBPB can be achieved by disabling
indirect branch predictor support in MSR 0xc0011021[14] at boot.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Mon, 20 Nov 2017 21:47:54 +0000 (13:47 -0800)]
x86/spec_ctrl: Add lock to serialize changes to ibrs and ibpb control
CVE-2017-5715 (Spectre v2 Intel)
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Thu, 16 Nov 2017 12:47:48 +0000 (04:47 -0800)]
x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
CVE-2017-5715 (Spectre v2 Intel)
There are 2 ways to control IBPB and IBRS
1. At boot time
noibrs kernel boot parameter will disable IBRS usage
noibpb kernel boot parameter will disable IBPB usage
Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.
2. At run time
echo 0 > /proc/sys/kernel/ibrs_enabled will turn off IBRS
echo 1 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in kernel
echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both userspace and kernel
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
[marcelo.cerri@canonical.com: add x86 guards to kernel/smp.c]
[marcelo.cerri@canonical.com: include asm/msr.h under x86 guard in kernel/sysctl.c] Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Sat, 21 Oct 2017 00:04:35 +0000 (17:04 -0700)]
x86/kvm: Toggle IBRS on VM entry and exit
CVE-2017-5715 (Spectre v2 Intel)
Restore guest IBRS on VM entry and set it to 1 on VM exit
back to kernel.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 13 Oct 2017 21:31:46 +0000 (14:31 -0700)]
x86/kvm: Set IBPB when switching VM
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB (Indirect branch prediction barrier) when switching VM.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Wei Wang [Tue, 7 Nov 2017 08:47:53 +0000 (16:47 +0800)]
x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD to kvm
CVE-2017-5715 (Spectre v2 Intel)
Add field to access guest MSR_IA332_SPEC_CTRL and MSR_IA32_PRED_CMD state.
Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 15 Nov 2017 01:16:30 +0000 (17:16 -0800)]
x86/entry: Stuff RSB for entry to kernel for non-SMEP platform
CVE-2017-5715 (Spectre v2 Intel)
Stuff RSB to prevent RSB underflow on non-SMEP platform.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Tue, 7 Nov 2017 21:52:42 +0000 (13:52 -0800)]
x86/mm: Only set IBPB when the new thread cannot ptrace current thread
CVE-2017-5715 (Spectre v2 Intel)
To reduce overhead of setting IBPB, we only do that when
the new thread cannot ptrace the current one. If the new
thread has ptrace capability on current thread, it is safe.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 20 Oct 2017 19:56:29 +0000 (12:56 -0700)]
x86/mm: Set IBPB upon context switch
CVE-2017-5715 (Spectre v2 Intel)
Set IBPB on context switch with changing of page table.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 15 Nov 2017 20:24:19 +0000 (12:24 -0800)]
x86/idle: Disable IBRS when offlining cpu and re-enable on wakeup
CVE-2017-5715 (Spectre v2 Intel)
Clear IBRS when cpu is offlined and set it when brining it back online.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Tue, 7 Nov 2017 02:19:14 +0000 (18:19 -0800)]
x86/idle: Disable IBRS entering idle and enable it on wakeup
CVE-2017-5715 (Spectre v2 Intel)
Clear IBRS on idle entry and set it on idle exit into kernel on mwait.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Fri, 13 Oct 2017 21:25:00 +0000 (14:25 -0700)]
x86/enter: Use IBRS on syscall and interrupts
CVE-2017-5715 (Spectre v2 Intel)
Set IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Sat, 16 Sep 2017 01:04:53 +0000 (18:04 -0700)]
x86/enter: MACROS to set/clear IBRS and set IBPB
CVE-2017-5715 (Spectre v2 Intel)
Setup macros to control IBRS and IBPB
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Wed, 27 Sep 2017 19:09:14 +0000 (12:09 -0700)]
x86/feature: Report presence of IBPB and IBRS control
CVE-2017-5715 (Spectre v2 Intel)
Report presence of IBPB and IBRS.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Tim Chen [Thu, 24 Aug 2017 16:34:41 +0000 (09:34 -0700)]
x86/feature: Enable the x86 feature to control Speculation
CVE-2017-5715 (Spectre v2 Intel)
cpuid ax=0x7, return rdx bit 26 to indicate presence of this feature
IA32_SPEC_CTRL (0x48) and IA32_PRED_CMD (0x49)
IA32_SPEC_CTRL, bit0 – Indirect Branch Restricted Speculation (IBRS)
IA32_PRED_CMD, bit0 – Indirect Branch Prediction Barrier (IBPB)
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
Craig Gallek [Mon, 30 Oct 2017 22:50:11 +0000 (18:50 -0400)]
tun/tap: sanitize TUNSETSNDBUF input
BugLink: https://launchpad.net/bugs/1748846
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
Julien Gomes [Wed, 25 Oct 2017 18:50:50 +0000 (11:50 -0700)]
tun: allow positive return values on dev_get_valid_name() call
BugLink: https://launchpad.net/bugs/1748846
If the name argument of dev_get_valid_name() contains "%d", it will try
to assign it a unit number in __dev__alloc_name() and return either the
unit number (>= 0) or an error code (< 0).
Considering positive values as error values prevent tun device creations
relying this mechanism, therefor we should only consider negative values
as errors here.
Signed-off-by: Julien Gomes <julien@arista.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5c25f65fd1e42685f7ccd80e0621829c105785d9) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Cong Wang [Fri, 13 Oct 2017 18:58:53 +0000 (11:58 -0700)]
tun: call dev_get_valid_name() before register_netdevice()
BugLink: https://launchpad.net/bugs/1748846
register_netdevice() could fail early when we have an invalid
dev name, in which case ->ndo_uninit() is not called. For tun
device, this is a problem because a timer etc. are already
initialized and it expects ->ndo_uninit() to clean them up.
We could move these initializations into a ->ndo_init() so
that register_netdevice() knows better, however this is still
complicated due to the logic in tun_detach().
Therefore, I choose to just call dev_get_valid_name() before
register_netdevice(), which is quicker and much easier to audit.
And for this specific case, it is already enough.
Fixes: 96442e42429e ("tuntap: choose the txq based on rxq") Reported-by: Dmitry Alexeev <avekceeb@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0ad646c81b2182f7fa67ec0c8c825e0ee165696d) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://launchpad.net/bugs/1742759
Add quirks for handling PX/HG systems. In this case, add
a quirk for a weston dGPU that only seems to properly power
down using ATPX power control rather than HG (_PR3).
v2: append a new weston XT
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> (v2) Reviewed-and-Tested-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Khaled Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law <law@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 04007c4ac40ca141e9160b33368e592394d1b6d3) Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap. The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE. On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.
The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD. Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.
The same sequence has been adopted by GCC for the GCC generated retpolines.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Kees Cook <keescook@google.com> Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 956ec9e7b59a1fb3fd4c9bc8a13a7f7700e9d7d2) Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>