Michael Ellerman [Tue, 27 Mar 2018 12:01:44 +0000 (23:01 +1100)]
powerpc: Add security feature flags for Spectre/Meltdown
This commit adds security feature flags to reflect the settings we
receive from firmware regarding Spectre/Meltdown mitigations.
The feature names reflect the names we are given by firmware on bare
metal machines. See the hostboot source for details.
Arguably these could be firmware features, but that then requires them
to be read early in boot so they're available prior to asm feature
patching, but we don't actually want to use them for patching. We may
also want to dynamically update them in future, which would be
incompatible with the way firmware features work (at the moment at
least). So for now just make them separate flags.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CVE-2018-3639 (powerpc)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
powerpc/rfi-flush: Differentiate enabled and patched flush types
Currently the rfi-flush messages print 'Using <type> flush' for all
enabled_flush_types, but that is not necessarily true -- as now the
fallback flush is always enabled on pseries, but the fixup function
overwrites its nop/branch slot with other flush types, if available.
So, replace the 'Using <type> flush' messages with '<type> flush is
available'.
Also, print the patched flush types in the fixup function, so users
can know what is (not) being used (e.g., the slower, fallback flush,
or no flush type at all if flush is disabled via the debugfs switch).
Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[backport: setup_64.c: split hunks due to L1D_FLUSH_FALLBACK location] Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
CVE-2018-3639 (powerpc)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Michael Ellerman [Wed, 14 Mar 2018 22:40:40 +0000 (19:40 -0300)]
powerpc/rfi-flush: Always enable fallback flush on pseries
This ensures the fallback flush area is always allocated on pseries,
so in case a LPAR is migrated from a patched to an unpatched system,
it is possible to enable the fallback flush in the target system.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CVE-2018-3639 (powerpc)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Michael Ellerman [Wed, 14 Mar 2018 22:40:38 +0000 (19:40 -0300)]
powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code
rfi_flush_enable() includes a check to see if we're already
enabled (or disabled), and in that case does nothing.
But that means calling setup_rfi_flush() a 2nd time doesn't actually
work, which is a bit confusing.
Move that check into the debugfs code, where it really belongs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 1e2a9fc7496955faacbbed49461d611b704a7505)
CVE-2018-3639 (powerpc)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mauricio: backport: hunk 2: update context lines for end of file] Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
CVE-2018-3639 (powerpc)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Jiri Kosina [Thu, 10 May 2018 20:47:18 +0000 (22:47 +0200)]
x86/bugs: Fix __ssb_select_mitigation() return type
__ssb_select_mitigation() returns one of the members of enum ssb_mitigation,
not ssb_mitigation_cmd; fix the prototype to reflect that.
Fixes: 24f7fc83b9204 ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The style for the 'status' file is CamelCase or this. _.
Fixes: fae1fa0fc ("proc: Provide details on speculation flaw mitigations") Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2]
as SSBD (Speculative Store Bypass Disable).
Hence changing it.
It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name
is going to be. Following the rename it would be SSBD_NO but that rolls out
to Speculative Store Bypass Disable No.
Also fixed the missing space in X86_FEATURE_AMD_SSBD.
[ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
[smb: adjustments for context and kvm missing v2 upstream code] Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Fri, 4 May 2018 13:12:06 +0000 (15:12 +0200)]
seccomp: Move speculation migitation control to arch code
The migitation control is simpler to implement in architecture code as it
avoids the extra function call to check the mode. Aside of that having an
explicit seccomp enabled mode in the architecture mitigations would require
even more workarounds.
Move it into architecture code and provide a weak function in the seccomp
code. Remove the 'which' argument as this allows the architecture to decide
which mitigations are relevant for seccomp.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Kees Cook [Thu, 3 May 2018 21:56:12 +0000 (14:56 -0700)]
seccomp: Add filter flag to opt-out of SSB mitigation
If a seccomp user is not interested in Speculative Store Bypass mitigation
by default, it can set the new SECCOMP_FILTER_FLAG_SPEC_ALLOW flag when
adding filters.
Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Thu, 3 May 2018 20:09:15 +0000 (22:09 +0200)]
prctl: Add force disable speculation
For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Stefan Bader [Mon, 7 May 2018 14:47:02 +0000 (16:47 +0200)]
UBUNTU: SAUCE: x86/bugs: Honour SPEC_CTRL default
Upstream implementation reads the content of the SPEC_CTRL
MSR once during boot to record the state of reserved bits.
Any access to this MSR (to enable/disable IBRS) needs to
preserve those reserved bits.
This tries to catch and convert all occurrances of the
Intel based IBRS changes we carry.
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Kees Cook [Tue, 1 May 2018 22:07:31 +0000 (15:07 -0700)]
seccomp: Enable speculation flaw mitigations
When speculation flaw mitigations are opt-in (via prctl), using seccomp
will automatically opt-in to these protections, since using seccomp
indicates at least some level of sandboxing is desired.
Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Sun, 29 Apr 2018 13:26:40 +0000 (15:26 +0200)]
x86/speculation: Add prctl for Speculative Store Bypass mitigation
Add prctl based control for Speculative Store Bypass mitigation and make it
the default mitigation for Intel and AMD.
Andi Kleen provided the following rationale (slightly redacted):
There are multiple levels of impact of Speculative Store Bypass:
1) JITed sandbox.
It cannot invoke system calls, but can do PRIME+PROBE and may have call
interfaces to other code
2) Native code process.
No protection inside the process at this level.
3) Kernel.
4) Between processes.
The prctl tries to protect against case (1) doing attacks.
If the untrusted code can do random system calls then control is already
lost in a much worse way. So there needs to be system call protection in
some way (using a JIT not allowing them or seccomp). Or rather if the
process can subvert its environment somehow to do the prctl it can already
execute arbitrary code, which is much worse than SSB.
To put it differently, the point of the prctl is to not allow JITed code
to read data it shouldn't read from its JITed sandbox. If it already has
escaped its sandbox then it can already read everything it wants in its
address space, and do much worse.
The ability to control Speculative Store Bypass allows to enable the
protection selectively without affecting overall system performance.
Based on an initial patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Sun, 29 Apr 2018 13:21:42 +0000 (15:21 +0200)]
x86/process: Allow runtime control of Speculative Store Bypass
The Speculative Store Bypass vulnerability can be mitigated with the
Reduced Data Speculation (RDS) feature. To allow finer grained control of
this eventually expensive mitigation a per task mitigation control is
required.
Add a new TIF_RDS flag and put it into the group of TIF flags which are
evaluated for mismatch in switch_to(). If these bits differ in the previous
and the next task, then the slow path function __switch_to_xtra() is
invoked. Implement the TIF_RDS dependent mitigation control in the slow
path.
If the prctl for controlling Speculative Store Bypass is disabled or no
task uses the prctl then there is no overhead in the switch_to() fast
path.
Update the KVM related speculation control functions to take TID_RDS into
account as well.
Based on a patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Sun, 29 Apr 2018 13:20:11 +0000 (15:20 +0200)]
prctl: Add speculation control prctls
Add two new prctls to control aspects of speculation related vulnerabilites
and their mitigations to provide finer grained control over performance
impacting mitigations.
PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
the following meaning:
Bit Define Description
0 PR_SPEC_PRCTL Mitigation can be controlled per task by
PR_SET_SPECULATION_CTRL
1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is
disabled
2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is
enabled
If all bits are 0 the CPU is not affected by the speculation misfeature.
If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
misfeature will fail.
PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.
The common return values are:
EINVAL prctl is not implemented by the architecture or the unused prctl()
arguments are not 0
ENODEV arg2 is selecting a not supported speculation misfeature
PR_SET_SPECULATION_CTRL has these additional return values:
ERANGE arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
ENXIO prctl control of the selected speculation misfeature is disabled
The first supported controlable speculation misfeature is
PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
architectures.
Based on an initial patch from Tim Chen and mostly rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CVE-2018-3639 (x86)
[tyhicks: Minor backport for SAUCE patch context] Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[smb: Created nospec.h] Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Thomas Gleixner [Sun, 29 Apr 2018 13:01:37 +0000 (15:01 +0200)]
x86/speculation: Create spec-ctrl.h to avoid include hell
Having everything in nospec-branch.h creates a hell of dependencies when
adding the prctl based switching mechanism. Move everything which is not
required in nospec-branch.h to spec-ctrl.h and fix up the includes in the
relevant files.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Ingo Molnar <mingo@kernel.org>
CVE-2018-3639 (x86)
[tyhicks: Minor backport for context] Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[smb: Additionally move vmexit_fill_RSB()] Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
AMD does not need the Speculative Store Bypass mitigation to be enabled.
The parameters for this are already available and can be done via MSR
C001_1020. Each family uses a different bit in that MSR for this.
[ tglx: Expose the bit mask via a variable and move the actual MSR fiddling
into the bugs code as that's the right thing to do and also required
to prepare for dynamic enable/disable ]
Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
CVE-2018-3639 (x86)
[tyhicks: Minor backport for context] Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
x86/bugs/intel: Set proper CPU features and setup RDS
Intel CPUs expose methods to:
- Detect whether RDS capability is available via CPUID.7.0.EDX[31],
- The SPEC_CTRL MSR(0x48), bit 2 set to enable RDS.
- MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable RRS.
With that in mind if spec_store_bypass_disable=[auto,on] is selected set at
boot-time the SPEC_CTRL MSR to enable RDS if the platform requires it.
Note that this does not fix the KVM case where the SPEC_CTRL is exposed to
guests which can muck with it, see patch titled :
KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS.
And for the firmware (IBRS to be set), see patch titled:
x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits
[ tglx: Distangled it from the intel implementation and kept the call order ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[backport drops clear feature for bad microcode] Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
Contemporary high performance processors use a common industry-wide
optimization known as "Speculative Store Bypass" in which loads from
addresses to which a recent store has occurred may (speculatively) see an
older value. Intel refers to this feature as "Memory Disambiguation" which
is part of their "Smart Memory Access" capability.
Memory Disambiguation can expose a cache side-channel attack against such
speculatively read values. An attacker can create exploit code that allows
them to read memory outside of a sandbox environment (for example,
malicious JavaScript in a web page), or to perform more complex attacks
against code running within the same privilege level, e.g. via the stack.
As a first step to mitigate against such attacks, provide two boot command
line control knobs:
By default affected x86 processors will power on with Speculative
Store Bypass enabled. Hence the provided kernel parameters are written
from the point of view of whether to enable a mitigation or not.
The parameters are as follows:
- auto - Kernel detects whether your CPU model contains an implementation
of Speculative Store Bypass and picks the most appropriate
mitigation.
- on - disable Speculative Store Bypass
- off - enable Speculative Store Bypass
[ tglx: Reordered the checks so that the whole evaluation is not done
when the CPU does not support RDS ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
x86/bugs, KVM: Support the combination of guest and host IBRS
A guest may modify the SPEC_CTRL MSR from the value used by the
kernel. Since the kernel doesn't use IBRS, this means a value of zero is
what is needed in the host.
But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to
the other bits as reserved so the kernel should respect the boot time
SPEC_CTRL value and use that.
This allows to deal with future extensions to the SPEC_CTRL interface if
any at all.
Note: This uses wrmsrl() instead of native_wrmsl(). I does not make any
difference as paravirt will over-write the callq *0xfff.. with the wrmsrl
assembler code.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
[tyhicks: Minor backport for context]
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[backport to review smv.c/vmx.c] Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
The 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to all
the other bits as reserved. The Intel SDM glossary defines reserved as
implementation specific - aka unknown.
As such at bootup this must be taken it into account and proper masking for
the bits in use applied.
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199511
[ tglx: Made x86_spec_ctrl_base __ro_after_init ]
Suggested-by: Jon Masters <jcm@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org>
CVE-2018-3639 (x86)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
David Woodhouse [Thu, 25 Jan 2018 16:14:13 +0000 (16:14 +0000)]
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
Also, for CPUs which don't speculate at all, don't report that they're
vulnerable to the Spectre variants either.
Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it
for now, even though that could be done with a simple comparison, on the
assumption that we'll have more to add.
Based on suggestions from Dave Hansen and Alan Cox.
Linus Torvalds [Tue, 1 May 2018 13:55:51 +0000 (15:55 +0200)]
x86/nospec: Simplify alternative_msr_write()
The macro is not type safe and I did look for why that "g" constraint for
the asm doesn't work: it's because the asm is more fundamentally wrong.
It does
movl %[val], %%eax
but "val" isn't a 32-bit value, so then gcc will pass it in a register,
and generate code like
movl %rsi, %eax
and gas will complain about a nonsensical 'mov' instruction (it's moving a
64-bit register to a 32-bit one).
Passing it through memory will just hide the real bug - gcc still thinks
the memory location is 64-bit, but the "movl" will only load the first 32
bits and it all happens to work because x86 is little-endian.
Convert it to a type safe inline function with a little trick which hands
the feature into the ALTERNATIVE macro.
Stefan Bader [Mon, 7 May 2018 19:35:06 +0000 (21:35 +0200)]
UBUNTU: SAUCE: Add X86_FEATURE_ARCH_CAPABILITIES
Upstream added the whole CPUID 0x00000007.EDX bits under a new
individual element. Right now we have all those under the scattered
bits, so add atch capabilities there as well.
CVE-2018-3639 (x86)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Tyler Hicks [Fri, 4 May 2018 20:30:12 +0000 (20:30 +0000)]
UBUNTU: SAUCE: LSM stacking: adjust prctl values
Since LSM Stacking is provided as an early preview in the Ubuntu
kernels, we should use unusually high values for the LSM Stacking prctls
to reduce the chances of colliding with an upstream feature.
Mark Rutland [Fri, 20 Apr 2018 22:05:00 +0000 (00:05 +0200)]
arm64: fix CONFIG_DEBUG_WX address reporting
BugLink: https://bugs.launchpad.net/bugs/1765850
In ptdump_check_wx(), we pass walk_pgd() a start address of 0 (rather
than VA_START) for the init_mm. This means that any reported W&X
addresses are offset by VA_START, which is clearly wrong and can make
them appear like userspace addresses.
Fix this by telling the ptdump code that we're walking init_mm starting
at VA_START. We don't need to update the addr_markers, since these are
still valid bounds regardless.
Cc: <stable@vger.kernel.org> Fixes: 1404d6f13e47 ("arm64: dump: Add checking for writable and exectuable pages") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <labbott@redhat.com> Reported-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit 1d08a044cf12aee37dfd54837558e3295287b343) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
dann frazier [Sat, 21 Apr 2018 16:50:00 +0000 (18:50 +0200)]
net: hns: Avoid action name truncation
BugLink: https://bugs.launchpad.net/bugs/1765977
When longer interface names are used, the action names exposed in
/proc/interrupts and /proc/irq/* maybe truncated. For example, when
using the predictable name algorithm in systemd on a HiSilicon D05,
I see:
Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f4ea89110df237da6fbcaab76af431e85f07d904) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
mm/madvise.c: fix madvise() infinite loop under special circumstances
CVE-2017-18208
MADVISE_WILLNEED has always been a noop for DAX (formerly XIP) mappings.
Unfortunately madvise_willneed() doesn't communicate this information
properly to the generic madvise syscall implementation. The calling
convention is quite subtle there. madvise_vma() is supposed to either
return an error or update &prev otherwise the main loop will never
advance to the next vma and it will keep looping for ever without a way
to get out of the kernel.
It seems this has been broken since introduction. Nobody has noticed
because nobody seems to be using MADVISE_WILLNEED on these DAX mappings.
[mhocko@suse.com: rewrite changelog] Link: http://lkml.kernel.org/r/20171127115318.911-1-guoxuenan@huawei.com Fixes: fe77ba6f4f97 ("[PATCH] xip: madvice/fadvice: execute in place") Signed-off-by: chenjie <chenjie6@huawei.com> Signed-off-by: guoxuenan <guoxuenan@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: zhangyi (F) <yi.zhang@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Shaohua Li <shli@fb.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 6ea8d958a2c95a1d514015d4e29ba21a8c0a1a91) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Colin King <colin.king@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Dan Carpenter [Mon, 23 Apr 2018 08:45:00 +0000 (10:45 +0200)]
staging: ncpfs: memory corruption in ncp_read_kernel()
CVE-2018-8822
If the server is malicious then *bytes_read could be larger than the
size of the "target" buffer. It would lead to memory corruption when we
do the memcpy().
Reported-by: Dr Silvio Cesare of InfoSect <Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(backported from commit 4c41aa24baa4ed338241d05494f2c595c885af8f) Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Colin King <colin.king@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The bug can be easily triggered, if an extra delay (e.g. 10ms) is added
between the test of DMF_FREEING & DMF_DELETING and dm_get() in
dm_get_from_kobject().
To fix it, we need to ensure the test of DMF_FREEING & DMF_DELETING and
dm_get() are done in an atomic way, so _minor_lock is used.
The other callers of dm_get() have also been checked to be OK: some
callers invoke dm_get() under _minor_lock, some callers invoke it under
_hash_lock, and dm_start_request() invoke it after increasing
md->open_count.
Cc: stable@vger.kernel.org Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
(cherry picked from commit b9a41d21dceadf8104812626ef85dc56ee8a60ed) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Colin King <colin.king@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Kevin Cernekee [Thu, 19 Apr 2018 12:23:00 +0000 (14:23 +0200)]
netlink: Add netns check on taps
CVE-2017-17449
Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity. Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.
Test case:
vpnns -- bash -c "ip link add nlmon0 type nlmon; \
ip link set nlmon0 up; \
tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
spi 0x1 mode transport \
auth sha1 0x6162633132330000000000000000000000000000 \
enc aes 0x00000000000000000000000000000000
grep --binary abc123 /tmp/nlmon.pcap
Signed-off-by: Kevin Cernekee <cernekee@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 93c647643b48f0131f02e45da3bd367d80443291) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Oliver Neukum [Thu, 19 Apr 2018 14:02:00 +0000 (16:02 +0200)]
media: usbtv: prevent double free in error case
CVE-2017-17975
Quoting the original report:
It looks like there is a double-free vulnerability in Linux usbtv driver
on an error path of usbtv_probe function. When audio registration fails,
usbtv_video_free function ends up freeing usbtv data structure, which
gets freed the second time under usbtv_video_fail label.
usbtv_audio_fail:
usbtv_video_free(usbtv); =>
v4l2_device_put(&usbtv->v4l2_dev);
=> v4l2_device_put
=> kref_put
=> v4l2_device_release
=> usbtv_release (CALLBACK)
=> kfree(usbtv) (1st time)
usbtv_video_fail:
usb_set_intfdata(intf, NULL);
usb_put_dev(usbtv->udev);
kfree(usbtv); (2nd time)
So, as we have refcounting, use it
Reported-by: Yavuz, Tuba <tuba@ece.ufl.edu> Signed-off-by: Oliver Neukum <oneukum@suse.com> CC: stable@vger.kernel.org Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
(cherry picked from commit 50e7044535537b2a54c7ab798cd34c7f6d900bd2) Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Khaled Elmously <khalid.elmously@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
drm/i915/edp: Do not do link training fallback or prune modes on EDP
BugLink: https://bugs.launchpad.net/bugs/1763271
In case of eDP because the panel has a fixed mode, the link rate
and lane count at which it is trained corresponds to the link BW
required to support the native resolution of the panel. In case of
panles with lower resolutions where fewer lanes are hooked up internally,
that number is reflected in the MAX_LANE_COUNT DPCD register of the panel.
So it is pointless to fallback to lower link rate/lane count in case
of link training failure on eDP connector since the lower link BW
will not support the native resolution of the panel and we cannot
prune the preferred mode on the eDP connector.
In case of Link training failure on the eDP panel, something is wrong
in the HW internally and hence driver errors out with a loud
and clear DRM_ERROR message.
v2:
* Fix the DEBUG_ERROR and add {} in else (Ville Syrjala)
Cc: Clinton Taylor <clinton.a.taylor@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=103369 Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1507835618-23051-1-git-send-email-manasi.d.navare@intel.com
(cherry picked from commit c0cfb10d9e1de490e36d3b9d4228c0ea0ca30677) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(backported from commit a306343bcd7df89d9d45a601929e26866e7b7a81) Signed-off-by: AceLan Kao <acelan.kao@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Jani Nikula [Thu, 12 Apr 2018 07:28:00 +0000 (09:28 +0200)]
drm/i915/dp: rename intel_dp_is_edp to intel_dp_is_port_edp
BugLink: https://bugs.launchpad.net/bugs/1763271
Emphasize that this is based on the port, not intel_dp. This is also in
line with the underlying intel_bios_is_port_edp() function. No
functional changes.
Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Jim Bride <jim.bride@linux.intel.com> Reviewed-by: Jim Bride <jim.bride@linux.intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170818093020.19160-1-jani.nikula@intel.com
(cherry picked from commit 7b91bf7f9196c8192b65a904ac7de7ecf989904f) Signed-off-by: AceLan Kao <acelan.kao@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Jim Bride [Thu, 12 Apr 2018 07:28:00 +0000 (09:28 +0200)]
drm/i915/edp: Allow alternate fixed mode for eDP if available.
BugLink: https://bugs.launchpad.net/bugs/1763271
Some fixed resolution panels actually support more than one mode,
with the only thing different being the refresh rate. Having this
alternate mode available to us is desirable, because it allows us to
test PSR on panels whose setup time at the preferred mode is too long.
With this patch we allow the use of the alternate mode if it's
available and it was specifically requested.
v2 and v3: Rebase
v4: * Fix up some leaky mode stuff (Chris)
* Rebase
v5: * Fix a NULL pointer derefrence (David Weinehall)
v6: * Whitespace / spelling / checkpatch clean-up; no functional
change. (David)
* Rebase
Cc: David Weinehall <david.weinehall@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: David Weinehall <david.weinehall@linux.intel.com> Signed-off-by: Jim Bride <jim.bride@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1502308133-26892-1-git-send-email-jim.bride@linux.intel.com
(cherry picked from commit dc911f5bd8aacfcf8aabd5c26c88e04c837a938e) Signed-off-by: AceLan Kao <acelan.kao@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ganapatrao Kulkarni <gpkulkarni@gklkml16.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20180307110803.32418-1-ganapatrao.kulkarni@cavium.com
[ Fixup wrt recent patchset by John Garry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
(backported from commit a8685f088819d21cd5aea5de4c184de427c3625d) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1762693
e1000e_check_for_copper_link() and e1000_check_for_copper_link_ich8lan()
are the two functions that may be assigned to mac.ops.check_for_link when
phy.media_type == e1000_media_type_copper. Commit 19110cfbb34d ("e1000e:
Separate signaling for link check/link up") changed the meaning of the
return value of check_for_link for copper media but only adjusted the first
function. This patch adjusts the second function likewise.
Reported-by: Christian Hesse <list@eworm.de> Reported-by: Gabriel C <nix.or.die@gmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=198047 Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Christian Hesse <list@eworm.de> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 4110e02eb45ea447ec6f5459c9934de0a273fb91) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Aaron Ma <aaron.ma@canonical.com> Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
George Cherian [Tue, 10 Apr 2018 19:29:00 +0000 (21:29 +0200)]
i2c: xlp9xx: Handle NACK on DATA properly
BugLink: https://bugs.launchpad.net/bugs/1762812
In case we receive NACK on DATA we shouldn't be resetting the controller,
rather we should issue STOP command. This will terminate the current
transaction and -EIO is returned.
While at that handle the SMBus Quick Command properly.
We shouldn't be setting the XLP9XX_I2C_CMD_READ/WRITE for such
transactions.
Signed-off-by: George Cherian <george.cherian@cavium.com> Reviewed-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
(cherry picked from commit e349d7d08e7044caf37a36409305edbd5af013c7) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
George Cherian [Tue, 10 Apr 2018 19:29:00 +0000 (21:29 +0200)]
i2c: xlp9xx: Check for Bus state before every transfer
BugLink: https://bugs.launchpad.net/bugs/1762812
I2C bus enters the STOP condition after the DATA_DONE interrupt is raised.
Essentially the driver should be checking the bus state before sending
any transaction. In case a transaction is initiated while the
bus is busy, the prior transaction's stop condition is not achieved.
Add the check to make sure the bus is not busy before every transaction.
Signed-off-by: George Cherian <george.cherian@cavium.com> Reviewed-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
(cherry picked from commit d3898a78521cd383d287b3ed5683f914c48c3be9) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
George Cherian [Tue, 10 Apr 2018 19:29:00 +0000 (21:29 +0200)]
i2c: xlp9xx: Handle transactions with I2C_M_RECV_LEN properly
BugLink: https://bugs.launchpad.net/bugs/1762812
In case of transaction with I2C_M_RECV_LEN set, make sure the driver reads
the first byte and then updates the RX fifo with the expected length. Set
threshold to 1 byte so that driver gets an interrupt on receiving the first byte.
After which the transfer length is updated depending on the received length.
Also report SMBus block read functionality.
Signed-off-by: George Cherian <george.cherian@cavium.com> Tested-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
(cherry picked from commit 41b1d4de96323e84c0a902e7e4b2c0f367e77f92) Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Jay Vosburgh [Thu, 5 Apr 2018 18:12:00 +0000 (20:12 +0200)]
virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
BugLink: https://bugs.launchpad.net/bugs/1761534
The operstate update logic will leave an interface in the
default UNKNOWN operstate if the interface carrier state never changes
from the default carrier up state set at creation. This includes the
case of an explicit call to netif_carrier_on, as the carrier on to on
transition has no effect on operstate.
This affects virtio-net for the case that the virtio peer does
not support VIRTIO_NET_F_STATUS (the feature that provides carrier state
updates). Without this feature, the virtio specification states that
"the link should be assumed active," so, logically, the operstate should
be UP instead of UNKNOWN. This has impact on user space applications
that use the operstate to make availability decisions for the interface.
Resolve this by changing the virtio probe logic slightly to call
netif_carrier_off for both the "with" and "without" VIRTIO_NET_F_STATUS
cases, and then the existing call to netif_carrier_on for the "without"
case will cause an operstate transition.
Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry-picked from commit bda7fab54828bbef2164bb23c0f6b1a7d05cc718) Signed-off-by: Eric Desrochers <eric.desrochers@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
"- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed
the call to arch_add_memory in mm/memory_hotplug.c to call with the
boolean argument set to true instead of false, and inverted the
semantics of that argument in the arch layers.
- The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the
semantic change in the arch layers, but didn't revert the change to the
arch_add_memory call in mm/memory_hotplug.c"
And also:
"It looks like the problem here is that the online_type is _MOVABLE but
can_online_high_movable(nid=255) is returning false:
if ((zone_idx(zone) > ZONE_NORMAL ||
online_type == MMOP_ONLINE_MOVABLE) &&
!can_online_high_movable(pfn_to_nid(pfn)))
This check was removed by upstream commit 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I
apply that commit (partially) to the 4.13.0-37.42 tree along with the
previous arch_add_memory patch to make the probe work, I can fully online
the GPU device memory as expected.
Commit 57c0a172.. implies that the can_online_high_movable() checks
weren't useful anyway, so in addition to the arch_add_memory fix, does
it make sense to revert the pieces of 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the
can_online_high_movable() check?"
This patch fixes the partial backport from bug #1747069, by removing
can_online_high_movable and fix the incorrectly set boolean argument
to arch_add_memory().
I've exercised this fix through the ADT tests for i386, amd64 and
ppc64el (in VMs) and observe no regressions.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Kamal Mostafa [Wed, 4 Apr 2018 16:32:58 +0000 (09:32 -0700)]
UBUNTU: SAUCE: remove ibrs_dump sysctl interface
BugLink: http://bugs.launchpad.net/bugs/1755627
The ibrs_dump sysctl interface landed in the Ubuntu backport df043b74ba71 ("x86/spec_ctrl: Add sysctl knobs to enable/disable
SPEC_CTRL feature") but nothing like it reached mainline.
This debug interface spams dmesg with many lines of output each time
/proc/sys/kernel/ibrs_dump is accessed (notably, every run of 'sysctl -a')
The interface returns only a dummy sysctl value; it has no other purpose
aside from generating dmesg output.
Remove the interface to squelch the excessive dmesg logging by 'sysctl -a'.
Fixes: df043b74ba71 ("x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature") Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Andy Lutomirski [Thu, 23 Jul 2015 22:37:48 +0000 (15:37 -0700)]
x86/entry/64: Don't use IST entry for #BP stack
CVE-2018-8897
There's nothing IST-worthy about #BP/int3. We don't allow kprobes
in the small handful of places in the kernel that run at CPL0 with
an invalid stack, and 32-bit kernels have used normal interrupt
gates for #BP forever.
Furthermore, we don't allow kprobes in places that have usergs while
in kernel mode, so "paranoid" is also unnecessary.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
(backported from commit d8ba61ba58c88d5207c1ba2f7d9a2280e7d03be9) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Linus Torvalds [Tue, 20 Mar 2018 19:16:59 +0000 (12:16 -0700)]
kvm/x86: fix icebp instruction handling
CVE-2018-1087
The undocumented 'icebp' instruction (aka 'int1') works pretty much like
'int3' in the absense of in-circuit probing equipment (except,
obviously, that it raises #DB instead of raising #BP), and is used by
some validation test-suites as such.
But Andy Lutomirski noticed that his test suite acted differently in kvm
than on bare hardware.
The reason is that kvm used an inexact test for the icebp instruction:
it just assumed that an all-zero VM exit qualification value meant that
the VM exit was due to icebp.
That is not unlike the guess that do_debug() does for the actual
exception handling case, but it's purely a heuristic, not an absolute
rule. do_debug() does it because it wants to ascribe _some_ reasons to
the #DB that happened, and an empty %dr6 value means that 'icebp' is the
most likely casue and we have no better information.
But kvm can just do it right, because unlike the do_debug() case, kvm
actually sees the real reason for the #DB in the VM-exit interruption
information field.
So instead of relying on an inexact heuristic, just use the actual VM
exit information that says "it was 'icebp'".
Right now the 'icebp' instruction isn't technically documented by Intel,
but that will hopefully change. The special "privileged software
exception" information _is_ actually mentioned in the Intel SDM, even
though the cause of it isn't enumerated.
Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 32d43cd391bacb5f0814c2624399a5dad3501d09) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
modification of a breakpoint - simplify it and remove the pointless
local variables.
Also update the stale Docbook while at it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit f67b15037a7a50c57f72e69a6d59941ad90a0f0f) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
When Linux brings a CPU down and back up, it switches to init_mm and then
loads swapper_pg_dir into CR3. With PCID enabled, this has the side effect
of masking off the ASID bits in CR3.
This can result in some confusion in the TLB handling code. If we
bring a CPU down and back up with any ASID other than 0, we end up
with the wrong ASID active on the CPU after resume. This could
cause our internal state to become corrupt, although major
corruption is unlikely because init_mm doesn't have any user pages.
More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
in the next context switch. The result of *that* is a failure to
resume from suspend with probability 1 - 1/6^(cpus-1).
Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Jiri Kosina <jikos@kernel.org> Fixes: 10af6235e0d3 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(backported from commit 72c0098d92cedb11c7e0151e84918840a4e96b31)
[tyhicks: initialize_tlbstate_and_flush() was added in 72be211ba] Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Flush indirect branches when switching into a process that marked itself
non dumpable. This protects high value processes like gpg better,
without having too high performance overhead.
If done naïvely, we could switch to a kernel idle thread and then back
to the original process, such as:
process A -> idle -> process A
In such scenario, we do not have to do IBPB here even though the process
is non-dumpable, as we are switching back to the same process after a
hiatus.
To avoid the redundant IBPB, which is expensive, we track the last mm
user context ID. The cost is to have an extra u64 mm context id to track
the last mm we were using before switching to the init_mm used by idle.
Avoiding the extra IBPB is probably worth the extra memory for this
common scenario.
For those cases where tlb_defer_switch_to_init_mm() returns true (non
PCID), lazy tlb will defer switch to init_mm, so we will not be changing
the mm for the process A -> idle -> process A switch. So IBPB will be
skipped for this case.
Thanks to the reviewers and Andy Lutomirski for the suggestion of
using ctx_id which got rid of the problem of mm pointer recycling.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk
(backported from commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7) Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Using a ptrace access check in the middle of a task switch was causing
a hard lockup in some cases when the old task was confined by AppArmor.
If the AppArmor policy for the the old task didn't allow the task to
ptrace the new task, AppArmor would attempt to emit an audit message and
deadlock on the task's pi_lock would occur. The fix is to revert this
change and go with upstream's implementation that uses the task's
dumpable state to determine if IBPB should be used.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Andy Whitcroft [Tue, 3 Apr 2018 16:52:00 +0000 (18:52 +0200)]
UBUNTU: [Packaging] include the retpoline extractor in the headers
BugLink: http://bugs.launchpad.net/bugs/1760876
Out of tree builds utilise the kernel Makefiles and therefore
we need to include all direct dependencies of those Makefiles.
Now that we call out to the repoline extractor during builds we
must carry the extractor with the headers. Move the extractor
to the kernel scripts directory and ensure its name is unique.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
UBUNTU: [Packaging] final-checks -- remove check for empty retpoline files
With the new support for removing known safe retpoline sequences
from the ones which were detected it is now completely valid to
end up with an empty retpoline file in the abi. Remove the
check for empty retpoline files so this will not cause an error.
BugLink: http://bugs.launchpad.net/bugs/1758856 Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Peter Zijlstra [Tue, 16 Jan 2018 09:38:09 +0000 (10:38 +0100)]
x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
BugLink: http://bugs.launchpad.net/bugs/1758856
The objtool retpoline validation found this indirect jump. Seeing how
it's on CPU bringup before we run userspace it should be safe, annotate
it.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit bd89004f6305cbf7352238f61da093207ee518d6) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org
(cherry picked from commit c940a3fb1e2e9b7d03228ab28f375fb5a47ff699) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org
(cherry picked from commit 1a29b5b7f347a1a9230c1e0af5b37e3e571588ab) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Andy Whitcroft [Thu, 8 Mar 2018 15:48:31 +0000 (15:48 +0000)]
UBUNTU: [Packaging] retpoline -- add safe usage hint support
BugLink: http://bugs.launchpad.net/bugs/1758856
Use the upstream retpoline safe hinting support to clear out known
safe retpoline sequences from those detected. At the same time
switch to extracting the indirect sequences and associated hints
to .o generation time. This allows it to be run on the cache hot
data and to be run in parallel on the builders.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Peter Zijlstra [Wed, 17 Jan 2018 15:58:11 +0000 (16:58 +0100)]
x86/paravirt, objtool: Annotate indirect calls
BugLink: http://bugs.launchpad.net/bugs/1758856
Paravirt emits indirect calls which get flagged by objtool retpoline
checks, annotate it away because all these indirect calls will be
patched out before we start userspace.
This patching happens through alternative_instructions() ->
apply_paravirt() -> pv_init_ops.patch() which will eventually end up
in paravirt_patch_default(). This function _will_ write direct
alternatives.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 3010a0663fd949d122eca0561b06b0a9453f7866) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
9e0e3c5130e9 ("x86/speculation, objtool: Annotate indirect calls/jumps for objtool")
... we added annotations for CALL_NOSPEC/JMP_NOSPEC on 64-bit x86 kernels,
but we did not annotate the 32-bit path.
Annotate it similarly.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180314112427.22351-1-apw@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit a14bff131108faf50cc0cf864589fd71ee216c96) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
[ klebers: changed commit message for the upstream one. ] Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 9e0e3c5130e949c389caabc8033e9799b129e429) Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1754584
Fix mmap'd libaio read on non-prefaulted page deadlock. This is a hot fix
from ZFS upstream that ensure pages do not deadlock.
Performing a read with the target data in a mmap'd page that is map'd into
the same blocks that are being read causes a lock on the page and a further
lock on the same page when the page is being faulted in, causing deadlock.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>