Export these symbols such that UBIFS can implement
->migratepage.
Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
get a TLB refill exception in it when KVM is built as a module.
This was observed to happen with the host MIPS kernel running under
QEMU, due to a not entirely transparent optimisation in the QEMU TLB
handling where TLB entries replaced with TLBWR are copied to a separate
part of the TLB array. Code in those pages continue to be executable,
but those mappings persist only until the next ASID switch, even if they
are marked global.
An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
switching to the guest exception base. Subsequent TLB mapped kernel
instructions just prior to switching to the guest trigger a TLB refill
exception, which enters the guest exception handlers without updating
EPC. This appears as a guest triggered TLB refill on a host kernel
mapped (host KSeg2) address, which is not handled correctly as user
(guest) mode accesses to kernel (host) segments always generate address
error exceptions.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Currently pmd_mknotpresent will use a zero entry to respresent an
invalidated pmd.
Unfortunately this definition clashes with pmd_none, thus it is
possible for a race condition to occur if zap_pmd_range sees pmd_none
whilst __split_huge_pmd_locked is running too with pmdp_invalidate
just called.
This patch fixes the race condition by modifying pmd_mknotpresent to
create non-zero faulting entries (as is done in other architectures),
removing the ambiguity with pmd_none.
[catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT]
Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In a subsequent patch, pmd_mknotpresent will clear the valid bit of the
pmd entry, resulting in a not-present entry from the hardware's
perspective. Unfortunately, pmd_present simply checks for a non-zero pmd
value and will therefore continue to return true even after a
pmd_mknotpresent operation. Since pmd_mknotpresent is only used for
managing huge entries, this is only an issue for the 3-level case.
This patch fixes the 3-level pmd_present implementation to take into
account the valid bit. For bisectability, the change is made before the
fix to pmd_mknotpresent.
Fixes: 8d9625070073 ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: Russell King <linux@armlinux.org.uk> Cc: Steve Capper <Steve.Capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Olga Kornievskaia reports that the following test fails to trigger
an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE.
fd0 = open(foo, RDRW) -- should be open on the wire for "both"
fd1 = open(foo, RDONLY) -- should be open on the wire for "read"
close(fd0) -- should trigger an open_downgrade
read(fd1)
close(fd1)
The issue is that we're missing a check for whether or not the current
state transitioned from an O_RDWR state as opposed to having transitioned
from a combination of O_RDONLY and O_WRONLY.
Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In "NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code"
unconditional d_drop() after the ->open_context() had been removed. It had
been correct for success cases (there ->open_context() itself had been doing
dcache manipulations), but not for error ones. Only one of those (ENOENT)
got a compensatory d_drop() added in that commit, but in fact it should've
been done for all errors. As it is, the case of O_CREAT non-exclusive open
on a hashed negative dentry racing with e.g. symlink creation from another
client ended up with ->open_context() getting an error and proceeding to
call nfs_lookup(). On a hashed dentry, which would've instantly triggered
BUG_ON() in d_materialise_unique() (or, these days, its equivalent in
d_splice_alias()).
Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Use set_posix_acl, which includes proper permission checks, instead of
calling ->set_acl directly. Without this anyone may be able to grant
themselves permissions to a file by setting the ACL.
Lock the inode to make the new checks atomic with respect to set_acl.
(Also, nfsd was the only caller of set_acl not locking the inode, so I
suspect this may fix other races.)
This also simplifies the code, and ensures our ACLs are checked by
posix_acl_valid.
The permission checks and the inode locking were lost with commit 4ac7249e, which changed nfsd to use the set_acl inode operation directly
instead of going through xattr handlers.
Reported-by: David Sinquin <david@sinquin.eu>
[agreunba@redhat.com: use set_posix_acl] Fixes: 4ac7249e Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Factor out part of posix_acl_xattr_set into a common function that takes
a posix_acl, which nfsd can also call.
The prototype already exists in include/linux/posix_acl.h.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
It used to be the case that state had an rwlock that was locked for write
by downgrades, but for read for upgrades (opens). Well, the problem is
if there are two competing opens for the same state, they step on
each other toes potentially leading to leaking file descriptors
from the state structure, since access mode is a bitmap only set once.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
As vm.dirty_[background_]bytes can't be applied verbatim to multiple
cgroup writeback domains, they get converted to percentages in
domain_dirty_limits() and applied the same way as
vm.dirty_[background]ratio. However, if the specified bytes is lower
than 1% of available memory, the calculated ratios become zero and the
writeback domain gets throttled constantly.
Fix it by using per-PAGE_SIZE instead of percentage for ratio
calculations. Also, the updated DIV_ROUND_UP() usages now should
yield 1/4096 (0.0244%) as the minimum ratio as long as the specified
bytes are above zero.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Miao Xie <miaoxie@huawei.com> Link: http://lkml.kernel.org/g/57333E75.3080309@huawei.com Fixes: 9fc3a43e1757 ("writeback: separate out domain_dirty_limits()") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adjusted comment based on Jan's suggestion. Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The freq_table array is not populated before calling
thermal_of_cooling_register. The code which populates the freq table was
introduced in commit f6859014.
This should be done before registering new thermal cooling device.
The log shows effects of this wrong decision.
[ 2.172614] cpu cpu1: Failed to get voltage for frequency 1984518656000: -34
[ 2.220863] cpu cpu0: Failed to get voltage for frequency 1984524416000: -34
Fixes: f6859014c7e7 ("thermal: cpu_cooling: Store frequencies in descending order") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Javi Merino <javi.merino@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Make sure consumers do not overwrite gpio flags for pins that have
already been claimed.
While adding support for gpio drivers to refuse a request using
unsupported flags, the order of when the requested flag was checked and
the new flags were applied was reversed to that consumers could
overwrite flags for already requested gpios.
This not only affects device-tree setups where two drivers could request
the same gpio using conflicting configurations, but also allowed user
space to clear gpio flags for already claimed pins simply by attempting
to export them through the sysfs interface. By for example clearing the
FLAG_ACTIVE_LOW flag this way, user space could effectively change the
polarity of a signal.
Reverting this change obviously prevents gpio drivers from doing sanity
checks on the flags in their request callbacks. Fortunately only one
recently added driver (gpio-tps65218 in v4.6) appears to do this, and a
follow up patch could restore this functionality through a different
interface.
Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fix boot crash that triggers if this driver is built into a kernel and
run on non-AMD systems.
AMD northbridges users call amd_cache_northbridges() and it returns
a negative value to signal that we weren't able to cache/detect any
northbridges on the system.
At least, it should do so as all its callers expect it to do so. But it
does return a negative value only when kmalloc() fails.
Fix it to return -ENODEV if there are no NBs cached as otherwise, amd_nb
users like amd64_edac, for example, which relies on it to know whether
it should load or not, gets loaded on systems like Intel Xeons where it
shouldn't.
Reported-and-tested-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1466097230-5333-2-git-send-email-bp@alien8.de Link: https://lkml.kernel.org/r/5761BEB0.9000807@cybernetics.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fix kprobe_fault_handler() to clear the TF (trap flag) bit of
the flags register in the case of a fault fixup on single-stepping.
If we put a kprobe on the instruction which caused a
page fault (e.g. actual mov instructions in copy_user_*),
that fault happens on the single-stepping buffer. In this
case, kprobes resets running instance so that the CPU can
retry execution on the original ip address.
However, current code forgets to reset the TF bit. Since this
fault happens with TF bit set for enabling single-stepping,
when it retries, it causes a debug exception and kprobes
can not handle it because it already reset itself.
On the most of x86-64 platform, it can be easily reproduced
by using kprobe tracer. E.g.
# cd /sys/kernel/debug/tracing
# echo p copy_user_enhanced_fast_string+5 > kprobe_events
# echo 1 > events/kprobes/enable
And you'll see a kernel panic on do_debug(), since the debug
trap is not handled by kprobes.
To fix this problem, we just need to clear the TF bit when
resetting running kprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: systemtap@sourceware.org Link: http://lkml.kernel.org/r/20160611140648.25885.37482.stgit@devbox
[ Updated the comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
For newer versions of Syslinux, we need ldlinux.c32 in addition to
isolinux.bin to reside on the boot disk, so if the latter is found,
copy it, too, to the isoimage tree.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
CPU 1 CPU 2
static_key_slow_inc()
atomic_inc_not_zero()
-> key.enabled == 0, no increment
jump_label_lock()
atomic_inc_return()
-> key.enabled == 1 now
static_key_slow_inc()
atomic_inc_not_zero()
-> key.enabled == 1, inc to 2
return
** static key is wrong!
jump_label_update()
jump_label_unlock()
Testing the static key at the point marked by (**) will follow the
wrong path for jumps that have not been patched yet. This can
actually happen when creating many KVM virtual machines with userspace
LAPIC emulation; just run several copies of the following program:
int main(void)
{
for (;;) {
int kvmfd = open("/dev/kvm", O_RDONLY);
int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
close(ioctl(vmfd, KVM_CREATE_VCPU, 1));
close(vmfd);
close(kvmfd);
}
return 0;
}
Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call.
The static key's purpose is to skip NULL pointer checks and indeed one
of the processes eventually dereferences NULL.
As explained in the commit that introduced the bug:
jump_label_update() needs key.enabled to be true. The solution adopted
here is to temporarily make key.enabled == -1, and use go down the
slow path when key.enabled <= 0.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 706249c222f6 ("locking/static_keys: Rework update logic") Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com
[ Small stylistic edits to the changelog and the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
... fixes spin_is_locked() and spin_unlock_wait() for the usage
in ipc/sem and netfilter, it does not in fact work right for the
usage in task_work and futex.
So while the 2 locks crossed problem:
spin_lock(A) spin_lock(B)
if (!spin_is_locked(B)) spin_unlock_wait(A)
foo() foo();
... works with the smp_mb() injected by both spin_is_locked() and
spin_unlock_wait(), this is not sufficient for:
flag = 1;
smp_mb(); spin_lock()
spin_unlock_wait() if (!flag)
// add to lockless list
// iterate lockless list
... because in this scenario, the store from spin_lock() can be delayed
past the load of flag, uncrossing the variables and loosing the
guarantee.
This patch reworks spin_is_locked() and spin_unlock_wait() to work in
both cases by exploiting the observation that while the lock byte
store can be delayed, the contender must have registered itself
visibly in other state contained in the word.
It also allows for architectures to override both functions, as PPC
and ARM64 have an additional issue for which we currently have no
generic solution.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Giovanni Gherdovich <ggherdovich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <waiman.long@hpe.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Recursive locking for ww_mutexes was originally conceived as an
exception. However, it is heavily used by the DRM atomic modesetting
code. Currently, the recursive deadlock is checked after we have queued
up for a busy-spin and as we never release the lock, we spin until
kicked, whereupon the deadlock is discovered and reported.
A simple solution for the now common problem is to move the recursive
deadlock discovery to the first action when taking the ww_mutex.
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1464293297-19777-1-git-send-email-chris@chris-wilson.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The kernel-doc for the of_irq_get[_byname]() is clearly inadequate in
describing the return values -- of_irq_get_byname() is documented better
than of_irq_get() but it still doesn't mention that 0 is returned iff
irq_create_of_mapping() fails (it doesn't return an error code in this
case). Document all possible return value variants, making the writing
of the word "IRQ" consistent, while at it...
Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Fixes: ad69674e73a1 ("of/irq: do irq resolution in platform_get_irq_byname()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Because of an improper dereference, a stray 'C' character was output to
the modalias when no 'compatible' was specified. This is the case for
some old PowerMac drivers which only set the 'name' property. Fix it to
let them match again.
Reported-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Mathieu Malaterre <malat@debian.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Andreas Schwab <schwab@linux-m68k.org> Fixes: 6543becf26fff6 ("mod/file2alias: make modalias generation safe for cross compiling") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fixes: 1b852bceb0d1 ("mnt: Refactor the logic for mounting sysfs and proc in a user namespace") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In rare cases it is possible for s_flags & MS_RDONLY to be set but
MNT_READONLY to be clear. This starting combination can cause
fs_fully_visible to fail to ensure that the new mount is readonly.
Therefore force MNT_LOCK_READONLY in the new mount if MS_RDONLY
is set on the source filesystem of the mount.
In general both MS_RDONLY and MNT_READONLY are set at the same for
mounts so I don't expect any programs to care. Nor do I expect
MS_RDONLY to be set on proc or sysfs in the initial user namespace,
which further decreases the likelyhood of problems.
Which means this change should only affect system configurations by
paranoid sysadmins who should welcome the additional protection
as it keeps people from wriggling out of their policies.
Fixes: 8c6cf9cc829f ("mnt: Modify fs_fully_visible to deal with locked ro nodev and atime") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
MNT_LOCKED implies on a child mount implies the child is locked to the
parent. So while looping through the children the children should be
tested (not their parent).
Typically an unshare of a mount namespace locks all mounts together
making both the parent and the slave as locked but there are a few
corner cases where other things work.
Fixes: ceeb0e5d39fc ("vfs: Ignore unlocked mounts in fs_fully_visible") Reported-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fix warning about tainted kernel because usb-otg-fsm has no license.
WARNING: with this patch usb-otg-fsm module can be loaded
but then the kernel will hang. Tested with a udoo quad board.
Signed-off-by: Oscar <oscar@naiandei.net> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The HOSTPC extension registers found in some EHCI implementations form
a variable-length array, with one element for each port. Therefore
the hostpc field in struct ehci_regs should be declared as a
zero-length array, not a single-element array.
This fixes a problem reported by UBSAN.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de> Tested-by: Wilfried Klaebe <linux-kernel@lebenslange-mailadresse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
A patch that went into Linux-4.4 to fix big-endian mode on a Lantiq
MIPS system unfortunately broke big-endian operation on PowerPC
APM82181 as reported by Christian Lamparter, and likely other
systems.
It actually introduced multiple issues:
- it broke big-endian ARM kernels: any machine that was working
correctly with a little-endian kernel is no longer using byteswaps
on big-endian kernels, which clearly breaks them.
- On PowerPC the same thing must be true: if it was working before,
using big-endian kernels is now broken. Unlike ARM, 32-bit PowerPC
usually uses big-endian kernels, so they are likely all broken.
- The barrier for dwc2_writel is on the wrong side of the __raw_writel(),
so the MMIO no longer synchronizes with DMA operations.
- On architectures that require specific CPU instructions for MMIO
access, using the __raw_ variant may turn this into a pointer
dereference that does not have the same effect as the readl/writel.
This patch is a simple revert for all architectures other than MIPS,
in the hope that we can more easily backport it to fix the regression
on PowerPC and ARM systems without breaking the Lantiq system again.
We should follow this up with a more elaborate change to add runtime
detection of endianness, to make sure it also works on all other
combinations of architectures and implementations of the usb-dwc2
device. That patch however will be fairly large and not appropriate
for backports to stable kernels.
Felipe suggested a different approach, using an endianness switching
register to always put the device into LE mode, but unfortunately
the dwc2 hardware does not provide a generic way to do that. Also,
I see no practical way of addressing the problem more generally by
patching architecture specific code on MIPS.
Fixes: 95c8bc360944 ("usb: dwc2: Use platform endianness when accessing registers") Acked-by: John Youn <johnyoun@synopsys.com> Tested-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Userspace can quite legitimately perform an exec() syscall with a
suspended transaction. exec() does not return to the old process, rather
it load a new one and starts that, the expectation therefore is that the
new process starts not in a transaction. Currently exec() is not treated
any differently to any other syscall which creates problems.
Firstly it could allow a new process to start with a suspended
transaction for a binary that no longer exists. This means that the
checkpointed state won't be valid and if the suspended transaction were
ever to be resumed and subsequently aborted (a possibility which is
exceedingly likely as exec()ing will likely doom the transaction) the
new process will jump to invalid state.
Secondly the incorrect attempt to keep the transactional state while
still zeroing state for the new process creates at least two TM Bad
Things. The first triggers on the rfid to return to userspace as
start_thread() has given the new process a 'clean' MSR but the suspend
will still be set in the hardware MSR. The second TM Bad Thing triggers
in __switch_to() as the processor is still transactionally suspended but
__switch_to() wants to zero the TM sprs for the new process.
This is an example of the outcome of calling exec() with a suspended
transaction. Note the first 700 is likely the first TM bad thing
decsribed earlier only the kernel can't report it as we've loaded
userspace registers. c000000000009980 is the rfid in
fast_exception_return()
The recent commit 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support
to ibm,client-architecture-support call") added a new PVR mask & value
to the start of the ibm_architecture_vec[] array.
However it missed the fact that further down in the array, we hard code
the offset of one of the fields, and then at boot use that value to
patch the value in the array. This means every update to the array must
also update the #define, ugh.
This means that on pseries machines we will misreport to firmware the
number of cores we support, by a factor of threads_per_core.
Fix it for now by updating the #define.
Fixes: 7cc851039d64 ("powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When this code was reworked for IBoE support the order of assignments
for the sl_tclass_flowlabel got flipped around resulting in
TClass & FlowLabel being permanently set to 0 in the packet headers.
This breaks IB routers that rely on these headers, but only affects
kernel users - libmlx4 does this properly for user space.
Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE") Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
If a user space program (e.g., wpa_supplicant) deletes a STA entry that
is currently in NL80211_PLINK_ESTAB state, the number of established
plinks counter was not decremented and this could result in rejecting
new plink establishment before really hitting the real maximum plink
limit. For !user_mpm case, this decrementation is handled by
mesh_plink_deactive().
Fix this by decrementing estab_plinks on STA deletion
(mesh_sta_cleanup() gets called from there) so that the counter has a
correct value and the Beacon frame advertisement in Mesh Configuration
element shows the proper value for capability to accept additional
peers.
Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
A wmediumd that does not send this attribute causes a NULL pointer
dereference, as the attribute is accessed even if it does not exist.
The attribute was required but never checked ever since userspace frame
forwarding has been introduced. The issue gets more problematic once we
allow wmediumd registration from user namespaces.
Fixes: 7882513bacb1 ("mac80211_hwsim driver support userspace frame tx/rx") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fix both of these by unconditionally flushing paths before destroying
the sta, and by adding a synchronize_net() after path flush to ensure
no active readers can still dereference the sta.
Reported-by: Fred Veldini <fred.veldini@gmail.com> Tested-by: Fred Veldini <fred.veldini@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The header field is defined as u8[] but also accessed as struct
ieee80211_hdr. Enforce an alignment of 2 to prevent unnecessary
unaligned accesses, which can be very harmful for performance on many
platforms.
Fixes: e495c24731a2 ("mac80211: extend fast-xmit for more ciphers") Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Paolo Pisati [Thu, 28 Jul 2016 13:45:18 +0000 (15:45 +0200)]
UBUNTU: SAUCE: snapcraft: cleanup and remove unnecessary elements
1) the branch name is only useful when distributing the snapcraft.yaml alone -
when it is part of a git tree, the branch name should be omitted if the
intention is to snap the branch containing the snapcraft.yaml file itself
3) d101m_ucode.bin, d101s_ucode.bin and d102e_ucode.bin are only necessary when
using an Intel(R) PRO/100 nic, while the main amd64 target is kvm (and that uses
an Intel e1000 firmware-less nic)
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
UBUNTU: SAUCE: (namespace) fuse: Permit requests from other pid namespaces
BugLink: http://bugs.launchpad.net/bugs/1605344
As a precaution, the pid namespace support in fuse was written
to refuse to send requests from processes whose pid has no
mapping into the pid namespace of the userspace fuse process.
This has caused a regression for at least one user, who is
mounting a fuse filesystem within a container and exporting
a file within the fuse fs to the host via a loop device.
Change this to send the request when the pid has no mapping and
fill in the pid field in the fuse request with 0. This behavior
was settled on in consultation with upstream. The risk of doing
this is that a fuse fs which receives this invalid pid might not
be prepared to handle it, but it would already be receiving pids
not valid in its namespace if used in this manner.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Fix a memory leak on probe error of the airspy usb device driver.
The problem is triggered when more than 64 usb devices register with
v4l2 of type VFL_TYPE_SDR or VFL_TYPE_SUBDEV.
The memory leak is caused by the probe function of the airspy driver
mishandeling errors and not freeing the corresponding control structures
when an error occours registering the device to v4l2 core.
A badusb device can emulate 64 of these devices, and then through
continual emulated connect/disconnect of the 65th device, cause the
kernel to run out of RAM and crash the kernel, thus causing a local DOS
vulnerability.
Fixes CVE-2016-5400
Signed-off-by: James Patrick-Evans <james@jmp-e.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org # 3.17+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit aa93d1fee85c890a34f2510a310e55ee76a27848)
CVE-2016-5400 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
UBUNTU: SAUCE: xenbus: Use proc_create_mount_point() to create /proc/xen
BugLink: http://bugs.launchpad.net/bugs/1607374
Mounting proc in user namespace containers fails if the xenbus
filesystem is mounted on /proc/xen because this directory fails
the "permanently empty" test. proc_create_mount_point() exists
specifically to create such mountpoints in proc but is currently
proc-internal. Export this interface to modules, then use it in
xenbus when creating /proc/xen.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Alex Hung [Thu, 28 Jul 2016 02:47:54 +0000 (10:47 +0800)]
ideapad_laptop: Add an event for mic mute hotkey
BugLink: http://bugs.launchpad.net/bugs/1607153
Newer ideapads support a new mic hotkey implemented via an ACPI
interface. This patch converts the mic mute event to a keycode
KEY_MICMUTE.
Signed-off-by: Alex Hung <alex.hung@canonical.com> Acked-by: Ike Panhc <ike.pan@canonical.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
(cherry picked from commit 48f67d62194952617dcade08194abc7f5cb3f50c) Signed-off-by: Alex Hung <alex.hung@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
block: atari: Return early for unsupported sector size
BugLink: http://bugs.launchpad.net/bugs/1604995
For 4K LBA or very large disks, atari_partition can easily get tricked
into thinking it has found an Atari partition table. Depending on the
data in the disk, it ends up creating partitions with awkward lengths.
Since the atari partition, according to the spec, doesn't even support
sector sizes with more than 512, a quick sanity check is reasonable to
just bail out early, before even attempting to read sector 0.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from linux-next commit 9c8747167e204b3fe0ff143e1642c921e382e1d6) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Stefan Bader [Tue, 21 Apr 2015 12:54:52 +0000 (14:54 +0200)]
UBUNTU: SAUCE: vesafb: Set mtrr:3 (write-combining) as default
We had this setting from Oneiric to Saucy[1]. In Saucy it was part
of the combined patch to make vesafb a module, which was unnessesary
in Trusty as we moved back to built-in.
Was added back in Trusty but again lost after Vivid.
It will still be possible to disable mtrr by putting
video=vesafb:nomtrr or video=vesafb:mtrr:<x>
on the kernel command-line but without that knowledge the default
will be noticably slower (especially on higher resolutions).
BugLink: http://bugs.launchpad.net/bugs/1605662
(forward-ported from trusty by adapting file location) Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Tue, 19 Jul 2016 21:36:59 +0000 (15:36 -0600)]
NVMe: Always use MSI/MSI-x interrupts
BugLink: http://bugs.launchpad.net/bugs/1602724
Multiple users have reported device initialization failure due the driver
not receiving legacy PCI interrupts. This is not unique to any particular
controller, but has been observed on multiple platforms.
There have been no issues reported or observed when with message signaled
interrupts, so this patch attempts to use MSI-x during initialization,
falling back to MSI. If that fails, legacy would become the default.
The setup_io_queues error handling had to change as a result: the admin
queue's msix_entry used to be initialized to the legacy IRQ. The case
where nr_io_queues is 0 would fail request_irq when setting up the admin
queue's interrupt since re-enabling MSI-x fails with 0 vectors, leaving
the admin queue's msix_entry invalid. Instead, return success immediately.
Reported-by: Tim Muhlemmer <muhlemmer@gmail.com> Reported-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from commit a5229050b69cfffb690b546c357ca5a60434c0c8) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/pci.c
Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
nvme: Avoid reset work on watchdog timer function during error recovery
BugLink: http://bugs.launchpad.net/bugs/1602724
This patch adds a check on nvme_watchdog_timer() function to avoid the
call to reset_work() when an error recovery process is ongoing on
controller. The check is made by looking at pci_channel_offline()
result.
If we don't check for this on nvme_watchdog_timer(), error recovery
mechanism can't recover well, because reset_work() won't be able to
do its job (since we're in the middle of an error) and so the
controller is removed from the system before error recovery mechanism
can perform slot reset (which would allow the adapter to recover).
In this patch we also have split the huge condition expression on
nvme_watchdog_timer() by introducing an auxiliary function to help
make the code more readable.
Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c875a7093f0479215cf9bf51356d7638f2ec5746) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Tue, 19 Jul 2016 21:36:57 +0000 (15:36 -0600)]
NVMe: Fix reset/remove race
BugLink: http://bugs.launchpad.net/bugs/1602724
This fixes a scenario where device is present and being reset, but a
request to unbind the driver occurs.
A previous patch series addressing a device failure removal scenario
flushed reset_work after controller disable to unblock reset_work waiting
on a completion that wouldn't occur. This isn't safe as-is. The broken
scenario can potentially be induced with:
modprobe nvme && modprobe -r nvme
To fix, the reset work is flushed immediately after setting the controller
removing flag, and any subsequent reset will not proceed with controller
initialization if the flag is set.
The controller status must be polled while active, so the watchdog timer
is also left active until the controller is disabled to cleanup requests
that may be stuck during namespace removal.
[Fixes: ff23a2a15a2117245b4599c1352343c8b8fb4c43] Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9bf2b972afeaffd173fe2ce211ebc555ea7e8a87) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
nvme: replace the kthread with a per-device watchdog timer
BugLink: http://bugs.launchpad.net/bugs/1602724
The only work left in the kthread is the periodic health check for each
controller. There is no need to run this from process context or keep
a thread context around for it, so replace it with a simpler timer.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from commit 2d55cd5f511d6fc377734473b237ac50820bfb9f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/pci.c
Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
nvme: use a work item to submit async event requests
BugLink: http://bugs.launchpad.net/bugs/1602724
Use a dedicated work item to submit async event requests instead of the
global kthread. This simplifies the code and reduces the latencies to
resubmit a request once an even notification happened.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9396dec916c052855dbb5b876c13d163df397319) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Using SEEK_DATA in a huge sparse file can easily lead to sotflockups as
ext4_seek_data() iterates hole block-by-block. Fix the problem by using
returned hole size from ext4_map_blocks() and thus skip the hole in one
go.
Update also SEEK_HOLE implementation to follow the same pattern as
SEEK_DATA to make future maintenance easier.
Furthermore we add cond_resched() to both ext4_seek_data() and
ext4_seek_hole() to avoid softlockups in case evil user creates huge
fragmented file and we have to go through lots of extents.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Currently, ext4_map_blocks() just returns 0 when it finds a hole and
allocation is not requested. However we have all the information
available to tell how large the hole actually is and there are callers
of ext4_map_blocks() which would save some block-by-block hole iteration
if they knew this information. So fill in struct ext4_map_blocks even
for holes with the information we have. We keep returning 0 for holes to
maintain backward compatibility of the function.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
ext4_ext_put_gap_in_cache() determines hole size in the extent tree,
then trims this with possible delayed allocated blocks, and inserts the
result into the extent status tree. Factor out determination of the size
of the hole in the extent tree as we will need this information in
ext4_ext_map_blocks() as well.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Andy Whitcroft [Tue, 19 Jul 2016 15:20:25 +0000 (16:20 +0100)]
UBUNTU: avoid duplicate CVE numbers in changelog
BugLink: http://bugs.launchpad.net/bugs/1604344 Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1603449
The PE primary bus cannot be got from its child devices when having
full hotplug in error recovery. The PE primary bus is cached, which
is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.
This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
can get valid PE primary bus through eeh_pe_bus_get().
Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
(backported from commit a3aa256b7258b3d19f8b44557cc64525a993b941) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
powerpc/pci: Assign fixed PHB number based on device-tree properties
BugLink: http://bugs.launchpad.net/bugs/1603574
The domain/PHB field of PCI addresses has its value obtained from a
global variable, incremented each time a new domain (represented by
struct pci_controller) is added on the system. The domain addition
process happens during boot or due to PHB hotplug add.
As recent kernels are using predictable naming for network interfaces,
the network stack is more tied to PCI naming. This can be a problem in
hotplug scenarios, because PCI addresses will change if devices are
removed and then re-added. This situation seems unusual, but it can
happen if a user wants to replace a NIC without rebooting the machine,
for example.
This patch changes the way PCI domain values are generated: now, we use
device-tree properties to assign fixed PHB numbers to PCI addresses
when available (meaning pSeries and PowerNV cases). We also use a bitmap
to allow dynamic PHB numbering when device-tree properties are not
used. This bitmap keeps track of used PHB numbers and if a PHB is
released (by hotplug operations for example), it allows the reuse of
this PHB number, avoiding PCI address to change in case of device remove
and re-add soon after. No functional changes were introduced.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
[mpe: Drop unnecessary machine_is(pseries) test] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 63a72284b159c569ec52f380c9a8dd9342d43bb8) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
nvme/quirk: Add a delay before checking for adapter readiness
BugLink: http://bugs.launchpad.net/bugs/1602726
When disabling the controller, the specification says the register
NVME_REG_CC should be written and then driver needs to wait the
adapter to be ready, which is checked by reading another register
bit (NVME_CSTS_RDY). There's a timeout validation in this checking,
so in case this timeout is reached the driver gives up and removes
the adapter from the system.
After a firmware activation procedure, the PCI_DEVICE(0x1c58, 0x0003)
(HGST adapter) end up being removed if we issue a reset_controller,
because driver keeps verifying the NVME_REG_CSTS until the timeout is
reached. This patch adds a necessary quirk for this adapter, by
introducing a delay before nvme_wait_ready(), so the reset procedure
is able to be completed. This quirk is needed because just increasing
the timeout is not enough in case of this adapter - the driver must
wait before start reading NVME_REG_CSTS register on this specific
device.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from linux-next commit 54adc01055b75ec8769c5a36574c7a0895c0c0b2) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/nvme.h
Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Mon, 18 Jul 2016 15:27:03 +0000 (09:27 -0600)]
NVMe: Create discard zero quirk white list
BugLink: http://bugs.launchpad.net/bugs/1602726
The NVMe specification does not require discarded blocks return zeroes on
read, but provides that behavior as a possibility. Some applications more
efficiently use an SSD if reads on discarded blocks were deterministically
zero, based on the "discard_zeroes_data" queue attribute.
There is no specification defined way to determine device behavior on
discarded blocks, so the driver always left the queue setting disabled. We
can only know behavior based on individual device models, so this patch
adds a flag to the NVMe "quirk" list that vendors may set if they know
their controller works that way. The patch also sets the new flag for one
such known device.
Signed-off-by: Keith Busch <keith.busch@intel.com> Suggested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 08095e70783f1d8296f858d37a9e1878f5da0623) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Samuel Gauthier [Mon, 18 Jul 2016 14:14:38 +0000 (08:14 -0600)]
openvswitch: fix conntrack netlink event delivery
BugLink: http://bugs.launchpad.net/bugs/1603468
Only the first and last netlink message for a particular conntrack are
actually sent. The first message is sent through nf_conntrack_confirm when
the conntrack is committed. The last one is sent when the conntrack is
destroyed on timeout. The other conntrack state change messages are not
advertised.
When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
is called for each packet, from the postrouting hook, which in turn calls
nf_ct_deliver_cached_events to send the state change netlink messages.
This commit fixes the problem by calling nf_ct_deliver_cached_events in the
non-commit case as well.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") CC: Joe Stringer <joestringer@nicira.com> CC: Justin Pettit <jpettit@nicira.com> CC: Andy Zhou <azhou@nicira.com> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit d913d3a763a6f66a862a6eafcf6da89a7905832a) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
net/openvswitch/conntrack.c
Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
cxl: Ignore CAPI adapters misplaced in switched slots
BugLink: http://bugs.launchpad.net/bugs/1602785
One should not attempt to switch a PHB into CAPI mode if there is
a switch between the PHB and the adapter. This patch modifies the
cxl driver to ignore CAPI adapters misplaced in switched slots.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 3b3dcd61fa4e3604d8f1bdfd8471fca7b7c012e4) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Ashutosh Dixit [Wed, 20 Jul 2016 15:15:37 +0000 (16:15 +0100)]
misc: mic: Fix for double fetch security bug in VOP driver
The MIC VOP driver does two successive reads from user space to read a
variable length data structure. Kernel memory corruption can result if
the data structure changes between the two reads. This patch disallows
the chance of this happening.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116651
Reported by: Pengfei Wang <wpengfeinudt@gmail.com> Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(backported from commit 9bf292bfca94694a721449e3fd752493856710f6)
[ luis: apply changes to mic_copy_dp_entry(), in file
drivers/misc/mic/host/mic_virtio.c; adjust context ]
CVE-2016-5728 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Wed, 20 Jul 2016 11:08:57 +0000 (12:08 +0100)]
rds: fix an infoleak in rds_inc_info_copy
The last field "flags" of object "minfo" is not initialized.
Copying this object out may leak kernel stack data.
Assign 0 to it to avoid leak.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4116def2337991b39919f3b448326e21c40e0dbb)
CVE-2016-5244 BugLink: https://bugs.launchpad.net/bugs/1589041 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Peter Zijlstra [Tue, 26 Jul 2016 08:33:17 +0000 (16:33 +0800)]
UBUNTU: SAUCE: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs
BugLink: http://bugs.launchpad.net/bugs/1606147
Monitored cached line may not wake up from mwait on certain
Goldmont based CPUs. This patch will avoid calling
current_set_polling_and_test() and thereby not set the TIF_ flag.
The result is that we'll always send IPIs for wakeups.
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1468867270-18493-1-git-send-email-jacob.jun.pan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(backported from linux-next commit 08e237fa56a1d95c1372033bc29c4a2517b3c0fa) Signed-off-by: Phidias Chiang <phidias.chiang@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
We have a boatload of open-coded family-6 model numbers. Half of
them have these model numbers in hex and the other half in
decimal. This makes grepping for them tons of fun, if you were
to try.
Solution:
Consolidate all the magic numbers. Put all the definitions in
one header.
The names here are closely derived from the comments describing
the models from arch/x86/events/intel/core.c. We could easily
make them shorter by doing things like s/SANDYBRIDGE/SNB/, but
they seemed fine even with the longer versions to me.
Do not take any of these names too literally, like "DESKTOP"
or "MOBILE". These are all colloquial names and not precise
descriptions of everywhere a given model will show up.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: jacob.jun.pan@intel.com Cc: linux-acpi@vger.kernel.org Cc: linux-edac@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 970442c599b22ccd644ebfe94d1d303bf6f87c05) Signed-off-by: Phidias Chiang <phidias.chiang@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
UBUNTU: SAUCE: (namespace) Bypass sget() capability check for nfs
BugLink: http://bugs.launchpad.net/bugs/1603719 302cabb "UBUNTU: SAUCE: (namespace) Sync with upstream s_user_ns
patches" added a capability check to sget() which causes a
regression for automatic submounts, which may happen in the
context of an unprivileged user. The capability check is not
necessary in this case.
The check can be bypassed by using sget_userns() instead.
init_user_namespace should be used for the user ns since nfs does
not support unprivileged mounting. This change makes the nfs
mount behavior in xenial functionally identical to upstream.
Acked-by: Andy Whitcroft <apw@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Gerecke [Tue, 19 Jul 2016 14:55:44 +0000 (09:55 -0500)]
HID: wacom: Support switching from vendor-defined device mode on G9 and G11
BugLink: http://bugs.launchpad.net/bugs/1603975
A tablet PC booted into Windows may have its pen/touch hardware switched
into "Wacom mode" similar to what we do with explicitly-supported hardware.
Some devices appear to maintain this state across reboots, preventing their
use with the generic HID driver. This patch adds support for detecting the
presence of the mode switch feature report used by devices based on the G9
and G11 chips and has the HID codepath always attempt to reset the device
back to sending standard HID reports.
Fixes: https://sourceforge.net/p/linuxwacom/bugs/307/ Fixes: https://sourceforge.net/p/linuxwacom/bugs/310/ Fixes: https://github.com/linuxwacom/input-wacom/issues/15 Co-authored-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 326ea2a90500fe4add86c5fb95d914d46910e780) Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Gerecke [Tue, 19 Jul 2016 14:55:43 +0000 (09:55 -0500)]
HID: wacom: Initialize hid_data.inputmode to -1
BugLink: http://bugs.launchpad.net/bugs/1603975
Commit 5ae6e89 introduced hid_data.inputmode with a comment that it
would have the value -1 if undefined, but then forgot to actually
perform the initialization. Although this doesn't appear to have
caused any problems in practice, it should still be remedied.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit c6fa1aeba02111ed8676494ac7cd453a03efef3c) Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
HID: wacom: break out parsing of device and registering of input
BugLink: http://bugs.launchpad.net/bugs/1603975
Simplifies the .probe() and will allow to reuse this path in the future.
Few things are reshuffled in .probe():
- init wacom struct earlier
- then retrieve the report descriptor
- then parse it and allocate/register inputs.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Ping Cheng <pingc@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit c58ac3a88d1e8a44fed152e80bf525a66a5647e2) Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
This program is free software; you can redistribute it and/or modify it
under the terms and conditions of the GNU General Public License,
version 2, as published by the Free Software Foundation.
This program is distributed in the hope it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Richard Alpe [Thu, 14 Jul 2016 14:02:07 +0000 (15:02 +0100)]
tipc: fix nl compat regression for link statistics
Fix incorrect use of nla_strlcpy() where the first NLA_HDRLEN bytes
of the link name where left out.
Making the output of tipc-config -ls look something like:
Link statistics:
dcast-link
1:data0-1.1.2:data0
1:data0-1.1.3:data0
Also, for the record, the patch that introduce this regression
claims "Sending the whole object out can cause a leak". Which isn't
very likely as this is a compat layer, where the data we are parsing
is generated by us and we know the string to be NULL terminated. But
you can of course never be to secure.
Fixes: 5d2be1422e02 (tipc: fix an infoleak in tipc_nl_compat_link_dump) Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 55e77a3e8297581c919b45adcc4d0815b69afa84)
CVE-2016-5243 BugLink: https://bugs.launchpad.net/bugs/1589036 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Thu, 14 Jul 2016 14:02:06 +0000 (15:02 +0100)]
tipc: fix an infoleak in tipc_nl_compat_link_dump
link_info.str is a char array of size 60. Memory after the NULL
byte is not initialized. Sending the whole object out can cause
a leak.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5d2be1422e02ccd697ccfcd45c85b4a26e6178e2)
CVE-2016-5243 BugLink: https://bugs.launchpad.net/bugs/1589036 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Dan Carpenter [Wed, 13 Jul 2016 10:43:47 +0000 (11:43 +0100)]
KEYS: potential uninitialized variable
If __key_link_begin() failed then "edit" would be uninitialized. I've
added a check to fix that.
This allows a random user to crash the kernel, though it's quite
difficult to achieve. There are three ways it can be done as the user
would have to cause an error to occur in __key_link():
(1) Cause the kernel to run out of memory. In practice, this is difficult
to achieve without ENOMEM cropping up elsewhere and aborting the
attempt.
(2) Revoke the destination keyring between the keyring ID being looked up
and it being tested for revocation. In practice, this is difficult to
time correctly because the KEYCTL_REJECT function can only be used
from the request-key upcall process. Further, users can only make use
of what's in /sbin/request-key.conf, though this does including a
rejection debugging test - which means that the destination keyring
has to be the caller's session keyring in practice.
(3) Have just enough key quota available to create a key, a new session
keyring for the upcall and a link in the session keyring, but not then
sufficient quota to create a link in the nominated destination keyring
so that it fails with EDQUOT.
The bug can be triggered using option (3) above using something like the
following:
echo 80 >/proc/sys/kernel/keys/root_maxbytes
keyctl request2 user debug:fred negate @t
The above sets the quota to something much lower (80) to make the bug
easier to trigger, but this is dependent on the system. Note also that
the name of the keyring created contains a random number that may be
between 1 and 10 characters in size, so may throw the test off by
changing the amount of quota used.
Assuming the failure occurs, something like the following will be seen:
Ben Hawkes says:
integer overflow in xt_alloc_table_info, which on 32-bit systems can
lead to small structure allocation and a copy_from_user based heap
corruption.
Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit d157bd761585605b7882935ffb86286919f62ea1)
CVE-2016-3135 BugLink: https://bugs.launchpad.net/bugs/1555353 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Roman Kagan [Wed, 13 Jul 2016 10:44:28 +0000 (11:44 +0100)]
kvm:vmx: more complete state update on APICv on/off
The function to update APICv on/off state (in particular, to deactivate
it when enabling Hyper-V SynIC) is incomplete: it doesn't adjust
APICv-related fields among secondary processor-based VM-execution
controls. As a result, Windows 2012 guests get stuck when SynIC-based
auto-EOI interrupt intersected with e.g. an IPI in the guest.
In addition, the MSR intercept bitmap isn't updated every time "virtualize
x2APIC mode" is toggled. This path can only be triggered by a malicious
guest, because Windows didn't use x2APIC but rather their own synthetic
APIC access MSRs; however a guest running in a SynIC-enabled VM could
switch to x2APIC and thus obtain direct access to host APIC MSRs
(CVE-2016-4440).
The patch fixes those omissions.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Reported-by: Steve Rutherford <srutherford@google.com> Reported-by: Yang Zhang <yang.zhang.wz@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3ce424e45411cf5a13105e0386b6ecf6eeb4f66f)
CVE-2016-4440 BugLink: https://bugs.launchpad.net/bugs/1584192 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Andy Shevchenko [Wed, 13 Jul 2016 09:36:13 +0000 (17:36 +0800)]
ACPI / LPSS: override power state for LPSS DMA device
BugLink: http://bugs.launchpad.net/bugs/1602579
This is a third approach to workaround long standing issue with LPSS on
BayTrail. First one [1] was reverted since it didn't resolve the issue
comprehensively. Second one [2] was rejected by internal review.
The LPSS DMA controller does not have neither _PS0 nor _PS3 method. Moreover it
can be powered off automatically whenever the last LPSS device goes down. In
case of no power any access to the DMA controller will hang the system. The
behaviour is reproduced on some HP laptops based on Intel BayTrail [3,4] as
well as on ASuS T100TA transformer.
Power on the LPSS island through the registers accessible in a specific way.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit eebb3e8d8aaf28f4bcaf12fd3645350bfd2f0b36) Signed-off-by: Hui Wang <hui.wang@canonical.com>
Conflicts:
arch/x86/include/asm/iosf_mbi.h
[To fix the conflicts and guarantee the success of building after
applying this patch, I copied _READ and _WRITE definitions from
the commit 4077a387 into this patch].
Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Andy Shevchenko [Wed, 13 Jul 2016 09:36:12 +0000 (17:36 +0800)]
dmaengine: dw: platform: power on device on shutdown
BugLink: http://bugs.launchpad.net/bugs/1602579
We have to call dw_dma_disable() to stop any ongoing transfer. On some
platforms we can't do that since DMA device is powered off. Moreover we have no
possibility at that point to check if the platform is affected or not. That's
why we call pm_runtime_get_sync() / pm_runtime_put() unconditionally. On the
other hand we can't use pm_runtime_suspended() because runtime PM framework is
not fully used by the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3214658855c01a0dd62f02feb2ce79846524c6a0) Signed-off-by: Hui Wang <hui.wang@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Allen Hung [Wed, 13 Jul 2016 08:45:34 +0000 (16:45 +0800)]
HID: multitouch: enable palm rejection for Windows Precision Touchpad
BugLink: http://bugs.launchpad.net/bugs/1593124
The usage Confidence is mandary to Windows Precision Touchpad devices. If
it is examined in input_mapping on a WIndows Precision Touchpad, a new add
quirk MT_QUIRK_CONFIDENCE desgned for such devices will be applied to the
device. A touch with the confidence bit is not set is determined as
invalid.
Tested on Dell XPS13 9343
Cc: stable@vger.kernel.org # v4.5+ Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3 Signed-off-by: Allen Hung <allen_hung@dell.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 6dd2e27a103d716921cc4a1a96a9adc0a8e3ab57) Signed-off-by: Phidias Chiang <phidias.chiang@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The commit enables palm rejection for Win8 Precision Touchpad devices but
the quirk MT_QUIRK_VALID_IS_CONFIDENCE it is using is not working very
properly. This quirk is originally designed for some WIn7 touchscreens. Use
of this for a Win8 Precision Touchpad will cause unexpected pointer jumping
problem.
Cc: stable@vger.kernel.org # v4.5+ Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Andy Lutomirski <luto@kernel.org> # XPS 13 9350, BIOS 1.4.3 Signed-off-by: Allen Hung <allen_hung@dell.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 62630ea768869beeb1e514b0bf5607a0c9b93d12) Signed-off-by: Phidias Chiang <phidias.chiang@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
percpu: fix synchronization between synchronous map extension and chunk destruction
For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.
This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.
percpu: fix synchronization between chunk->map_extend_work and chunk destruction
Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work. pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.
This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: stable@vger.kernel.org # v3.18+ Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")
(cherry picked from commit 4f996e234dad488e5d9ba0858bc1bae12eff82c3)
CVE-2016-4794 BugLink: https://bugs.launchpad.net/bugs/1581871 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Christopher Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>