Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for
page walk.
Reported-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Fixes: f2a6a7050109 ("x86: Convert the rest of the code to support p4d_t") Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dan Williams reported dax-pmem kernel warnings with the following signature:
WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200
percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic
... and bisected it to this commit, which suggests possible memory corruption
caused by the x86 fast-GUP conversion.
He also pointed out:
"
This is similar to the backtrace when we were not properly handling
pud faults and was fixed with this commit: 220ced1676c4 "mm: fix
get_user_pages() vs device-dax pud mappings"
I've found some missing _devmap checks in the generic
get_user_pages_fast() path, but this does not fix the regression
[...]
"
So given that there are known bugs, and a pretty robust looking bisection
points to this commit suggesting that are unknown bugs in the conversion
as well, revert it for the time being - we'll re-try in v4.13.
Reported-by: Dan Williams <dan.j.williams@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: aneesh.kumar@linux.vnet.ibm.com Cc: dann.frazier@canonical.com Cc: dave.hansen@intel.com Cc: steve.capper@linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86/mm: Fix dump pagetables for 4 levels of page tables
Commit fdd3d8ce0ea62 ("x86/dump_pagetables: Add support for 5-level
paging") introduced an error for dumping with only 4 levels by setting
PGD_LEVEL_MULT to a wrong value.
This is leading to e.g. addresses printed as "(null)" for ranges:
x86/mm: Found insecure W+X mapping at address (null)/(null)
Make PGD_LEVEL_MULT a multiple of PTRS_PER_P4D instead of PTRS_PER_PUD
Fixes: fdd3d8ce0ea62 ("x86/dump_pagetables: Add support for 5-level paging") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20170412143634.6846-1-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
The check between the hardware state and our shadow of it is
checked in the signal handler for all bounds exceptions,
even for the ones where we don't keep the shadow up2date.
This is a problem because when no shadow is kept the handler
fails at this point and hides the real reason of the
exception.
Move the check into the code-path evaluating normal bounds
exceptions to prevent this.
Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kselftest@vger.kernel.org Link: http://lkml.kernel.org/r/1491488598-27346-1-git-send-email-joro@8bytes.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
When this function fails it just sends a SIGSEGV signal to
user-space using force_sig(). This signal is missing
essential information about the cause, e.g. the trap_nr or
an error code.
Fix this by propagating the error to the only caller of
mpx_handle_bd_fault(), do_bounds(), which sends the correct
SIGSEGV signal to the process.
Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: fe3d197f84319 ('x86, mpx: On-demand kernel allocation of bounds tables') Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Merge branch 'x86/boot' into x86/mm, to avoid conflict
There's a conflict between ongoing level-5 paging support and
the E820 rewrite. Since the E820 rewrite is essentially ready,
merge it into x86/mm to reduce tree conflicts.
x86/mm: Define virtual memory map for 5-level paging
The first part of memory map (up to %esp fixup) simply scales existing
map for 4-level paging by factor of 9 -- number of bits addressed by
the additional page table level.
The rest of the map is unchanged.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
In this initial implementation we force-require 5-level paging support
from the hardware, when compiled with CONFIG_X86_5LEVEL=y. (The kernel
will panic during boot on CPUs that don't support 5-level paging.)
We will implement boot-time switch between 4- and 5-level paging later.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170330080731.65421-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wei Yang [Tue, 14 Mar 2017 03:08:01 +0000 (11:08 +0800)]
x86/mm/numa: Remove numa_nodemask_from_meminfo()
numa_nodemask_from_meminfo() generates a nodemask of nodes which have
memory according to a meminfo descriptor.
The two callsites of that function both set bits in copies of the
numa_nodes_parsed nodemask. In both cases, the information in supplied
numa_meminfo is a subset of numa_nodes_parsed. So setting those bits
again is not really necessary.
Here are the three call paths which show that the supplied numa_meminfo
argument describes memory regions in nodes which are already in
numa_nodes_parsed:
Thus, in all three cases, the respective bit in numa_nodes_parsed is
set, which means it is not necessary to set it again in a copy of
numa_nodes_parsed.
alloc_node_data() tries to allocate from the local node first and, if
that attempt fails, falls back to any node. Improve the error message to
issue the initial node for ease during debugging.
Merge tag 'dmaengine-fix-4.11-rc5' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"A couple of minor fixes for 4.11:
- array bound fix for __get_unmap_pool()
- cyclic period splitting for bcm2835"
* tag 'dmaengine-fix-4.11-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
dmaengine: bcm2835: Fix cyclic DMA period splitting
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"This update provides:
- prevent KASLR from randomizing EFI regions
- restrict the usage of -maccumulate-outgoing-args and document when
and why it is required.
- make the Global Physical Address calculation for UV4 systems work
correctly.
- address a copy->paste->forgot-edit problem in the MCE exception
table entries.
- assign a name to AMD MCA bank 3, so the sysfs file registration
works.
- add a missing include in the boot code"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Include missing header file
x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs
x86/build: Mostly disable '-maccumulate-outgoing-args'
x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization
x86/mce: Fix copy/paste error in exception table entries
x86/platform/uv: Fix calculation of Global Physical Address
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"This update provides:
- make the scheduler clock switch to unstable mode smooth so the
timestamps stay at microseconds granularity instead of switching to
tick granularity.
- unbreak perf test tsc by taking the new offset into account which
was added in order to proveide better sched clock continuity
- switching sched clock to unstable mode runs all clock related
computations which affect the sched clock output itself from a work
queue. In case of preemption sched clock uses half updated data and
provides wrong timestamps. Keep the math in the protected context
and delegate only the static key switch to workqueue context.
- remove a duplicate header include"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/headers: Remove duplicate #include <linux/sched/debug.h> line
sched/clock: Fix broken stable to unstable transfer
sched/clock, x86/perf: Fix "perf test tsc"
sched/clock: Fix clear_sched_clock_stable() preempt wobbly
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
"Downgrade the missing ESRT header printk to warning level and remove a
useless error printk which just generates noise for no value"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/esrt: Cleanup bad memory map log messages
Merge branch 'parisc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Al Viro reported that - in case of read faults - our copy_from_user()
implementation may claim to have copied more bytes than it actually
did. In order to fix this bug and because of the way how gcc optimizes
register usage for inline assembly in C code, we had to replace our
pa_memcpy() function with a pure assembler implementation.
While fixing the memcpy bug we noticed some other issues with our
get_user() and put_user() functions, e.g. nested faults may return
wrong data. This is now fixed by a common fixup handler for
get_user/put_user in the exception handler which additionally makes
generated code smaller and faster.
The third patch is a trivial one-line fix for a patch which went in
during 4.11-rc and which avoids stalled CPU warnings after power
shutdown (for parisc machines which can't plug power off themselves).
Due to the rewrite of pa_memcpy() into assembly this patch got bigger
than what I wanted to have sent at this stage.
Those patches have been running in production during the last few days
on our debian build servers without any further issues"
* 'parisc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Avoid stalled CPU warnings after system shutdown
parisc: Clean up fixup routines for get_user()/put_user()
parisc: Fix access fault handling in pa_memcpy()
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Thirteen small fixes: The hopefully final effort to get the lpfc nvme
kconfig problems sorted, there's one important sg fix (user can induce
read after end of buffer) and one minor enhancement (adding an extra
PCI ID to qedi). The rest are a set of minor fixes, which mostly occur
as user visible in error legs or on specific devices"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: remove the duplicated checking for supporting clkscaling
scsi: lpfc: fix building without debugfs support
scsi: lpfc: Fix PT2PT PRLI reject
scsi: hpsa: fix volume offline state
scsi: libsas: fix ata xfer length
scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL
scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function
scsi: scsi_dh_alua: Check scsi_device_get() return value
scsi: sg: check length passed to SG_NEXT_CMD_LEN
scsi: ufshcd-platform: remove the useless cast in ERR_PTR/IS_ERR
scsi: qedi: Add PCI device-ID for QL41xxx adapters.
scsi: aacraid: Fix potential null access
scsi: qla2xxx: Fix crash in qla2xxx_eh_abort on bad ptr
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kasan: do not sanitize kexec purgatory
drivers/rapidio/devices/tsi721.c: make module parameter variable name unique
mm/hugetlb.c: don't call region_abort if region_chg fails
kasan: report only the first error by default
hugetlbfs: initialize shared policy as part of inode allocation
mm: fix section name for .data..ro_after_init
mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd()
mm: workingset: fix premature shadow node shrinking with cgroups
mm: rmap: fix huge file mmap accounting in the memcg stats
mm: move mm_percpu_wq initialization earlier
mm: migrate: fix remove_migration_pte() for ksm pages
Merge tag 'usb-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 4.11-rc5.
The usual xhci fixes are here, as well as a fix for yet-another-bug-
found-by-KASAN, those developers are doing great stuff here.
And there's a phy build warning fix that showed up in 4.11-rc1.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: phy: isp1301: Fix build warning when CONFIG_OF is disabled
xhci: Manually give back cancelled URB if we can't queue it for cancel
xhci: Set URB actual length for stopped control transfers
xhci: plat: Register shutdown for xhci_plat
USB: fix linked-list corruption in rh_call_control()
Merge tag 'tty-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small fixes for some serial drivers and Kconfig help
text for 4.11-rc5. Nothing major here at all, a few things resolving
reported bugs in some random serial drivers.
I don't think these made the last linux-next due to me getting to them
yesterday, but I am not sure, they might have snuck in. The patches
only affect drivers that the maintainers of sent me these patches for,
so we should be safe here :)"
* tag 'tty-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: pl011: fix earlycon work-around for QDF2400 erratum 44
serial: 8250_EXAR: fix duplicate Kconfig text and add missing help text
tty/serial: atmel: fix TX path in atmel_console_write()
tty/serial: atmel: fix race condition (TX+DMA)
serial: mxs-auart: Fix baudrate calculation
Merge tag 'acpi-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix two issues related to IOAPIC hotplug, an overzealous build
optimization that prevents the function graph tracer from working with
the ACPI subsystem correctly and an RCU synchronization issue in the
ACPI APEI code.
Specifics:
- drop the unconditional setting of the '-Os' gcc flag from the ACPI
Makefile to make the function graph tracer work correctly with the
ACPI subsystem (Josh Poimboeuf).
- add missing synchronize_rcu() to ghes_remove() which removes an
element from an RCU-protected list, but fails to synchronize it
properly afterward (James Morse).
- fix two problems related to IOAPIC hotplug, a local variable
initialization in setup_res() and the creation of platform device
objects for IO(x)APICs which are (a) unused and (b) leaked on
hot-removal (Joerg Roedel)"
* tag 'acpi-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: Fix incompatibility with mcount-based function graph tracing
ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal
ACPI: Do not create a platform_device for IOAPIC/IOxAPIC
ACPI: ioapic: Clear on-stack resource before using it
Merge tag 'pm-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a cpufreq core issue with the initialization of the cpufreq
sysfs interface and a cpuidle powernv driver initialization issue.
Specifics:
- symbolic links from CPU directories to the corresponding cpufreq
policy directories in sysfs are not created during initialization
in some cases which confuses user space, so prevent that from
happening (Rafael Wysocki).
- the powernv cpuidle driver fails to pass a correct cpumaks to the
cpuidle core in some cases which causes subsequent failures to
occur, so fix it (Vaidyanathan Srinivasan)"
* tag 'pm-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: powernv: Pass correct drv->cpumask for registration
cpufreq: Fix creation of symbolic links to policy directories
Merge tag 'arc-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Accumulated fixes for ARC which I've been been sitting on for a while:
- reading clk from driver vs device tree [Vlad]
- fix support for UIO in VDK platform [Alexey]
- SLC busy bit reading workaround
- build warning with kprobes header reorg"
* tag 'arc-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: fix build warnings with !CONFIG_KPROBES
ARCv2: SLC: Make sure busy bit is set properly on SLC flushing
ARC: vdk: Fix support of UIO
ARCv2: make unimplemented vectors as no-ops rather than halt core
ARC: get rate from clk driver instead of reading device tree
ARC: [dts] add cpu nodes to ARCHS SMP device tree
ARC: [dts] add input clocks for cpu nodes
Merge tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"The restriction of NFSv4 to TCP went overboard and also broke the
backchannel; fix.
Also some minor refinements to the nfsd version-setting interface that
we'd like to get fixed before release"
* tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linux:
svcrdma: set XPT_CONG_CTRL flag for bc xprt
NFSD: fix nfsd_reset_versions for NFSv4.
NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
NFSD: further refinement of content of /proc/fs/nfsd/versions
nfsd: map the ENOKEY to nfserr_perm for avoiding warning
SUNRPC/backchanel: set XPT_CONG_CTRL flag for bc xprt
Timur Tabi [Fri, 31 Mar 2017 22:05:02 +0000 (17:05 -0500)]
tty: pl011: fix earlycon work-around for QDF2400 erratum 44
The work-around for the Qualcomm Datacenter Technologies QDF2400
erratum 44 sets the "qdf2400_e44_present" global variable if the
work-around is needed. However, this check does not happen until after
earlycon is initialized, which means the work-around is not
used, and the console hangs as soon as it displays one character.
Fixes: d8a4995bcea1 ("tty: pl011: Work around QDF2400 E44 stuck BUSY bit") Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We have three small fixes queued up in my for-linus-4.11 branch"
* 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix an integer overflow check
btrfs: Change qgroup_meta_rsv to 64bit
Btrfs: bring back repair during read
Randy Dunlap [Fri, 31 Mar 2017 22:12:10 +0000 (15:12 -0700)]
drivers/rapidio/devices/tsi721.c: make module parameter variable name unique
kbuild test robot reported a non-static variable name collision between
a staging driver and a RapidIO driver, with a generic variable name of
'dbg_level'.
Both drivers should be changed so that they don't use this generic
public variable name. This patch fixes the RapidIO driver but does not
change the user interface (name) for the module parameter.
drivers/staging/built-in.o:(.bss+0x109d0): multiple definition of `dbg_level'
drivers/rapidio/built-in.o:(.bss+0x16c): first defined here
Mike Kravetz [Fri, 31 Mar 2017 22:12:07 +0000 (15:12 -0700)]
mm/hugetlb.c: don't call region_abort if region_chg fails
Changes to hugetlbfs reservation maps is a two step process. The first
step is a call to region_chg to determine what needs to be changed, and
prepare that change. This should be followed by a call to call to
region_add to commit the change, or region_abort to abort the change.
The error path in hugetlb_reserve_pages called region_abort after a
failed call to region_chg. As a result, the adds_in_progress counter in
the reservation map is off by 1. This is caught by a VM_BUG_ON in
resv_map_release when the reservation map is freed.
syzkaller fuzzer (when using an injected kmalloc failure) found this
bug, that resulted in the following:
Mark Rutland [Fri, 31 Mar 2017 22:12:04 +0000 (15:12 -0700)]
kasan: report only the first error by default
Disable kasan after the first report. There are several reasons for
this:
- Single bug quite often has multiple invalid memory accesses causing
storm in the dmesg.
- Write OOB access might corrupt metadata so the next report will print
bogus alloc/free stacktraces.
- Reports after the first easily could be not bugs by itself but just
side effects of the first one.
Given that multiple reports usually only do harm, it makes sense to
disable kasan after the first one. If user wants to see all the
reports, the boot-time parameter kasan_multi_shot must be used.
[aryabinin@virtuozzo.com: wrote changelog and doc, added missing include] Link: http://lkml.kernel.org/r/20170323154416.30257-1-aryabinin@virtuozzo.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Kravetz [Fri, 31 Mar 2017 22:12:01 +0000 (15:12 -0700)]
hugetlbfs: initialize shared policy as part of inode allocation
Any time after inode allocation, destroy_inode can be called. The
hugetlbfs inode contains a shared_policy structure, and
mpol_free_shared_policy is unconditionally called as part of
hugetlbfs_destroy_inode. Initialize the policy as part of inode
allocation so that any quick (error path) calls to destroy_inode will be
handed an initialized policy.
syzkaller fuzzer found this bug, that resulted in the following:
BUG: KASAN: user-memory-access in atomic_inc
include/asm-generic/atomic-instrumented.h:87 [inline] at addr 000000131730bd7a
BUG: KASAN: user-memory-access in __lock_acquire+0x21a/0x3a80
kernel/locking/lockdep.c:3239 at addr 000000131730bd7a
Write of size 4 by task syz-executor6/14086
CPU: 3 PID: 14086 Comm: syz-executor6 Not tainted 4.11.0-rc3+ #364
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
atomic_inc include/asm-generic/atomic-instrumented.h:87 [inline]
__lock_acquire+0x21a/0x3a80 kernel/locking/lockdep.c:3239
lock_acquire+0x1ee/0x590 kernel/locking/lockdep.c:3762
__raw_write_lock include/linux/rwlock_api_smp.h:210 [inline]
_raw_write_lock+0x33/0x50 kernel/locking/spinlock.c:295
mpol_free_shared_policy+0x43/0xb0 mm/mempolicy.c:2536
hugetlbfs_destroy_inode+0xca/0x120 fs/hugetlbfs/inode.c:952
alloc_inode+0x10d/0x180 fs/inode.c:216
new_inode_pseudo+0x69/0x190 fs/inode.c:889
new_inode+0x1c/0x40 fs/inode.c:918
hugetlbfs_get_inode+0x40/0x420 fs/hugetlbfs/inode.c:734
hugetlb_file_setup+0x329/0x9f0 fs/hugetlbfs/inode.c:1282
newseg+0x422/0xd30 ipc/shm.c:575
ipcget_new ipc/util.c:285 [inline]
ipcget+0x21e/0x580 ipc/util.c:639
SYSC_shmget ipc/shm.c:673 [inline]
SyS_shmget+0x158/0x230 ipc/shm.c:657
entry_SYSCALL_64_fastpath+0x1f/0xc2
Analysis provided by Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: http://lkml.kernel.org/r/1490477850-7944-1-git-send-email-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Michal Hocko <mhocko@suse.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The latter adds incorrect wrapping around the existing s390 section, and
came later. I'd prefer the s390 naming, so this moves the s390-specific
name up to the asm-generic/sections.h and renames the section as used by
kmemleak (and in the future, kernel/extable.c).
Link: http://lkml.kernel.org/r/20170327192213.GA129375@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390 parts] Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Eddie Kovsky <ewk@edkovsky.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This bug is triggered when pmd_present() returns true for non-present
hugetlb, so fixing the present check in follow_huge_pmd() prevents it.
Using pmd_present() to determine present/non-present for hugetlb is not
correct, because pmd_present() checks multiple bits (not only
_PAGE_PRESENT) for historical reason and it can misjudge hugetlb state.
Fixes: e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") Link: http://lkml.kernel.org/r/1490149898-20231-1-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 31 Mar 2017 22:11:52 +0000 (15:11 -0700)]
mm: workingset: fix premature shadow node shrinking with cgroups
Commit 0a6b76dd23fa ("mm: workingset: make shadow node shrinker memcg
aware") enabled cgroup-awareness in the shadow node shrinker, but forgot
to also enable cgroup-awareness in the list_lru the shadow nodes sit on.
Consequently, all shadow nodes are sitting on a global (per-NUMA node)
list, while the shrinker applies the limits according to the amount of
cache in the cgroup its shrinking. The result is excessive pressure on
the shadow nodes from cgroups that have very little cache.
Enable memcg-mode on the shadow node LRUs, such that per-cgroup limits
are applied to per-cgroup lists.
Fixes: 0a6b76dd23fa ("mm: workingset: make shadow node shrinker memcg aware") Link: http://lkml.kernel.org/r/20170322005320.8165-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov@tarantool.org> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 31 Mar 2017 22:11:50 +0000 (15:11 -0700)]
mm: rmap: fix huge file mmap accounting in the memcg stats
Huge pages are accounted as single units in the memcg's "file_mapped"
counter. Account the correct number of base pages, like we do in the
corresponding node counter.
Link: http://lkml.kernel.org/r/20170322005111.3156-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Fri, 31 Mar 2017 22:11:47 +0000 (15:11 -0700)]
mm: move mm_percpu_wq initialization earlier
Yang Li has reported that drain_all_pages triggers a WARN_ON which means
that this function is called earlier than the mm_percpu_wq is
initialized on arm64 with CMA configured:
WARNING: CPU: 2 PID: 1 at mm/page_alloc.c:2423 drain_all_pages+0x244/0x25c
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-next-20170310-00027-g64dfbc5 #127
Hardware name: Freescale Layerscape 2088A RDB Board (DT)
task: ffffffc07c4a6d00 task.stack: ffffffc07c4a8000
PC is at drain_all_pages+0x244/0x25c
LR is at start_isolate_page_range+0x14c/0x1f0
[...]
drain_all_pages+0x244/0x25c
start_isolate_page_range+0x14c/0x1f0
alloc_contig_range+0xec/0x354
cma_alloc+0x100/0x1fc
dma_alloc_from_contiguous+0x3c/0x44
atomic_pool_init+0x7c/0x208
arm64_dma_init+0x44/0x4c
do_one_initcall+0x38/0x128
kernel_init_freeable+0x1a0/0x240
kernel_init+0x10/0xfc
ret_from_fork+0x10/0x20
Fix this by moving the whole setup_vmstat which is an initcall right now
to init_mm_internals which will be called right after the WQ subsystem
is initialized.
Link: http://lkml.kernel.org/r/20170315164021.28532-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Yang Li <pku.leo@gmail.com> Tested-by: Yang Li <pku.leo@gmail.com> Tested-by: Xiaolong Ye <xiaolong.ye@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
new = page - pvmw.page->index +
linear_page_index(vma, pvmw.address);
The 'new' is calculated with 'page' which is given by the caller as a
destination page and some offset adjustment for thp. But this doesn't
properly work for ksm pages because pvmw.page->index doesn't change for
each address but linear_page_index() changes, which means that 'new'
points to different pages for each addresses backed by the ksm page. As
a result, we try to set totally unrelated pages as destination pages,
and that causes kernel crash.
This patch fixes the miscalculation and makes ksm page migration work
fine.
Fixes: 3fe87967c536 ("mm: convert remove_migration_pte() to use page_vma_mapped_walk()") Link: http://lkml.kernel.org/r/1489717683-29905-1-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 31 Mar 2017 19:29:03 +0000 (12:29 -0700)]
Merge tag 'nfs-for-4.11-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
"Here are a few more bugfixes that came in over the last couple of
weeks. Most of these fix various hangs and loops that people found,
but we also had a few error handling fixes.
Stable Bugfixes:
- fix infinite loop on BAD_STATEID error
Other Bugfixes:
- fix old dentry rehash after move
- fix pnfs GETDEVINFO hangs
- fix pnfs fallback to MDS on commit errors
- fix flexfiles kernel oops"
* tag 'nfs-for-4.11-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
nfs: flexfiles: fix kernel OOPS if MDS returns unsupported DS type
NFSv4.1 fix infinite loop on IO BAD_STATEID error
PNFS fix fallback to MDS if got error on commit to DS
NFS filelayout:call GETDEVICEINFO after pnfs_layout_process completes
NFS store nfs4_deviceid in struct nfs4_filelayout_segment
NFS cleanup struct nfs4_filelayout_segment
NFS: Fix old dentry rehash after move
Linus Torvalds [Fri, 31 Mar 2017 19:21:48 +0000 (12:21 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main thing is a fix for a NULL dereference on systems that boot
using spin-tables or the ACPI parking protocol, but there are also a
couple of trivial one-liners too.
We're currently debugging a page flags corruption issue under
syzkaller, but we're still some way from fixing that as it's proving
fiddly to reproduce.
Summary:
- fix cpu_die() NULL dereference when booting secondary CPUs using
spin-table
- remove redundant #include
- remove obsolete .gitignore entry"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: drop non-existing vdso-offsets.h from .gitignore
arm64: remove redundant header file in current.h
arm64: fix NULL dereference in have_cpu_die()
Linus Torvalds [Fri, 31 Mar 2017 19:05:05 +0000 (12:05 -0700)]
Merge tag 'mmc-v4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"Here are a couple of mmc fixes intended for v4.11 rc5.
MMC host:
- sdhci: Fix bug when using SDIO IRQ
- sdhci-of-at91: Fix eMMC DDR52 card detection"
* tag 'mmc-v4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-at91: fix MMC_DDR_52 timing selection
mmc: sdhci: Disable runtime pm when the sdio_irq is enabled
Linus Torvalds [Fri, 31 Mar 2017 18:53:49 +0000 (11:53 -0700)]
Merge tag 'sound-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"At this time, most of changes are for ASoC, while we got one fix for
yet another race of ALSA sequencer core and a usual HD-audio quirk.
The ASoC changes are mostly small and device-specific fixes. A
slightly large volume is seen in sun8i-codec, which is a new code in
4.11, and we'd like to fix user-visible stuff before the official 4.1
release"
* tag 'sound-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits)
ALSA: hda - fix a problem for lineout on a Dell AIO machine
ASoC: simple-card: fix simple_dai clk lookup
ASoC: STI: Fix reader substream pointer set
ALSA: seq: Fix race during FIFO resize
ARM: dts: sun8i: Update audio-routing with renamed widgets
ASoC: sun8i-codec: Convert to use SND_SOC_DAPM_AIF_IN
ASoC: sun8i-codec: Fix space on audio-routing widget
ASoC: sun8i-codec: Update mixer to use SOC_DAPM_DOUBLE
ASoC: sun8i-codec: Remove analog "HP" widget
ASoC: rt5665: fix wrong shift rt5665_if2_1_adc_in_enum
ASoC: rt5665: fix define of RT5665_HP_DRIVER_5X
ASoC: rcar: dma: remove unnecessary "volatile"
ASoC: rcar: clear DE bit only in PDMACHCR when it stops
ASoC: rsnd: fix sound route path when using SRC6/SRC9
ASoC: don't dereference NULL pcm_{new,free}
ASoC: rt5665: CLKDET is also a power of ASRC
ASoC: rt5665: Vref3 is necessary for Mono Amp
ASoC: rt5665: increase LDO level
ASoC: rt5665: fix getting wrong work handler container
ASoC: atmel-classd: fix audio clock rate
...
Linus Torvalds [Fri, 31 Mar 2017 18:50:31 +0000 (11:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- Wacom regression fixes, from Aaron Armstrong Skomra
- new device ID addition by Peter Stein
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: wacom: call _query_tablet_data() for BAMBOO_TOUCH
HID: wacom: Don't add ghost interface as shared data
HID: xinmo: fix for out of range for THT 2P arcade controller.
Linus Torvalds [Fri, 31 Mar 2017 18:34:06 +0000 (11:34 -0700)]
Merge tag 'drm-fixes-for-v4.11-rc5' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Seems to be quietening down, which means someone will make a liar of
me for rc6.
Just one vc4, one etnvaiv, one radeon, and a few i915 GVT fixes, and
one i915 normal fixes"
* tag 'drm-fixes-for-v4.11-rc5' of git://people.freedesktop.org/~airlied/linux:
drm/vc4: Allocate the right amount of space for boot-time CRTC state.
drm/etnaviv: (re-)protect fence allocation with GPU mutex
drm/radeon: Override fpfn for all VRAM placements in radeon_evict_flags
drm/i915: Restore marking context objects as dirty on pinning
drm/i915/gvt: Use force single submit flag to distinguish gvt request from i915 request
drm/i915/gvt: set shadow entry to scratch page while p2m failed
drm/i915/gvt: Fix guest fail to read EDID leading to black guest console issue.
drm/i915/gvt: fix wrong offset when loading RCS mocs
drm/i915/gvt: add write handler for mmio mbctl
drm/i915/kvmgt: Hold struct kvm reference
Commit 63d63cbf5e03 "NFSv4.1: Don't recheck delegations that
have already been checked" introduced a regression where when a
client received BAD_STATEID error it would not send any TEST_STATEID
and instead go into an infinite loop of resending the IO that caused
the BAD_STATEID.
Fixes: 63d63cbf5e03 ("NFSv4.1: Don't recheck delegations that have already been checked") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Paul Gortmaker [Mon, 27 Mar 2017 23:39:10 +0000 (19:39 -0400)]
serial: 8250_EXAR: fix duplicate Kconfig text and add missing help text
In commit d0aeaa83f0b0f7a92615bbdd6b1f96812f7dcfd2 ("serial: exar:
split out the exar code from 8250_pci") the exar driver got its own
Kconfig. However the text for the new option was never changed from
the original 8250_PCI text, and hence it appears confusing when you
get asked the same question twice:
8250/16550 PCI device support (SERIAL_8250_PCI) [Y/n/m/?] (NEW)
8250/16550 PCI device support (SERIAL_8250_EXAR) [Y/n/m] (NEW)
Adding to the confusion, is that there is no help text for this new
option to indicate it is specific to a certain family of cards.
Fix both issues at the same time, as well as the space vs. tab issues
introduced in the same commit.
Fixes: d0aeaa83f0b0 ("serial: exar: split out the exar code from 8250_pci") Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-serial@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicolas Ferre [Mon, 20 Mar 2017 15:38:57 +0000 (16:38 +0100)]
tty/serial: atmel: fix TX path in atmel_console_write()
A side effect of 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA
from transmitting in stop_tx") is that the console can be called with
TX path disabled. Then the system would hang trying to push charecters
out in atmel_console_putchar().
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Fixes: 89d8232411a8 ("tty/serial: atmel_serial: BUG: stop DMA from transmitting
in stop_tx") Cc: stable <stable@vger.kernel.org> #4.4+ Acked-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Genoud [Mon, 20 Mar 2017 10:52:41 +0000 (11:52 +0100)]
tty/serial: atmel: fix race condition (TX+DMA)
If uart_flush_buffer() is called between atmel_tx_dma() and
atmel_complete_tx_dma(), the circular buffer has been cleared, but not
atmel_port->tx_len.
That leads to a circular buffer overflow (dumping (UART_XMIT_SIZE -
atmel_port->tx_len) bytes).
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Cc: stable <stable@vger.kernel.org> # 3.12+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The reference manual for the i.MX28 recommends to calculate the divisor
as
divisor = (UARTCLK * 32) / baud rate, rounded to the nearest integer
, so let's do this. For a typical setup of UARTCLK = 24 MHz and baud
rate = 115200 this changes the divisor from 6666 to 6667 and so the
actual baud rate improves from 115211.521 Bd (error ≅ 0.01 %) to
115194.240 Bd (error ≅ 0.005 %).
Dmitry Safonov [Fri, 31 Mar 2017 11:11:37 +0000 (14:11 +0300)]
x86/mm: Make in_compat_syscall() work during exec
The x86 mmap() code selects the mmap base for an allocation depending on
the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
for 32bit mm->mmap_compat_base.
On execve the registers of the task invoking exec() are copied to the child
pt_regs. So child->pt_regs->orig_ax contains the execve syscall number of the
parent.
exec() calls mmap() which in turn uses in_compat_syscall() to check whether
the mapping is for a 32bit or a 64bit task. The decision is made on the
following criteria:
child->thread.status is corretly set up in set_personality_*(), but the
syscall number in child->pt_regs.orig_ax is left unmodified.
Therefore the parent/child combinations work or fail in the following way:
Parent Child Child->thread_status child->pt_regs.orig_ax in_compat() Works
ia64 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false Y
ia64 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 0 true Y
ia64 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false N
ia32 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false Y
ia32 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 0 true Y
ia32 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 0 false N
x32 ia64 TS_COMPAT == 0 __X32_SYSCALL_BIT == 1 true N
x32 ia32 TS_COMPAT == 1 __X32_SYSCALL_BIT == 1 true Y
x32 x32 TS_COMPAT == 0 __X32_SYSCALL_BIT == 1 true Y
Make set_personality_*() store the syscall number incl. __X32_SYSCALL_BIT
which corresponds to the newly started ELF executable in the childs
pt_regs, i.e. pretend that the exec was invoked from a task with the same
executable format.
So both thread.status and pt_regs.orig_ax correspond to the new ELF format
and in_compat_syscall() returns the correct result.
[ tglx: Rewrote changelog ]
Fixes: commit 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") Reported-by: Adam Borowski <kilobyte@angband.pl> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: 0x7f454c46@gmail.com Cc: linux-mm@kvack.org Cc: Andrei Vagin <avagin@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Borislav Petkov <bp@suse.de> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20170331111137.28170-1-dsafonov@virtuozzo.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Matt Redfearn [Fri, 31 Mar 2017 11:05:32 +0000 (12:05 +0100)]
irqchip/mips-gic: Fix Local compare interrupt
Commit 4cfffcfa5106 ("irqchip/mips-gic: Fix local interrupts") added
mapping of several local interrupts during initialisation of the gic
driver. This associates virq numbers with these interrupts.
Unfortunately, as not all of the interrupts are mapped in hardware
order, when drivers subsequently request these interrupts they conflict
with the mappings that have already been set up. For example, this
manifests itself in the gic clocksource driver, which fails to probe
with the message:
This is because virq 25 (the correct IRQ number specified via device
tree) was allocated to the PERFCTR interrupt (and 24 to the timer, 26 to
the FDC). To fix this, map all of these local interrupts in the hardware
order so as to associate their virq numbers with the correct hw
interrupts.
Fixes: 4cfffcfa5106 ("irqchip/mips-gic: Fix local interrupts") Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Hui Wang [Fri, 31 Mar 2017 02:31:40 +0000 (10:31 +0800)]
ALSA: hda - fix a problem for lineout on a Dell AIO machine
On this Dell AIO machine, the lineout jack does not work.
We found the pin 0x1a is assigned to lineout on this machine, and in
the past, we applied ALC298_FIXUP_DELL1_MIC_NO_PRESENCE to fix the
heaset-set mic problem for this machine, this fixup will redefine
the pin 0x1a to headphone-mic, as a result the lineout doesn't
work anymore.
After consulting with Dell, they told us this machine doesn't support
microphone via headset jack, so we add a new fixup which only defines
the pin 0x18 as the headset-mic.
[rearranged the fixup insertion position by tiwai in order to make the
merge with other branches easier -- tiwai]
Fixes: 59ec4b57bcae ("ALSA: hda - Fix headset mic detection problem for two dell machines") Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Zhengyi Shen [Wed, 29 Mar 2017 07:00:20 +0000 (15:00 +0800)]
x86/boot: Include missing header file
Sparse complains about missing forward declarations:
arch/x86/boot/compressed/error.c:8:6:
warning: symbol 'warn' was not declared. Should it be static?
arch/x86/boot/compressed/error.c:15:6:
warning: symbol 'error' was not declared. Should it be static?
Yazen Ghannam [Thu, 30 Mar 2017 11:17:14 +0000 (13:17 +0200)]
x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs
MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.
Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.
Tested on AMD Fam15h and Fam17h systems.
WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
kobject: (ffff88085bb256c0): attempted to be registered with empty name!
...
Call Trace:
kobject_add_internal
kobject_add
kobject_create_and_add
threshold_create_device
threshold_init_device
Zhengyi Shen [Wed, 29 Mar 2017 07:00:20 +0000 (15:00 +0800)]
x86/boot: Fix Sparse warning by including required header file
Include declarations for various symbols defined in the error.h header file
to fix the following Sparse warnings:
arch/x86/boot/compressed/error.c:8:6:
warning: symbol 'warn' was not declared. Should it be static?
arch/x86/boot/compressed/error.c:15:6:
warning: symbol 'error' was not declared. Should it be static?
Borislav Petkov [Thu, 30 Mar 2017 07:44:05 +0000 (09:44 +0200)]
x86/boot/32: Flip the logic in test_wp_bit()
... to have a natural "likely()" in the code flow and thus have the
success case with a branch 99.999% of the times non-taken and function
return code following it instead of jumping to it each time.
This puts the panic() call at the end of the function - it is going to
be practically unreachable anyway.
The C code is a bit more readable too.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: thgarnie@google.com Link: http://lkml.kernel.org/r/20170330080101.ywsf5rg6ilzu4itk@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dave Airlie [Fri, 31 Mar 2017 01:50:56 +0000 (11:50 +1000)]
Merge tag 'drm-intel-fixes-2017-03-29' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.11-rc5
* tag 'drm-intel-fixes-2017-03-29' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Restore marking context objects as dirty on pinning
drm/i915/gvt: Use force single submit flag to distinguish gvt request from i915 request
drm/i915/gvt: set shadow entry to scratch page while p2m failed
drm/i915/gvt: Fix guest fail to read EDID leading to black guest console issue.
drm/i915/gvt: fix wrong offset when loading RCS mocs
drm/i915/gvt: add write handler for mmio mbctl
drm/i915/kvmgt: Hold struct kvm reference
Dave Airlie [Fri, 31 Mar 2017 01:50:04 +0000 (11:50 +1000)]
Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes
a single fix to keep fence seqnos of completed jobs monotonically
increasing, as expected in various locations of the driver code. Also
tagged for stable.
* 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux:
drm/etnaviv: (re-)protect fence allocation with GPU mutex
Vineet Gupta [Thu, 30 Mar 2017 17:02:57 +0000 (10:02 -0700)]
ARC: fix build warnings with !CONFIG_KPROBES
| CC lib/nmi_backtrace.o
| In file included from ../include/linux/kprobes.h:43:0,
| from ../lib/nmi_backtrace.c:17:
| ../arch/arc/include/asm/kprobes.h:57:13: warning: 'trap_is_kprobe' defined but not used [-Wunused-function]
| static void trap_is_kprobe(unsigned long address, struct pt_regs *regs)
| ^~~~~~~~~~~~~~
The warning started with 7d134b2ce6 ("kprobes: move kprobe declarations
to asm-generic/kprobes.h") which started including <asm/kprobes.h>
unconditionally into <linux/kprobes.h> exposing a stub function for
!CONFIG_KPROBES to rest of world. Fix that by making the stub a macro
Alexey Brodkin [Wed, 29 Mar 2017 14:15:11 +0000 (17:15 +0300)]
ARCv2: SLC: Make sure busy bit is set properly on SLC flushing
As reported in STAR 9001165532, an SLC control reg read (for checking
busy state) right after SLC invalidate command may incorrectly return
NOT busy causing software to NOT spin-wait while operation is underway.
(and for some reason this only happens if L1 cache is also disabled - as
required by IOC programming model)
Suggested workaround is to do an additional Control Reg read, which
ensures the 2nd read gets the right status.
Linus Torvalds [Thu, 30 Mar 2017 22:08:38 +0000 (15:08 -0700)]
Merge tag 'pci-v4.11-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- fix iProc memory corruption
- fix ThunderX usage of unregistered PNP/ACPI ID
- fix ThunderX resource reservation on early firmware
* tag 'pci-v4.11-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: thunder-pem: Add legacy firmware support for Cavium ThunderX host controller
PCI: thunder-pem: Use Cavium assigned hardware ID for ThunderX host controller
PCI: iproc: Save host bridge window resource in struct iproc_pcie
The controller has different timings for MMC_TIMING_UHS_DDR50 and
MMC_TIMING_MMC_DDR52. Configuring the controller with SDHCI_CTRL_UHS_DDR50,
when MMC_TIMING_MMC_DDR52 timings are requested, is not correct and can
lead to unexpected behavior.
Hans de Goede [Sun, 26 Mar 2017 11:14:45 +0000 (13:14 +0200)]
mmc: sdhci: Disable runtime pm when the sdio_irq is enabled
SDIO cards may need clock to send the card interrupt to the host.
On a cherrytrail tablet with a RTL8723BS wifi chip, without this patch
pinging the tablet results in:
PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=78.6 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1760 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=753 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=3.88 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=795 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1841 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=810 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1860 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=812 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=48.6 ms
Where as with this patch I get:
PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=3.96 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1.97 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=17.2 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=2.46 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=2.83 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=2.10 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=2.04 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=1.40 ms
Cc: Dong Aisheng <b29396@freescale.com> Cc: Ian W MORRISON <ianwmorrison@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Dong Aisheng <aisheng.dong@nxp.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Masahiro Yamada [Thu, 30 Mar 2017 17:44:17 +0000 (02:44 +0900)]
arm64: drop non-existing vdso-offsets.h from .gitignore
Since commit a66649dab350 ("arm64: fix vdso-offsets.h dependency"),
include/generated/vdso-offsets.h is directly generated without
arch/arm64/kernel/vdso/vdso-offsets.h.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
Takashi Iwai [Thu, 30 Mar 2017 18:03:25 +0000 (20:03 +0200)]
Merge tag 'asoc-fix-v4.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v4.11
A relatively large pile of fixes for mainline, the first since the merge
window. The biggest block of changes here by volume is the sun8i-codec
set, the driver was newly added in the merge window but it was realized
that renaming some of the user visible controls was required so these
are being pushed for v4.11 to avoid the original code appearing in a
release. Otherwise it's all fairly standard bugfix stuff.
Mark Salter [Fri, 24 Mar 2017 13:53:56 +0000 (09:53 -0400)]
arm64: fix NULL dereference in have_cpu_die()
Commit 5c492c3f5255 ("arm64: smp: Add function to determine if cpus are
stuck in the kernel") added a helper function to determine if die() is
supported in cpu_ops. This function assumes a cpu will have a valid
cpu_ops entry, but that may not be the case for cpu0 is spin-table or
parking protocol is used to boot secondary cpus. In that case, there
is a NULL dereference if have_cpu_die() is called by cpu0. So add a
check for a valid cpu_ops before dereferencing it.
Fixes: 5c492c3f5255 ("arm64: smp: Add function to determine if cpus are stuck in the kernel") Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
Eric Anholt [Tue, 28 Mar 2017 20:13:43 +0000 (13:13 -0700)]
drm/vc4: Allocate the right amount of space for boot-time CRTC state.
Without this, the first modeset would dereference past the allocation
when trying to free the mm node.
Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170328201343.4884-1-eric@anholt.net Fixes: d8dbf44f13b9 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.") Cc: <stable@vger.kernel.org> # v4.6+ Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The GCC '-maccumulate-outgoing-args' flag is enabled for most configs,
mostly because of issues which are no longer relevant. For most
configs, and with most recent versions of GCC, it's no longer needed.
Clarify which cases need it, and only enable it for those cases. Also
produce a compile-time error for the ftrace graph + mcount + '-Os' case,
which will otherwise cause runtime failures.
The main benefit of '-maccumulate-outgoing-args' is that it prevents an
ugly prologue for functions which have aligned stacks. But removing the
option also has some benefits: more readable argument saves, smaller
text size, and (presumably) slightly improved performance.
Here are the object size savings for 32-bit and 64-bit defconfig
kernels:
HID: wacom: call _query_tablet_data() for BAMBOO_TOUCH
Commit a544c619a54b ("HID: wacom: do not attempt to switch mode
while in probe") introduces delayed work for querying (setting the
mode) on all tablets. Bamboo Touch (056a:00d0) has a ghost
interface which claims to be a pen device. Though this device can
be removed, we have to set the mode on the ghost pen interface
before we remove it. After the aforementioned delay was introduced
the device was being removed before the mode setting could be
executed.
HID: wacom: Don't add ghost interface as shared data
A previous commit (below) adds a check for already probed interfaces to
Wacom's matching heuristic. Unfortunately this causes the Bamboo Pen
(CTL-460) to match itself to its 'ghost' touch interface. After
subsequent changes to the driver this match to the ghost causes the
kernel to crash. This patch avoids calling wacom_add_shared_data()
for the BAMBOO_PEN's ghost touch interface.
Fixes: 41372d5d40e7 ("HID: wacom: Augment 'oVid' and 'oPid' with heuristics for HID_GENERIC") Cc: stable <stable@vger.kernel.org> # 4.9 Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Andy Lutomirski [Wed, 29 Mar 2017 23:47:35 +0000 (16:47 -0700)]
x86/boot/32: Rewrite test_wp_bit()
This code seems to be very old and has gotten only minor updates.
It's overcomplicated and has a bunch of comments that are, at best,
of purely historical interest. Nowadays we have a shiny function
probe_kernel_write() that does more or less exactly what we need.
Use it.
I switched the page that we test from swapper_pg_dir to
empty_zero_page because writing zero to empty_zero_page is more
obviously safe than writing to the paging structures. (It's
extremely unlikely that any of this would cause problems in practice
because the write will fail on any supported CPU.)
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0b9e64ab0236de30e7572213cea77bf95ae2e990.1490831211.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Thu, 30 Mar 2017 02:59:49 +0000 (19:59 -0700)]
Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
- Fix a potential deadlock in cpu_cooling driver, which was introduced
in 4.11-rc1. (Matthew Wilcox)
- Fix the cpu_cooling and devfreq_cooling code to handle possible error
return value from OPP calls, together with three minor fixes in the
same patch series. (Viresh Kumar)
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: cpu_cooling: Check OPP for errors
thermal: cpu_cooling: Replace dev_warn with dev_err
thermal: devfreq: Check OPP for errors
thermal: devfreq_cooling: Replace dev_warn with dev_err
thermal: devfreq: Simplify expression
thermal: Fix potential deadlock in cpu_cooling
Linus Torvalds [Wed, 29 Mar 2017 21:30:19 +0000 (14:30 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Five fixes for this series:
- a fix from me to ensure that blk-mq drivers that terminate IO in
their ->queue_rq() handler by returning QUEUE_ERROR don't stall
with a scheduler enabled.
- four nbd fixes from Josef and Ratna, fixing various problems that
are critical enough to go in for this cycle. They have been well
tested"
* 'for-linus' of git://git.kernel.dk/linux-block:
nbd: replace kill_bdev() with __invalidate_device()
nbd: set queue timeout properly
nbd: set rq->errors to actual error code
nbd: handle ERESTARTSYS properly
blk-mq: include errors in did_work calculation
cpuidle: powernv: Pass correct drv->cpumask for registration
drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init().
On PowerNV platform cpu_present could be less than cpu_possible in cases
where firmware detects the cpu, but it is not available to the OS. When
CONFIG_HOTPLUG_CPU=n, such cpus are not hotplugable at runtime and hence
we skip creating cpu_device.
This breaks cpuidle on powernv where register_cpu() is not called for
cpus in cpu_possible_mask that cannot be hot-added at runtime.
Trying cpuidle_register_device() on cpu without cpu_device will cause
crash like this: