Joonsoo Kim [Wed, 16 Oct 2013 20:46:48 +0000 (13:46 -0700)]
mm/hugetlb.c: correct missing private flag clearing
We should clear the page's private flag when returing the page to the
hugepage pool. Otherwise, marked hugepage can be allocated to the user
who tries to allocate the non-reserved hugepage. If this user fail to
map this hugepage, he would try to return the page to the hugepage pool.
Since this page has a private flag, resv_huge_pages would mistakenly
increase. This patch fixes this situation.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 16 Oct 2013 20:46:45 +0000 (13:46 -0700)]
ipc/sem.c: synchronize semop and semctl with IPC_RMID
After acquiring the semlock spinlock, operations must test that the
array is still valid.
- semctl() and exit_sem() would walk stale linked lists (ugly, but
should be ok: all lists are empty)
- semtimedop() would sleep forever - and if woken up due to a signal -
access memory after free.
The patch also:
- standardizes the tests for .deleted, so that all tests in one
function leave the function with the same approach.
- unconditionally tests for .deleted immediately after every call to
sem_lock - even it it means that for semctl(GETALL), .deleted will be
tested twice.
Both changes make the review simpler: After every sem_lock, there must
be a test of .deleted, followed by a goto to the cleanup code (if the
function uses "goto cleanup").
The only exception is semctl_down(): If sem_ids().rwsem is locked, then
the presence in ids->ipcs_idr is equivalent to !.deleted, thus no
additional test is required.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Wed, 16 Oct 2013 20:46:45 +0000 (13:46 -0700)]
ipc: update locking scheme comments
The initial documentation was a bit incomplete, update accordingly.
[akpm@linux-foundation.org: make it more readable in 80 columns] Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 16 Oct 2013 20:46:43 +0000 (13:46 -0700)]
mm, memcg: protect mem_cgroup_read_events for cpu hotplug
for_each_online_cpu() needs the protection of {get,put}_online_cpus() so
cpu_online_mask doesn't change during the iteration.
cpu_hotplug.lock is held while a cpu is going down, it's a coarse lock
that is used kernel-wide to synchronize cpu hotplug activity. Memcg has
a cpu hotplug notifier, called while there may not be any cpu hotplug
refcounts, which drains per-cpu event counts to memcg->nocpu_base.events
to maintain a cumulative event count as cpus disappear. Without
get_online_cpus() in mem_cgroup_read_events(), it's possible to account
for the event count on a dying cpu twice, and this value may be
significantly large.
In fact, all memcg->pcp_counter_lock use should be nested by
{get,put}_online_cpus().
This fixes that issue and ensures the reported statistics are not vastly
over-reported during cpu hotplug.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 16 Oct 2013 00:14:13 +0000 (17:14 -0700)]
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree fixes and reverts from Grant Likely:
"One bug fix and three reverts. The reverts back out the slightly
controversial feeding the entire device tree into the random pool and
the reserved-memory binding which isn't fully baked yet. Expect the
reserved-memory patches at least to resurface for v3.13.
The bug fixes removes a scary but harmless warning on SPARC that was
introduced in the v3.12 merge window. v3.13 will contain a proper fix
that makes the new code work on SPARC.
On the plus side, the diffstat looks *awesome*. I love removing lines
of code"
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
Revert "drivers: of: add initialization code for dma reserved memory"
Revert "ARM: init: add support for reserved memory defined by device tree"
Revert "of: Feed entire flattened device tree into the random pool"
of: fix unnecessary warning on missing /cpus node
Linus Torvalds [Tue, 15 Oct 2013 23:22:11 +0000 (16:22 -0700)]
Merge tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fixes from Stefano Stabellini:
"A small fix for Xen on x86_32 and a build fix for xen-tpmfront on
arm64"
* tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix possible user space selector corruption
tpm: xen-tpmfront: fix missing declaration of xen_domain
Raghavendra K T [Wed, 9 Oct 2013 09:03:21 +0000 (14:33 +0530)]
KVM: Enable pvspinlock after jump_label_init() to avoid VM hang
We use jump label to enable pv-spinlock. With the changes in (442e0973e927
Merge branch 'x86/jumplabel'), the jump label behaviour has changed
that would result in eventual hang of the VM since we would end up in a
situation where slow path locks would halt the vcpus but we will not be
able to wakeup the vcpu by lock releaser using unlock kick.
Similar problem in Xen and more detailed description is available in a945928ea270 (xen: Do not enable spinlocks before jump_label_init()
has executed)
This patch splits kvm_spinlock_init to separate jump label changes with
pvops patching and also make jump label enabling after jump_label_init().
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>
Marek Szyprowski [Fri, 11 Oct 2013 07:27:28 +0000 (09:27 +0200)]
Revert "drivers: of: add initialization code for dma reserved memory"
This reverts commit 9d8eab7af79cb4ce2de5de39f82c455b1f796963. There is
still no consensus on the bindings for the reserved memory and various
drawbacks of the proposed solution has been shown, so the best now is to
revert it completely and start again from scratch later.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>
Marek Szyprowski [Fri, 11 Oct 2013 07:27:27 +0000 (09:27 +0200)]
Revert "ARM: init: add support for reserved memory defined by device tree"
This reverts commit 10bcdfb8ba24760f715f0a700c3812747eddddf5. There is
no consensus on the bindings for the reserved memory, so the code for
handing it will be reverted.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>
Linus Torvalds [Tue, 15 Oct 2013 00:43:33 +0000 (17:43 -0700)]
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband updates from Roland Dreier:
"Last batch of IB changes for 3.12: many mlx5 hardware driver fixes
plus one trivial semicolon cleanup"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB: Remove unnecessary semicolons
IB/mlx5: Ensure proper synchronization accessing memory
IB/mlx5: Fix alignment of reg umr gather buffers
IB/mlx5: Fix eq names to display nicely in /proc/interrupts
mlx5: Fix error code translation from firmware to driver
IB/mlx5: Fix opt param mask according to firmware spec
mlx5: Fix opt param mask for sq err to rts transition
IB/mlx5: Disable atomic operations
mlx5: Fix layout of struct mlx5_init_seg
mlx5: Keep polling to reclaim pages while any returned
IB/mlx5: Avoid async events on invalid port number
IB/mlx5: Decrease memory consumption of mr caches
mlx5: Remove checksum on command interface commands
IB/mlx5: Fix memory leak in mlx5_ib_create_srq
IB/mlx5: Flush cache workqueue before destroying it
IB/mlx5: Fix send work queue size calculation
Linus Torvalds [Mon, 14 Oct 2013 17:02:23 +0000 (10:02 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"Some more ARM fixes, nothing particularly major here. The biggest
change is to fix the SMP_ON_UP code so that it works with TI's Aegis
cores"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7851/1: check for number of arguments in syscall_get/set_arguments()
ARM: 7846/1: Update SMP_ON_UP code to detect A9MPCore with 1 CPU devices
ARM: 7845/1: sharpsl_param.c: fix invalid memory access for pxa devices
ARM: 7843/1: drop asm/types.h from generic-y
ARM: 7842/1: MCPM: don't explode if invoked without being initialized first
Linus Torvalds [Mon, 14 Oct 2013 16:31:08 +0000 (09:31 -0700)]
Merge tag 'pm+acpi-3.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These fix two recent bugs in ACPIPHP (ACPI-based PCI hotplug) and
update a bunch of web links and e-mail addresses in MAINTAINERS, docs
and Kconfig that either are stale or will expire soon.
Specifics:
- The WARN_ON() in acpiphp_enumerate_slots() triggers as a false
positive in some cases, so drop it.
- Add a missing pci_dev_put() to an error code path in
acpiphp_enumerate_slots().
- Replace my old e-mail address that's going to expire with a new
one.
- Update ACPI web links and git tree information in MAINTAINERS.
- Update links to the Linux-ACPI project's page in MAINTAINERS.
- Update some stale links and e-mail addresses under Documentation
and in the ACPI Kconfig file"
* tag 'pm+acpi-3.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / hotplug / PCI: Drop WARN_ON() from acpiphp_enumerate_slots()
ACPI / hotplug / PCI: Fix error code path in acpiphp_enumerate_slots()
ACPI / PM / Documentation: Replace outdated project links and addresses
MAINTAINERS / ACPI: Update links to the Linux-ACPI project web page
MAINTAINERS / ACPI: Update links and git tree information
MAINTAINERS / Documentation: Update Rafael's e-mail address
Tim Bird expressed concern that this will have a bad effect on boot
time, and while simple tests have shown it to be okay with simple tree,
a device tree blob can potentially be quite large and
add_device_randomness() is not a fast function. Rather than do this for
all platforms unconditionally, I'm reverting this patch and would like
to see it revisited. Instead of feeding the entire tree into the random
pool, it would probably be appropriate to hash the tree and feed the
hash result into the pool. There really isn't a lot of randomness in a
device tree anyway. In the majority of cases only a handful of
properties are going to be different between machines with the same
baseboard.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Grant Likely [Thu, 3 Oct 2013 20:04:31 +0000 (21:04 +0100)]
of: fix unnecessary warning on missing /cpus node
Not all DT platforms have all the cpus collected under a /cpus node.
That just happens to be a details of FDT, ePAPR and PowerPC platforms.
Sparc does something different, but unfortunately the current code
complains with a warning if /cpus isn't there. This became a problem
with commit f86e4718, "driver/core cpu: initialize of_node in cpu's
device structure", which caused the function to get called for all
architectures.
This commit is a temporary fix to fail silently if the cpus node isn't
present. A proper fix will come later to allow arch code to provide a
custom mechanism for decoding the CPU hwid if the 'reg' property isn't
appropriate.
Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: David Miller <davem@davemloft.net> Cc: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> Cc: Rob Herring <rob.herring@calxeda.com>
Linus Torvalds [Sun, 13 Oct 2013 18:41:26 +0000 (11:41 -0700)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
"This will fix a deadlock on the ts72xx_wdt driver, fix bitmasks in the
kempld_wdt driver and fix a section mismatch in the sunxi_wdt driver"
* git://www.linux-watchdog.org/linux-watchdog:
watchdog: sunxi: Fix section mismatch
watchdog: kempld_wdt: Fix bit mask definition
watchdog: ts72xx_wdt: locking bug in ioctl
Maxime Ripard [Sat, 5 Oct 2013 14:20:17 +0000 (16:20 +0200)]
watchdog: sunxi: Fix section mismatch
This driver has a section mismatch, for probe and remove functions,
leading to the following warning during the compilation.
WARNING: drivers/watchdog/built-in.o(.data+0x24): Section mismatch in
reference from the variable sunxi_wdt_driver to the function
.init.text:sunxi_wdt_probe()
The variable sunxi_wdt_driver references
the function __init sunxi_wdt_probe()
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
AKASHI Takahiro [Wed, 9 Oct 2013 14:58:29 +0000 (15:58 +0100)]
ARM: 7851/1: check for number of arguments in syscall_get/set_arguments()
In ftrace_syscall_enter(),
syscall_get_arguments(..., 0, n, ...)
if (i == 0) { <handle ORIG_r0> ...; n--;}
memcpy(..., n * sizeof(args[0]));
If 'number of arguments(n)' is zero and 'argument index(i)' is also zero in
syscall_get_arguments(), none of arguments should be copied by memcpy().
Otherwise 'n--' can be a big positive number and unexpected amount of data
will be copied. Tracing system calls which take no argument, say sync(void),
may hit this case and eventually make the system corrupted.
This patch fixes the issue both in syscall_get_arguments() and
syscall_set_arguments().
Cc: <stable@vger.kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sun, 13 Oct 2013 16:59:10 +0000 (09:59 -0700)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A small batch of fixes this week, mostly OMAP related. Nothing stands
out as particularly controversial.
Also a fix for a 3.12-rc1 timer regression for Exynos platforms,
including the Chromebooks"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: exynos: dts: Update 5250 arch timer node with clock frequency
ARM: OMAP2: RX-51: Add missing max_current to rx51_lp5523_led_config
ARM: mach-omap2: board-generic: fix undefined symbol
ARM: dts: Fix pinctrl mask for omap3
ARM: OMAP3: Fix hardware detection for omap3630 when booted with device tree
ARM: OMAP2: gpmc-onenand: fix sync mode setup with DT
ARM: exynos: dts: Update 5250 arch timer node with clock frequency
Without the "clock-frequency" property in arch timer node, could able
to see the below crash dump.
[<c0014e28>] (unwind_backtrace+0x0/0xf4) from [<c0011808>] (show_stack+0x10/0x14)
[<c0011808>] (show_stack+0x10/0x14) from [<c036ac1c>] (dump_stack+0x7c/0xb0)
[<c036ac1c>] (dump_stack+0x7c/0xb0) from [<c01ab760>] (Ldiv0_64+0x8/0x18)
[<c01ab760>] (Ldiv0_64+0x8/0x18) from [<c0062f60>] (clockevents_config.part.2+0x1c/0x74)
[<c0062f60>] (clockevents_config.part.2+0x1c/0x74) from [<c0062fd8>] (clockevents_config_and_register+0x20/0x2c)
[<c0062fd8>] (clockevents_config_and_register+0x20/0x2c) from [<c02b8e8c>] (arch_timer_setup+0xa8/0x134)
[<c02b8e8c>] (arch_timer_setup+0xa8/0x134) from [<c04b47b4>] (arch_timer_init+0x1f4/0x24c)
[<c04b47b4>] (arch_timer_init+0x1f4/0x24c) from [<c04b40d8>] (clocksource_of_init+0x34/0x58)
[<c04b40d8>] (clocksource_of_init+0x34/0x58) from [<c049ed8c>] (time_init+0x20/0x2c)
[<c049ed8c>] (time_init+0x20/0x2c) from [<c049b95c>] (start_kernel+0x1e0/0x39c)
THis is because the Exynos u-boot, for example on the Chromebooks, doesn't set
up the CNTFRQ register as expected by arch_timer. Instead, we have to specify
the frequency in the device tree like this.
Signed-off-by: Yuvaraj Kumar C D <yuvaraj.cd@samsung.com>
[olof: Changed subject, added comment, elaborated on commit message] Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Sun, 13 Oct 2013 16:33:32 +0000 (09:33 -0700)]
Merge tag 'fixes-against-v3.12-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
Few fixes for omap3 related hangs and errors that people have
noticed now that people are actually using the device tree
based booting for omap3.
Also one regression fix for timer compile for dra7xx when
omap5 is not selected, and a LED regression fix for n900.
* tag 'fixes-against-v3.12-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2: RX-51: Add missing max_current to rx51_lp5523_led_config
ARM: mach-omap2: board-generic: fix undefined symbol
ARM: dts: Fix pinctrl mask for omap3
ARM: OMAP3: Fix hardware detection for omap3630 when booted with device tree
ARM: OMAP2: gpmc-onenand: fix sync mode setup with DT
Linus Torvalds [Sun, 13 Oct 2013 16:13:28 +0000 (09:13 -0700)]
Merge branch 'parisc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"This patchset includes a bugfix to prevent a kernel crash when memory
in page zero is accessed by the kernel itself, e.g. via
probe_kernel_read().
Furthermore we now export flush_cache_page() which is needed
(indirectly) by the lustre filesystem. The other patches remove
unused functions and optimizes the page fault handler to only evaluate
variables if needed, which again protects against possible kernel
crashes"
* 'parisc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: let probe_kernel_read() capture access to page zero
parisc: optimize variable initialization in do_page_fault
parisc: fix interruption handler to respect pagefault_disable()
parisc: mark parisc_terminate() noreturn and cold.
parisc: remove unused syscall_ipi() function.
parisc: kill SMP single function call interrupt
parisc: Export flush_cache_page() (needed by lustre)
Linus Torvalds [Sun, 13 Oct 2013 16:02:03 +0000 (09:02 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"Another week, time to send another fixes request taking time out of
extended weekend for the festivities in this part of the world.
We have two fixes from Sergei for rcar driver and one fixing memory
leak of edma driver by Geyslan"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dma: edma.c: remove edma_desc leakage
rcar-hpbdma: add parameter to set_slave() method
rcar-hpbdma: remove shdma_free_irq() calls
parisc: optimize variable initialization in do_page_fault
The attached change defers the initialization of the variables tsk, mm
and flags until they are needed. As a result, the code won't crash if a
kernel probe is done with a corrupt context and the code will be better
optimized.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Tue, 1 Oct 2013 19:54:46 +0000 (21:54 +0200)]
parisc: fix interruption handler to respect pagefault_disable()
Running an "echo t > /proc/sysrq-trigger" crashes the parisc kernel. The
problem is, that in print_worker_info() we try to read the workqueue info via
the probe_kernel_read() functions which use pagefault_disable() to avoid
crashes like this:
probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
probe_kernel_read(name, wq->name, sizeof(name) - 1);
The problem here is, that the first probe_kernel_read(&pwq) might return zero
in pwq and as such the following probe_kernel_reads() try to access contents of
the page zero which is read protected and generate a kernel segfault.
With this patch we fix the interruption handler to call parisc_terminate()
directly only if pagefault_disable() was not called (in which case
preempt_count()==0). Otherwise we hand over to the pagefault handler which
will try to look up the faulting address in the fixup tables.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v3.0+ Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
Jiang Liu [Wed, 11 Sep 2013 16:07:18 +0000 (00:07 +0800)]
parisc: kill SMP single function call interrupt
Commit 9a46ad6d6df3b54 "smp: make smp_call_function_many() use logic
similar to smp_call_function_single()" has unified the way to handle
single and multiple cross-CPU function calls. Now only one interrupt
is needed for architecture specific code to support generic SMP function
call interfaces, so kill the redundant single function call interrupt.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Jiang Liu <liuj97@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
Olga reported that file descriptors opened with O_PATH do not work with
fstatfs(), found during further development of ksh93's thread support.
There is no reason to not allow O_PATH file descriptors here (fstatfs is
very much a path operation), so use "fdget_raw()". See commit 55815f70147d ("vfs: make O_PATH file descriptors usable for 'fstat()'")
for a very similar issue reported for fstat() by the same team.
Reported-and-tested-by: ольга крыжановская <olga.kryzhanovska@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org # O_PATH introduced in 3.0+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 12 Oct 2013 19:55:15 +0000 (12:55 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"A bug fix and performance regression fix for ext4"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix memory leak in xattr
ext4: fix performance regression in writeback of random writes
Linus Torvalds [Sat, 12 Oct 2013 19:54:24 +0000 (12:54 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We've got more bug fixes in my for-linus branch:
One of these fixes another corner of the compression oops from last
time. Miao nailed down some problems with concurrent snapshot
deletion and drive balancing.
I kept out one of his patches for more testing, but these are all
stable"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix oops caused by the space balance and dead roots
Btrfs: insert orphan roots into fs radix tree
Btrfs: limit delalloc pages outside of find_delalloc_range
Btrfs: use right root when checking for hash collision
Linus Torvalds [Sat, 12 Oct 2013 18:53:43 +0000 (11:53 -0700)]
Merge tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All stable fixes except for a trivial headset mic fixup: the removal
of bogus frame checks in snd-usb-usx2y driver that have regressed in
the recent kernel versions, the HD-audio HDMI channel map fix, and a
few HD-audio device-specific fixes"
* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Sony VAIO Pro 13 (haswell) now has a working headset jack
ALSA: hda - Add a headset mic model for ALC269 and friends
ALSA: hda - Fix microphone for Sony VAIO Pro 13 (Haswell model)
ALSA: hda - Add fixup for ASUS N56VZ
ALSA: hda - hdmi: Fix channel map switch not taking effect
ALSA: hda - Fix mono speakers and headset mic on Dell Vostro 5470
ALSA: snd-usb-usx2y: remove bogus frame checks
Linus Torvalds [Sat, 12 Oct 2013 18:52:40 +0000 (11:52 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"We had various reports of problems with deferred probing in the I2C
subsystem, so this pull requst is a little bigger than usual.
Most issues should be addressed now so devices will be found
correctly. A few ususal driver bugfixes are in here, too"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-mux-pinctrl: use deferred probe when adapter not found
i2c: i2c-arb-gpio-challenge: use deferred probe when adapter not found
i2c: i2c-mux-gpio: use deferred probing
i2c: i2c-mux-gpio: don't ignore of_get_named_gpio errors
i2c: omap: Clear ARDY bit twice
i2c: Not all adapters have a parent
i2c: i2c-stu300: replace platform_driver_probe to support deferred probing
i2c: i2c-mxs: replace platform_driver_probe to support deferred probing
i2c: i2c-imx: replace platform_driver_probe to support deferred probing
i2c: i2c-designware-platdrv: replace platform_driver_probe to support deferred probing
Dave Jones [Fri, 11 Oct 2013 00:05:35 +0000 (20:05 -0400)]
ext4: fix memory leak in xattr
If we take the 2nd retry path in ext4_expand_extra_isize_ea, we
potentionally return from the function without having freed these
allocations. If we don't do the return, we over-write the previous
allocation pointers, so we leak either way.
Spotted with Coverity.
[ Fixed by tytso to set is and bs to NULL after freeing these
pointers, in case in the retry loop we later end up triggering an
error causing a jump to cleanup, at which point we could have a double
free bug. -- Ted ]
Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Cc: stable@vger.kernel.org
Linus Torvalds [Sat, 12 Oct 2013 18:06:18 +0000 (11:06 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull gcc "asm goto" miscompilation workaround from Ingo Molnar:
"This is the fix for the GCC miscompilation discussed in the following
lkml thread:
[x86] BUG: unable to handle kernel paging request at 00740060
The bug in GCC has been fixed by Jakub and the fix will be part of the
GCC 4.8.2 release expected to be released next week - so the quirk's
version test checks for <= 4.8.1.
The quirk is only added to compiler-gcc4.h and not to the higher level
compiler.h because all asm goto uses are behind a feature check"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
Linus Torvalds [Sat, 12 Oct 2013 17:36:03 +0000 (10:36 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A build fix and a reboot quirk"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/reboot: Add reboot quirk for Dell Latitude E5410
x86, build, pci: Fix PCI_MSI build on !SMP
Linus Torvalds [Sat, 12 Oct 2013 17:34:14 +0000 (10:34 -0700)]
Merge tag 'arc-fixes-for-3.12-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fix from Vineet Gupta:
"Fix for broken gdb 'jump'"
* tag 'arc-fixes-for-3.12-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc"
Vineet Gupta [Thu, 10 Oct 2013 14:03:57 +0000 (19:33 +0530)]
ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc"
ARCompact TRAP_S insn used for breakpoints, commits before exception is
taken (updating architectural PC). So ptregs->ret contains next-PC and
not the breakpoint PC itself. This is different from other restartable
exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
@stop_pc which returns ptregs->ret vs. EFA depending on the
situation.
However, writing stop_pc (SETREGSET request), which updates ptregs->ret
doesn't makes sense stop_pc doesn't always correspond to that reg as
described above.
This was not an issue so far since user_regs->ret / user_regs->stop_pc
had same value and both writing to ptregs->ret was OK, needless, but NOT
broken, hence not observed.
With gdb "jump", they diverge, and user_regs->ret updating ptregs is
overwritten immediately with stop_pc, which this patch fixes.
Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ACPI / hotplug / PCI: Drop WARN_ON() from acpiphp_enumerate_slots()
The WARN_ON() in acpiphp_enumerate_slots() triggers unnecessarily for
devices whose bridges are going to be handled by native PCIe hotplug
(pciehp) and the simplest way to prevent that from happening is to
drop the WARN_ON().
References: https://bugzilla.kernel.org/show_bug.cgi?id=62831 Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Fri, 11 Oct 2013 17:41:21 +0000 (10:41 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"All over the map..
- nouveau:
disable MSI, needs more work, will try again next merge window
- radeon:
audio + uvd regression fixes, dpm fixes, reset fixes
- i915:
the dpms fix might fix your haswell
And one pain in the ass revert, so we have VGA arbitration that when
implemented 4-5 years ago really hoped that GPUs could remove
themselves from arbitration completely once they had a kernel driver.
It seems Intel hw designers decided that was too nice a facility to
allow us to have so they removed it when they went on-die (so since
Ironlake at least). Now Alex Williamson added support for VGA
arbitration for newer GPUs however this now exposes itself to
userspace as requireing arbitration of GPU VGA regions and the X
server gets involved and disables things that it can't handle when VGA
access is possibly required around every operation.
So in order to not break userspace we just reverted things back to the
old known broken status so maybe we can try and design out way out.
Ville also had a patch to use stop machine for the two times Intel
needs to access VGA space, that might be acceptable with some rework,
but for now myself and Daniel agreed to just go back"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (23 commits)
Revert "i915: Update VGA arbiter support for newer devices"
Revert "drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done"
drm/radeon: re-enable sw ACR support on pre-DCE4
drm/radeon/dpm: disable bapm on TN asics
drm/radeon: improve soft reset on CIK
drm/radeon: improve soft reset on SI
drm/radeon/dpm: off by one in si_set_mc_special_registers()
drm/radeon/dpm/btc: off by one in btc_set_mc_special_registers()
drm/radeon: forever loop on error in radeon_do_test_moves()
drm/radeon: fix hw contexts for SUMO2 asics
drm/radeon: fix typo in CP DMA register headers
drm/radeon/dpm: disable multiple UVD states
drm/radeon: use hw generated CTS/N values for audio
drm/radeon: fix N/CTS clock matching for audio
drm/radeon: use 64-bit math to calculate CTS values for audio (v2)
drm/edid: catch kmalloc failure in drm_edid_to_speaker_allocation
Revert "drm/fb-helper: don't sleep for screen unblank when an oops is in progress"
drm/gma500: fix things after get/put page helpers
drm/nouveau/mc: disable msi support by default, it's busted in tons of places
drm/i915: Only apply DPMS to the encoder if enabled
...
Antonios Motakis [Fri, 11 Oct 2013 16:40:46 +0000 (10:40 -0600)]
VFIO: vfio_iommu_type1: fix bug caused by break in nested loop
In vfio_iommu_type1.c there is a bug in vfio_dma_do_map, when checking
that pages are not already mapped. Since the check is being done in a
for loop nested within the main loop, breaking out of it does not create
the intended behavior. If the underlying IOMMU driver returns a non-NULL
value, this will be ignored and mapping the DMA range will be attempted
anyway, leading to unpredictable behavior.
This interracts badly with the ARM SMMU driver issue fixed in the patch
that was submitted with the title:
"[PATCH 2/2] ARM: SMMU: return NULL on error in arm_smmu_iova_to_phys"
Both fixes are required in order to use the vfio_iommu_type1 driver
with an ARM SMMU.
This patch refactors the function slightly, in order to also make this
kind of bug less likely.
Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
ALSA: hda - Fix microphone for Sony VAIO Pro 13 (Haswell model)
The external mic showed up with a precense detect of "always present",
essentially disabling the internal mic. Therefore turn off presence
detection for this pin.
Note: The external mic seems not yet working, but an internal mic is
certainly better than no mic at all.
Ingo Molnar [Thu, 10 Oct 2013 08:16:30 +0000 (10:16 +0200)]
compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Adding drm/i915 into the vga arbiter chain means that X (in a piece of
well-meant paranoia) will do a get/put on the vga decoding around
_every_ accel call down into the ddx. Which results in some nice
performance disasters [1]. This really breaks userspace, by disabling
DRI for everyone, and stops OpenGL from working, this isn't limited
to just the i915 but both the integrated and discrete GPUs on
multi-gpu systems, in other words this causes untold worlds of pain,
Ville tried to come up with a Great Hack to fiddle the required VGA
I/O ops behind everyone's back using stop_machine, but that didn't
really work out [2]. Given that we're fairly late in the -rc stage for
such games let's just revert this all.
One thing we might want to keep is to delay the disabling of the vga
decoding until the fbdev emulation and the fbcon screen is set up. If
we kill vga mem decoding beforehand fbcon can end up with a white
square in the top-left corner it tried to save from the vga memory for
a seamless transition. And we have bug reports on older platforms
which seem to match these symptoms.
But again that's something to play around with in -next.
References: [1] http://lists.x.org/archives/xorg-devel/2013-September/037763.html
References: [2] http://www.spinics.net/lists/intel-gfx/msg34062.html Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 11 Oct 2013 03:07:15 +0000 (13:07 +1000)]
Merge branch 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Regression fixes for audio and UVD, several hang fixes,
some DPM fixes.
* 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: re-enable sw ACR support on pre-DCE4
drm/radeon/dpm: disable bapm on TN asics
drm/radeon: improve soft reset on CIK
drm/radeon: improve soft reset on SI
drm/radeon/dpm: off by one in si_set_mc_special_registers()
drm/radeon/dpm/btc: off by one in btc_set_mc_special_registers()
drm/radeon: forever loop on error in radeon_do_test_moves()
drm/radeon: fix hw contexts for SUMO2 asics
drm/radeon: fix typo in CP DMA register headers
drm/radeon/dpm: disable multiple UVD states
drm/radeon: use hw generated CTS/N values for audio
drm/radeon: fix N/CTS clock matching for audio
drm/radeon: use 64-bit math to calculate CTS values for audio (v2)
drm/edid: catch kmalloc failure in drm_edid_to_speaker_allocation
Sergei Shtylyov [Wed, 25 Sep 2013 22:31:37 +0000 (02:31 +0400)]
rcar-hpbdma: add parameter to set_slave() method
Commit 4981c4dc194efb18f0e9a02f1b43e926f2f0d2bb (DMA: shdma: switch DT mode to
use configuration data from a match table) added a new parameter to set_slave()
method but unfortunately got merged later than commit c4f6c41ba790bbbfcebb4c47a
(dma: add driver for R-Car HPB-DMAC), so that the HPB-DMAC driver retained the
old prototype which caused this warning:
drivers/dma/sh/rcar-hpbdma.c:485: warning: initialization from incompatible
pointer type
The newly added parameter is used to override DMA slave address from 'struct
hpb_dmae_slave_config', so we have to add the 'slave_addr' field to 'struct
hpb_dmae_chan', conditionally assign it in set_slave() method, and return in
slave_addr() method.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
drivers/dma/sh/rcar-hpbdma.c: In function `hpb_dmae_alloc_chan_resources':
drivers/dma/sh/rcar-hpbdma.c:435: error: implicit declaration of function
`shdma_free_irq'
Fix this compilation error by removing the remaining shdma_free_irq() calls.
Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
It is because we returned the error number if the reference of the root was 0
when doing space relocation. It was not right here, because though the root
was dead(refs == 0), but the space it held still need be relocated, or we
could not remove the block group. So in this case, we should return the root
no matter it is dead or not.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Now we don't drop all the deleted snapshots/subvolumes before the space
balance. It means we have to relocate the space which is held by the dead
snapshots/subvolumes. So we must into them into fs radix tree, or we would
forget to commit the change of them when doing transaction commit, and it
would corrupt the metadata.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Tue, 8 Oct 2013 02:11:09 +0000 (22:11 -0400)]
Btrfs: limit delalloc pages outside of find_delalloc_range
Liu fixed part of this problem and unfortunately I steered him in slightly the
wrong direction and so didn't completely fix the problem. The problem is we
limit the size of the delalloc range we are looking for to max bytes and then we
try to lock that range. If we fail to lock the pages in that range we will
shrink the max bytes to a single page and re loop. However if our first page is
inside of the delalloc range then we will end up limiting the end of the range
to a period before our first page. This is illustrated below
[0 -------- delalloc range --------- 256mb]
[page]
So find_delalloc_range will return with delalloc_start as 0 and end as 128mb,
and then we will notice that delalloc_start < *start and adjust it up, but not
adjust delalloc_end up, so things go sideways. To fix this we need to not limit
the max bytes in find_delalloc_range, but in find_lock_delalloc_range and that
way we don't end up with this confusion. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 9 Oct 2013 16:24:04 +0000 (12:24 -0400)]
Btrfs: use right root when checking for hash collision
btrfs_rename was using the root of the old dir instead of the root of the new
dir when checking for a hash collision, so if you tried to move a file into a
subvol it would freak out because it would see the file you are trying to move
in its current root. This fixes the bug where this would fail
Cc: stable@vger.kernel.org Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Kent Overstreet [Fri, 11 Oct 2013 00:31:15 +0000 (17:31 -0700)]
bcache: Fix a null ptr deref regression
Commit c0f04d88e46d ("bcache: Fix flushes in writeback mode") was fixing
a reported data corruption bug, but it seems some last minute
refactoring or rebasing introduced a null pointer deref.
Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Reported-by: Gabriel de Perthuis <g2p.code@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fengguang Wu [Wed, 9 Oct 2013 01:26:21 +0000 (09:26 +0800)]
kobject: show debug info on delayed kobject release
Useful for locating buggy drivers on kernel oops.
It may add dozens of new lines to boot dmesg. DEBUG_KOBJECT_RELEASE is
hopefully only enabled in debug kernels (like maybe the Fedora rawhide
one, or at developers), so being a bit more verbose is likely ok.
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
watchdog: hpwdt: Patch to ignore auxilary iLO devices
This patch is to prevent hpwdt from loading on any auxilary iLO devices defined
after the initial (or main) iLO device. All auxilary iLO devices will have a
subsystem device ID set to 0x1979 in order for hpwdt to differentiate between
the two types.
Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Tested-by: Lisa Mitchell <lisa.mitchell@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Linus Torvalds [Thu, 10 Oct 2013 19:31:43 +0000 (12:31 -0700)]
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull /dev/random changes from Ted Ts'o:
"These patches are designed to enable improvements to /dev/random for
non-x86 platforms, in particular MIPS and ARM"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: allow architectures to optionally define random_get_entropy()
random: run random_int_secret_init() run after all late_initcalls
Linus Torvalds [Thu, 10 Oct 2013 18:33:02 +0000 (11:33 -0700)]
Merge tag 'spi-v3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"This is all driver updates, mostly fixes for error handling paths
except for the s3c64xx and hspi fixes for trying to use runtime PM
before it is enabled and the pxa2xx fix for interactions between power
management and interrupt handling"
* tag 'spi-v3.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: atmel: Fix incorrect error path
spi/hspi: fixup Runtime PM enable timing
spi/s3c64xx: Ensure runtime PM is enabled prior to registration
spi/clps711x: drop clk_put for devm_clk_get in spi_clps711x_probe()
spi: fix return value check in dspi_probe()
spi: mpc512x: fix error return code in mpc512x_psc_spi_do_probe()
spi: clps711x: Don't call kfree() after spi_master_put/spi_unregister_master
spi/pxa2xx: check status register as well to determine if the device is off
random: allow architectures to optionally define random_get_entropy()
Allow architectures which have a disabled get_cycles() function to
provide a random_get_entropy() function which provides a fine-grained,
rapidly changing counter that can be used by the /dev/random driver.
For example, an architecture might have a rapidly changing register
used to control random TLB cache eviction, or DRAM refresh that
doesn't meet the requirements of get_cycles(), but which is good
enough for the needs of the random driver.
Eli Cohen [Wed, 11 Sep 2013 13:35:28 +0000 (16:35 +0300)]
mlx5: Keep polling to reclaim pages while any returned
Change mlx5_reclaim_startup_pages() to keep polling while any pages
are returned. If none are returned, keep polling for five more seconds
before exiting with an error message.
Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Eli Cohen [Wed, 11 Sep 2013 13:35:27 +0000 (16:35 +0300)]
IB/mlx5: Avoid async events on invalid port number
On a single ported Connect-IB, its possible for the firmware to issue
events on the non-existing 2nd port. Make sure to ignore events
generated for such ports.
Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Eli Cohen [Wed, 11 Sep 2013 13:35:26 +0000 (16:35 +0300)]
IB/mlx5: Decrease memory consumption of mr caches
Change the logic so we do not allocate memory nor map the device
before actually posting to the REG_UMR QP. In addition, unmap and free
the memory after we get completion.
Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
IB/mlx5: Flush cache workqueue before destroying it
Destroying the workqueue without flushing it first can lead to a case
in which the kernel tries to push a delayed work to the workqueue
which does not exist anymore.
Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Eli Cohen [Wed, 11 Sep 2013 13:35:22 +0000 (16:35 +0300)]
IB/mlx5: Fix send work queue size calculation
1. Make sure wqe_cnt does not exceed the limit published by firmware.
2. There is no requirement that the number of outstanding work
requests will be a power of two. Remove the ilog2 in the
calculation of sq.max_post to fix that.
3. Add case for IB_QPT_XRC_TGT in sq_overhead and return 0 as XRC
target QPs do not have a send queue.
Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Frediano Ziglio [Thu, 10 Oct 2013 14:39:37 +0000 (14:39 +0000)]
xen: Fix possible user space selector corruption
Due to the way kernel is initialized under Xen is possible that the
ring1 selector used by the kernel for the boot cpu end up to be copied
to userspace leading to segmentation fault in the userspace.
Xen code in the kernel initialize no-boot cpus with correct selectors (ds
and es set to __USER_DS) but the boot one keep the ring1 (passed by Xen).
On task context switch (switch_to) we assume that ds, es and cs already
point to __USER_DS and __KERNEL_CSso these selector are not changed.
If processor is an Intel that support sysenter instruction sysenter/sysexit
is used so ds and es are not restored switching back from kernel to
userspace. In the case the selectors point to a ring1 instead of __USER_DS
the userspace code will crash on first memory access attempt (to be
precise Xen on the emulated iret used to do sysexit will detect and set ds
and es to zero which lead to GPF anyway).
Now if an userspace process call kernel using sysenter and get rescheduled
(for me it happen on a specific init calling wait4) could happen that the
ring1 selector is set to ds and es.
This is quite hard to detect cause after a while these selectors are fixed
(__USER_DS seems sticky).