Vineet Gupta [Thu, 9 May 2013 16:24:51 +0000 (21:54 +0530)]
ARC: [mm] Aliasing VIPT dcache support 2/4
This is the meat of the series which prevents any dcache alias creation
by always keeping the U and K mapping of a page congruent.
If a mapping already exists, and other tries to access the page, prev
one is flushed to physical page (wback+inv)
Essentially flush_dcache_page()/copy_user_highpage() create K-mapping
of a page, but try to defer flushing, unless U-mapping exist.
When page is actually mapped to userspace, update_mmu_cache() flushes
the K-mapping (in certain cases this can be optimised out)
Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page()
handle the puring of stale userspace mappings on exit/munmap...
flush_anon_page() handles the existing U-mapping for anon page before
kernel reads it via the GUP path.
Note that while not complete, this is enough to boot a simple
dynamically linked Busybox based rootfs
flush_dcache_page( ) is MM hook to ensure that a page has consistent
views between kernel and userspace. Thus it is called when
* kernel writes to a page which at some later point could get mapped to
userspace (so kernel mapping needs to be flushed-n-inv)
* kernel is about to read from a page with possible userspace mappings
(so userspace mappings needs to be made coherent with kernel ones)
However for Non aliasing VIPT dcache, any userspace mapping will always
be congruent to kernel mapping. Thus d-cache need need not be flushed at
all (or delayed indefinitely).
The only reason it does need to be flushed is when mapping code pages.
Since icache doesn't snoop dcache, those dirty dcache lines need to be
written back to memory and icache line invalidated so that icache lines
fetch will get the right data.
Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.
(1) FPGA @ 80 MHZ
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
^^^^ ^^^^ ^^^
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8
3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4
3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6
^^^^^ ^^^^^ ^^^^^ ^^^^
(2) Customer Silicon @ 500 MHz (166 MHz mem)
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
^^^^ ^^^^ ^^^
Now that we have same helper used for all icache invalidates (i.e.
vaddr+paddr based exact line invalidate), consolidate the open coded
calls into one place.
Also rename flush_icache_range_vaddr => __sync_icache_dcache
ARC: [mm] optimise icache flush for kernel mappings
This change continues the theme from prev commit - this time icache
handling for kernel's own code modification (vmalloc: loadable modules,
breakpoints for kprobes/kgdb...)
flush_icache_range() calls the CDU icache helper with vaddr to enable
exact line invalidate.
For a true kernel-virtual mapping, the vaddr is actually virtual hence
valid as index into cache. For kprobes breakpoint however, the vaddr arg
is actually paddr - since that's how normal kernel is mapped in ARC
memory map. This implies that CDU will use the same addr for
indexing as for tag match - which is fine since kernel code would only
have that "implicit" mapping and none other.
This should speed up module loading significantly - specially on default
ARC700 icache configurations (32k) which alias.
ARC icache doesn't snoop dcache thus executable pages need to be made
coherent before mapping into userspace in flush_icache_page().
However ARC700 CDU (hardware cache flush module) requires both vaddr
(index in cache) as well as paddr (tag match) to correctly identify a
line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
the paddr only based flush_icache_page() API couldn't be implemented
efficiently. It had to loop thru all possible alias indexes and perform
the invalidate operation (ofcourse the cache op would only succeed at
the index(es) where tag matches - typically only 1, but the cost of
visiting all the cache-bins needs to paid nevertheless).
Turns out however that the vaddr (along with paddr) is available in
update_mmu_cache() hence better suits ARC icache flush semantics.
With both vaddr+paddr, exactly one flush operation per line is done.
So even a single page munmap, a frequent operation when uClibc dynamic
linker (ldso) is loading the dependent shared libraries, would move the
the ASID multiple times - needlessly invalidating the pre-faulted TLB
entries (and increasing the rate of ASID wraparound + full TLB flush).
This is now optimised to only be called if tlb->full_mm (which means
for exit/execve) cases only. And for those cases, flush_tlb_mm() is
already optimised to be a no-op for mm->mm_users == 0.
So essentially there are no mmore full mm flushes - except for fork which
anyhow needs it for properly COW'ing parent address space.
munmap now needs to do TLB range flush, which is implemented with
tlb_end_vma()
Results
-------
1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
7 before).
2. LMBench microbenchmark also shows improvements
Basic system parameters
------------------------------------------------------------------------------
Host OS Description Mhz tlb cache mem scal
pages line par load
bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6
3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4
This adds support for an ARC Virtual Platform. This platform is based on the
System C standard promoted by the OSCI (Open System C Initiative) and uses
nSIM to simulate the ARC CPU core itself.
Users can build a virtual SoC by combining System C models of peripherals
and CPU cores.
ARC: [TB10x] Adapt device tree to new compatible string
The original device tree was written using a slightly different
implementation of the fixed-factor-clock device tree binding. The
compatible string must be modified in order to be compatible with the
new implementation.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Infrastructure required to make the Linux kernel compile and boot on the
Abilis Systems TB10x series of SOCs based on ARC700 CPUs:
- Kmake related files (Kconfig, Makefile, tb10x_defconfig)
- TB10x platform initialisation
ARC: [TB10x] Device tree of TB100 and TB101 Development Kits
These are the device tree files for the Abilis Systems TB100 and TB101 ICs and
their respective development kit PCBs. These files are committed in preparation
of the following patch set which adds support for these chips to the ARC
platform.
ARC: Prepare interrupt code for external controllers
This patch adds some room for CPU-external interrupt controllers in the
Linux interrupt space. Until now, only the 32 CPU internal interrupt lines
were supported which does not allow for external interrupt controllers such
as GPIO modules etc.
ARC: Allow embedded arc-intc to be properly placed in DT intc hierarchy
arc-intc is initialized in arc common code as it is applicable to all
platforms. However platforms with their own external intc still need to
refer to it for correct DT interrupt tree hierarchy setup,
The fix is to use the generic irqchip framework to tie all irqchips in
a special linker section and then call irqchip_init() which calls the
DT of_irq_init() for all the intc in one go.
That way the platform code need not be aware of arc-intc at all.
ARC: Remove non existent refs to GENERIC_KERNEL_EXECVE & GENERIC_KERNEL_THREAD
This tracks mainline commit ae903caae267 "Bury the conditionals from
kernel_thread/kernel_execve series" which we missed out as ARC port was
not yet mainline.
[vgupta: commit log modified]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
commit "fs: make binfmt support for #! scripts modular and removable"
made support for #!scripts optional - thus need to include the Kconfig
file to get all relevant BINFMT_*
Vineet Gupta [Sat, 30 Mar 2013 09:37:47 +0000 (15:07 +0530)]
ARC: [kbuild] Allow platforms to disable LLSC for !SMP as well
Currently ARC_HAS_LLSC can be influenced by platform for SMP only using
ARC_HAS_COH_LLSC. For !SMP it defaults to "y".
It turns out that some customers can't support it all, even in UP.
So we change the semantics, and use a negative dependency ARC_CANT_LLSC.
Any platform (independent of SMP or !SMP) can select it to disable LLSC.
Vineet Gupta [Fri, 22 Mar 2013 16:34:27 +0000 (22:04 +0530)]
ARC: [build] Allow uncompressed uImage
The existing uImage target always generates gzip compressed image which
drags bootup for some very slow FPGA customer boards.
So introduce seperate make targets:uImage.{bin,gz} with uncompressed
being default. Also tie gz generation to CONFIG_KERNEL_GZIP, which a
platform can select in it's Kconfig if it wishes gz to be default.
Vineet Gupta [Thu, 21 Mar 2013 06:52:06 +0000 (12:22 +0530)]
ARC: [build] silence make defconfig warnings with host gcc 4.7
We do cross compiles for ARC Linux.
With gcc 4.7, a make defconfig spews out the following:
------------------->8--------------------------
make ARCH=arc defconfig
gcc: error: unrecognized command line option '-marc600'
gcc: error: unrecognized command line option '-mA7'
gcc: error: unrecognized command line option '-mno-sdata'
gcc: error: unrecognized command line option '-mno-mpy'
*** Default configuration is based on 'fpga_defconfig'
------------------->8--------------------------
This apparently is coming from LIBGCC line - which is strange to be
invoked for defconfig generation.
Paul Bolle [Fri, 15 Mar 2013 16:16:17 +0000 (17:16 +0100)]
ARC: remove #ifdef-ed out include of dead header
There's no (Kconfig) macro CONFIG_BLOCK_DEV_RAM. (CONFIG_BLK_DEV_RAM
does exist though.) But linux/blk.h got killed in 2005 anyway (in a
patch titled "kill blk.h"), so these three lines can be removed.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Sachin Kamat [Wed, 6 Mar 2013 11:23:45 +0000 (16:53 +0530)]
ARC: Fix coding style issues
Fixes the following coding style issues as detected by checkpatch:
ERROR: space required before the open parenthesis '('
ERROR: "foo * bar" should be "foo *bar"
WARNING: space prohibited between function name and open parenthesis '('
WARNING: please, no spaces at the start of a line
Sachin Kamat [Wed, 6 Mar 2013 11:23:44 +0000 (16:53 +0530)]
ARC: Use <linux/*> headers instead of <asm/*>
Silences the following checkpatch warnings:
WARNING: Use #include <linux/ptrace.h> instead of <asm/ptrace.h>
WARNING: Use #include <linux/kprobes.h> instead of <asm/kprobes.h>
WARNING: Use #include <linux/kgdb.h> instead of <asm/kgdb.h>
WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
WARNING: Use #include <linux/cache.h> instead of <asm/cache.h>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes
Pull powerpc bugfix from Stephen Rothwell:
"A single BUG_ON fix for a condition that could happen for machines
with certain hardware installed."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes:
powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being performed before the ANDCOND test
ARC: Add implicit compiler barrier to raw_local_irq* functions
ARC irqsave/restore macros were missing the compiler barrier, causing a
stale load in irq-enabled region be used in irq-safe region, despite
being changed, because the register holding the value was still live.
The problem manifested as random crashes in timer code when stress
testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT
Here's the exact sequence which caused this:
(0). tv1[x] <----> t1 <---> t2
(1). mod_timer(t1) interrupted after it calls timer_pending()
(2). mod_timer(t2) completes
(3). mod_timer(t1) resumes but messes up the list
(4). __runt_timers( ) uses bogus timer_list entry / crashes in
timer->function
Essentially mod_timer() was racing against itself and while the spinlock
serialized the tv1[] timer link list, timer_pending() called outside the
spinlock, cached timer link list element in a register.
With low register pressure (and a deep register file), lack of barrier
in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT
version), there was nothing to force gcc to reload across the spinlock,
causing a stale value in reg be used for link list manipulation - ensuing
a corruption.
ARcompact disassembly which shows the culprit generated code:
tst_s r3,r3 <--------------
# timer_pending( ) checks timer->entry.next
# r3 is NOT reloaded by gcc, using stale value
beq.d @.L169
mov.eq r0,0
##### detach_timer( ): __list_del( )
ld r4,[r13,4] # <variable>.entry.prev, D.31439
st r4,[r3,4] # <variable>.prev, D.31439
st r3,[r4] # <variable>.next, D.30246
We initially tried to fix this by adding barrier() to preempt_* macros
for !PREEMPT_COUNT but Linus clarified that it was anything but wrong.
http://www.spinics.net/lists/kernel/msg1512709.html
[vgupta: updated commitlog]
Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Merge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
Pull libata fixes from Jeff Garzik:
"The HDIO_DRIVE_* fix is really the biggie.
1) Fix ATAPI regression, noticed mainly on tape drives, due to a
commit which mistakenly changed an 'int' return type to a 'bool'.
Broken by commit 4dce8ba94c75 ("libata: Use 'bool' return value for
ata_id_XXX")
2) Add Slimtype DVD A DS8A8SH ATAPI quirk
3) ata_piix: Intel Haswell platform quirk
4) Avoid DMA'ing to stack buffer, when obtaining DEVSLP timings. IMO
a mild regression, given that libata previously did not DMA to a
stack buffer. Broken by commit commit 803739d25c23 ("[libata]
replace sata_settings with devslp_timing")
5) Fix regression impacting SMART and smartd, broken by commit 84a9a8cd9d0a ("[libata] Set proper SK when CK_COND is set")"
* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] Fix HDIO_DRIVE_* ioctl() Linux 3.9 regression
libata: fix DMA to stack in reading devslp_timing parameters
ata_piix: Fix DVD not dectected at some Haswell platforms
libata: Set max sector to 65535 for Slimtype DVD A DS8A8SH drive
libata: Use integer return value for atapi_command_packet_set
Merge tag 'trace-fixes-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This includes three fixes. Two fix features added in 3.9 and one
fixes a long time minor bug.
The first patch fixes a race that can happen if the user switches from
the irqsoff tracer to another tracer. If a irqs off latency is
detected, it will try to use the snapshot buffer, but the new tracer
wont have it allocated. There's a nasty warning that gets printed and
the trace is ignored. Nothing crashes, just a nasty WARN_ON is shown.
The second patch fixes an issue where if the sysctl is used to disable
and enable function tracing, it can put the function tracing into an
unstable state.
The third patch fixes an issue with perf using the function tracer.
An update was done, where the stub function could be called during the
perf function tracing, and that stub function wont have the "control"
flag set and cause a nasty warning when running perf."
* tag 'trace-fixes-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Do not call stub functions in control loop
ftrace: Consistently restore trace function on sysctl enabling
tracing: Fix race with update_max_tr_single and changing tracers
ftrace: Do not call stub functions in control loop
The function tracing control loop used by perf spits out a warning
if the called function is not a control function. This is because
the control function references a per cpu allocated data structure
on struct ftrace_ops that is not allocated for other types of
functions.
commit 0a016409e42 "ftrace: Optimize the function tracer list loop"
Had an optimization done to all function tracing loops to optimize
for a single registered ops. Unfortunately, this allows for a slight
race when tracing starts or ends, where the stub function might be
called after the current registered ops is removed. In this case we
get the following dump:
To fix this a new ftrace_ops flag is added that denotes the ftrace_list_end
ftrace_ops stub as just that, a stub. This flag is now checked in the
control loop and the function is not called if the flag is set.
Thanks to Jovi for not just reporting the bug, but also pointing out
where the bug was in the code.
Jan Kiszka [Tue, 26 Mar 2013 16:53:03 +0000 (17:53 +0100)]
ftrace: Consistently restore trace function on sysctl enabling
If we reenable ftrace via syctl, we currently set ftrace_trace_function
based on the previous simplistic algorithm. This is inconsistent with
what update_ftrace_function does. So better call that helper instead.
tracing: Fix race with update_max_tr_single and changing tracers
The commit 34600f0e9 "tracing: Fix race with max_tr and changing tracers"
fixed the updating of the main buffers with the race of changing
tracers, but left out the fix to the updating of just a per cpu buffer.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Michael Wolf [Fri, 5 Apr 2013 10:41:40 +0000 (10:41 +0000)]
powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being performed before the ANDCOND test
Some versions of pHyp will perform the adjunct partition test before the
ANDCOND test. The result of this is that H_RESOURCE can be returned and
cause the BUG_ON condition to occur. The HPTE is not removed. So add a
check for H_RESOURCE, it is ok if this HPTE is not removed as
pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a
specific HPTE to remove. So it is ok to just move on to the next slot
and try again.
Cc: stable@vger.kernel.org Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"Two quite small fixes: one a build problem, and the other fixes
seccomp filters on x32."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Fix rebuild with EFI_STUB enabled
x86: remove the x32 syscall bitmask from syscall_get_nr()
Will Deacon [Sun, 7 Apr 2013 09:36:12 +0000 (21:36 +1200)]
alpha: irq: remove deprecated use of IRQF_DISABLED
Interrupt handlers are always invoked with interrupts disabled, so
remove all uses of the deprecated IRQF_DISABLED flag.
Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Sun, 7 Apr 2013 09:36:11 +0000 (21:36 +1200)]
alpha: irq: run all handlers with interrupts disabled
Linux has expected that interrupt handlers are executed with local
interrupts disabled for a while now, so ensure that this is the case on
Alpha even for non-device interrupts such as IPIs.
Without this patch, secondary boot results in the following backtrace:
A similar dump occurs if you try to reboot using magic-sysrq.
Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Sun, 7 Apr 2013 09:36:10 +0000 (21:36 +1200)]
alpha: makefile: don't enforce small data model for kernel builds
Due to all of the goodness being packed into today's kernels, the
resulting image isn't as slim as it once was.
In light of this, don't pass -msmall-data to gcc, which otherwise results
in link failures due to impossible relocations when compiling anything but
the most trivial configurations.
Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Honig [Fri, 29 Mar 2013 16:35:21 +0000 (09:35 -0700)]
KVM: Allow cross page reads and writes from cached translations.
This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page. If the range falls within
the same memslot, then this will be a fast operation. If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.
Tested: Test against kvm_clock unit tests.
Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
Merge tag 'dm-3.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm
Pull device-mapper fixes from Alasdair Kergon:
"A pair of patches to fix the writethrough mode of the device-mapper
cache target when the device being cached is not itself wrapped with
device-mapper."
* tag 'dm-3.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm cache: reduce bio front_pad size in writeback mode
dm cache: fix writes to cache device in writethrough mode
Merge tag 'pci-v3.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"PCI updates for v3.9:
ASPM
Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"
kexec
PCI: Don't try to disable Bus Master on disconnected PCI devices
Platform ROM images
PCI: Add PCI ROM helper for platform-provided ROM images
nouveau: Attempt to use platform-provided ROM image
radeon: Attempt to use platform-provided ROM image
Hotplug
PCI/ACPI: Always resume devices on ACPI wakeup notifications
PCI/PM: Disable runtime PM of PCIe ports
EISA
EISA/PCI: Fix bus res reference
EISA/PCI: Init EISA early, before PNP"
* tag 'pci-v3.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/PM: Disable runtime PM of PCIe ports
PCI/ACPI: Always resume devices on ACPI wakeup notifications
PCI: Don't try to disable Bus Master on disconnected PCI devices
Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"
radeon: Attempt to use platform-provided ROM image
nouveau: Attempt to use platform-provided ROM image
EISA/PCI: Init EISA early, before PNP
EISA/PCI: Fix bus res reference
PCI: Add PCI ROM helper for platform-provided ROM images
1) Fix erroneous sock_orphan() leading to crashes and double
kfree_skb() in NFC protocol. From Thierry Escande and Samuel Ortiz.
2) Fix use after free in remain-on-channel mac80211 code, from Johannes
Berg.
3) nf_reset() needs to reset the NF tracing cookie, otherwise we can
leak it from one namespace into another. Fix from Gao Feng and
Patrick McHardy.
4) Fix overflow in channel scanning array of mwifiex driver, from Stone
Piao.
5) Fix loss of link after suspend/shutdown in r8169, from Hayes Wang.
6) Synchronization of unicast address lists to the undelying device
doesn't work because whether to sync is maintained as a boolean
rather than a true count. Fix from Vlad Yasevich.
7) Fix corruption of TSO packets in atl1e by limiting the segmented
packet length. From Hannes Frederic Sowa.
8) Revert bogus AF_UNIX credential passing change and fix the
coalescing issue properly, from Eric W Biederman.
9) Changes of ipv4 address lifetime settings needs to generate a
notification, from Jiri Pirko.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
netfilter: don't reset nf_trace in nf_reset()
net: ipv4: notify when address lifetime changes
ixgbe: fix registration order of driver and DCA nofitication
af_unix: If we don't care about credentials coallesce all messages
Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL"
bonding: remove sysfs before removing devices
atl1e: limit gso segment size to prevent generation of wrong ip length fields
net: count hw_addr syncs so that unsync works properly.
r8169: fix auto speed down issue
netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths
mwifiex: limit channel number not to overflow memory
NFC: microread: Fix build failure due to a new MEI bus API
iwlwifi: dvm: fix the passive-no-RX workaround
netfilter: nf_conntrack: fix error return code
NFC: llcp: Keep the connected socket parent pointer alive
mac80211: fix idle handling sequence
netfilter: nfnetlink_acct: return -EINVAL if object name is empty
netfilter: nfnetlink_queue: fix error return code in nfnetlink_queue_init()
netfilter: reset nf_trace in nf_reset
mac80211: fix remain-on-channel cancel crash
...
Jan Beulich [Wed, 3 Apr 2013 14:47:33 +0000 (15:47 +0100)]
x86: Fix rebuild with EFI_STUB enabled
eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence
their .cmd files don't get included by the build machinery, leading to
the files always getting rebuilt.
Rather than adding the two files individually, take the opportunity and
add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment
at the top of the file to be shrunk quite a bit.
At the same time, remove a pointless flags override line - the variable
assigned to was misspelled anyway, and the options added are
meaningless for assembly sources.
[ hpa: the patch is not minimal, but I am taking it for -urgent anyway
since the excess impact of the patch seems to be small enough. ]
Patrick McHardy [Fri, 5 Apr 2013 18:42:05 +0000 (20:42 +0200)]
netfilter: don't reset nf_trace in nf_reset()
Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
to reset nf_trace in nf_reset(). This is wrong and unnecessary.
nf_reset() is used in the following cases:
- when passing packets up the the socket layer, at which point we want to
release all netfilter references that might keep modules pinned while
the packet is queued. nf_trace doesn't matter anymore at this point.
- when encapsulating or decapsulating IPsec packets. We want to continue
tracing these packets after IPsec processing.
- when passing packets through virtual network devices. Only devices on
that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
used anymore. Its not entirely clear whether those packets should
be traced after that, however we've always done that.
- when passing packets through virtual network devices that make the
packet cross network namespace boundaries. This is the only cases
where we clearly want to reset nf_trace and is also what the
original patch intended to fix.
Add a new function nf_reset_trace() and use it in dev_forward_skb() to
fix this properly.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"Fixes for a number of small glitches in various corners of the MIPS
tree. No particular areas is standing out.
With this applied all MIPS defconfigs are building fine. No merge
conflicts are expected."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Delete definition of SA_RESTORER.
MIPS: Fix ISA level which causes secondary cache init bypassing and more
MIPS: Fix build error cavium-octeon without CONFIG_SMP
MIPS: Kconfig: Rename SNIPROM too
MIPS: Alchemy: Fix typo "CONFIG_DEBUG_PCI"
MIPS: Unbreak function tracer for 64-bit kernel.
Pull GFS2 fixes from Steven Whitehouse:
"There are two patches which fix up a couple of minor issues in the DLM
interface code, a missing error path in gfs2_rs_alloc(), one patch
which fixes a problem during "withdraw" and a fix for discards/FITRIM
when using 4k sector sized devices."
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Issue discards in 512b sectors
GFS2: Fix unlock of fcntl locks during withdrawn state
GFS2: return error if malloc failed in gfs2_rs_alloc()
GFS2: use memchr_inv
GFS2: use kmalloc for lvb bitmap
Merge tag 'spi-fix-v3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc
Pull spi fixes from Mark Brown:
"A bunch of small driver fixes plus a fix for error handling in the
core - nothing too exciting overall."
* tag 'spi-fix-v3.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc:
spi/mpc512x-psc: optionally keep PSC SS asserted across xfer segmensts
spi: Unlock a spinlock before calling into the controller driver.
spi/s3c64xx: modified error interrupt handling and init
spi/bcm63xx: don't disable non enabled clocks in probe error path
spi/bcm63xx: Remove unused variable
spi: slink-tegra20: move runtime pm calls to transfer_one_message
Bob Peterson [Fri, 22 Mar 2013 14:07:24 +0000 (10:07 -0400)]
GFS2: Issue discards in 512b sectors
This patch changes GFS2's discard issuing code so that it calls
function sb_issue_discard rather than blkdev_issue_discard. The
code was calling blkdev_issue_discard and specifying the correct
sector offset and sector size, but blkdev_issue_discard expects
these values to be in terms of 512 byte sectors, even if the native
sector size for the device is different. Calling sb_issue_discard
with the BLOCK size instead ensures the correct block-to-512b-sector
translation. I verified that "minlen" is specified in blocks, so
comparing it to a number of blocks is correct.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch introduced a few races which cannot be easily fixed with a
small follow-up patch. Furthermore, the SoC with the broken hardware
register, which this patch intended to add support for, can only be used
with device trees, which this driver currently does not support.
[ Here is the discussion that led to this "revert" patch:
https://lkml.org/lkml/2013/4/3/176 ]
Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'fbdev-fixes-3.9-rc6' of git://gitorious.org/linux-omap-dss2/linux
Pull fbdev fixes from Tomi Valkeinen:
"Fix uvesafb crash bug and typoed flag name in fbmon's new videomode
code"
* tag 'fbdev-fixes-3.9-rc6' of git://gitorious.org/linux-omap-dss2/linux:
video:uvesafb: Fix dereference NULL pointer code path
fbmon: use VESA_DMT_VSYNC_HIGH to fix typo
Merge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This contains slightly more volumes than usual at this stage, mostly
because of my vacation in the last week. Nothing to scare, all small
and/or trivial fixes:
- Fix loop path handling in ASoC DAPM
- Some memory handling fixes in ASoC core
- Fix spear_pcm to adapt to the updated API
- HD-audio HDMI ELD handling fixes
- Fix for CM6331 USB-audio SRC change bugs
- Revert power_save_controller option change due to user-space usage
- A few other small ASoC and HD-audio fixes"
* tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/generic - fix uninitialized variable
Revert "ALSA: hda - Allow power_save_controller option override DCAPS"
ALSA: hda - fix typo in proc output
ALSA: hda - Enabling Realtek ALC 671 codec
ALSA: usb: Work around CM6631 sample rate change bug
ALSA: hda - bug fix on HDMI ELD debug message
ALSA: hda - bug fix on return value when getting HDMI ELD info
ASoC: dma-sh7760: Fix compile error
ASoC: core: fix invalid free of devm_ allocated data
ASoC: spear_pcm: Update to new pcm_new() API
ASoC:: max98090: Remove executable bit
ASoC: dapm: Fix pointer dereference in is_connected_output_ep()
ASoC: pcm030 audio fabric: remove __init from probe
ASoC: imx-ssi: Fix occasional AC97 reset failure
ASoC: core: fix possible memory leak in snd_soc_bytes_put()
ASoC: wm_adsp: fix possible memory leak in wm_adsp_load_coeff()
ASoC: dapm: Fix handling of loops
ASoC: si476x: Add missing break for SNDRV_PCM_FORMAT_S8 switch case
Mike Snitzer [Fri, 5 Apr 2013 14:36:34 +0000 (15:36 +0100)]
dm cache: reduce bio front_pad size in writeback mode
A recent patch to fix the dm cache target's writethrough mode extended
the bio's front_pad to include a 1056-byte struct dm_bio_details.
Writeback mode doesn't need this, so this patch reduces the
per_bio_data_size to 16 bytes in this case instead of 1096.
The dm_bio_details structure was added in "dm cache: fix writes to
cache device in writethrough mode" which fixed commit e2e74d617e ("dm
cache: fix race in writethrough implementation"). In writeback mode
we avoid allocating the writethrough-specific members of the
per_bio_data structure (the dm_bio_details structure included).
Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Darrick J. Wong [Fri, 5 Apr 2013 14:36:32 +0000 (15:36 +0100)]
dm cache: fix writes to cache device in writethrough mode
The dm-cache writethrough strategy introduced by commit e2e74d617eadc15
("dm cache: fix race in writethrough implementation") issues a bio to
the origin device, remaps and then issues the bio to the cache device.
This more conservative in-series approach was selected to favor
correctness over performance (of the previous parallel writethrough).
However, this in-series implementation that reuses the same bio to write
both the origin and cache device didn't take into account that the block
layer's req_bio_endio() modifies a completing bio's bi_sector and
bi_size. So the new writethrough strategy needs to preserve these bio
fields, and restore them before submission to the cache device,
otherwise nothing gets written to the cache (because bi_size is 0).
This patch adds a struct dm_bio_details field to struct per_bio_data,
and uses dm_bio_record() and dm_bio_restore() to ensure the bio is
restored before reissuing to the cache device. Adding such a large
structure to the per_bio_data is not ideal but we can improve this
later, for now correctness is the important thing.
This problem initially went unnoticed because the dm-cache test-suite
uses a linear DM device for the dm-cache device's origin device.
Writethrough worked as expected because DM submits a *clone* of the
original bio, so the original bio which was reused for the cache was
never touched.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Ralf Baechle [Mon, 25 Mar 2013 12:43:14 +0000 (13:43 +0100)]
MIPS: Delete definition of SA_RESTORER.
SA_RESTORER used to be defined as 0x04000000 but only the O32 ABI ever
supported its use and no libc was using it, so the entire sa-restorer
functionality was removed with lmo commit 39bffc12c3580ab [Zap sa_restorer.]
for 2.5.48 retaining only the SA_RESTORER definition as a reminder to avoid
accidental reuse of the mask bit.
Upstream cdef9602fbf1871a43f0f1b5cea10dd0f275167d [signal: always clear
sa_restorer on execve] adds code that assumes sa_sigaction has an
sa_restorer field, if SA_RESTORER is defined which would break MIPS.
So remove the SA_RESTORER definition before the v3.8.4 merge.
MIPS: Fix ISA level which causes secondary cache init bypassing and more
The commit a96102be70 introduced set_isa() where compatible ISA info is
also set aside from the one gets passed in. It means, for example, 1004K
will have MIPS_CPU_ISA_M32R2/M32R1/II/I flags. This leads to things like
the following inappropriate:
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Bolle [Thu, 4 Apr 2013 12:47:01 +0000 (12:47 +0000)]
MIPS: Alchemy: Fix typo "CONFIG_DEBUG_PCI"
Commit 7517de348663b08a808aff44b5300e817157a568 ("MIPS: Alchemy: Redo
PCI as platform driver") added a reference to CONFIG_DEBUG_PCI. Change
it to CONFIG_PCI_DEBUG, as that is a valid Kconfig macro.
Also add a newline to a debugging printk that this fix enables.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Tue, 2 Apr 2013 22:59:29 +0000 (22:59 +0000)]
MIPS: Unbreak function tracer for 64-bit kernel.
Commit 58b69401c797 [MIPS: Function tracer: Fix broken function tracing]
completely broke the function tracer for 64-bit kernels. The symptom is
a system hang very early in the boot process.
The fix: Remove/fix $sp adjustments for 64-bit case.
Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Cc: Al Cooper <alcooperx@gmail.com> Cc: viric@viric.name Cc: stable@vger.kernel.org # 3.8.x Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Jakub Kicinski [Wed, 3 Apr 2013 16:50:54 +0000 (16:50 +0000)]
ixgbe: fix registration order of driver and DCA nofitication
ixgbe_notify_dca cannot be called before driver registration
because it expects driver's klist_devices to be allocated and
initialized. While on it make sure debugfs files are removed
when registration fails.
Cc: stable <stable@vger.kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
af_unix: If we don't care about credentials coallesce all messages
It was reported that the following LSB test case failed
https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we
were not coallescing unix stream messages when the application was
expecting us to.
The problem was that the first send was before the socket was accepted
and thus sock->sk_socket was NULL in maybe_add_creds, and the second
send after the socket was accepted had a non-NULL value for sk->socket
and thus we could tell the credentials were not needed so we did not
bother.
The unnecessary credentials on the first message cause
unix_stream_recvmsg to start verifying that all messages had the same
credentials before coallescing and then the coallescing failed because
the second message had no credentials.
Ignoring credentials when we don't care in unix_stream_recvmsg fixes a
long standing pessimization which would fail to coallesce messages when
reading from a unix stream socket if the senders were different even if
we did not care about their credentials.
I have tested this and verified that the in the LSB test case mentioned
above that the messages do coallesce now, while the were failing to
coallesce without this change.
Reported-by: Karel Srot <ksrot@redhat.com> Reported-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem that the above patch was meant to address is that af_unix
messages are not being coallesced because we are sending unnecesarry
credentials. Not sending credentials in maybe_add_creds totally
breaks unconnected unix domain sockets that wish to send credentails
to other sockets.
In practice this break some versions of udev because they receive a
message and the sending uid is bogus so they drop the message.
Reported-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We have a race condition if we try to rmmod bonding and simultaneously add
a bond master through sysfs. In bonding_exit() we first remove the devices
(through rtnl_link_unregister() ) and only after that we remove the sysfs.
If we manage to add a device through sysfs after that the devices were
removed - we'll end up with that device/sysfs structure and with the module
unloaded.
Fix this by first removing the sysfs and only after that calling
rtnl_link_unregister().
Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
atl1e: limit gso segment size to prevent generation of wrong ip length fields
The limit of 0x3c00 is taken from the windows driver.
Suggested-by: Huang, Xiong <xiong@qca.qualcomm.com> Cc: Huang, Xiong <xiong@qca.qualcomm.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: count hw_addr syncs so that unsync works properly.
A few drivers use dev_uc_sync/unsync to synchronize the
address lists from master down to slave/lower devices. In
some cases (bond/team) a single address list is synched down
to multiple devices. At the time of unsync, we have a leak
in these lower devices, because "synced" is treated as a
boolean and the address will not be unsynced for anything after
the first device/call.
Treat "synced" as a count (same as refcount) and allow all
unsync calls to work.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Revert of a recent cpuidle change that caused Nehalem machines to
hang on boot from Alex Shi.
- USB power management fix addressing a crash in the port device
object's release routine from Rafael J Wysocki.
- Device PM QoS fix for a potential deadlock related to sysfs interface
from Rafael J Wysocki.
- Fix for a cpufreq crash when the /cpus Device Tree node is missing
from Paolo Pisati.
- Fix for a build issue on ia64 related to the Boot Graphics Resource
Table (BGRT) from Tony Luck.
- Two fixes for ACPI handles being set incorrectly for device objects
that don't correspond to any ACPI namespace nodes in the I2C and SPI
subsystems from Rafael J Wysocki.
- Fix for compiler warnings related to CONFIG_PM_DEVFREQ being unset
from Rajagopal Venkat.
- Fix for a symbol definition typo in cpufreq_governor.h from Borislav
Petkov.
* tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / BGRT: Don't let users configure BGRT on non X86 systems
cpuidle / ACPI: recover percpu ACPI processor cstate
ACPI / I2C: Use parent's ACPI_HANDLE() in acpi_i2c_register_devices()
cpufreq: Correct header guards typo
ACPI / SPI: Use parent's ACPI_HANDLE() in acpi_register_spi_devices()
cpufreq: check OF node /cpus presence before dereferencing it
PM / devfreq: Fix compiler warnings for CONFIG_PM_DEVFREQ unset
PM / QoS: Avoid possible deadlock related to sysfs access
USB / PM: Don't try to hide PM QoS flags from usb_port_device_release()
hayeswang [Sun, 31 Mar 2013 17:02:04 +0000 (17:02 +0000)]
r8169: fix auto speed down issue
It would cause no link after suspending or shutdowning when the
nic changes the speed to 10M and connects to a link partner which
forces the speed to 100M.
Check the link partner ability to determine which speed to set.
Signed-off-by: Hayes Wang <hayeswang@realtek.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Apr 2013 21:39:06 +0000 (17:39 -0400)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless
John W. Linville says:
====================
Here are some more fixes intended for the 3.9 stream...
Regarding the mac80211 bits, Johannes says:
"I had changed the idle handling to simplify it, but broken the
sequencing of commands, at least for ath9k-htc, one patch restores the
sequence. The other patch fixes a crash Jouni found while stress-testing
the remain-on-channel code, when an item is deleted the work struct can
run twice and crash the second time."
As for the iwlwifi bits, Johannes says:
"The only fix here is to the passive-no-RX firmware regulatory
enforcement driver support code to not drop auth frames in quick
succession, leading to not being able to connect to APs on passive
channels in certain circumstances."
Don't forget the NFC bits, about which Samuel says:
"This time we have:
- A crash fix for when a DGRAM LLCP socket is listening while the NFC adapter
is physically removed.
- A potential double skb free when the LLCP socket receive queue is full.
- A fix for properly handling multiple and consecutive LLCP connections, and
not trash the socket ack log.
- A build failure for the MEI microread physical layer, now that the MEI bus
APIs have been merged into char-misc-next."
On top of that, Stone Piao provides an mwifiex fix to avoid accessing
beyond the end of a buffer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Stancek [Thu, 4 Apr 2013 18:35:10 +0000 (11:35 -0700)]
mm: prevent mmap_cache race in find_vma()
find_vma() can be called by multiple threads with read lock
held on mm->mmap_sem and any of them can update mm->mmap_cache.
Prevent compiler from re-fetching mm->mmap_cache, because other
readers could update it in the meantime:
thread 1 thread 2
|
find_vma() | find_vma()
struct vm_area_struct *vma = NULL; |
vma = mm->mmap_cache; |
if (!(vma && vma->vm_end > addr |
&& vma->vm_start <= addr)) { |
| mm->mmap_cache = vma;
return vma; |
^^ compiler may optimize this |
local variable out and re-read |
mm->mmap_cache |
This issue can be reproduced with gcc-4.8.0-1 on s390x by running
mallocstress testcase from LTP, which triggers:
Merge tag 'upstream-3.9-rc6' of git://git.infradead.org/linux-ubifs
Pull UBIFS fix from Artem Bityutskiy:
"Make the space fixup feature work in the case when the file-system is
first mounted R/O and then remounted R/W."
* tag 'upstream-3.9-rc6' of git://git.infradead.org/linux-ubifs:
UBIFS: make space fixup work in the remount case
* pm-fixes:
cpufreq: Correct header guards typo
cpufreq: check OF node /cpus presence before dereferencing it
PM / devfreq: Fix compiler warnings for CONFIG_PM_DEVFREQ unset
PM / QoS: Avoid possible deadlock related to sysfs access
USB / PM: Don't try to hide PM QoS flags from usb_port_device_release()
* acpi-fixes:
ACPI / BGRT: Don't let users configure BGRT on non X86 systems
cpuidle / ACPI: recover percpu ACPI processor cstate
ACPI / I2C: Use parent's ACPI_HANDLE() in acpi_i2c_register_devices()
ACPI / SPI: Use parent's ACPI_HANDLE() in acpi_register_spi_devices()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- Workaround for device ID conflict between Masterkit MA901 usb radio
device and Atmel V-USB devices, to avoid regressions from older
kernels, by Alexey Klimov
- fix for possible race during input device registration in magicmouse
driver, by Benjamin Tissoires
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: magicmouse: fix race between input_register() and probe()
media: radio-ma901: return ENODEV in probe if usb_device doesn't match
HID: fix Masterkit MA901 hid quirks
Merge tag 'gpio-fixes-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Two GPIO fixes for the v3.9 series:
- Fix erroneous return value in the ICH driver
- Make the STMPE driver proper properly on device tree boots"
* tag 'gpio-fixes-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: stmpe: pass DT node to irqdomain
gpio-ich: Fix value returned by ichx_gpio_request
The commit [6ab317419c: ALSA: hda - Allow power_save_controller option
override DCAPS] changed the behavior of power_save_controller so that
it can override the driver capability. This assumed that this option
is rarely changed dynamically unlike power_save option. Too naive.
It turned out that the user-space power-management tool tries to set
power_save_controller option to 1 together with power_save option
without knowing what's actually doing. This enabled forcibly the
runtime PM of the controller, which is known to be broken om many
chips thus disabled as default.
So, the only sane fix is to revert this commit again. It was intended
to ease debugging/testing for runtime PM enablement, but obviously we
need another way for it.
GFS2: Fix unlock of fcntl locks during withdrawn state
When withdraw occurs, we need to continue to allow unlocks of fcntl
locks to occur, however these will only be local, since the node has
withdrawn from the cluster. This prevents triggering a VFS level
bug trap due to locks remaining when a file is closed.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>