Akinobu Mita [Wed, 4 Jun 2014 23:06:48 +0000 (16:06 -0700)]
x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled
This patchset enhances the DMA Contiguous Memory Allocator on x86.
Currently the DMA CMA is only supported with pci-nommu dma_map_ops and
furthermore it can't be enabled on x86_64. But I would like to allocate
big contiguous memory with dma_alloc_coherent() and tell it to the device
that requires it, regardless of which dma mapping implementation is
actually used in the system.
So this makes it work with swiotlb and intel-iommu dma_map_ops, too. And
this also extends "cma=" kernel parameter to specify placement constraint
by the physical address range of memory allocations. For example, CMA
allocates memory below 4GB by "cma=64M@0-4G", it is required for the
devices only supporting 32-bit addressing on 64-bit systems without iommu.
This patch (of 5):
Calling dma_alloc_coherent() with __GFP_ZERO must return zeroed memory.
But when the contiguous memory allocator (CMA) is enabled on x86 and the
memory region is allocated by dma_alloc_from_contiguous(), it doesn't
return zeroed memory. Because dma_generic_alloc_coherent() forgot to fill
the memory region with zero if it was allocated by
dma_alloc_from_contiguous()
Most implementations of dma_alloc_coherent() return zeroed memory
regardless of whether __GFP_ZERO is specified. So this fixes it by
unconditionally zeroing the allocated memory region.
Alternatively, we could fix dma_alloc_from_contiguous() to return zeroed
out memory and remove memset() from all caller of it. But we can't simply
remove the memset on arm because __dma_clear_buffer() is used there for
ensuring cache flushing and it is used in many places. Of course we can
do redundant memset in dma_alloc_from_contiguous(), but I think this patch
is less impact for fixing this problem.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Don Dutile <ddutile@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For single threaded workloads, we can avoid flushing and iterating through
the entire list of tasks, making the whole function a lot faster,
requiring only a single atomic read for the mm_users.
mm: only force scan in reclaim when none of the LRUs are big enough.
Prior to this change, we would decide whether to force scan a LRU during
reclaim if that LRU itself was too small for the current priority.
However, this can lead to the file LRU getting force scanned even if
there are a lot of anonymous pages we can reclaim, leading to hot file
pages getting needlessly reclaimed.
To address this, we instead only force scan when none of the reclaimable
LRUs are big enough.
Gives huge improvements with zswap. For example, when doing -j20 kernel
build in a 500MB container with zswap enabled, runtime (in seconds) is
greatly reduced:
x without this change
+ with this change
N Min Max Median Avg Stddev
x 5 700.997 790.076 763.928 754.05 39.59493
+ 5 141.634 197.899 155.706 161.9 21.270224
Difference at 95.0% confidence
-592.15 +/- 46.3521
-78.5293% +/- 6.14709%
(Student's t, pooled s = 31.7819)
Should also give some improvements in regular (non-zswap) swap cases.
Yes, hughd found significant speedup using regular swap, with several
memcgs under pressure; and it should also be effective in the non-memcg
case, whenever one or another zone LRU is forced too small.
Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Bob Liu <bob.liu@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Luigi Semenzato <semenzato@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyrill Gorcunov [Wed, 4 Jun 2014 23:06:43 +0000 (16:06 -0700)]
mm: softdirty: clear VM_SOFTDIRTY flag inside clear_refs_write() instead of clear_soft_dirty()
clear_refs_write() is called earlier than clear_soft_dirty() and it is
more natural to clear VM_SOFTDIRTY (which belongs to VMA entry but not
PTEs) that early instead of clearing it a way deeper inside call chain.
Cyrill Gorcunov [Wed, 4 Jun 2014 23:06:42 +0000 (16:06 -0700)]
mm: softdirty: don't forget to save file map softdiry bit on unmap
pte_file_mksoft_dirty operates with argument passed by a value and
returns modified result thus we need to assign @ptfile here, otherwise
itis a no-op which may lead to loss of the softdirty bit.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyrill Gorcunov [Wed, 4 Jun 2014 23:06:41 +0000 (16:06 -0700)]
mm: softdirty: make freshly remapped file pages being softdirty unconditionally
Hugh reported:
| I noticed your soft_dirty work in install_file_pte(): which looked
| good at first, until I realized that it's propagating the soft_dirty
| of a pte it's about to zap completely, to the unrelated entry it's
| about to insert in its place. Which seems very odd to me.
Indeed this code ends up being nop in result -- pte_file_mksoft_dirty()
operates with pte_t argument and returns new pte_t which were never used
after. After looking more I think what we need is to soft-dirtify all
newely remapped file pages because it should look like a new mapping for
memory tracker.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reported-by: Hugh Dickins <hughd@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently to allocate a page that should be charged to kmemcg (e.g.
threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page
allocated is then to be freed by free_memcg_kmem_pages. Apart from
looking asymmetrical, this also requires intrusion to the general
allocation path. So let's introduce separate functions that will
alloc/free pages charged to kmemcg.
The new functions are called alloc_kmem_pages and free_kmem_pages. They
should be used when the caller actually would like to use kmalloc, but
has to fall back to the page allocator for the allocation is large.
They only differ from alloc_pages and free_pages in that besides
allocating or freeing pages they also charge them to the kmem resource
counter of the current memory cgroup.
[sfr@canb.auug.org.au: export kmalloc_order() to modules] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have only a few places where we actually want to charge kmem so
instead of intruding into the general page allocation path with
__GFP_KMEMCG it's better to explictly charge kmem there. All kmem
charges will be easier to follow that way.
This is a step towards removing __GFP_KMEMCG. It removes __GFP_KMEMCG
from memcg caches' allocflags. Instead it makes slab allocation path
call memcg_charge_kmem directly getting memcg to charge from the cache's
memcg params.
This also eliminates any possibility of misaccounting an allocation
going from one memcg's cache to another memcg, because now we always
charge slabs against the memcg the cache belongs to. That's why this
patch removes the big comment to memcg_kmem_get_cache.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Wed, 4 Jun 2014 23:06:37 +0000 (16:06 -0700)]
mm: slub: fix ALLOC_SLOWPATH stat
There used to be only one path out of __slab_alloc(), and ALLOC_SLOWPATH
got bumped in that exit path. Now there are two, and a bunch of gotos.
ALLOC_SLOWPATH can now get set more than once during a single call to
__slab_alloc() which is pretty bogus. Here's the sequence:
1. Enter __slab_alloc(), fall through all the way to the
stat(s, ALLOC_SLOWPATH);
2. hit 'if (!freelist)', and bump DEACTIVATE_BYPASS, jump to
new_slab (goto #1)
3. Hit 'if (c->partial)', bump CPU_PARTIAL_ALLOC, goto redo
(goto #2)
4. Fall through in the same path we did before all the way to
stat(s, ALLOC_SLOWPATH)
5. bump ALLOC_REFILL stat, then return
Doing this is obviously bogus. It keeps us from being able to
accurately compare ALLOC_SLOWPATH vs. ALLOC_FASTPATH. It also means
that the total number of allocs always exceeds the total number of
frees.
This patch moves stat(s, ALLOC_SLOWPATH) to be called from the same
place that __slab_alloc() is. This makes it much less likely that
ALLOC_SLOWPATH will get botched again in the spaghetti-code inside
__slab_alloc().
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 4 Jun 2014 23:06:36 +0000 (16:06 -0700)]
mm, slab: suppress out of memory warning unless debug is enabled
When the slab or slub allocators cannot allocate additional slab pages,
they emit diagnostic information to the kernel log such as current
number of slabs, number of objects, active objects, etc. This is always
coupled with a page allocation failure warning since it is controlled by
!__GFP_NOWARN.
Suppress this out of memory warning if the allocator is configured
without debug supported. The page allocation failure warning will
indicate it is a failed slab allocation, the order, and the gfp mask, so
this is only useful to diagnose allocator issues.
Since CONFIG_SLUB_DEBUG is already enabled by default for the slub
allocator, there is no functional change with this patch. If debug is
disabled, however, the warnings are now suppressed.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Inspired by Joe Perches suggestion in ntfs logging clean-up.
Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joe Perches <joe@perches.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joe Perches <joe@perches.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Wed, 4 Jun 2014 23:06:32 +0000 (16:06 -0700)]
x86, mm: probe memory block size for generic x86 64bit
On system with 2TiB ram, current x86_64 have 128M as section size, and
one memory_block only include one section. So will have 16400 entries
under /sys/devices/system/memory/.
Current code try to use block id to find block pointer in /sys for any
section, and reuse that block pointer. that finding will take some time
even after commit 7c243c7168dc ("mm: speedup in __early_pfn_to_nid")
that will skip the search in that case during booting up.
So solution could be increase block size just like SGI UV system did.
(harded code to 2g).
This patch is trying to probe the block size to make it match mmio remap
size. for example, Intel Nehalem later system will have memory range [0,
TOML), [4g, TOMH]. If the memory hole is 2g and total is 128g, TOM will
be 2g, and TOM2 will be 130g.
We could use 2g as block size instead of default 128M. That will reduce
number of entries in /sys/devices/system/memory/
On system 6TiB system will reduce boot time by 35 seconds.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 4 Jun 2014 23:06:30 +0000 (16:06 -0700)]
x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels
_PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
faults on x86. Care is taken such that _PAGE_NUMA is used only in
situations where the VMA flags distinguish between NUMA hinting faults
and prot_none faults. This decision was x86-specific and conceptually
it is difficult requiring special casing to distinguish between PROTNONE
and NUMA ptes based on context.
Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
between an entry that is really unmapped and a page that is protected
for NUMA hinting faults as if the PTE is not present then a fault will
be trapped.
Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
This patch shrinks the maximum possible swap size and uses the bit to
uniquely distinguish between NUMA hinting ptes and swap ptes.
Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 4 Jun 2014 23:06:29 +0000 (16:06 -0700)]
x86: require x86-64 for automatic NUMA balancing
32-bit support for NUMA is an oddity on its own but with automatic NUMA
balancing on top there is a reasonable risk that the CPUPID information
cannot be stored in the page flags. This patch removes support for
automatic NUMA support on 32-bit x86.
Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Description by Jan Kara:
"A lot of older filesystems don't properly flush volatile disk caches
on fsync(2) which can lead to loss of fsynced data after power failure.
This patch makes generic_file_fsync() issue proper cache flush to fix the
problem. Sysadmin can use /sys/devices/.../cache_type to tell the system
it should not send the cache flush."
[akpm@linux-foundation.org: nuke ifdef]
[akpm@linux-foundation.org: fix warning] Signed-off-by: Fabian Frederick <fabf@skynet.be> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
v9fs_sysfs_init is only called by __init init_v9fs
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Wed, 4 Jun 2014 23:06:17 +0000 (16:06 -0700)]
ocfs2: fix incorrect i_size of global bitmap inode after resize
Ocfs2 cluster size may be 1MB, which has 20 bits. When resize, the
input new clusters is mostly the number of clusters in a group
descriptor(32256).
Since the input clusters is defined as type int, so it will overflow
when shift left 20 bits and then lead to incorrect global bitmap i_size.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Wed, 4 Jun 2014 23:06:14 +0000 (16:06 -0700)]
ocfs2/dlm: disallow node joining when recovery is on going
We found a race situation when dlm recovery and node joining occurs
simultaneously if the network state is bad.
N1 N4
start joining dlm and send
query join to all live nodes
set joining node to N1, return OK
send query join to other
live nodes and it may take
a while
call dlm_send_join_assert()
to send assert join message
when N2 is down, so keep
trying to send message to N2
until find N2 is down
send assert join message to
N3, but connection is down
with N3, so it may take a
while
become the recovery master for N2
and send begin reco message to other
nodes in domain map but no N1
connection with N3 is rebuild,
then send assert join to N4
call dlm_assert_joined_handler(),
add N1 to domain_map
dlm recovery done, send finalize message
to nodes in domain map, including N1
receiving finalize message,
trigger the BUG() because
recovery master mismatch.
Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Wed, 4 Jun 2014 23:06:13 +0000 (16:06 -0700)]
ocfs2: fix umount hang while shutting down truncate log
Revert commit 75f82eaa502c ("ocfs2: fix NULL pointer dereference when
dismount and ocfs2rec simultaneously") because it may cause a umount
hang while shutting down the truncate log.
fix NULL pointer dereference when dismount and ocfs2rec simultaneously
The situation is as followes:
ocfs2_dismout_volume
-> ocfs2_recovery_exit
-> free osb->recovery_map
-> ocfs2_truncate_shutdown
-> lock global bitmap inode
-> ocfs2_wait_for_recovery
-> check whether osb->recovery_map->rm_used is zero
Because osb->recovery_map is already freed, rm_used can be any other
values, so it may yield umount hang.
To prevent NULL pointer dereference while getting sys_root_inode, we use
a osb_tl_disable flag to disable schedule osb_truncate_log_wq after
truncate log shutdown.
Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs_info_foo() and ocfs2_get_request_ptr functions are only used in ioctl.c
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Wed, 4 Jun 2014 23:06:10 +0000 (16:06 -0700)]
ocfs2/dlm: fix possible convert=sion deadlock
We found there is a conversion deadlock when the owner of lockres
happened to crash before send DLM_PROXY_AST_MSG for a downconverting
lock. The situation is as follows:
Node1 Node2 Node3
the owner of lockresA
lock_1 granted at EX mode
and call ocfs2_cluster_unlock
to decrease ex_holders.
converting lock_3 from
NL to EX
send DLM_PROXY_AST_MSG
to Node1, asking Node 1
to downconvert.
receiving DLM_PROXY_AST_MSG,
thread ocfs2dc send
DLM_CONVERT_LOCK_MSG
to Node2 to downconvert
lock_1(EX->NL).
lock_1 can be granted and
put it into pending_asts
list, return DLM_NORMAL.
then something happened
and Node2 crashed.
received DLM_NORMAL, waiting
for DLM_PROXY_AST_MSG.
selected as the recovery
master, receving migrate
lock from Node1, queue
lock_1 to the tail of
converting list.
After dlm recovery, converting list in the master of lockresA(Node3)
will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL).
Requested mode of lock_3 is not compatible with the granted mode of
lock_1, so it can not be granted. and lock_1 can not downconvert
because covnerting queue is strictly FIFO. So a deadlock is created.
We think function dlm_process_recovery_data() should queue_ast for
lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can
process lock_1 first. And if there are multiple downconverting locks,
they must convert form PR to NL, so no need to sort them.
Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Wed, 4 Jun 2014 23:06:09 +0000 (16:06 -0700)]
ocfs2: limit printk when journal is aborted
Once JBD2_ABORT is set, ocfs2_commit_cache will fail in
ocfs2_commit_thread. Then it will get into a loop with mass logs. This
will meaninglessly consume a larger number of resource and may lead to
the system hanging. So limit printk in this case.
[akpm@linux-foundation.org: document the msleep] Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
George Spelvin [Wed, 4 Jun 2014 23:06:08 +0000 (16:06 -0700)]
ocfs2: remove some redundant casting
There are two standard techniques for dereferencing structures pointed
to by void *: cast to the right type each time they're used, or assign
to local variables of the right type.
But there's no need to do *both*.
Signed-off-by: George Spelvin <linux@horizon.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <jlbec@evilplan.org> Reviewed-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ocfs2/super.c: use OCFS2_MAX_VOL_LABEL_LEN and strlcpy
Replace strncpy(size 63) by defined value.
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Static values are automatically initialized to NULL.
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__get_cpu_var() is used for multiple purposes in the kernel source. One
of them is address calculation via the form &__get_cpu_var(x). This
calculates the address for the instance of the percpu variable of the
current processor based on an offset.
Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.
__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.
This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less
registers are used when code is generated.
At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.
The patch set includes passes over all arches as well. Once these
operations are used throughout then specialized macros can be defined in
non -x86 arches as well in order to optimize per cpu access by f.e. using
a global register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu
variable.
DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(&x, this_cpu_ptr(&y), sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
__this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Signed-off-by: Christoph Lameter <cl@linux.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> [compilation only] Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fanotify: check file flags passed in fanotify_init
Without this patch fanotify_init does not validate the value passed in
event_f_flags.
When a fanotify event is read from the fanotify file descriptor a new
file descriptor is created where file.f_flags = event_f_flags.
Internal and external open flags are stored together in field f_flags of
struct file. Hence, an application might create file descriptors with
internal flags like FMODE_EXEC, FMODE_NOCMTIME set.
Jan Kara and Eric Paris both aggreed that this is a bug and the value of
event_f_flags should be checked:
https://lkml.org/lkml/2014/4/29/522
https://lkml.org/lkml/2014/4/29/539
This updated patch version considers the comments by Michael Kerrisk in
https://lkml.org/lkml/2014/5/4/10
With the patch the value of event_f_flags is checked.
When specifying an invalid value error EINVAL is returned.
Internal flags are disallowed.
File creation flags are disallowed:
O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TRUNC, and O_TTY_INIT.
Flags which do not make sense with fanotify are disallowed:
__O_TMPFILE, O_PATH, FASYNC, and O_DIRECT.
This leaves us with the following allowed values:
O_RDONLY, O_WRONLY, O_RDWR are basic functionality. The are stored in the
bits given by O_ACCMODE.
O_APPEND is working as expected. The value might be useful in a logging
application which appends the current status each time the log is opened.
O_LARGEFILE is needed for files exceeding 4GB on 32bit systems.
O_NONBLOCK may be useful when monitoring slow devices like tapes.
O_NDELAY is equal to O_NONBLOCK except for platform parisc.
To avoid code breaking on parisc either both flags should be
allowed or none. The patch allows both.
__O_SYNC and O_DSYNC may be used to avoid data loss on power disruption.
O_NOATIME may be useful to reduce disk activity.
O_CLOEXEC may be useful, if separate processes shall be used to scan files.
Once this patch is accepted, the fanotify_init.2 manpage has to be updated.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/notify/fanotify/fanotify_user.c: fix FAN_MARK_FLUSH flag checking
If fanotify_mark is called with illegal value of arguments flags and
marks it usually returns EINVAL.
When fanotify_mark is called with FAN_MARK_FLUSH the argument flags is
not checked for irrelevant flags like FAN_MARK_IGNORED_MASK.
The patch removes this inconsistency.
If an irrelevant flag is set error EINVAL is returned.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Cohen [Wed, 4 Jun 2014 23:05:42 +0000 (16:05 -0700)]
fs/notify/mark.c: trivial cleanup
Do not initialize private_destroy_list twice. list_replace_init()
already takes care of initializing private_destroy_list. We don't need
to initialize it with LIST_HEAD() beforehand.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before the patch, read creates FAN_ACCESS_PERM and FAN_ACCESS events,
readdir creates only FAN_ACCESS_PERM events.
This is inconsistent.
After the patch, readdir creates FAN_ACCESS_PERM and FAN_ACCESS events.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fanotify: FAN_MARK_FLUSH: avoid having to provide a fake/invalid fd and path
Originally from Tvrtko Ursulin (https://lkml.org/lkml/2011/1/12/112)
Avoid having to provide a fake/invalid fd and path when flushing marks
Currently for a group to flush marks it has set it needs to provide a
fake or invalid (but resolvable) file descriptor and path when calling
fanotify_mark. This patch pulls the flush handling a bit up so file
descriptor and path are completely ignored when flushing.
I reworked the patch to be applicable again (the signature of
fanotify_mark has changed since Tvrtko's work).
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Wed, 4 Jun 2014 23:05:36 +0000 (16:05 -0700)]
kthread: fix return value of kthread_create() upon SIGKILL.
Commit 786235eeba0e ("kthread: make kthread_create() killable") meant
for allowing kthread_create() to abort as soon as killed by the
OOM-killer. But returning -ENOMEM is wrong if killed by SIGKILL from
userspace. Change kthread_create() to return -EINTR upon SIGKILL.
Naoya Horiguchi [Wed, 4 Jun 2014 23:05:35 +0000 (16:05 -0700)]
hugetlb: restrict hugepage_migration_support() to x86_64
Currently hugepage migration is available for all archs which support
pmd-level hugepage, but testing is done only for x86_64 and there're
bugs for other archs. So to avoid breaking such archs, this patch
limits the availability strictly to x86_64 until developers of other
archs get interested in enabling this feature.
Simply disabling hugepage migration on non-x86_64 archs is not enough to
fix the reported problem where sys_move_pages() hits the BUG_ON() in
follow_page(FOLL_GET), so let's fix this by checking if hugepage
migration is supported in vma_migratable().
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Hugh Dickins <hughd@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right, since conversion to mutex then rwsem, we should not put_anon_vma()
from inside an rcu_read_lock()ed section: fix the two places that did so.
And add might_sleep() to anon_vma_free(), as suggested by Peter Zijlstra.
Fixes: 88c22088bf23 ("mm: optimize page_lock_anon_vma() fast-path") Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 4 Jun 2014 23:05:32 +0000 (16:05 -0700)]
MAINTAINERS: pass on hwpoison maintainership to Naoya Horiguchi
Horiguchi-san has done most of the work on hwpoison in the last years
and he also does most of the reviewing. So I'm passing on the hwpoison
maintainership to him.
tools/vm/page-types.c: catch sigbus if raced with truncate
Recently added page-cache dumping is known to be a little bit racy.
But after race with truncate it just dies due to unhandled SIGBUS
when it tries to poke pages beyond the new end of file.
This patch adds handler for SIGBUS which skips the rest of the file.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 4 Jun 2014 17:02:38 +0000 (10:02 -0700)]
Merge tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux into next
Pull DeviceTree updates from Rob Herring:
- Another round of clean-up of FDT related code in architecture code.
This removes knowledge of internal FDT details from most
architectures except powerpc.
- Conversion of kernel's custom FDT parsing code to use libfdt.
- DT based initialization for generic serial earlycon. The
introduction of generic serial earlycon support went in through the
tty tree.
- Improve the platform device naming for DT probed devices to ensure
unique naming and use parent names instead of a global index.
- Fix a race condition in of_update_property.
- Unify the various linker section OF match tables and fix several
function prototype errors.
- Update platform_get_irq_byname to work in deferred probe cases.
- 2 binding doc updates
* tag 'devicetree-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (58 commits)
of: handle NULL node in next_child iterators
of/irq: provide more wrappers for !CONFIG_OF
devicetree: bindings: Document micrel vendor prefix
dt: bindings: dwc2: fix required value for the phy-names property
of_pci_irq: kill useless variable in of_irq_parse_pci()
of/irq: do irq resolution in platform_get_irq_byname()
of: Add a testcase for of_find_node_by_path()
of: Make of_find_node_by_path() handle /aliases
of: Create unlocked version of for_each_child_of_node()
lib: add glibc style strchrnul() variant
of: Handle memory@0 node on PPC32 only
pci/of: Remove dead code
of: fix race between search and remove in of_update_property()
of: Use NULL for pointers
of: Stop naming platform_device using dcr address
of: Ensure unique names without sacrificing determinism
tty/serial: pl011: add DT based earlycon support
of/fdt: add FDT serial scanning for earlycon
of/fdt: add FDT address translation support
serial: earlycon: add DT support
...
Linus Torvalds [Wed, 4 Jun 2014 16:08:25 +0000 (09:08 -0700)]
Merge tag 'sound-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into next
Pull sound updates from Takashi Iwai:
"At this time, majority of changes come from ASoC world while we got a
few new drivers in other places for FireWire and USB. There have been
lots of ASoC core cleanups / refactoring, but very little visible to
external users.
ASoC:
- Support for specifying aux CODECs in DT
- Removal of the deprecated mux and enum macros
- More moves towards full componentisation
- Removal of some unused I/O code
- Lots of cleanups, fixes and enhancements to the davinci, Freescale,
Haswell and Realtek drivers
- Several drivers exposed directly in Kconfig for use with
simple-card
- GPIO descriptor support for jacks
- More updates and fixes to the Freescale SSI, Intel and rsnd drivers
- New drivers for Cirrus CS42L56, Realtek RT5639, RT5642 and RT5651
and ST STA350, Analog Devices ADAU1361, ADAU1381, ADAU1761 and
ADAU1781, and Realtek RT5677
HD-audio:
- Clean up Dell headset quirks
- Noise fixes for Dell and Sony laptops
- Thinkpad T440 dock fix
- Realtek codec updates (ALC293,ALC233,ALC3235)
- Tegra HD-audio HDMI support
FireWire-audio:
- FireWire audio stack enhancement (AMDTP, MIDI), support for
incoming isochronous stream and duplex streams with timestamp
synchronization
- BeBoB-based devices support
- Fireworks-based device support
Misc:
- Clean up of a few old drivers, atmel, fm801, etc"
* tag 'sound-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (480 commits)
ASoC: Fix wrong argument for card remove callbacks
ASoC: free jack GPIOs before the sound card is freed
ALSA: firewire-lib: Remove a comment about restriction of asynchronous operation
ASoC: cache: Fix error code when not using ASoC level cache
ALSA: hda/realtek - Fix COEF widget NID for ALC260 replacer fixup
ALSA: hda/realtek - Correction of fixup codes for PB V7900 laptop
ALSA: firewire-lib: Use IEC 61883-6 compliant labels for Raw Audio data
ASoC: add RT5677 CODEC driver
ASoC: intel: The Baytrail/MAX98090 driver depends on I2C
ASoC: rt5640: Add the function "get_clk_info" to RL6231 shared support
ASoC: rt5640: Add the function of the PLL clock calculation to RL6231 shared support
ASoC: rt5640: Add RL6231 class device shared support for RT5640, RT5645 and RT5651
ASoC: cache: Fix possible ZERO_SIZE_PTR pointer dereferencing error.
ASoC: Add helper functions to cast from DAPM context to CODEC/platform
ALSA: bebob: sizeof() vs ARRAY_SIZE() typo
ASoC: wm9713: correct mono out PGA sources
ALSA: synth: emux: soundfont.c: Cleaning up memory leak
ASoC: fsl: Remove dependencies of boards for SND_SOC_EUKREA_TLV320
ASoC: fsl-ssi: Use regmap
ASoC: fsl-ssi: reorder and document fsl_ssi_private
...
Linus Torvalds [Wed, 4 Jun 2014 16:05:12 +0000 (09:05 -0700)]
Merge tag 'fbdev-main-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux into next
Pull main fbdev changes from Tomi Valkeinen:
"Mainly fixes and small improvements. The biggest change seems to be
backlight control support for mx3fb"
* tag 'fbdev-main-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (31 commits)
drivers/video/fbdev/fb-puv3.c: Add header files for function unifb_mmap
video: fbdev: s3fb.c: Fix for possible null pointer dereference
video: fbdev: grvga.c: Fix for possible null pointer dereference
matroxfb: perform a dummy read of M_STATUS
video: of: display_timing: fix default native-mode setting
video: delete unneeded call to platform_get_drvdata
video: mx3fb: Add backlight control support
video: omap: delete support for early fbmem allocation
video: of: display_timing: remove two unsafe error messages
fbdev: fbmem: remove positive test on unsigned values
fbcon: Fix memory leak in con2fb_release_oldinfo()
video: Kconfig: Add a dependency to the Goldfish framebuffer driver
video: exynos: Add a dependency to the menu
video: mx3fb: Use devm_kzalloc
video/nuc900: allow modular build
video: atmel needs FB_BACKLIGHT
video: export fb_prepare_logo
video/mbx: fix building debugfs support
video/omap: fix modular build
video: clarify I2C dependencies
...
Linus Torvalds [Wed, 4 Jun 2014 15:57:16 +0000 (08:57 -0700)]
Merge tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into next
Pull ACPI and power management updates from Rafael Wysocki:
"ACPICA is the leader this time (63 commits), followed by cpufreq (28
commits), devfreq (15 commits), system suspend/hibernation (12
commits), ACPI video and ACPI device enumeration (10 commits each).
We have no major new features this time, but there are a few
significant changes of how things work. The most visible one will
probably be that we are now going to create platform devices rather
than PNP devices by default for ACPI device objects with _HID. That
was long overdue and will be really necessary to be able to use the
same drivers for the same hardware blocks on ACPI and DT-based systems
going forward. We're not expecting fallout from this one (as usual),
but it's something to watch nevertheless.
The second change having a chance to be visible is that ACPI video
will now default to using native backlight rather than the ACPI
backlight interface which should generally help systems with broken
Win8 BIOSes. We're hoping that all problems with the native backlight
handling that we had previously have been addressed and we are in a
good enough shape to flip the default, but this change should be easy
enough to revert if need be.
In addition to that, the system suspend core has a new mechanism to
allow runtime-suspended devices to stay suspended throughout system
suspend/resume transitions if some extra conditions are met
(generally, they are related to coordination within device hierarchy).
However, enabling this feature requires cooperation from the bus type
layer and for now it has only been implemented for the ACPI PM domain
(used by ACPI-enumerated platform devices mostly today).
Also, the acpidump utility that was previously shipped as a separate
tool will now be provided by the upstream ACPICA along with the rest
of ACPICA code, which will allow it to be more up to date and better
supported, and we have one new cpuidle driver (ARM clps711x).
The rest is improvements related to certain specific use cases,
cleanups and fixes all over the place.
Specifics:
- ACPICA update to upstream version 20140424. That includes a number
of fixes and improvements related to things like GPE handling,
table loading, headers, memory mapping and unmapping, DSDT/SSDT
overriding, and the Unload() operator. The acpidump utility from
upstream ACPICA is included too. From Bob Moore, Lv Zheng, David
Box, David Binderman, and Colin Ian King.
- Fixes and cleanups related to ACPI video and backlight interfaces
from Hans de Goede. That includes blacklist entries for some new
machines and using native backlight by default.
- ACPI device enumeration changes to create platform devices rather
than PNP devices for ACPI device objects with _HID by default. PNP
devices will still be created for the ACPI device object with
device IDs corresponding to real PNP devices, so that change should
not break things left and right, and we're expecting to see more
and more ACPI-enumerated platform devices in the future. From
Zhang Rui and Rafael J Wysocki.
- Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing it
to handle system suspend/resume on Asus T100 correctly. From
Heikki Krogerus and Rafael J Wysocki.
- PM core update introducing a mechanism to allow runtime-suspended
devices to stay suspended over system suspend/resume transitions if
certain additional conditions related to coordination within device
hierarchy are met. Related PM documentation update and ACPI PM
domain support for the new feature. From Rafael J Wysocki.
- Fixes and improvements related to the "freeze" sleep state. They
affect several places including cpuidle, PM core, ACPI core, and
the ACPI battery driver. From Rafael J Wysocki and Zhang Rui.
- Miscellaneous fixes and updates of the ACPI core from Aaron Lu,
Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki.
- Fixes and cleanups for the ACPI processor and ACPI PAD (Processor
Aggregator Device) drivers from Baoquan He, Manuel Schölling, Tony
Camuso, and Toshi Kani.
- System suspend/resume optimization in the ACPI battery driver from
Lan Tianyu.
- OPP (Operating Performance Points) subsystem updates from Chander
Kashyap, Mark Brown, and Nishanth Menon.
- cpufreq core fixes, updates and cleanups from Srivatsa S Bhat,
Stratos Karafotis, and Viresh Kumar.
- Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q,
s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris,
Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and
Viresh Kumar.
- intel_pstate driver fixes and cleanups from Dirk Brandewie, Doug
Smythies, and Stratos Karafotis.
- Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown.
- Fix for the cpuidle menu governor from Chander Kashyap.
- New ARM clps711x cpuidle driver from Alexander Shiyan.
- Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter,
Fabian Frederick, Pali Rohár, and Sebastian Capella.
- Intel RAPL (Running Average Power Limit) driver updates from Jacob
Pan.
- PNP subsystem updates from Bjorn Helgaas and Fabian Frederick.
- devfreq core updates from Chanwoo Choi and Paul Bolle.
- devfreq updates for exynos4 and exynos5 from Chanwoo Choi and
Bartlomiej Zolnierkiewicz.
- turbostat tool fix from Jean Delvare.
- cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra
and Thomas Renninger.
- New ACPI ec_access.c tool for poking at the EC in a safe way from
Thomas Renninger"
* tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (187 commits)
ACPICA: Namespace: Remove _PRP method support.
intel_pstate: Improve initial busy calculation
intel_pstate: add sample time scaling
intel_pstate: Correct rounding in busy calculation
intel_pstate: Remove C0 tracking
PM / hibernate: fixed typo in comment
ACPI: Fix x86 regression related to early mapping size limitation
ACPICA: Tables: Add mechanism to control early table checksum verification.
ACPI / scan: use platform bus type by default for _HID enumeration
ACPI / scan: always register ACPI LPSS scan handler
ACPI / scan: always register memory hotplug scan handler
ACPI / scan: always register container scan handler
ACPI / scan: Change the meaning of missing .attach() in scan handlers
ACPI / scan: introduce platform_id device PNP type flag
ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list
ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule
ACPI / PNP: use device ID list for PNPACPI device enumeration
ACPI / scan: .match() callback for ACPI scan handlers
ACPI / battery: wakeup the system only when necessary
power_supply: allow power supply devices registered w/o wakeup source
...
Linus Torvalds [Wed, 4 Jun 2014 15:52:36 +0000 (08:52 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid into next
Pull HID patches from Jiri Kosina:
- RMI driver for Synaptics touchpads, by Benjamin Tissoires, Andrew
Duggan and Jiri Kosina
- cleanup of hid-sony driver and improved support for Sixaxis and
Dualshock 4, by Frank Praznik
- other usual small fixes and support for new device IDs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (29 commits)
HID: thingm: thingm_fwinfo[] doesn't need to be global
HID: core: add two new usages for digitizer
HID: hid-sensor-hub: new device id and quirk for STM Sensor hub
HID: usbhid: enable NO_INIT_REPORTS quirk for Semico USB Keykoard
HID: hid-sensor-hub: Set report quirk for Microsoft Surface
HID: debug: add labels for HID Sensor Usages
HID: uhid: Use kmemdup instead of kmalloc + memcpy
HID: rmi: do not handle touchscreens through hid-rmi
HID: quirk for Saitek RAT7 and MMO7 mices' mode button
HID: core: fix validation of report id 0
HID: rmi: fix masks for x and w_x data
HID: rmi: fix wrong struct field name
HID: rmi: do not fetch more than 16 bytes in a query
HID: rmi: check for the existence of some optional queries before reading query 12
HID: i2c-hid: hid report descriptor retrieval changes
HID: add missing hid usages
HID: hid-sony - allow 3rd party INTEC controller to turn off all leds
HID: sony: Add blink support to the Sixaxis and DualShock 4 LEDs
HID: sony: Initialize the controller LEDs with a device ID value
HID: sony: Use the controller Bluetooth MAC address as the unique value in the battery name string
...
Linus Torvalds [Wed, 4 Jun 2014 15:47:12 +0000 (08:47 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into next
Pull KVM updates from Paolo Bonzini:
"At over 200 commits, covering almost all supported architectures, this
was a pretty active cycle for KVM. Changes include:
- a lot of s390 changes: optimizations, support for migration, GDB
support and more
- ARM changes are pretty small: support for the PSCI 0.2 hypercall
interface on both the guest and the host (the latter acked by
Catalin)
- initial POWER8 and little-endian host support
- support for running u-boot on embedded POWER targets
- pretty large changes to MIPS too, completing the userspace
interface and improving the handling of virtualized timer hardware
- for x86, a larger set of changes is scheduled for 3.17. Still, we
have a few emulator bugfixes and support for running nested
fully-virtualized Xen guests (para-virtualized Xen guests have
always worked). And some optimizations too.
The only missing architecture here is ia64. It's not a coincidence
that support for KVM on ia64 is scheduled for removal in 3.17"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits)
KVM: add missing cleanup_srcu_struct
KVM: PPC: Book3S PR: Rework SLB switching code
KVM: PPC: Book3S PR: Use SLB entry 0
KVM: PPC: Book3S HV: Fix machine check delivery to guest
KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs
KVM: PPC: Book3S HV: Make sure we don't miss dirty pages
KVM: PPC: Book3S HV: Fix dirty map for hugepages
KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address
KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates()
KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number
KVM: PPC: Book3S: Add ONE_REG register names that were missed
KVM: PPC: Add CAP to indicate hcall fixes
KVM: PPC: MPIC: Reset IRQ source private members
KVM: PPC: Graciously fail broken LE hypercalls
PPC: ePAPR: Fix hypercall on LE guest
KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler
KVM: PPC: BOOK3S: Always use the saved DAR value
PPC: KVM: Make NX bit available with magic page
KVM: PPC: Disable NX for old magic page using guests
KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest
...
Linus Torvalds [Wed, 4 Jun 2014 15:39:03 +0000 (08:39 -0700)]
Merge tag 'jfs-3.16' of git://github.com/kleikamp/linux-shaggy into next
Pull jfs changes from Dave Kleikamp.
* tag 'jfs-3.16' of git://github.com/kleikamp/linux-shaggy:
fs/jfs/super.c: convert simple_str to kstr
fs/jfs/jfs_dmap.c: replace min/casting by min_t
fs/jfs/super.c: remove 0 assignment to static + code clean-up
fs/jfs/jfs_logmgr.c: remove NULL assignment on static
JFS: Check for NULL before calling posix_acl_equiv_mode()
fs/jfs/jfs_inode.c: atomically set inode->i_flags
Linus Torvalds [Wed, 4 Jun 2014 15:30:10 +0000 (08:30 -0700)]
Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw into next
Pull gfs2 updates from Steven Whitehouse:
"This must be about the smallest merge window patch set ever for GFS2.
It is probably also the first one without a single patch from me.
That is down to a combination of factors, and I have some things in
the works that are not quite ready yet, that I hope to put in next
time around.
Returning to what is here this time... we have 3 patches which fix
various warnings. Two are bug fixes (for quotas and also a rare
recovery race condition). The final patch, from Ben Marzinski, is an
important change in the freeze code which has been in progress for
some time. This removes the need to take and drop the transaction
lock for every single transaction, when the only time it was used, was
at file system freeze time. Ben's patch integrates the freeze
operation into the journal flush code as an alternative with lower
overheads and also lands up resolving some difficult to fix races at
the same time"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Prevent recovery before the local journal is set
GFS2: fs/gfs2/file.c: kernel-doc warning fixes
GFS2: fs/gfs2/bmap.c: kernel-doc warning fixes
GFS2: remove transaction glock
GFS2: lops.c: replace 0 by NULL for pointers
GFS2: quotas not being refreshed in gfs2_adjust_quota
Linus Torvalds [Wed, 4 Jun 2014 15:12:50 +0000 (08:12 -0700)]
Merge tag 'locks-v3.16' of git://git.samba.org/jlayton/linux into next
Pull file locking changes from Jeff Layton:
"Pretty quiet on the file-locking related front this cycle. Just some
small cleanups and the addition of some tracepoints in the lease
handling code"
* tag 'locks-v3.16' of git://git.samba.org/jlayton/linux:
locks: add some tracepoints in the lease handling code
fs/locks.c: replace seq_printf by seq_puts
locks: ensure that fl_owner is always initialized properly in flock and lease codepaths
Linus Torvalds [Wed, 4 Jun 2014 14:47:03 +0000 (07:47 -0700)]
Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 into next
Pull firewire updates from Stefan Richter:
"IEEE 1394 (FireWire) subsystem changes: One optimization for some VIA
controllers, one fix, one kconfig brushup"
* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: ohci: enable MSI for VIA VT6315 rev 1, drop cycle timer quirk
firewire: Use COMPILE_TEST for build testing
firewire: net: fix NULL derefencing in fwnet_probe()
Thomas Gleixner [Wed, 4 Jun 2014 10:34:15 +0000 (12:34 +0200)]
clocksource: versatile: Use sched_clock_register()
The newly merged versatile sched clock support uses a deprecated
interface. Of course that patch got routed through the ARM tree instead
of going through the relevant maintainer tree.
Use the proper interface so we can get rid of the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Fainelli [Wed, 28 May 2014 17:39:02 +0000 (10:39 -0700)]
of: handle NULL node in next_child iterators
Add an early check for the node argument in __of_get_next_child and
of_get_next_available_child() to avoid dereferencing a NULL node pointer
a few lines after.
CC: Daniel Mack <zonque@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>
Arnd Bergmann [Wed, 4 Jun 2014 09:40:19 +0000 (04:40 -0500)]
of/irq: provide more wrappers for !CONFIG_OF
The pci-rcar driver is enabled for compile tests, and this has
now shown that the driver cannot build without CONFIG_OF,
following the inclusion of f8f2fe7355fb "PCI: rcar: Use new OF
interrupt mapping when possible":
drivers/built-in.o: In function `rcar_pci_map_irq':
:(.text+0x1cc7c): undefined reference to `of_irq_parse_and_map_pci'
pci/host/pcie-rcar.c: In function 'pci_dma_range_parser_init':
pci/host/pcie-rcar.c:875:2: error: implicit declaration of function 'of_n_addr_cells' [-Werror=implicit-function-declaration]
As pointed out by Ben Dooks and Geert Uytterhoeven, this is actually
supposed to build fine, which we can achieve if we make the
declaration of of_irq_parse_and_map_pci conditional on CONFIG_OF
and provide an empty inline function otherwise, as we do for
a lot of other of interfaces.
This lets us build the rcar_pci driver again without CONFIG_OF
for build testing. All platforms using this driver select OF,
so this doesn't change anything for the users.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: devicetree@vger.kernel.org Cc: Rob Herring <robh+dt@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Magnus Damm <damm@opensource.se> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: linux-pci@vger.kernel.org Cc: linux-sh@vger.kernel.org
[robh: drop wrappers for of_n_addr_cells and of_n_size_cells which are
low-level functions that should not be used for !OF] Signed-off-by: Rob Herring <robh@kernel.org>
Linus Torvalds [Tue, 3 Jun 2014 22:48:23 +0000 (15:48 -0700)]
Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86/UV changes from Ingo Molnar:
"Continued updates for SGI UV 3 hardware support"
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/UV: Fix conditional in gru_exit()
x86/UV: Set n_lshift based on GAM_GR_CONFIG MMR for UV3
Linus Torvalds [Tue, 3 Jun 2014 22:47:40 +0000 (15:47 -0700)]
Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 RAS changes from Ingo Molnar:
"Improve mcheck device initialization and bootstrap robustness"
* 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mce: Panic when a core has reached a timeout
x86/mce: Improve mcheck_init_device() error handling
Linus Torvalds [Tue, 3 Jun 2014 22:46:38 +0000 (15:46 -0700)]
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 IOSF platform updates from Ingo Molnar:
"IOSF (Intel OnChip System Fabric) updates:
- generalize the IOSF interface to allow mixed mode drivers: non-IOSF
drivers to utilize of IOSF features on IOSF platforms.
- add 'Quark X1000' IOSF/MBI support
- clean up BayTrail and Quark PCI ID enumeration"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, iosf: Add PCI ID macros for better readability
x86, iosf: Add Quark X1000 PCI ID
x86, iosf: Added Quark MBI identifiers
x86, iosf: Make IOSF driver modular and usable by more drivers
Linus Torvalds [Tue, 3 Jun 2014 22:45:50 +0000 (15:45 -0700)]
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 mm update from Ingo Molnar:
- speed up 256 GB PCI BAR ioremap()s
- speed up PTE swapout page reclaim case
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, ioremap: Speed up check for RAM pages
x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB
Linus Torvalds [Tue, 3 Jun 2014 22:44:50 +0000 (15:44 -0700)]
Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 microcode changes from Ingo Molnar:
"A microcode-debugging boot flag plus related refactoring"
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode: Add a disable chicken bit
x86, boot: Carve out early cmdline parsing function
Linus Torvalds [Tue, 3 Jun 2014 22:43:38 +0000 (15:43 -0700)]
Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 irq cleanup from Ingo Molnar:
"A single, trivial cleanup"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Clean up VECTOR_UNDEFINED and VECTOR_RETRIGGERED definition
Linus Torvalds [Tue, 3 Jun 2014 22:43:00 +0000 (15:43 -0700)]
Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 build cleanups from Ingo Molnar:
"Two small build related cleanups"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Supress realmode.bin is up to date message
compiler-intel.h: Remove duplicate definition
Linus Torvalds [Tue, 3 Jun 2014 22:41:57 +0000 (15:41 -0700)]
Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 boot changes from Ingo Molnar:
"Two small cleanups"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, boot: Remove misc.h inclusion from compressed/string.c
x86, boot: Do not include boot.h in string.c
* acpi-tools:
ACPI / tools: Introduce ec_access.c - tool to access the EC
* pm-tools:
cpupower: Remove mc and smt power aware scheduler info/settings
cpupower: cpupower info -b should return 0 on success, not the perf bias value
cpupower: If root, try to load msr driver on x86 if /dev/cpu/0/msr is not available
cpupower: Install recently added cpupower-idle-{set, info} manpages
cpupower: Introduce idle state disable-by-latency and enable-all
cpupower: Remove all manpages on make uninstall
cpupower: Remove dead link to homepage, and update the targets built.
cpupower: Rename cpufrequtils -> cpupower, and libcpufreq -> libcpupower.
PM / tools: cpupower: add option to display values without round offs
tools / power: turbostat: Drop temperature checks
* pm-cpufreq: (28 commits)
cpufreq: handle calls to ->target_index() in separate routine
cpufreq: s5pv210: drop check for CONFIG_PM_VERBOSE
cpufreq: intel_pstate: Remove unused member name of cpudata
cpufreq: Break out early when frequency equals target_freq
cpufreq: Tegra: drop wrapper around tegra_update_cpu_speed()
cpufreq: imx6q: Remove unused include
cpufreq: imx6q: Drop devm_clk/regulator_get usage
cpufreq: powernow-k8: Suppress checkpatch warnings
cpufreq: powernv: make local function static
cpufreq: Enable big.LITTLE cpufreq driver on arm64
cpufreq: nforce2: remove DEFINE_PCI_DEVICE_TABLE macro
intel_pstate: Add CPU IDs for Broadwell processors
cpufreq: Fix build error on some platforms that use cpufreq_for_each_*
PM / OPP: Move cpufreq specific OPP functions out of generic OPP library
PM / OPP: Remove cpufreq wrapper dependency on internal data organization
cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end
intel_pstate: Remove sample parameter in intel_pstate_calc_busy
cpufreq: Kconfig: Fix spelling errors
cpufreq: Make linux-pm@vger.kernel.org official mailing list
cpufreq: exynos: Use dev_err/info function instead of pr_err/info
...
Merge branches 'pnp', 'powercap', 'pm-runtime' and 'pm-opp'
* pnp:
MAINTAINERS: Remove Bjorn Helgaas as PNP maintainer
PNP / resources: remove positive test on unsigned values
* powercap:
powercap / RAPL: add new CPU IDs
powercap / RAPL: further relax energy counter checks
* pm-runtime:
PM / runtime: Update documentation to reflect the current code flow
* pm-opp:
PM / OPP: discard duplicate OPPs
PM / OPP: Make OPP invisible to users in Kconfig
PM / OPP: fix incorrect OPP count handling in of_init_opp_table
* acpi-video:
ACPI / video: Add 4 new models to the use_native_backlight DMI list
ACPI / video: Add use native backlight quirk for the ThinkPad W530
ACPI / video: Unregister the backlight device if a raw one shows up later
backlight: Add backlight device (un)registration notification
nouveau: Don't check acpi_video_backlight_support() before registering backlight
acer-wmi: Add Aspire 5741 to video_vendor_dmi_table
acer-wmi: Switch to acpi_video_unregister_backlight
ACPI / video: Add an acpi_video_unregister_backlight function
ACPI / video: Don't register acpi_video_resume notifier without backlight devices
ACPI / video: change acpi-video brightness_switch_enabled default to 0
* acpica: (63 commits)
ACPICA: Namespace: Remove _PRP method support.
ACPI: Fix x86 regression related to early mapping size limitation
ACPICA: Tables: Add mechanism to control early table checksum verification.
ACPICA: acpidump: Fix repetitive table dump in -n mode.
ACPI: Clean up acpi_os_map/unmap_memory() to eliminate __iomem.
ACPICA: Clean up redudant definitions already defined elsewhere
ACPICA: Linux headers: Add <asm/acenv.h> to remove mis-ordered inclusion of <asm/acpi.h>
ACPICA: Linux headers: Add <acpi/platform/aclinuxex.h>
ACPICA: Linux headers: Remove ACPI_PREEMPTION_POINT() due to no usages.
ACPICA: Update version to 20140424.
ACPICA: Comment/format update, no functional change.
ACPICA: Events: Update GPE handling and initialization code.
ACPICA: Remove extraneous error message for large number of GPEs.
ACPICA: Tables: Remove old mechanism to validate if XSDT contains NULL entries.
ACPICA: Tables: Add new mechanism to skip NULL entries in RSDT and XSDT.
ACPICA: acpidump: Add support to force using RSDT.
ACPICA: Back port of improvements on exception code.
ACPICA: Back port of _PRP update.
ACPICA: acpidump: Fix truncated RSDP signature validation.
ACPICA: Linux header: Add support for stubbed externals.
...
* acpi-enumeration:
ACPI / scan: use platform bus type by default for _HID enumeration
ACPI / scan: always register ACPI LPSS scan handler
ACPI / scan: always register memory hotplug scan handler
ACPI / scan: always register container scan handler
ACPI / scan: Change the meaning of missing .attach() in scan handlers
ACPI / scan: introduce platform_id device PNP type flag
ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list
ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule
ACPI / PNP: use device ID list for PNPACPI device enumeration
ACPI / scan: .match() callback for ACPI scan handlers
* acpi-pm:
ACPI / PM: Export rest of the subsys PM callbacks
ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend
ACPI / PM: Hold ACPI scan lock over the "freeze" sleep state
ACPI / PM: Export acpi_target_system_state() to modules
* acpi-battery:
ACPI / battery: wakeup the system only when necessary
power_supply: allow power supply devices registered w/o wakeup source
ACPI / battery: introduce support for POWER_SUPPLY_PROP_CAPACITY_LEVEL
ACPI / battery: Accelerate battery resume callback
* pm-sleep:
PM / hibernate: fixed typo in comment
PM / sleep: unregister wakeup source when disabling device wakeup
PM / sleep: Introduce command line argument for sleep state enumeration
PM / sleep: Use valid_state() for platform-dependent sleep states only
PM / sleep: Add state field to pm_states[] entries
PM / sleep: Update device PM documentation to cover direct_complete
PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily
PM / hibernate: Fix memory corruption in resumedelay_setup()
PM / hibernate: convert simple_strtoul to kstrtoul
PM / hibernate: Documentation: Fix script for unswapping
PM / hibernate: no kernel_power_off when pm_power_off NULL
PM / hibernate: use unsigned local variables in swsusp_show_speed()
* pm-cpuidle:
PM / suspend: Always use deepest C-state in the "freeze" sleep state
cpuidle / menu: move repeated correction factor check to init
cpuidle / menu: Return (-1) if there are no suitable states
cpuidle: Combine cpuidle_enabled() with cpuidle_select()
ARM: clps711x: Add cpuidle driver
Linus Torvalds [Tue, 3 Jun 2014 21:00:15 +0000 (14:00 -0700)]
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull scheduler updates from Ingo Molnar:
"The main scheduling related changes in this cycle were:
- various sched/numa updates, for better performance
- tree wide cleanup of open coded nice levels
- nohz fix related to rq->nr_running use
- cpuidle changes and continued consolidation to improve the
kernel/sched/idle.c high level idle scheduling logic. As part of
this effort I pulled cpuidle driver changes from Rafael as well.
- standardized idle polling amongst architectures
- continued work on preparing better power/energy aware scheduling
- sched/rt updates
- misc fixlets and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
sched/numa: Decay ->wakee_flips instead of zeroing
sched/numa: Update migrate_improves/degrades_locality()
sched/numa: Allow task switch if load imbalance improves
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()
sched: Initialize rq->age_stamp on processor start
sched, nohz: Change rq->nr_running to always use wrappers
sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance()
sched: Use clamp() and clamp_val() to make sys_nice() more readable
sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()
sched/numa: Fix initialization of sched_domain_topology for NUMA
sched: Call select_idle_sibling() when not affine_sd
sched: Simplify return logic in sched_read_attr()
sched: Simplify return logic in sched_copy_attr()
sched: Fix exec_start/task_hot on migrated tasks
arm64: Remove TIF_POLLING_NRFLAG
metag: Remove TIF_POLLING_NRFLAG
sched/idle: Make cpuidle_idle_call() void
sched/idle: Reflow cpuidle_idle_call()
sched/idle: Delay clearing the polling bit
...
Linus Torvalds [Tue, 3 Jun 2014 20:18:00 +0000 (13:18 -0700)]
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull perf updates from Ingo Molnar:
"The tooling changes maintained by Jiri Olsa until Arnaldo is on
vacation:
User visible changes:
- Add -F option for specifying output fields (Namhyung Kim)
- Propagate exit status of a command line workload for record command
(Namhyung Kim)
- Use tid for finding thread (Namhyung Kim)
- Clarify the output of perf sched map plus small sched command
fixes (Dongsheng Yang)
- Wire up perf_regs and unwind support for ARM64 (Jean Pihet)
- Factor hists statistics counts processing which in turn also fixes
several bugs in TUI report command (Namhyung Kim)
- Add --percentage option to control absolute/relative percentage
output (Namhyung Kim)
- Add --list-cmds to 'kmem', 'mem', 'lock' and 'sched', for use by
completion scripts (Ramkumar Ramachandra)
Development/infrastructure changes and fixes:
- Android related fixes for pager and map dso resolving (Michael
Lentine)
- Add libdw DWARF post unwind support for ARM (Jean Pihet)
- Consolidate types.h for ARM and ARM64 (Jean Pihet)
- Fix possible null pointer dereference in session.c (Masanari Iida)
- Cleanup, remove unused variables in map_switch_event() (Dongsheng
Yang)
- Remove nr_state_machine_bugs in perf latency (Dongsheng Yang)
- Remove usage of trace_sched_wakeup(.success) (Peter Zijlstra)
- Cleanups for perf.h header (Jiri Olsa)
- Consolidate types.h and export.h within tools (Borislav Petkov)
- Move u64_swap union to its single user's header, evsel.h (Borislav
Petkov)
- Fix for s390 to properly parse tracepoints plus test code
(Alexander Yarygin)
- Handle EINTR error for readn/writen (Namhyung Kim)
- Add a test case for hists filtering (Namhyung Kim)
- Share map_groups among threads of the same group (Arnaldo Carvalho
de Melo, Jiri Olsa)
- Making some code (cpu node map and report parse callchain callback)
global to be usable by upcomming changes (Don Zickus)
- Fix pmu object compilation error (Jiri Olsa)
Kernel side changes:
- intrusive uprobes fixes from Oleg Nesterov. Since the interface is
admin-only, and the bug only affects user-space ("any probed
jmp/call can kill the application"), we queued these fixes via the
development tree, as a special exception.
- more fuzzer motivated race fixes and related refactoring and
robustization.
- allow PMU drivers to be built as modules. (No actual module yet,
because the x86 Intel uncore module wasn't ready in time for this)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
perf tools: Add automatic remapping of Android libraries
perf tools: Add cat as fallback pager
perf tests: Add a testcase for histogram output sorting
perf tests: Factor out print_hists_*()
perf tools: Introduce reset_output_field()
perf tools: Get rid of obsolete hist_entry__sort_list
perf hists: Reset width of output fields with header length
perf tools: Skip elided sort entries
perf top: Add --fields option to specify output fields
perf report/tui: Fix a bug when --fields/sort is given
perf tools: Add ->sort() member to struct sort_entry
perf report: Add -F option to specify output fields
perf tools: Call perf_hpp__init() before setting up GUI browsers
perf tools: Consolidate management of default sort orders
perf tools: Allow hpp fields to be sort keys
perf ui: Get rid of callback from __hpp__fmt()
perf tools: Consolidate output field handling to hpp format routines
perf tools: Use hpp formats to sort final output
perf tools: Support event grouping in hpp ->sort()
perf tools: Use hpp formats to sort hist entries
...