Chao Yu [Wed, 6 May 2015 05:11:13 +0000 (13:11 +0800)]
f2fs: support FALLOC_FL_ZERO_RANGE
Now, FALLOC_FL_ZERO_RANGE flag in ->fallocate is supported in ext4/xfs.
In commit, the semantics of this flag is descripted as following:"
1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
block number is not the starting block of the extent, split the extent
such that the block number is the starting block of the extent.
4) Shift all the extents which are lying between
[offset, last allocated extent] towards right by len bytes. This step
will make a hole of len bytes at offset."
This patch implements fallocate's FALLOC_FL_ZERO_RANGE for f2fs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 6 May 2015 05:09:46 +0000 (13:09 +0800)]
f2fs: support FALLOC_FL_COLLAPSE_RANGE
Now, FALLOC_FL_COLLAPSE_RANGE flag in ->fallocate is supported in ext4/xfs.
In commit, the semantics of this flag is descripted as following:"
1) It collapses the range lying between offset and length by removing any
data blocks which are present in this range and than updates all the
logical offsets of extents beyond "offset + len" to nullify the hole
created by removing blocks. In short, it does not leave a hole.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
in case of xfs and ext4.
4) Collaspe range does not work beyond i_size."
This patch implements fallocate's FALLOC_FL_COLLAPSE_RANGE for f2fs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Wed, 6 May 2015 05:08:06 +0000 (13:08 +0800)]
f2fs: introduce f2fs_replace_block() for reuse
Introduce a generic function replace_block base on recover_data_page,
and export it. So with it we can operate file's meta data which is in
CP/SSA area when we invoke fallocate with FALLOC_FL_COLLAPSE_RANGE
flag.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In set_node_addr, we try to lookup cached nat entry of inode and then
set flag in it.
But previously in this function, we have already grabbed nat entry with
current node id, if the node id is the same as the one of inode, we
do not need to lookup it in cache again.
So this patch adds condition judgment for reducing unneeded lookup.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jaegeuk Kim [Fri, 1 May 2015 05:37:50 +0000 (22:37 -0700)]
f2fs: introduce discard_map for f2fs_trim_fs
This patch adds a bitmap for discard issues from f2fs_trim_fs.
There-in rule is to issue discard commands only for invalidated blocks
after mount.
Once mount is done, f2fs_trim_fs trims out whole invalid area.
After ehn, it will not issue and discrads redundantly.
Jaegeuk Kim [Wed, 29 Apr 2015 18:18:42 +0000 (11:18 -0700)]
f2fs: fix race on allocating and deallocating a dentry block
There are two threads:
f2fs_delete_entry() get_new_data_page()
f2fs_reserve_block()
dn.blkaddr = XXX
lock_page(dentry_block)
truncate_hole()
dn.blkaddr = NULL
unlock_page(dentry_block)
lock_page(dentry_block)
fill the block from XXX address
add new dentries
unlock_page(dentry_block)
Later, f2fs_write_data_page() will truncate the dentry_block, since
its block address is NULL.
The reason for this was due to the wrong lock order.
In this case, we should do f2fs_reserve_block() after locking its dentry block.
f2fs: add offset check routine before punch_hole() in f2fs_fallocate()
In the punch_hole(), if offset bigger than inode size, it returns SUCCESS.
Then f2fs_fallocate() will update time and dirty mark.
In that case, inode has not been modified actually.
So I have added offset check routine that prevent to call the punch_hole().
Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Our f2fs_acl_create is copied from posix_acl_create in ./fs/posix_acl.c and
modified to avoid deadlock bug when inline_dentry feature is enabled.
Dan Carpenter rewrites posix_acl_create in commit 2799563b281f
("fs/posix_acl.c: make posix_acl_create() safer and cleaner") to make this
function more safer, so that we can avoid potential bug in its caller,
especially for ocfs2.
Let's back port the patch to f2fs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Linus Torvalds [Thu, 7 May 2015 15:27:38 +0000 (08:27 -0700)]
Merge tag 'pinctrl-v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here is a smallish set of pin control fixes for the v4.1 cycle,
collected the last two weeks:
- fix a real nasty legacy bug that has screwed up the protection of
adding pinctrl maps dynamically. Normally this didn't happen so
much but Dough Anderson ran into it and fixed it, kudos!
- minor driver fixes for Qualcomm spmi, mediatek and Marvell drivers"
* tag 'pinctrl-v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: Don't just pretend to protect pinctrl_maps, do it for real
pinctrl: mediatek: mtk-common: initialize unmask
pinctrl: qcom-spmi-mpp: Fix input value report
pinctrl: qcom-spmi: Fix pin direction configuration
pinctrl: mvebu: Fix mapping of pin 63 (gpo -> gpio)
Linus Torvalds [Thu, 7 May 2015 14:04:33 +0000 (07:04 -0700)]
Merge tag 'for-linus' of git://github.com/dledford/linux
Pull infiniband updates from Doug Ledford:
"Minor updates for 4.1-rc
Most of the changes are fairly small and well confined. The iWARP
address reporting changes are the only ones that are a medium size. I
had these queued up prior to rc1, but due to the shuffle in
maintainers, they did not get submitted when I expected. My apologies
for that. I feel comfortable with them however due to the testing
they've received, so I left them in this submission"
* tag 'for-linus' of git://github.com/dledford/linux:
MAINTAINERS: Update InfiniBand subsystem maintainer
MAINTAINERS: add include/rdma/ to InfiniBand subsystem
IPoIB/CM: Fix indentation level
iw_cxgb4: Remove negative advice dmesg warnings
IB/core: Fix unaligned accesses
IB/core: change rdma_gid2ip into void function as it always return zero
IB/qib: use arch_phys_wc_add()
IB/qib: add acounting for MTRR
IB/core: dma unmap optimizations
IB/core: dma map/unmap locking optimizations
RDMA/cxgb4: Report the actual address of the remote connecting peer
RDMA/nes: Report the actual address of the remote connecting peer
RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
iw_cxgb4: enforce qp/cq id requirements
iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
iw_cxgb4: 32b platform fixes
iw_cxgb4: Cleanup register defines/MACROS
RDMA/CMA: Canonize IPv4 on IPV6 sockets properly
Linus Torvalds [Wed, 6 May 2015 22:58:06 +0000 (15:58 -0700)]
Merge tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- fix blkback regression if using persistent grants
- fix various event channel related suspend/resume bugs
- fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS
- SWIOTLB on ARM now uses frames <4 GiB (if available) so device only
capable of 32-bit DMA work.
* tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
xen/console: Update console event channel on resume
xen/xenbus: Update xenbus event channel on resume
xen/events: Clear cpu_evtchn_mask before resuming
xen-pciback: Add name prefix to global 'permissive' variable
xen: Suspend ticks on all CPUs during suspend
xen/grant: introduce func gnttab_unmap_refs_sync()
xen/blkback: safely unmap purge persistent grants
Linus Torvalds [Wed, 6 May 2015 17:57:37 +0000 (10:57 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"EFI fixes, and FPU fix, a ticket spinlock boundary condition fix and
two build fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Always restore_xinit_state() when use_eager_cpu()
x86: Make cpu_tss available to external modules
efi: Fix error handling in add_sysfs_runtime_map_entry()
x86/spinlocks: Fix regression in spinlock contention detection
x86/mm: Clean up types in xlate_dev_mem_ptr()
x86/efi: Store upper bits of command line buffer address in ext_cmd_line_ptr
efivarfs: Ensure VariableName is NUL-terminated
Linus Torvalds [Wed, 6 May 2015 17:47:25 +0000 (10:47 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
PMU driver hardware-enablement addition"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix segfault if passed with ''.
perf report: Fix -T/--threads option to work again
perf bench numa: Fix immediate meeting of convergence condition
perf bench numa: Fixes of --quiet argument
perf bench futex: Fix hung wakeup tasks after requeueing
perf probe: Fix bug with global variables handling
perf top: Fix a segfault when kernel map is restricted.
tools lib traceevent: Fix build failure on 32-bit arch
perf kmem: Fix compiles on RHEL6/OL6
tools lib api: Undefine _FORTIFY_SOURCE before setting it
perf kmem: Consistently use PRIu64 for printing u64 values
perf trace: Disable events and drain events when forked workload ends
perf trace: Enable events when doing system wide tracing and starting a workload
perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu
Doug Anderson [Fri, 1 May 2015 16:01:27 +0000 (09:01 -0700)]
pinctrl: Don't just pretend to protect pinctrl_maps, do it for real
Way back, when the world was a simpler place and there was no war, no
evil, and no kernel bugs, there was just a single pinctrl lock. That
was how the world was when (57291ce pinctrl: core device tree mapping
table parsing support) was written. In that case, there were
instances where the pinctrl mutex was already held when
pinctrl_register_map() was called, hence a "locked" parameter was
passed to the function to indicate that the mutex was already locked
(so we shouldn't lock it again).
A few years ago in (42fed7b pinctrl: move subsystem mutex to
pinctrl_dev struct), we switched to a separate pinctrl_maps_mutex.
...but (oops) we forgot to re-think about the whole "locked" parameter
for pinctrl_register_map(). Basically the "locked" parameter appears
to still refer to whether the bigger pinctrl_dev mutex is locked, but
we're using it to skip locks of our (now separate) pinctrl_maps_mutex.
That's kind of a bad thing(TM). Probably nobody noticed because most
of the calls to pinctrl_register_map happen at boot time and we've got
synchronous device probing. ...and even cases where we're
asynchronous don't end up actually hitting the race too often. ...but
after banging my head against the wall for a bug that reproduced 1 out
of 1000 reboots and lots of looking through kgdb, I finally noticed
this.
Anyway, we can now safely remove the "locked" parameter and go back to
a war-free, evil-free, and kernel-bug-free world.
Fixes: 42fed7ba44e4 ("pinctrl: move subsystem mutex to pinctrl_dev struct") Signed-off-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses. We do this on
ARM and ARM64, where dom0 is mapped 1:1, so pfn == mfn in this case.
Bobby Powers [Mon, 27 Apr 2015 15:10:41 +0000 (08:10 -0700)]
x86/fpu: Always restore_xinit_state() when use_eager_cpu()
The following commit:
f893959b0898 ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()")
removed drop_init_fpu() usage from flush_thread(). This seems to break
things for me - the Go 1.4 test suite fails all over the place with
floating point comparision errors (offending commit found through
bisection).
The functional change was that flush_thread() after this commit
only calls restore_init_xstate() when both use_eager_fpu() and
!used_math() are true. drop_init_fpu() (now fpu_reset_state()) calls
restore_init_xstate() regardless of whether current used_math() - apply
the same logic here.
Switch used_math() -> tsk_used_math(tsk) to consistently use the grabbed
tsk instead of current, like in the rest of flush_thread().
Tested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Bobby Powers <bobbypowers@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()") Link: http://lkml.kernel.org/r/1430147441-9820-1-git-send-email-bobbypowers@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 6 May 2015 02:42:01 +0000 (19:42 -0700)]
Merge tag 'for-linus-4.1-1' of git://git.code.sf.net/p/openipmi/linux-ipmi
Pull IPMI fixes from Corey Minyard:
"Lots of minor IPMI fixes, especially ones that have have come up since
the SSIF driver has been in the main kernel for a while"
* tag 'for-linus-4.1-1' of git://git.code.sf.net/p/openipmi/linux-ipmi:
ipmi: Fix multi-part message handling
ipmi: Add alert handling to SSIF
ipmi: Fix a problem that messages are not issued in run_to_completion mode
ipmi: Report an error if ACPI _IFT doesn't exist
ipmi: Remove unused including <linux/version.h>
ipmi: Don't report err in the SI driver for SSIF devices
ipmi: Remove incorrect use of seq_has_overflowed
ipmi:ssif: Ignore spaces when comparing I2C adapter names
ipmi_ssif: Fix the logic on user-supplied addresses
Linus Torvalds [Wed, 6 May 2015 01:52:13 +0000 (18:52 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"16 patches
This includes a new rtc driver for the Abracon AB x80x and isn't very
appropriate for -rc2. It was still being fiddled with a bit during
the merge window and I fell asleep during -rc1"
[ So I took the new driver, it seems small and won't regress anything.
I'm a softy. - Linus ]
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
rtc: armada38x: fix concurrency access in armada38x_rtc_set_time
ocfs2: dlm: fix race between purge and get lock resource
nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()
util_macros.h: have array pointer point to array of constants
configfs: init configfs module earlier at boot time
mm/hwpoison-inject: check PageLRU of hpage
mm/hwpoison-inject: fix refcounting in no-injection case
mm: soft-offline: fix num_poisoned_pages counting on concurrent events
rtc: add rtc-abx80x, a driver for the Abracon AB x80x i2c rtc
Documentation: bindings: add abracon,abx80x
kasan: show gcc version requirements in Kconfig and Documentation
mm/memory-failure: call shake_page() when error hits thp tail page
lib: delete lib/find_last_bit.c
MAINTAINERS: add co-maintainer for LED subsystem
zram: add Designated Reviewer for zram in MAINTAINERS
revert "zram: move compact_store() to sysfs functions area"
Linus Torvalds [Wed, 6 May 2015 01:14:04 +0000 (18:14 -0700)]
Merge tag 'platform-drivers-x86-v4.1-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
"This includes a trivial warning and adding a Lenovo laptop to an
existing quirk.
I've held off on things like the latter in the past, but I didn't feel
it was risky enough to push out to 4.2.
- thinkpad_acpi:
Fix warning for static not at beginning
- ideapad_laptop:
Add Lenovo G40-30 to devices without radio switch"
* tag 'platform-drivers-x86-v4.1-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
thinkpad_acpi: Fix warning for static not at beginning
ideapad_laptop: Add Lenovo G40-30 to devices without radio switch
The values was not being re-initialized, if something went wrong
handling a multi-part message and it got left in a bad state, it
might be an issue.
The commands were not correct when issuing multi-part reads, the
code was not passing in the proper value for commands. Also clean
up some minor formatting issues.
Get the block number from the right location, limit the maximum send
message size to 63 bytes and explain why, and fix some minor sylistic
issues.
The SSIF interface can optionally have an SMBus alert come in when
data is ready. Unfortunately, the IPMI spec gives wiggle room to
the implementer to allow them to always have the alert enabled,
even if the driver doesn't enable it. So implement alerts.
If you don't in this situation, the SMBus alert handling will
constantly complain.
ipmi: Fix a problem that messages are not issued in run_to_completion mode
start_next_msg() issues a message placed in smi_info->waiting_msg
if it is non-NULL. However, sender() sets a message to
smi_info->curr_msg and NULL to smi_info->waiting_msg in the context
of run_to_completion mode. As the result, it leads an infinite
loop by waiting the completion of unissued message when leaving
dying message after kernel panic.
sender() should set the message to smi_info->waiting_msg not
curr_msg.
Gregory CLEMENT [Tue, 5 May 2015 23:24:05 +0000 (16:24 -0700)]
rtc: armada38x: fix concurrency access in armada38x_rtc_set_time
While setting the time, the RTC TIME register should not be accessed.
However due to hardware constraints, setting the RTC time involves
sleeping during 100ms. This sleep was done outside the critical section
protected by the spinlock, so it was possible to read the RTC TIME
register and get an incorrect value. This patch introduces a mutex for
protecting the RTC TIME access, unlike the spinlock it is allowed to
sleep in a critical section protected by a mutex.
The RTC STATUS register can still be used from the interrupt handler but
it has no effect on setting the time.
Junxiao Bi [Tue, 5 May 2015 23:24:02 +0000 (16:24 -0700)]
ocfs2: dlm: fix race between purge and get lock resource
There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged. This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.
dlm_get_lock_resource {
...
spin_lock(&dlm->spinlock);
tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
if (tmpres) {
spin_unlock(&dlm->spinlock);
>>>>>>>> race window, dlm_run_purge_list() may run and purge
the lock resource
spin_lock(&tmpres->spinlock);
...
spin_unlock(&tmpres->spinlock);
}
}
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ryusuke Konishi [Tue, 5 May 2015 23:24:00 +0000 (16:24 -0700)]
nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()
The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).
Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.
This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.
Guenter Roeck [Tue, 5 May 2015 23:23:57 +0000 (16:23 -0700)]
util_macros.h: have array pointer point to array of constants
Using the new find_closest() macro can result in the following sparse
warnings.
drivers/hwmon/lm85.c:194:16: warning:
incorrect type in initializer (different modifiers)
drivers/hwmon/lm85.c:194:16: expected int *__fc_a
drivers/hwmon/lm85.c:194:16: got int static const [toplevel] *<noident>
drivers/hwmon/lm85.c:210:16: warning:
incorrect type in initializer (different modifiers)
drivers/hwmon/lm85.c:210:16: expected int *__fc_a
drivers/hwmon/lm85.c:210:16: got int const *map
This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.
Naoya Horiguchi [Tue, 5 May 2015 23:23:52 +0000 (16:23 -0700)]
mm/hwpoison-inject: check PageLRU of hpage
Hwpoison injector checks PageLRU of the raw target page to find out
whether the page is an appropriate target, but current code now filters
out thp tail pages, which prevents us from testing for such cases via this
interface. So let's check hpage instead of p.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Dean Nelson <dnelson@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 5 May 2015 23:23:49 +0000 (16:23 -0700)]
mm/hwpoison-inject: fix refcounting in no-injection case
Hwpoison injection via debugfs:hwpoison/corrupt-pfn takes a refcount of
the target page. But current code doesn't release it if the target page
is not supposed to be injected, which results in memory leak. This patch
simply adds the refcount releasing code.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Dean Nelson <dnelson@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 5 May 2015 23:23:46 +0000 (16:23 -0700)]
mm: soft-offline: fix num_poisoned_pages counting on concurrent events
If multiple soft offline events hit one free page/hugepage concurrently,
soft_offline_page() can handle the free page/hugepage multiple times,
which makes num_poisoned_pages counter increased more than once. This
patch fixes this wrong counting by checking TestSetPageHWPoison for normal
papes and by checking the return value of dequeue_hwpoisoned_huge_page()
for hugepages.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Dean Nelson <dnelson@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rtc: add rtc-abx80x, a driver for the Abracon AB x80x i2c rtc
This is a basic driver for the ultra-low-power Abracon AB x80x series of RTC
chips. It supports in particular, the supersets AB0805 and AB1805.
It allows reading and writing the time, and enables the supercapacitor/
battery charger.
[arnd@arndb.de: abx805 depends on i2c]
[alexandre.belloni@free-electrons.com: renam buffer from date to buf in abx80x_rtc_read_time()] Signed-off-by: Philippe De Muyter <phdm@macqel.be> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 5 May 2015 23:23:38 +0000 (16:23 -0700)]
kasan: show gcc version requirements in Kconfig and Documentation
The documentation shows a need for gcc > 4.9.2, but it's really >=. The
Kconfig entries don't show require versions so add them. Correct a
latter/later typo too. Also mention that gcc 5 required to catch out of
bounds accesses to global and stack variables.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 5 May 2015 23:23:35 +0000 (16:23 -0700)]
mm/memory-failure: call shake_page() when error hits thp tail page
Currently memory_failure() calls shake_page() to sweep pages out from
pcplists only when the victim page is 4kB LRU page or thp head page.
But we should do this for a thp tail page too.
Consider that a memory error hits a thp tail page whose head page is on
a pcplist when memory_failure() runs. Then, the current kernel skips
shake_pages() part, so hwpoison_user_mappings() returns without calling
split_huge_page() nor try_to_unmap() because PageLRU of the thp head is
still cleared due to the skip of shake_page().
As a result, me_huge_page() runs for the thp, which is broken behavior.
One effect is a leak of the thp. And another is to fail to isolate the
memory error, so later access to the error address causes another MCE,
which kills the processes which used the thp.
This patch fixes this problem by calling shake_page() for thp tail case.
Fixes: 385de35722c9 ("thp: allow a hwpoisoned head page to be put back to LRU") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Dean Nelson <dnelson@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Jin Dongming <jin.dongming@np.css.fujitsu.com> Cc: <stable@vger.kernel.org> [3.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yury Norov [Tue, 5 May 2015 23:23:33 +0000 (16:23 -0700)]
lib: delete lib/find_last_bit.c
The file lib/find_last_bit.c was no longer used and supposed to be
deleted by commit 8f6f19dd51 ("lib: move find_last_bit to
lib/find_next_bit.c") but that delete didn't happen. This gets rid of
it.
Minchan Kim [Tue, 5 May 2015 23:23:28 +0000 (16:23 -0700)]
zram: add Designated Reviewer for zram in MAINTAINERS
Sergey Senozhatsky has contributed/reviewed to zram for a long time. He
is really helpful for maintaining zram so I want for him to continue
helping me as Designated Reviewer unless he hates it.
Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Mon, 27 Apr 2015 07:45:06 +0000 (09:45 +0200)]
thinkpad_acpi: Fix warning for static not at beginning
Fix the following warning:
warning: "static" is not at beginning of declaration
void static hotkey_mask_warn_incomplete_mask(void)
^
Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Henrique de Moraes Holschuh <ibm-acpi@hmh.eng.br> Cc: Darren Hart <dvhart@infradead.org> Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Joe Perches [Sun, 22 Feb 2015 18:21:07 +0000 (10:21 -0800)]
ipmi: Remove incorrect use of seq_has_overflowed
commit d6c5dc18d863 ("ipmi: Remove uses of return value of seq_printf")
incorrectly changed the return value of various proc_show functions
to use seq_has_overflowed().
These functions should return 0 on completion rather than 1/true
on overflow. 1 is the same as #define SEQ_SKIP which would cause
the output to not be emitted (skipped) instead.
This is a logical defect only as the length of these outputs are
all smaller than the initial allocation done by the seq filesystem.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
Marc Dionne [Mon, 4 May 2015 18:16:44 +0000 (15:16 -0300)]
x86: Make cpu_tss available to external modules
Commit 75182b1632 ("x86/asm/entry: Switch all C consumers of
kernel_stack to this_cpu_sp0()") changed current_thread_info
to use this_cpu_sp0, and indirectly made it rely on init_tss
which was exported with EXPORT_PER_CPU_SYMBOL_GPL.
As a result some macros and inline functions such as set/get_fs,
test_thread_flag and variants have been made unusable for
external modules.
Make cpu_tss exported with EXPORT_PER_CPU_SYMBOL so that these
functions are accessible again, as they were previously.
Boris Ostrovsky [Mon, 4 May 2015 15:02:15 +0000 (11:02 -0400)]
hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
Commit 61f01dd941ba ("x86_64, asm: Work around AMD SYSRET SS descriptor
attribute issue") makes AMD processors set SS to __KERNEL_DS in
__switch_to() to deal with cases when SS is NULL.
This breaks Xen PV guests who do not want to load SS with__KERNEL_DS.
Since the problem that the commit is trying to address would have to be
fixed in the hypervisor (if it in fact exists under Xen) there is no
reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here.
This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features
op which will clear this flag. (And since this structure is no longer
HVM-specific we should do some renaming).
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:15 +0000 (17:10 -0400)]
xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
.. because bind_evtchn_to_cpu(evtchn, cpu) will map evtchn to
'info' and pass 'info' down to xen_evtchn_port_bind_to_cpu().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Annie Li <annie.li@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:12 +0000 (17:10 -0400)]
xen/events: Clear cpu_evtchn_mask before resuming
When a guest is resumed, the hypervisor may change event channel
assignments. If this happens and the guest uses 2-level events it
is possible for the interrupt to be claimed by wrong VCPU since
cpu_evtchn_mask bits may be stale. This can happen even though
evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
is passed in is not necessarily the original one (from pre-migration
times) but instead is freshly allocated during resume and so any
information about which CPU the channel was bound to is lost.
Thus we should clear the mask during resume.
We also need to make sure that bits for xenstore and console channels
are set when these two subsystems are resumed. While rebind_evtchn_irq()
(which is invoked for both of them on a resume) calls irq_set_affinity(),
the latter will in fact postpone setting affinity until handling the
interrupt. But because cpu_evtchn_mask will have bits for these two
cleared we won't be able to take the interrupt.
With that in mind, we need to bind those two channels explicitly in
rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
pass through generic irq affinity code later, in case something needs
to be updated there as well.
(Also replace cpumask_of(0) with cpumask_of(info->cpu) in
rebind_evtchn_irq(): it should be set to zero in preceding
xen_irq_info_evtchn_setup().)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Annie Li <annie.li@oracle.com> Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Since Roland stepped down, the community asked me to take his place, and
the nomination was followed by sufficient votes and no dissensions that
we can move forward with the change.
Pull crypto fixes from Herbert Xu:
"This fixes a build problem with bcm63xx and yet another fix to the
memzero_explicit function to ensure that the memset is not elided"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm63xx - Fix driver compilation
lib: make memzero_explicit more robust against dead store elimination
Linus Torvalds [Tue, 5 May 2015 15:42:06 +0000 (08:42 -0700)]
Merge tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Three driver fixes:
- fix for omap4, fixing a regression due to a subsystem API that got
removed for 4.1 (commit efde234674d9);
- fix for one of the formats supported by Marvel ccic driver;
- fix rcar_vin driver that, when stopping abnormally, the driver
can't return from wait_for_completion"
* tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] v4l: omap4iss: Replace outdated OMAP4 control pad API with syscon
[media] media: soc_camera: rcar_vin: Fix wait_for_completion
[media] marvell-ccic: fix Y'CbCr ordering
Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1430210769-94177-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dan Carpenter [Tue, 21 Apr 2015 13:46:28 +0000 (16:46 +0300)]
efi: Fix error handling in add_sysfs_runtime_map_entry()
I spotted two (difficult to hit) bugs while reviewing this.
1) There is a double free bug because we unregister "map_kset" in
add_sysfs_runtime_map_entry() and also efi_runtime_map_init().
2) If we fail to allocate "entry" then we should return
ERR_PTR(-ENOMEM) instead of NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dave Young <dyoung@redhat.com> Cc: Guangyu Sun <guangyu.sun@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
This driver already makes use of ioremap_wc() on PIO buffers,
so convert it to use arch_phys_wc_add().
The qib driver uses a mmap() special case for when PAT is
not used, this behaviour used to be determined with a
module parameter but since we have been asked to just
remove that module parameter this checks for the WC cookie,
if not set we can assume PAT was used. If its set we do
what we used to do for the mmap for when MTRR was enabled.
The removal of the module parameter is OK given that Andy
notes that even if users of module parameter are still around
it will not prevent loading of the module on recent kernels.
Guy Shapiro [Wed, 15 Apr 2015 15:17:57 +0000 (18:17 +0300)]
IB/core: dma unmap optimizations
While unmapping an ODP writable page, the dirty bit of the page is set. In
order to do so, the head of the compound page is found.
Currently, the compound head is found even on non-writable pages, where it is
never used, leading to unnecessary cpu barrier that impacts performance.
This patch moves the search for the compound head to be done only when needed.
Guy Shapiro [Wed, 15 Apr 2015 15:17:56 +0000 (18:17 +0300)]
IB/core: dma map/unmap locking optimizations
Currently, while mapping or unmapping pages for ODP, the umem mutex is locked
and unlocked once for each page. Such lock/unlock operation take few tens to
hundreds of nsecs. This makes a significant impact when mapping or unmapping few
MBs of memory.
To avoid this, the mutex should be locked only once per operation, and not per
page.
RDMA/nes: Report the actual address of the remote connecting peer
Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper and report the address info to the user space application
at the time of connection establishment
RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
Add functionality to enable the port mapper on the passive side to provide to its
clients the actual (non-mapped) ip/tcp address information of the connecting peer
1) Adding remote_info_cb() to process the address info of the connecting peer
The address info is provided by the user space port mapper service when
the connection is initiated by the peer
2) Adding a hash list to store the remote address info
3) Adding functionality to add/remove the remote address info
After the info has been provided to the port mapper client,
it is removed from the hash list
Hariprasad S [Tue, 21 Apr 2015 20:15:01 +0000 (01:45 +0530)]
iw_cxgb4: enforce qp/cq id requirements
Currently the iw_cxgb4 implementation requires the qp and cq qid densities
to match as well as the qp and cq id ranges. So fail a device open if
the device configuration doesn't meet the requirements.
The reason for these restictions has to do with the fact that IQ qid X
has a UGTS register in the same bar2 page as EQ qid X. Thus both qids
need to be allocated to the same user process for security reasons.
The logic that does this (the qpid allocator in iw_cxgb4/resource.c)
handles this but requires the above restrictions.
Jason Gunthorpe [Mon, 20 Apr 2015 20:01:11 +0000 (14:01 -0600)]
RDMA/CMA: Canonize IPv4 on IPV6 sockets properly
When accepting a new IPv4 connect to an IPv6 socket, the CMA tries to
canonize the address family to IPv4, but does not properly process
the listening sockaddr to get the listening port, and does not properly
set the address family of the canonized sockaddr.
Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager") Cc: <stable@vger.kernel.org> Reported-By: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>