lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()
We are seeing a lot of sg_alloc_table allocation failures using the new
drm prime infrastructure. We isolated the cause to code in
__sg_alloc_table that was re-writing the gfp_flags.
There is a comment in the code that suggest that there is an assumption
about the allocation coming from a memory pool. This was likely true
when sg lists were primarily used for disk I/O.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Cong Wang <amwang@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rob Clark <rob.clark@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Inki Dae <inki.dae@samsung.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Olof Johansson <olofj@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:43:17 +0000 (14:43 -0700)]
fault-injection: add selftests for cpu and memory hotplug
This adds two selftests
* tools/testing/selftests/cpu-hotplug/on-off-test.sh is testing script
for CPU hotplug
1. Online all hot-pluggable CPUs
2. Offline all hot-pluggable CPUs
3. Online all hot-pluggable CPUs again
4. Exit if cpu-notifier-error-inject.ko is not available
5. Offline all hot-pluggable CPUs in preparation for testing
6. Test CPU hot-add error handling by injecting notifier errors
7. Online all hot-pluggable CPUs in preparation for testing
8. Test CPU hot-remove error handling by injecting notifier errors
* tools/testing/selftests/memory-hotplug/on-off-test.sh is doing the
similar thing for memory hotplug.
1. Online all hot-pluggable memory
2. Offline 10% of hot-pluggable memory
3. Online all hot-pluggable memory again
4. Exit if memory-notifier-error-inject.ko is not available
5. Offline 10% of hot-pluggable memory in preparation for testing
6. Test memory hot-add error handling by injecting notifier errors
7. Online all hot-pluggable memory in preparation for testing
8. Test memory hot-remove error handling by injecting notifier errors
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This provides the ability to inject artifical errors to pSeries reconfig
notifier chain callbacks. It is controlled through debugfs interface
under /sys/kernel/debug/notifier-error-inject/pSeries-reconfig
If the notifier call chain should be failed with some events
notified, write the error code to "actions/<notifier event>/error".
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:43:10 +0000 (14:43 -0700)]
memory: memory notifier error injection module
This provides the ability to inject artifical errors to memory hotplug
notifier chain callbacks. It is controlled through debugfs interface
under /sys/kernel/debug/notifier-error-inject/memory
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:43:07 +0000 (14:43 -0700)]
PM: PM notifier error injection module
This provides the ability to inject artifical errors to PM notifier chain
callbacks. It is controlled through debugfs interface under
/sys/kernel/debug/notifier-error-inject/pm
Each of the files in "error" directory represents an event which can be
failed and contains the error code. If the notifier call chain should be
failed with some events notified, write the error code to the files.
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Example: Inject PM suspend error (-12 = -ENOMEM)
# cd /sys/kernel/debug/notifier-error-inject/pm
# echo -12 > actions/PM_SUSPEND_PREPARE/error
# echo mem > /sys/power/state
bash: echo: write error: Cannot allocate memory
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:43:03 +0000 (14:43 -0700)]
cpu: rewrite cpu-notifier-error-inject module
Rewrite existing cpu-notifier-error-inject module to use debugfs based new
framework.
This change removes cpu_up_prepare_error and cpu_down_prepare_error module
parameters which were used to specify error code to be injected. We could
keep these module parameters for backward compatibility by module_param_cb
but it seems overkill for this module.
This provides the ability to inject artifical errors to CPU notifier chain
callbacks. It is controlled through debugfs interface under
/sys/kernel/debug/notifier-error-inject/cpu
If the notifier call chain should be failed with some events notified,
write the error code to "actions/<notifier event>/error".
Example1: inject CPU offline error (-1 == -EPERM)
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -1 > actions/CPU_DOWN_PREPARE/error
# echo 0 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Operation not permitted
Example2: inject CPU online error (-2 == -ENOENT)
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -2 > actions/CPU_UP_PREPARE/error
# echo 1 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: No such file or directory
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:43:02 +0000 (14:43 -0700)]
fault-injection: notifier error injection
This patchset provides kernel modules that can be used to test the error
handling of notifier call chain failures by injecting artifical errors to
the following notifier chain callbacks.
# cd /sys/kernel/debug/notifier-error-inject/cpu
# echo -1 > actions/CPU_DOWN_PREPARE/error
# echo 0 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Operation not permitted
The patchset also adds cpu and memory hotplug tests to
tools/testing/selftests These tests first do simple online and offline
test and then do fault injection tests if notifier error injection
module is available.
This patch:
The notifier error injection provides the ability to inject artifical
errors to specified notifier chain callbacks. It is useful to test the
error handling of notifier call chain failures.
This adds common basic functions to define which type of events can be
fail and to initialize the debugfs interface to control what error code
should be returned and which event should be failed.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
resource: make sure requested range is included in the root range
When the requested range is outside of the root range the logic in
__reserve_region_with_split will cause an infinite recursion which will
overflow the stack as seen in the warning bellow.
This particular stack overflow was caused by requesting the
(100000000-107ffffff) range while the root range was (0-ffffffff). In
this case __request_resource would return the whole root range as
conflict range (i.e. 0-ffffffff). Then, the logic in
__reserve_region_with_split would continue the recursion requesting the
new range as (conflict->end+1, end) which incidentally in this case
equals the originally requested range.
This patch aborts looking for an usable range when the request does not
intersect with the root range. When the request partially overlaps with
the root range, it ajust the request to fall in the root range and then
continues with the new request.
When the request is modified or aborted errors and a stack trace are
logged to allow catching the errors in the upper layers.
Andrew Morton [Mon, 30 Jul 2012 21:42:56 +0000 (14:42 -0700)]
include/linux/aio.h: cpp->C conversions
Convert init_sync_kiocb() from a nasty macro into a nice C function. The
struct assignment trick takes care of zeroing all unmentioned fields.
Shrinks fs/read_write.o's .text from 9857 bytes to 9714.
Also demacroize is_sync_kiocb() and aio_ring_avail(). The latter fixes an
arg-referenced-multiple-times hand grenade.
Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Emil Goode [Mon, 30 Jul 2012 21:42:51 +0000 (14:42 -0700)]
pps: return PTR_ERR on error in device_create
We should return PTR_ERR if the call to the device_create function fails.
Without this patch we instead return the value from a successful call to
cdev_add if the call to device_create fails.
Signed-off-by: Emil Goode <emilgoode@gmail.com> Acked-by: Devendra Naga <devendra.aaru@gmail.com> Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su> Cc: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Mon, 30 Jul 2012 21:42:48 +0000 (14:42 -0700)]
sysctl: suppress kmemleak messages
register_sysctl_table() is a strange function, as it makes internal
allocations (a header) to register a sysctl_table. This header is a
handle to the table that is created, and can be used to unregister the
table. But if the table is permanent and never unregistered, the header
acts the same as a static variable.
Unfortunately, this allocation of memory that is never expected to be
freed fools kmemleak in thinking that we have leaked memory. For those
sysctl tables that are never unregistered, and have no pointer referencing
them, kmemleak will think that these are memory leaks:
Will Deacon [Mon, 30 Jul 2012 21:42:46 +0000 (14:42 -0700)]
ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION
Rather than #define the options manually in the architecture code, add
Kconfig options for them and select them there instead. This also allows
us to select the compat IPC version parsing automatically for platforms
using the old compat IPC interface.
Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Mon, 30 Jul 2012 21:42:43 +0000 (14:42 -0700)]
ipc: compat: use signed size_t types for msgsnd and msgrcv
The msgsnd and msgrcv system calls use size_t to represent the size of the
message being transferred. POSIX states that values of msgsz greater than
SSIZE_MAX cause the result to be implementation-defined. On Linux, this
equates to returning -EINVAL if (long) msgsz < 0.
For compat tasks where !CONFIG_ARCH_WANT_OLD_COMPAT_IPC and compat_size_t
is smaller than size_t, negative size values passed from userspace will be
interpreted as positive values by do_msg{rcv,snd} and will fail to exit
early with -EINVAL.
This patch changes the compat prototypes for msg{rcv,snd} so that the
message size is represented as a compat_ssize_t, which we cast to the
native ssize_t type for the core IPC code.
Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Mon, 30 Jul 2012 21:42:40 +0000 (14:42 -0700)]
ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC
Commit 48b25c43e6ee ("ipc: provide generic compat versions of IPC
syscalls") added a new ARCH_WANT_OLD_COMPAT_IPC config option for
architectures to select if their compat target requires the old IPC
syscall interface.
For architectures (such as AArch64) that do not require the internal
calling conventions provided by this option, but have a compat target
where the C library passes the IPC_64 flag explicitly,
compat_ipc_parse_version no longer strips out the flag before calling
the native system call implementation, resulting in unknown SHM/IPC
commands and -EINVAL being returned to userspace.
This patch separates the selection of the internal calling conventions
for the IPC syscalls from the version parsing, allowing architectures to
select __ARCH_WANT_COMPAT_IPC_PARSE_VERSION if they want to use version
parsing whilst retaining the newer syscall calling conventions.
Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Mon, 30 Jul 2012 21:42:38 +0000 (14:42 -0700)]
ipc: add COMPAT_SHMLBA support
If the SHMLBA definition for a native task differs from the definition for
a compat task, the do_shmat() function would need to handle both.
This patch introduces COMPAT_SHMLBA, which is used by the compat shmat
syscall when calling the ipc code and allows architectures such as AArch64
(where the native SHMLBA is 64k but the compat (AArch32) definition is
16k) to provide the correct semantics for compat IPC system calls.
Cc: David S. Miller <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kdump: append newline to the last lien of vmcoreinfo note
The last line of vmcoreinfo note does not end with \n. Parsing all the
lines in note becomes easier if all lines end with \n instead of trying to
special case the last line.
I know at least one tool, vmcore-dmesg in kexec-tools tree which made the
assumption that all lines end with \n. I think it is a good idea to fix
it.
Error by 0) is not a matter, it can just return. But error by 1) requires
releasing task_struct allocated by 0) before it returns. Likewise, error
by 2) requires releasing task_struct and thread_info allocated by 0) and
1).
The existing error handling calls free_task_struct() and
free_thread_info() which do not only release task_struct and thread_info,
but also call architecture specific arch_release_task_struct() and
arch_release_thread_info().
The problem is that task_struct and thread_info are not fully initialized
yet at this point, but arch_release_task_struct() and
arch_release_thread_info() are called with them.
For example, x86 defines its own arch_release_task_struct() that releases
a task_xstate. If alloc_thread_info_node() fails in dup_task(),
arch_release_task_struct() is called with task_struct which is just
allocated and filled with garbage in this error handling.
This actually happened with tools/testing/fault-injection/failcmd.sh
In order to fix this issue, make free_{task_struct,thread_info}() not to
call arch_release_{task_struct,thread_info}() and call
arch_release_{task_struct,thread_info}() implicitly where needed.
Default arch_release_task_struct() and arch_release_thread_info() are
defined as empty by default. So this change only affects the
architectures which implement their own arch_release_task_struct() or
arch_release_thread_info() as listed below.
arch_release_task_struct(): x86, sh
arch_release_thread_info(): mn10300, tile
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Salman Qazi <sqazi@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
proc: do not allow negative offsets on /proc/<pid>/environ
__mem_open() which is called by both /proc/<pid>/environ and
/proc/<pid>/mem ->open() handlers will allow the use of negative offsets.
/proc/<pid>/mem has negative offsets but not /proc/<pid>/environ.
Clean this by moving the 'force FMODE_UNSIGNED_OFFSET flag' to mem_open()
to allow negative offsets only on /proc/<pid>/mem.
proc: environ_read() make sure offset points to environment address range
Currently the following offset and environment address range check in
environ_read() of /proc/<pid>/environ is buggy:
int this_len = mm->env_end - (mm->env_start + src);
if (this_len <= 0)
break;
Large or negative offsets on /proc/<pid>/environ converted to 'unsigned
long' may pass this check since '(mm->env_start + src)' can overflow and
'this_len' will be positive.
This can turn /proc/<pid>/environ to act like /proc/<pid>/mem since
(mm->env_start + src) will point and read from another VMA.
There are two fixes here plus some code cleaning:
1) Fix the overflow by checking if the offset that was converted to
unsigned long will always point to the [mm->env_start, mm->env_end]
address range.
2) Remove the truncation that was made to the result of the check,
storing the result in 'int this_len' will alter its value and we can
not depend on it.
For kernels that have commit b409e578d ("proc: clean up
/proc/<pid>/environ handling") which adds the appropriate ptrace check and
saves the 'mm' at ->open() time, this is not a security issue.
This patch is taken from the grsecurity patch since it was just made
available.
coredump: fix wrong comments on core limits of pipe coredump case
In commit 898b374af6f7 ("exec: replace call_usermodehelper_pipe with use
of umh init function and resolve limit"), the core limits recursive
check value was changed from 0 to 1, but the corresponding comments were
not updated.
Signed-off-by: Jovi Zhang <bookjovi@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The system deadlocks (at least since 2.6.10) when
call_usermodehelper(UMH_WAIT_EXEC) request triggers
call_usermodehelper(UMH_WAIT_PROC) request.
This is because "khelper thread is waiting for the worker thread at
wait_for_completion() in do_fork() since the worker thread was created
with CLONE_VFORK flag" and "the worker thread cannot call complete()
because do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper
thread cannot start processing UMH_WAIT_PROC request because the khelper
thread is waiting for the worker thread at wait_for_completion() in
do_fork()".
The easiest example to observe this deadlock is to use a corrupted
/sbin/hotplug binary (like shown below).
call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a
module. do_execve("/tmp/dummy") triggers a call to
request_module("binfmt-0000") from search_binary_handler() which in turn
calls call_usermodehelper(UMH_WAIT_PROC).
In order to avoid deadlock, as a for-now and easy-to-backport solution, do
not try to call wait_for_completion() in call_usermodehelper_exec() if the
worker thread was created by khelper thread with CLONE_VFORK flag. Future
and fundamental solution might be replacing singleton khelper thread with
some workqueue so that recursive calls up to max_active dependency loop
can be handled without deadlock.
[akpm@linux-foundation.org: add comment to kmod_thread_locker] Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nearly identical shortname parsing is performed in fat_search_long() and
__fat_readdir(). Extract this code into a function that may be called by
both.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
where the thaw ioctl deadlocked at thaw_super() when called while chcp was
waiting at nilfs_transaction_begin() called from
nilfs_ioctl_change_cpmode(). This deadlock is 100% reproducible.
This is because nilfs_ioctl_change_cpmode() first locks sb->s_umount in
read mode and then waits for unfreezing in nilfs_transaction_begin(),
whereas thaw_super() locks sb->s_umount in write mode. The locking of
sb->s_umount here was intended to make snapshot mounts and the downgrade
of snapshots to checkpoints exclusive.
This fixes the deadlock issue by replacing the sb->s_umount usage in
nilfs_ioctl_change_cpmode() with a dedicated mutex which protects snapshot
mounts.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nilfs2: fix timing issue between rmcp and chcp ioctls
The checkpoint deletion ioctl (rmcp ioctl) has potential for breaking
snapshot because it is not fully exclusive with checkpoint mode change
ioctl (chcp ioctl).
The rmcp ioctl first tests if the specified checkpoint is a snapshot or
not within nilfs_cpfile_delete_checkpoint function, and then calls
nilfs_cpfile_delete_checkpoints function to actually invalidate the
checkpoint only if it's not a snapshot. However, the checkpoint can be
changed into a snapshot by the chcp ioctl between these two operations.
In that case, calling nilfs_cpfile_delete_checkpoints() wrongly
invalidates the snapshot, which leads to snapshot list corruption and
snapshot count mismatch.
This fixes the issue by changing nilfs_cpfile_delete_checkpoints() so
that it reconfirms the target checkpoints are snapshot or not.
This second check is exclusive with the chcp operation since it is
protected by an existing semaphore.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nilfs2: remove references to long gone super operations
->delete_inode(), ->write_super_lockfs(), ->unlockfs() are gone so remove
references to them in the NTFS code. Noticed while cleaning up the
fsfreeze mess.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On minix2 and minix3 usually max_size is 7fffffff and the check in
question prohibits creation of last block spanning right before 7fffffff,
due to downward rounding during the division. Fix it by using
multiplication instead.
[akpm@linux-foundation.org: fix up code layout, use local `sb'] Signed-off-by: Vladimir Serbinenko <phcoder@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Bowler [Mon, 30 Jul 2012 21:41:57 +0000 (14:41 -0700)]
drivers/rtc/rtc-pcf8563.c: set owner field in driver struct
The owner member is supposed to be set to the module implementing the
device driver, i.e., THIS_MODULE. This enables the appropriate module
link in sysfs.
Signed-off-by: Nick Bowler <nbowler@elliptictech.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Mon, 30 Jul 2012 21:41:47 +0000 (14:41 -0700)]
drivers/rtc/rtc-r9701.c: check that r9701_set_datetime() succeeded
When the driver detects that the clock time is invalid, it attempts to
write a sane time into the hardware. We curently assume that everything
is OK if those writes succeeded. But it is better to re-read the time
from the hardware to ensure that the new settings got there OK.
drivers/rtc/rtc-r9701.c: avoid second call to rtc_valid_tm()
r9701_get_datetime() calls rtc_valid_tm() and returns the value returned
by rtc_valid_tm(), which can be used in the `if', so calling
rtc_valid_tm() a second time is not required.
The pl031 interrupt is shared between the timer part and the clockwatch
part of the same HW block on the ux500, so mark it IRQF_SHARED on this
variant.
This patch also adds the IRQF_NO_SUSPEND flag to the rtc irq on all
variants as we don't want this pretty important IRQ to be disabled in
suspend.
rtc: pl031: use per-vendor variables for special init
Instead of hard-checking for certain vendor codes, follow the pattern of
other AMBA (PrimeCell) drivers and use variables in the vendor data.
Get rid of the locally cached vendor and hardware revision since we
already have the nice vendor data variable in the state.
firmware_map: make firmware_map_add_early() argument consistent with firmware_map_add_hotplug()
There are two ways to create /sys/firmware/memmap/X sysfs:
- firmware_map_add_early
When the system starts, it is calledd from e820_reserve_resources()
- firmware_map_add_hotplug
When the memory is hot plugged, it is called from add_memory()
But these functions are called without unifying value of end argument as
below:
- end argument of firmware_map_add_early() : start + size - 1
- end argument of firmware_map_add_hogplug() : start + size
The patch unifies them to "start + size". Even if applying the patch,
/sys/firmware/memmap/X/end file content does not change.
[akpm@linux-foundation.org: clarify comments] Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stephen Boyd [Mon, 30 Jul 2012 21:41:11 +0000 (14:41 -0700)]
spinlock_debug: print offset in addition to symbol name
If there are two spinlocks embedded in a structure that kallsyms knows
about and one of the spinlocks locks up we will print the name of the
containing structure instead of the address of the lock. This is quite
bad, so let's use %pS instead of %ps so we get an offset in addition to
the symbol so we can determine which particular lock is having problems.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:41:08 +0000 (14:41 -0700)]
ext4: use memweight()
Convert ext4_count_free() to use memweight() instead of table lookup
based counting clear bits implementation. This change only affects the
code segments enabled by EXT4FS_DEBUG.
Note that this memweight() call can't be replaced with a single
bitmap_weight() call, although the pointer to the memory area is aligned
to long-word boundary. Because the size of the memory area may not be a
multiple of BITS_PER_LONG, then it returns wrong value on big-endian
architecture.
This also includes the following change.
- Remove unnecessary map == NULL check in ext4_count_free() which
always takes non-null pointer as the memory area.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:41:06 +0000 (14:41 -0700)]
ext3: use memweight()
Convert ext3_count_free() to use memweight() instead of table lookup
based counting clear bits implementation. This change only affects the
code segments enabled by EXT3FS_DEBUG.
Note that this memweight() call can't be replaced with a single
bitmap_weight() call, although the pointer to the memory area is aligned
to long-word boundary. Because the size of the memory area may not be a
multiple of BITS_PER_LONG, then it returns wrong value on big-endian
architecture.
This also includes the following changes.
- Remove unnecessary map == NULL check in ext3_count_free() which
always takes non-null pointer as the memory area.
- Fix printk format warning that only reveals with EXT3FS_DEBUG.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:41:05 +0000 (14:41 -0700)]
ext2: use memweight()
Convert ext2_count_free() to use memweight() instead of table lookup
based counting clear bits implementation. This change only affects the
code segments enabled by EXT2FS_DEBUG.
Note that this memweight() call can't be replaced with a single
bitmap_weight() call, although the pointer to the memory area is aligned
to long-word boundary. Because the size of the memory area may not be a
multiple of BITS_PER_LONG, then it returns wrong value on big-endian
architecture.
This also includes the following changes.
- Remove unnecessary map == NULL check in ext2_count_free() which
always takes non-null pointer as the memory area.
- Fix printk format warning that only reveals with EXT2FS_DEBUG.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:41:03 +0000 (14:41 -0700)]
ocfs2: use memweight()
Use memweight to count the total number of bits set in memory area.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:40:57 +0000 (14:40 -0700)]
qnx4fs: use memweight()
Use memweight() to count the total number of bits clear in memory area.
Note that this memweight() call can't be replaced with a single
bitmap_weight() call, although the pointer to the memory area is aligned
to long-word boundary. Because the size of the memory area may not be a
multiple of BITS_PER_LONG, then it returns wrong value on big-endian
architecture.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Anders Larsen <al@alarsen.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Mon, 30 Jul 2012 21:40:55 +0000 (14:40 -0700)]
string: introduce memweight()
memweight() is the function that counts the total number of bits set in
memory area. Unlike bitmap_weight(), memweight() takes pointer and size
in bytes to specify a memory area which does not need to be aligned to
long-word boundary.
[akpm@linux-foundation.org: rename `w' to `ret'] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Anders Larsen <al@alarsen.net> Cc: Alasdair Kergon <agk@redhat.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jan Kara <jack@suse.cz> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Tony Luck <tony.luck@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:47 +0000 (14:40 -0700)]
backlight: l4f00242t03: export and use devm_gpio_request_one()
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_gpio_request_one() for these functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Alberto Panizzo <alberto@amarulasolutions.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:45 +0000 (14:40 -0700)]
backlight: corgi_lcd: use devm_gpio_request()
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_gpio_request() for these functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:43 +0000 (14:40 -0700)]
backlight: lms283gf05: use devm_gpio_request()
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_gpio_request() for these functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:37 +0000 (14:40 -0700)]
backlight: ot200_bl: use devm_gpio_request()
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_gpio_request() for these functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:34 +0000 (14:40 -0700)]
drivers/video/backlight/lm3533_bl.c: use devm_ functions
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_kzalloc of these functions.
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Acked-by: Johan Hovold <jhovold@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jingoo Han [Mon, 30 Jul 2012 21:40:33 +0000 (14:40 -0700)]
drivers/video/backlight/ot200_bl.c: use devm_ functions
The devm_ functions allocate memory that is released when a driver
detaches. This patch uses devm_kzalloc of these functions
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Shevchenko [Mon, 30 Jul 2012 21:40:27 +0000 (14:40 -0700)]
vsprintf: add support of '%*ph[CDN]'
There are many places in the kernel where the drivers print small buffers
as a hex string. This patch adds a support of the variable width buffer
to print it as a hex string with a delimiter. The idea came from Pavel
Roskin here: http://www.digipedia.pl/usenet/thread/18835/17449/
Sample output of
pr_info("buf[%d:%d] %*phC\n", from, len, len, &buf[from]);
could be look like this:
[ 0.726130] buf[51:8] e8:16:b6:ef:e3:74:45:6e
[ 0.750736] buf[59:15] 31:81:b8:3f:35:49:06:ae:df:32:06:05:4a:af:55
[ 0.757602] buf[17:5] ac:16:d5:2c:ef
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The intent behind this behavior was to return "pK-error" in cases where
the %pK format specifier was used in interrupt context, because the
CAP_SYSLOG check wouldn't be meaningful. Clearly this should only apply
when kptr_restrict is actually enabled though.
Reported-by: Stevie Trujillo <stevie.trujillo@gmail.com> Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 30 Jul 2012 21:40:21 +0000 (14:40 -0700)]
printk: remove the now unnecessary "C" annotation for KERN_CONT
Now that all KERN_<LEVEL> uses are prefixed with ASCII SOH, there is no
need for a KERN_CONT. Keep it backward compatible by adding #define
KERN_CONT ""
Reduces kernel image size a thousand bytes.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 30 Jul 2012 21:40:13 +0000 (14:40 -0700)]
btrfs: use printk_get_level and printk_skip_level, add __printf, fix fallout
Use the generic printk_get_level() to search a message for a kern_level.
Add __printf to verify format and arguments. Fix a few messages that
had mismatches in format and arguments. Add #ifdef CONFIG_PRINTK blocks
to shrink the object size a bit when not using printk.
[akpm@linux-foundation.org: whitespace tweak] Signed-off-by: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 30 Jul 2012 21:40:12 +0000 (14:40 -0700)]
arch: remove direct definitions of KERN_<LEVEL> uses
Add #include <linux/kern_levels.h> so that the #define KERN_<LEVEL> macros
don't have to be duplicated.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Kay Sievers <kay@vrfy.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 30 Jul 2012 21:40:09 +0000 (14:40 -0700)]
printk: add generic functions to find KERN_<LEVEL> headers
The current form of a KERN_<LEVEL> is "<.>".
Add printk_get_level and printk_skip_level functions to handle these
formats.
These functions centralize tests of KERN_<LEVEL> so a future modification
can change the KERN_<LEVEL> style and shorten the number of bytes consumed
by these headers.
[akpm@linux-foundation.org: fix build error and warning] Signed-off-by: Joe Perches <joe@perches.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>