Just like files-layout, blocks & objects layouts are part of the
NFS 4.1 protocol and should be automatically selected if NFS_4_1
is selected. The small problem is that these depend on other
Kernel support being present, while files only depends on NFS
itself.
This patch removes from the user choice the presence of objects
and blocks layout. But makes sure these are selected only if
the depended subsystems are present in the Kernel.
Eric Sandeen [Thu, 11 Aug 2011 14:54:31 +0000 (09:54 -0500)]
ext4: Properly count journal credits for long symlinks
Commit df5e6223407e ("ext4: fix deadlock in ext4_symlink() in ENOSPC
conditions") recalculated the number of credits needed for a long
symlink, in the process of splitting it into two transactions. However,
the first credit calculation under-counted because if selinux is
enabled, credits are needed to create the selinux xattr as well.
Overrunning the reservation will result in an OOPS in
jbd2_journal_dirty_metadata() due to this assert:
J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
Fix this by increasing the reservation size.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Thu, 11 Aug 2011 14:51:46 +0000 (09:51 -0500)]
ext3: Properly count journal credits for long symlinks
Commit ae54870a1dc9 ("ext3: Fix lock inversion in ext3_symlink()")
recalculated the number of credits needed for a long symlink, in the
process of splitting it into two transactions. However, the first
credit calculation under-counted because if selinux is enabled, credits
are needed to create the selinux xattr as well.
Overrunning the reservation will result in an OOPS in
journal_dirty_metadata() due to this assert:
J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
Fix this by increasing the reservation size.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasiliy Kulikov [Mon, 8 Aug 2011 15:02:04 +0000 (19:02 +0400)]
move RLIMIT_NPROC check from set_user() to do_execve_common()
The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
check in set_user() to check for NPROC exceeding via setuid() and
similar functions.
Before the check there was a possibility to greatly exceed the allowed
number of processes by an unprivileged user if the program relied on
rlimit only. But the check created new security threat: many poorly
written programs simply don't check setuid() return code and believe it
cannot fail if executed with root privileges. So, the check is removed
in this patch because of too often privilege escalations related to
buggy programs.
The NPROC can still be enforced in the common code flow of daemons
spawning user processes. Most of daemons do fork()+setuid()+execve().
The check introduced in execve() (1) enforces the same limit as in
setuid() and (2) doesn't create similar security issues.
Neil Brown suggested to track what specific process has exceeded the
limit by setting PF_NPROC_EXCEEDED process flag. With the change only
this process would fail on execve(), and other processes' execve()
behaviour is not changed.
Solar Designer suggested to re-check whether NPROC limit is still
exceeded at the moment of execve(). If the process was sleeping for
days between set*uid() and execve(), and the NPROC counter step down
under the limit, the defered execve() failure because NPROC limit was
exceeded days ago would be unexpected. If the limit is not exceeded
anymore, we clear the flag on successful calls to execve() and fork().
The flag is also cleared on successful calls to set_user() as the limit
was exceeded for the previous user, not the current one.
Similar check was introduced in -ow patches (without the process flag).
v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().
Linus Torvalds [Thu, 11 Aug 2011 16:03:48 +0000 (09:03 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf symbols: Check '/tmp/perf-' symbol file ownership
perf sched: Usage leftover from trace -> script rename
perf sched: Do not delete session object prematurely
perf tools: Check $HOME/.perfconfig ownership
perf, x86: Add model 45 SandyBridge support
perf tools: Add support to install perf python extension
perf tools: do not look at ./config for configuration
perf tools: Make clean leaves some files
perf lock: Dropping unsupported ':r' modifier
perf probe: Fix coredump introduced by probe module option
jump label: Reduce the cycle count by changing the link order
perf report: Use ui__warning in some more places
perf python: Add PERF_RECORD_{LOST,READ,SAMPLE} routine tables
perf evlist: Introduce 'disable' method
trace events: Update version number reference to new 3.x scheme for EVENT_POWER_TRACING_DEPRECATED
perf buildid-cache: Zero out buffer of filenames when adding/removing buildid
It turns out that one was meant to be applied on top of the edac.git
tree in -next that has more i7core_edac changes, but that wasn't clear
in the original email.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peng Tao [Wed, 10 Aug 2011 22:29:21 +0000 (18:29 -0400)]
NFS41: make PNFS_BLOCK selectable
PNFS_BLOCK needs BLK_DEV_DM/MD, which is not a dependency for other
pnfs layout drivers. Seperate it out so others can still build when
BLK_DEV_DM/MD is not enabled.
Also change select to depends on to avoid build failures.
Reported-and-tested-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Peng Tao <peng_tao@emc.com> Acked-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 11 Aug 2011 00:37:17 +0000 (17:37 -0700)]
Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: drop experimental status for ARM_PATCH_PHYS_VIRT
ARM: 7008/1: alignment: Make SIGBUS sent to userspace POSIXly correct
ARM: 7007/1: alignment: Prevent ignoring of faults with ARMv6 unaligned access model
ARM: 7010/1: mm: fix invalid loop for poison_init_mem
ARM: 7005/1: freshen up mm/proc-arm946.S
dmaengine: PL08x: Fix trivial build error
ARM: Fix build error for SMP=n builds
Linus Torvalds [Wed, 10 Aug 2011 19:36:45 +0000 (12:36 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Really fix build without CONFIG_PCI
powerpc: Fix build without CONFIG_PCI
powerpc/4xx: Fix build of PCI code on 405
powerpc/pseries: Simplify vpa deregistration functions
powerpc/pseries: Cleanup VPA registration and deregistration errors
powerpc/pseries: Fix kexec on recent firmware versions
MAINTAINERS: change maintainership of mpc5xxx
powerpc: Make KVM_GUEST default to n
powerpc/kvm: Fix build errors with older toolchains
powerpc: Lack of ibm,io-events not that important!
powerpc: Move kdump default base address to half RMO size on 64bit
powerpc/perf: Disable pagefaults during callchain stack read
ppc: Remove duplicate definition of PV_POWER7
powerpc: pseries: Fix kexec on machines with more than 4TB of RAM
powerpc: Jump label misalignment causes oops at boot
powerpc: Clean up some panic messages in prom_init
powerpc: Fix device tree claim code
powerpc: Return the_cpu_ spec from identify_cpu
powerpc: mtspr/mtmsr should take an unsigned long
Borislav Petkov [Wed, 10 Aug 2011 12:43:30 +0000 (14:43 +0200)]
EDAC: Correct Kconfig dependencies
Both AMD and Intel i7 EDAC drivers use MCE features and are thus
dependent of this functionality present in the kernel. Express this in
Kconfig so that randconfig builds don't break.
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Johansen [Fri, 22 Jul 2011 15:14:15 +0000 (08:14 -0700)]
Ecryptfs: Add mount option to check uid of device being mounted = expect uid
Close a TOCTOU race for mounts done via ecryptfs-mount-private. The mount
source (device) can be raced when the ownership test is done in userspace.
Provide Ecryptfs a means to force the uid check at mount time.
Signed-off-by: John Johansen <john.johansen@canonical.com> Cc: <stable@kernel.org> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Jonathan Nieder [Mon, 8 Aug 2011 04:22:43 +0000 (06:22 +0200)]
cap_syslog: don't use WARN_ONCE for CAP_SYS_ADMIN deprecation warning
syslog-ng versions before 3.3.0beta1 (2011-05-12) assume that
CAP_SYS_ADMIN is sufficient to access syslog, so ever since CAP_SYSLOG
was introduced (2010-11-25) they have triggered a warning.
Commit ee24aebffb75 ("cap_syslog: accept CAP_SYS_ADMIN for now")
improved matters a little by making syslog-ng work again, just keeping
the WARN_ONCE(). But still, this is a warning that writes a stack trace
we don't care about to syslog, sets a taint flag, and alarms sysadmins
when nothing worse has happened than use of an old userspace with a
recent kernel.
Convert the WARN_ONCE to a printk_once to avoid that while continuing to
give userspace developers a hint that this is an unwanted
backward-compatibility feature and won't be around forever.
Reported-by: Ralf Hildebrandt <ralf.hildebrandt@charite.de> Reported-by: Niels <zorglub_olsen@hotmail.com> Reported-by: Paweł Sikora <pluto@agmk.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Liked-by: Gergely Nagy <algernon@madhouse-project.org> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The patch incorrectly assumes that using atomic FLUSHING_CACHED_CHARGE
bit operations is sufficient but that is not true. Johannes Weiner has
reported a crash during parallel memory cgroup removal:
We are crashing because we try to dereference cached memcg when we are
checking whether we should wait for draining on the cache. The cache is
already cleaned up, though.
There is also a theoretical chance that the cached memcg gets freed
between we test for the FLUSHING_CACHED_CHARGE and dereference it in
mem_cgroup_same_or_subtree:
CPU0 CPU1 CPU2
mem=stock->cached
stock->cached=NULL
clear_bit
test_and_set_bit
test_bit() ...
<preempted> mem_cgroup_destroy
use after free
The percpu_charge_mutex protected from this race because sync draining
is exclusive.
It is safer to revert now and come up with a more parallel
implementation later.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Johannes Weiner <jweiner@redhat.com> Acked-by: Johannes Weiner <jweiner@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ecryptfs/keystore.c: In function ‘ecryptfs_generate_key_packet_set’:
fs/ecryptfs/keystore.c:1991:28: warning: ‘payload_len’ may be used uninitialized in this function [-Wuninitialized]
fs/ecryptfs/keystore.c:1976:9: note: ‘payload_len’ was declared here
Roberto Sassu [Mon, 1 Aug 2011 11:33:38 +0000 (13:33 +0200)]
eCryptfs: fix compile error
This patch fixes the compile error reported at the address:
https://bugzilla.kernel.org/show_bug.cgi?id=40292
The problem arises when compiling eCryptfs as built-in and the 'encrypted'
key type as a module. The patch prevents this combination from being set in
the kernel configuration, by fixing the eCryptfs dependencies.
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it> Reported-by: David Hill <hilld@binarystorm.net> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Tyler Hicks [Fri, 5 Aug 2011 03:58:51 +0000 (22:58 -0500)]
eCryptfs: Return error when lower file pointer is NULL
When an eCryptfs inode's lower file has been closed, and the pointer has
been set to NULL, return an error when trying to do a lower read or
write rather than calling BUG().
deactivate_slab() has the comparison if more than the minimum number of
partial pages are in the partial list wrong. An effect of this may be that
empty pages are not freed from deactivate_slab(). The result could be an
OOM due to growth of the partial slabs per node. Frees mostly occur from
__slab_free which is okay so this would only affect use cases where a lot
of switching around of per cpu slabs occur.
Switching per cpu slabs occurs with high frequency if debugging options are
enabled.
Reported-and-tested-by: Xiaotian Feng <xtfeng@gmail.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Jiri Olsa [Mon, 8 Aug 2011 21:03:34 +0000 (23:03 +0200)]
perf sched: Do not delete session object prematurely
The session object is released prematurely when processing events for
latency command. The session's thread objects are used within the
output_lat_thread function.
Runnning following commands:
# perf sched record
# perf sched latency
the latter displays incorrect data and might cause access violation.
Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1312837414-3819-1-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Tue, 9 Aug 2011 15:42:16 +0000 (08:42 -0700)]
Merge branch 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: fix check_bytes() for slub debugging
slub: Fix full list corruption if debugging is on
Linus Torvalds [Tue, 9 Aug 2011 15:41:36 +0000 (08:41 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
sound: pss - don't use the deprecated function check_region
ALSA: timer - Add NULL-check for invalid slave timer
ALSA: timer - Fix Oops at closing slave timer
ASoC: Acknowledge WM8996 interrupts before acting on them
ASoC: Rename WM8915 to WM8996
ALSA: Fix dependency of CONFIG_SND_TEA575X
ALSA: asihpi - use kzalloc()
ALSA: snd-usb-caiaq: Fix keymap for RigKontrol3
ALSA: snd-usb: Fix uninitialized variable usage
ALSA: hda - Fix a complile warning in patch_via.c
ALSA: hdspm - Fix uninitialized compile warnings
ALSA: usb-audio - add quirk for Keith McMillen StringPort
ALSA: snd-usb: operate on given mixer interface only
ALSA: snd-usb: avoid dividing by zero on invalid input
ALSA: snd-usb: Accept UAC2 FORMAT_TYPE descriptors with bLength > 6
sound: oss/pas2: Remove CLOCK_TICK_RATE dependency from PAS16 driver
ALSA: hda - Use auto-parser for ASUS UX50, Eee PC P901, S101 and P1005
ALSA: hda - Fix digital-mic mono recording on ASUS Eee PC
ASoC: sgtl5000: fix cache handling
ASoC: Disable wm_hubs periodic DC servo update
Akinobu Mita [Sun, 7 Aug 2011 09:30:38 +0000 (18:30 +0900)]
slub: fix check_bytes() for slub debugging
The check_bytes() function is used by slub debugging. It returns a pointer
to the first unmatching byte for a character in the given memory area.
If the character for matching byte is greater than 0x80, check_bytes()
doesn't work. Becuase 64-bit pattern is generated as below.
value64 = value | value << 8 | value << 16 | value << 24;
value64 = value64 | value64 << 32;
The integer promotions are performed and sign-extended as the type of value
is u8. The upper 32 bits of value64 is 0xffffffff in the first line, and
the second line has no effect.
This fixes the 64-bit pattern generation.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Reviewed-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
When a slab is freed by __slab_free() and the slab can only contain a
single object ever then it was full (and therefore not on the partial
lists but on the full list in the debug case) before we reached
slab_empty.
This caused the following full list corruption when SLUB debugging was enabled:
[ Full discussion here: https://lkml.org/lkml/2011/8/4/375 ]
Make sure that we remove such a slab also from the full lists.
Reported-and-tested-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Xiaotian Feng <xtfeng@gmail.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Dave Martin [Thu, 28 Jul 2011 13:29:40 +0000 (14:29 +0100)]
ARM: 7008/1: alignment: Make SIGBUS sent to userspace POSIXly correct
With the UM_SIGNAL alignment fault mode, no siginfo structure is
passed to userspace.
POSIX specifies how siginfo_t should be populated for alignment
faults, so this patch does just that:
* si_signo = SIGBUS
* si_code = BUS_ADRALN
* si_addr = misaligned data address at which access was attempted
Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Martin [Thu, 28 Jul 2011 13:28:52 +0000 (14:28 +0100)]
ARM: 7007/1: alignment: Prevent ignoring of faults with ARMv6 unaligned access model
Currently, it's possible to set the kernel to ignore alignment
faults when changing the alignment fault handling mode at runtime
via /proc/sys/alignment, even though this is undesirable on ARMv6
and above, where it can result in infinite spins where an un-fixed-
up instruction repeatedly faults.
In addition, the kernel clobbers any alignment mode specified on
the command-line if running on ARMv6 or above.
This patch factors out the necessary safety check into a couple of
new helper functions, and checks and modifies the fault handling
mode as appropriate on boot and on writes to /proc/cpu/alignment.
Prior to ARMv6, the behaviour is unchanged.
For ARMv6 and above, the behaviour changes as follows:
* Attempting to ignore faults on ARMv6 results in the mode being
forced to UM_FIXUP instead. A warning is printed if this
happened as a result of a write to /proc/cpu/alignment. The
user's UM_WARN bit (if present) is still honoured.
* An alignment= argument from the kernel command-line is now
honoured, except that the kernel will modify the specified mode
as described above. This is allows modes such as UM_SIGNAL and
UM_WARN to be active immediately from boot, which is useful for
debugging purposes.
Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Jamie Iles [Thu, 4 Aug 2011 08:39:31 +0000 (09:39 +0100)]
ARM: 7010/1: mm: fix invalid loop for poison_init_mem
poison_init_mem() used a loop of:
while ((count = count - 4))
which has 2 problems - an off by one error so that we do one less word
than we should, and the other is that if count == 0 then we loop forever
and poison too much. On a platform with HAVE_TCM=y but nothing in the
TCM's, this caused corruption and the platform failed to boot.
Acked-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Brian S. Julin [Sun, 24 Jul 2011 15:53:50 +0000 (16:53 +0100)]
ARM: 7005/1: freshen up mm/proc-arm946.S
The file mm/proc-arm946.S contains a typo and is missing a structure
member in __arm946_proc_info. The former prevents compilation
and the latter causes problems during boot. It is likely this
file was manually copied from a similar file and not tested, then
later updates to the *_proc_info structures missed this file.
This patch will apply (with offset) with or without the
recent macro unification work that has been done in this directory.
This was verified against linux-next/stable last week.
See arm-linux-kernel thread:
http://lists.arm.linux.org.uk/lurker/message/20110718.103237.0106d468.en.html
Signed-off-by: Brian S. Julin <bri@abrij.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 6 Aug 2011 08:34:26 +0000 (09:34 +0100)]
dmaengine: PL08x: Fix trivial build error
Something changed during the 3.1 merge window in the include files
which now causes the pl08x DMA engine driver to fail to build. Fix
this by adding the now necessary dma-mapping.h include:
drivers/dma/amba-pl08x.c: In function ■pl08x_unmap_buffers■:
drivers/dma/amba-pl08x.c:1524: error: implicit declaration of function ■dma_unmap_single■
drivers/dma/amba-pl08x.c:1527: error: implicit declaration of function ■dma_unmap_page■
Acked-by: Vinod Koul <vinod.koul@intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Mon, 8 Aug 2011 19:14:51 +0000 (12:14 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
TOMOYO: Fix incomplete read of /sys/kernel/security/tomoyo/profile
Linus Torvalds [Mon, 8 Aug 2011 18:55:20 +0000 (11:55 -0700)]
autofs4: fix debug printk warning uncovered by cleanup
The previous comit made the autofs4 debug printouts check types against
the printout format, and uncovered this bug:
fs/autofs4/waitq.c:106:2: warning: format ‘%08lx’ expects type ‘long unsigned int’, but argument 4 has type ‘autofs_wqt_t’
which is due to the insane type for wait_queue_token. That thing should
be some fixed well-defined size (preferably just 'unsigned int' or
'u32') but for unexplained reasons it is randomly either 'unsigned long'
or 'unsigned int' depending on the architecture.
For now, cast it to 'unsigned long' for printing, the way we do
elsewhere. Somebody else can try to explain the typedef mess.
(There's a reason we don't support excessive use of typedefs in the
kernel: it's usually just a good way of confusing yourself).
Linus Torvalds [Mon, 8 Aug 2011 18:35:17 +0000 (11:35 -0700)]
autofs4: clean up uaotfs use of debug/info/warning printouts
Use 'pr_debug()' for DPRINTK, which will do the proper type checking on
the arguments (without generating code) even when DEBUG isn't #defined.
Also, use the standard __VA_ARGS__ for the macros, and stop the
pointless abuse of 'do { xyz } while (0)' when the macro is already a
perfectly well-formed single statement.
Reported-by: David Howells <dhowells@redhat.com> Suggested-by: Joe Perches <joe@perches.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Mon, 8 Aug 2011 14:54:53 +0000 (15:54 +0100)]
CRED: Restore const to current_cred()
Commit 3295514841c2 ("fix rcu annotations noise in cred.h") accidentally
dropped the const of current->cred inside current_cred() by the
insertion of a cast to deal with an RCU annotation loss warning from
sparce.
Use an appropriate RCU wrapper instead so as not to lose the const.
Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Olsa [Fri, 22 Jul 2011 11:33:07 +0000 (13:33 +0200)]
perf tools: Add support to install perf python extension
Adding install-python_ext target to install python extension related
files. Installation directory is governed by python distutils package
and follows the DESTDIR variable settings.
Also moving python extension build output into '$(O)python_ext_build'
directory and making it configurable via PYTHON_EXTBUILD variable.
Keeping the '$(O)python/perf.so' file, so it could be used for testing
as of until now.
Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110722113307.GA1931@jolsa.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jonathan Nieder [Fri, 5 Aug 2011 16:58:38 +0000 (18:58 +0200)]
perf tools: do not look at ./config for configuration
In addition to /etc/perfconfig and $HOME/.perfconfig, perf looks for
configuration in the file ./config, imitating git which looks at
$GIT_DIR/config. If ./config is not a perf configuration file, it
fails, or worse, treats it as a configuration file and changes behavior
in some unexpected way.
"config" is not an unusual name for a file to be lying around and perf
does not have a private directory dedicated for its own use, so let's
just stop looking for configuration in the cwd. Callers needing
context-sensitive configuration can use the PERF_CONFIG environment
variable.
Requested-by: Christian Ohm <chr.ohm@gmx.net> Cc: 632923@bugs.debian.org Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Christian Ohm <chr.ohm@gmx.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110805165838.GA7237@elie.gateway.2wire.net Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wang Shaoyan [Mon, 8 Aug 2011 11:10:26 +0000 (19:10 +0800)]
sound: pss - don't use the deprecated function check_region
sound/oss/pss.c: In function 'configure_nonsound_components':
sound/oss/pss.c:676: warning: 'check_region' is deprecated (declared at include/linux/ioport.h:201)
Signed-off-by: Wang Shaoyan <wangshaoyan.pt@taobao.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Mark Brown [Wed, 20 Jul 2011 12:49:58 +0000 (13:49 +0100)]
ASoC: Acknowledge WM8996 interrupts before acting on them
This closes the small race between a status being read in response to an
interrupt and clearing the interrupt, meaning that if the status changes
between those periods we might not get a reassertion of the interrupt.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Commit d006199e72a9 ("serial: sh-sci: Regtype probing doesn't need to be
fatal.") made sci_init_single() return when sci_probe_regmap() succeeds,
although it should return when sci_probe_regmap() fails. This causes
systems using the serial sh-sci driver to crash during boot.
Fix the problem by using the right return condition.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 7 Aug 2011 21:07:03 +0000 (14:07 -0700)]
arm: remove "optimized" SHA1 routines
Since commit 1eb19a12bd22 ("lib/sha1: use the git implementation of
SHA-1"), the ARM SHA1 routines no longer work. The reason? They
depended on the larger 320-byte workspace, and now the sha1 workspace is
just 16 words (64 bytes). So the assembly version would overwrite the
stack randomly.
The optimized asm version is also probably slower than the new improved
C version, so there's no reason to keep it around. At least that was
the case in git, where what appears to be the same assembly language
version was removed two years ago because the optimized C BLK_SHA1 code
was faster.
Reported-and-tested-by: Joachim Eastwood <manabian@gmail.com> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sun, 7 Aug 2011 17:55:11 +0000 (18:55 +0100)]
fix rcu annotations noise in cred.h
task->cred is declared as __rcu, and access to other tasks' ->cred is,
indeed, protected. Access to current->cred does not need rcu_dereference()
at all, since only the task itself can change its ->cred. sparse, of
course, has no way of knowing that...
Add force-cast in current_cred(), make current_fsuid() et.al. use it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 7 Aug 2011 16:53:20 +0000 (09:53 -0700)]
vfs: rename 'do_follow_link' to 'should_follow_link'
Al points out that the do_follow_link() helper function really is
misnamed - it's about whether we should try to follow a symlink or not,
not about actually doing the following.
Takashi Iwai [Sun, 7 Aug 2011 15:34:07 +0000 (17:34 +0200)]
ALSA: Fix dependency of CONFIG_SND_TEA575X
CONFIG_SND_TEA575X is enabled by RADIO_SF16FMR2, but the latter one is
no PCI device. Since tea575x-tuner itself is independent from the board
bus type, the config should be moved out of SND_PCI dependency.
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Thomas Meyer [Sat, 6 Aug 2011 11:26:20 +0000 (13:26 +0200)]
ALSA: asihpi - use kzalloc()
Use kzalloc rather than kmalloc followed by memset with 0
This considers some simple cases that are common and easy to validate
Note in particular that there are no ...s in the rule, so all of the
matched code has to be contiguous
The semantic patch that makes this output is available
in scripts/coccinelle/api/alloc/kzalloc-simple.cocci.
More information about semantic patching is available at
http://coccinelle.lip6.fr/
Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Ari Savolainen [Sat, 6 Aug 2011 16:43:07 +0000 (19:43 +0300)]
Fix POSIX ACL permission check
After commit 3567866bf261: "RCUify freeing acls, let check_acl() go ahead in
RCU mode if acl is cached" posix_acl_permission is being called with an
unsupported flag and the permission check fails. This patch fixes the issue.
Signed-off-by: Ari Savolainen <ari.m.savolainen@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sun, 7 Aug 2011 05:56:03 +0000 (22:56 -0700)]
Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
ore: Make ore its own module
exofs: Rename raid engine from exofs/ios.c => ore
exofs: ios: Move to a per inode components & device-table
exofs: Move exofs specific osd operations out of ios.c
exofs: Add offset/length to exofs_get_io_state
exofs: Fix truncate for the raid-groups case
exofs: Small cleanup of exofs_fill_super
exofs: BUG: Avoid sbi realloc
exofs: Remove pnfs-osd private definitions
nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
Linus Torvalds [Sun, 7 Aug 2011 05:45:50 +0000 (22:45 -0700)]
vfs: optimize inode cache access patterns
The inode structure layout is largely random, and some of the vfs paths
really do care. The path lookup in particular is already quite D$
intensive, and profiles show that accessing the 'inode->i_op->xyz'
fields is quite costly.
We already optimized the dcache to not unnecessarily load the d_op
structure for members that are often NULL using the DCACHE_OP_xyz bits
in dentry->d_flags, and this does something very similar for the inode
ops that are used during pathname lookup.
It also re-orders the fields so that the fields accessed by 'stat' are
together at the beginning of the inode structure, and roughly in the
order accessed.
The effect of this seems to be in the 1-2% range for an empty kernel
"make -j" run (which is fairly kernel-intensive, mostly in filename
lookup), so it's visible. The numbers are fairly noisy, though, and
likely depend a lot on exact microarchitecture. So there's more tuning
to be done.
Linus Torvalds [Sun, 7 Aug 2011 05:41:50 +0000 (22:41 -0700)]
vfs: renumber DCACHE_xyz flags, remove some stale ones
Gcc tends to generate better code with small integers, including the
DCACHE_xyz flag tests - so move the common ones to be first in the list.
Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
tree.
And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
common case to be a nice straight-line fall-through.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Compute protocol sequence numbers and fragment IDs using MD5.
crypto: Move md5_transform to lib/md5.c
Boaz Harrosh [Sun, 7 Aug 2011 02:26:31 +0000 (19:26 -0700)]
exofs: Rename raid engine from exofs/ios.c => ore
ORE stands for "Objects Raid Engine"
This patch is a mechanical rename of everything that was in ios.c
and its API declaration to an ore.c and an osd_ore.h header. The ore
engine will later be used by the pnfs objects layout driver.
* File ios.c => ore.c
* Declaration of types and API are moved from exofs.h to a new
osd_ore.h
* All used types are prefixed by ore_ from their exofs_ name.
* Shift includes from exofs.h to osd_ore.h so osd_ore.h is
independent, include it from exofs.h.
Other than a pure rename there are no other changes. Next patch
will move the ore into it's own module and will export the API
to be used by exofs and later the layout driver
Boaz Harrosh [Fri, 5 Aug 2011 22:06:04 +0000 (15:06 -0700)]
exofs: ios: Move to a per inode components & device-table
Exofs raid engine was saving on memory space by having a single layout-info,
single pid, and a single device-table, global to the filesystem. Then passing
a credential and object_id info at the io_state level, private for each
inode. It would also devise this contraption of rotating the device table
view for each inode->ino to spread out the device usage.
This is not compatible with the pnfs-objects standard, demanding that
each inode can have it's own layout-info, device-table, and each object
component it's own pid, oid and creds.
So: Bring exofs raid engine to be usable for generic pnfs-objects use by:
* Define an exofs_comp structure that holds obj_id and credential info.
* Break up exofs_layout struct to an exofs_components structure that holds a
possible array of exofs_comp and the array of devices + the size of the
arrays.
* Add a "comps" parameter to get_io_state() that specifies the ids creds
and device array to use for each IO.
This enables to keep the layout global, but the device-table view, creds
and IDs at the inode level. It only adds two 64bit to each inode, since
some of these members already existed in another form.
* ios raid engine now access layout-info and comps-info through the passed
pointers. Everything is pre-prepared by caller for generic access of
these structures and arrays.
At the exofs Level:
* Super block holds an exofs_components struct that holds the device
array, previously in layout. The devices there are in device-table
order. The device-array is twice bigger and repeats the device-table
twice so now each inode's device array can point to a random device
and have a round-robin view of the table, making it compatible to
previous exofs versions.
* Each inode has an exofs_components struct that is initialized at
load time, with it's own view of the device table IDs and creds.
When doing IO this gets passed to the io_state together with the
layout.
While preforming this change. Bugs where found where credentials with the
wrong IDs where used to access the different SB objects (super.c). As well
as some dead code. It was never noticed because the target we use does not
check the credentials.
Boaz Harrosh [Tue, 16 Nov 2010 18:09:58 +0000 (20:09 +0200)]
exofs: Add offset/length to exofs_get_io_state
In future raid code we will need to know the IO offset/length
and if it's a read or write to determine some of the array
sizes we'll need.
So add a new exofs_get_rw_state() API for use when
writeing/reading. All other simple cases are left using the
old way.
The major change to this is that now we need to call
exofs_get_io_state later at inode.c::read_exec and
inode.c::write_exec when we actually know these things. So this
patch is kept separate so I can test things apart from other
changes.
David S. Miller [Thu, 4 Aug 2011 03:50:44 +0000 (20:50 -0700)]
net: Compute protocol sequence numbers and fragment IDs using MD5.
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.
MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)
Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation. So the periodic
regeneration and 8-bit counter have been removed. We compute and
use a full 32-bit sequence number.
For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.
Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: cope with negative dentries in cifs_get_root
cifs: convert prefixpath delimiters in cifs_build_path_to_root
CIFS: Fix missing a decrement of inFlight value
cifs: demote DFS referral lookup errors to cFYI
Revert "cifs: advertise the right receive buffer size to the server"
Linus Torvalds [Sat, 6 Aug 2011 19:22:30 +0000 (12:22 -0700)]
Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set
xen: Fix misleading WARN message at xen_release_chunk
xen: Fix printk() format in xen/setup.c
xen/tracing: it looks like we wanted CONFIG_FTRACE
xen/self-balloon: Add dependency on tmem.
xen/balloon: Fix compile errors - missing header files.
xen/grant: Fix compile warning.
xen/pciback: remove duplicated #include
John Stanley [Thu, 4 Aug 2011 00:41:00 +0000 (20:41 -0400)]
savagedb: Fix typo causing regression in savage4 series video chip detection
Two additional savage4 variants were added, but the S3_SAVAGE4_SERIES
macro was incompletely modified, resulting in a false positive detection
of a savage4 card regardless of which savage card is actually present.
For non-savage4 series cards, such as a Savage/IX-MV card, this results
in garbled video and/or a hard-hang at boot time. Fix this by changing
an '||' to an '&&' in the S3_SAVAGE4_SERIES macro.
Signed-off-by: John P. Stanley <jpsinthemix@verizon.net> Reviewed-by: Tormod Volden <debian.tormod@gmail.com>
[ The macros have incomplete parenthesis too, but whatever .. -Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh Triplett [Wed, 3 Aug 2011 19:19:07 +0000 (12:19 -0700)]
CodingStyle: Document the exception of not splitting user-visible strings, for grepping
Patch reviewers now recommend not splitting long user-visible strings,
such as printk messages, even if they exceed 80 columns. This avoids
breaking grep. However, that recommendation did not actually appear
anywhere in Documentation/CodingStyle.
See, for example, the thread at
http://news.gmane.org/find-root.php?message_id=%3c1312215262.11635.15.camel%40Joe%2dLaptop%3e
Linus Torvalds [Sat, 6 Aug 2011 18:51:33 +0000 (11:51 -0700)]
vfs: show O_CLOEXE bit properly in /proc/<pid>/fdinfo/<fd> files
The CLOEXE bit is magical, and for performance (and semantic) reasons we
don't actually maintain it in the file descriptor itself, but in a
separate bit array. Which means that when we show f_flags, the CLOEXE
status is shown incorrectly: we show the status not as it is now, but as
it was when the file was opened.
Fix that by looking up the bit properly in the 'fdt->close_on_exec' bit
array.
Uli needs this in order to re-implement the pfiles program:
"For normal file descriptors (not sockets) this was the last piece of
information which wasn't available. This is all part of my 'give
Solaris users no reason to not switch' effort. I intend to offer the
code to the util-linux-ng maintainers."
Linus Torvalds [Sat, 6 Aug 2011 18:43:08 +0000 (11:43 -0700)]
oom_ajd: don't use WARN_ONCE, just use printk_once
WARN_ONCE() is very annoying, in that it shows the stack trace that we
don't care about at all, and also triggers various user-level "kernel
oopsed" logic that we really don't care about. And it's not like the
user can do anything about the applications (sshd) in question, it's a
distro issue.
For ChromiumOS, we use SHA-1 to verify the integrity of the root
filesystem. The speed of the kernel sha-1 implementation has a major
impact on our boot performance.
To improve boot performance, we investigated using the heavily optimized
sha-1 implementation used in git. With the git sha-1 implementation, we
see a 11.7% improvement in boot time.
10 reboots, remove slowest/fastest.
Before:
Mean: 6.58 seconds Stdev: 0.14
After (with git sha-1, this patch):
Mean: 5.89 seconds Stdev: 0.07
The other cool thing about the git SHA-1 implementation is that it only
needs 64 bytes of stack for the workspace while the original kernel
implementation needed 320 bytes.
Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Cc: Nicolas Pitre <nico@cam.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David S. Miller <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Fri, 5 Aug 2011 22:23:18 +0000 (00:23 +0200)]
ALSA: snd-usb: Fix uninitialized variable usage
Purely cosmetic, but fixes the following build warning.
CC [M] sound/usb/quirks.o
sound/usb/quirks.c: In function ‘snd_usb_apply_boot_quirk’:
sound/usb/quirks.c:429:6: warning: ‘err’ may be used uninitialized in this function [-Wuninitialized]
Signed-off-by: Daniel Mack <zonque@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
This patch introduces separate lock to struct acpi_battery to
grab in sysfs_remove_battery() instead of battery->lock.
So fix by Lan Tianyu is still there, we just grab independent lock.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Jason Baron [Fri, 5 Aug 2011 20:40:40 +0000 (16:40 -0400)]
jump label: Reduce the cycle count by changing the link order
In the course of testing jump labels for use with the CFS
bandwidth controller, Paul Turner, discovered that using jump
labels reduced the branch count and the instruction count, but
did not reduce the cycle count or wall time.
I noticed that having the jump_label.o included in the kernel
but not used in any way still caused this increase in cycle
count and wall time. Thus, I moved jump_label.o in the
kernel/Makefile, thus changing the link order, and presumably
moving it out of hot icache areas. This brought down the cycle
count/time as expected.
In addition to Paul's testing, I've tested the patch using a
single 'static_branch()' in the getppid() path, and basically
running tight loops of calls to getppid(). Here are my results
for the branch disabled case:
With jump labels turned on (CONFIG_JUMP_LABEL), branch disabled:
Performance counter stats for 'bash -c /tmp/getppid;true' (50 runs):
Kevin Hilman [Fri, 5 Aug 2011 19:45:20 +0000 (21:45 +0200)]
PM / Runtime: Allow _put_sync() from interrupts-disabled context
Currently the use of pm_runtime_put_sync() is not safe from
interrupts-disabled context because rpm_idle() will release the
spinlock and enable interrupts for the idle callbacks. This enables
interrupts during a time where interrupts were expected to be
disabled, and can have strange side effects on drivers that expected
interrupts to be disabled.
This is not a bug since the documentation clearly states that only
_put_sync_suspend() is safe in IRQ-safe mode.
However, pm_runtime_put_sync() could be made safe when in IRQ-safe
mode by releasing the spinlock but not re-enabling interrupts, which
is what this patch aims to do.
Problem was found when using some buggy drivers that set
pm_runtime_irq_safe() and used _put_sync() in interrupts-disabled
context.
Reported-by: Colin Cross <ccross@google.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The local variable ret is defined twice in pm_genpd_poweron(), which
causes this function to always return 0, even if the PM domain's
.power_on() callback fails, in which case an error code should be
returned.
Remove the wrong second definition of ret and additionally remove an
unnecessary definition of wait from pm_genpd_poweron().
The AMW0 function in acer-wmi works on Lenovo ideapad S205 for control
the wifi hardware state. We also found there have a 0x78 EC register
exposes the state of wifi hardware switch on the machine.
So, add this patch to support Lenovo ideapad S205 wifi hardware switch
in acer-wmi driver.