Linus Torvalds [Thu, 12 Oct 2006 14:49:46 +0000 (07:49 -0700)]
Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] block layer: ioprio_best function fix
[PATCH] ide-cd: fix breakage with internally queued commands
[PATCH] block layer: elv_iosched_show should get elv_list_lock
[PATCH] splice: fix pipe_to_file() ->prepare_write() error path
[PATCH] block layer: elevator_find function cleanup
[PATCH] elevator: elevator_type member not used
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[PKT_SCHED] sch_htb: use rb_first() cleanup
[RTNETLINK]: Fix use of wrong skb in do_getlink()
[DECNET]: Fix sfuzz hanging on 2.6.18
[NET]: Do not memcmp() over pad bytes of struct flowi.
[NET]: Introduce protocol-specific destructor for time-wait sockets.
[NET]: Use typesafe inet_twsk() inline function instead of cast.
[NET]: Use hton{l,s}() for non-initializers.
[TCP]: Use TCPOLEN_TSTAMP_ALIGNED macro instead of magic number.
[IPV6]: Seperate sit driver to extra module (addrconf.c changes)
[IPV6]: Seperate sit driver to extra module
[NET]: File descriptor loss while receiving SCM_RIGHTS
[SCTP]: Fix the RX queue size shown in /proc/net/sctp/assocs output.
[SCTP]: Fix receive buffer accounting.
SELinux: Bug fix in polidydb_destroy
IPsec: fix handling of errors for socket policies
IPsec: correct semantics for SELinux policy matching
IPsec: propagate security module errors up from flow_cache_lookup
NetLabel: use SECINITSID_UNLABELED for a base SID
NetLabel: fix a cache race condition
Vasily Tarasov [Thu, 12 Oct 2006 13:09:51 +0000 (15:09 +0200)]
[PATCH] block layer: ioprio_best function fix
Currently ioprio_best function first checks wethere aioprio or bioprio equals
IOPRIO_CLASS_NONE (ioprio_valid() macros does that) and if it is so it returns
bioprio/aioprio appropriately. Thus the next four lines, that set aclass/bclass
to IOPRIO_CLASS_BE, if aclass/bclass == IOPRIO_CLASS_NONE, are never executed.
The second problem: if aioprio from class IOPRIO_CLASS_NONE and bioprio from
class IOPRIO_CLASS_IDLE are passed to ioprio_best function, it will return
IOPRIO_CLASS_IDLE. It means that during __make_request we can merge two
requests and set the priority of merged request to IDLE, while one of
the initial requests originates from a process with NONE (default) priority.
So we can get a situation when a process with default ioprio will experience
IO starvation, while there is no process from real-time class in the system.
Just removing ioprio_valid check should correct situation.
Joerg Roedel [Tue, 10 Oct 2006 21:49:53 +0000 (14:49 -0700)]
[IPV6]: Seperate sit driver to extra module (addrconf.c changes)
This patch contains the changes to net/ipv6/addrconf.c to remove sit
specific code if the sit driver is not selected.
Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Joerg Roedel [Tue, 10 Oct 2006 21:47:44 +0000 (14:47 -0700)]
[IPV6]: Seperate sit driver to extra module
This patch removes the driver of the IPv6-in-IPv4 tunnel driver (sit)
from the IPv6 module. It adds an option to Kconfig which makes it
possible to compile it as a seperate module.
Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Miklos Szeredi [Tue, 10 Oct 2006 04:42:14 +0000 (21:42 -0700)]
[NET]: File descriptor loss while receiving SCM_RIGHTS
If more than one file descriptor was sent with an SCM_RIGHTS message,
and on the receiving end, after installing a nonzero (but not all)
file descritpors the process runs out of fds, then the already
installed fds will be lost (userspace will have no way of knowing
about them).
The following patch makes sure, that at least the already installed
fds are sent to userspace. It doesn't solve the issue of losing file
descriptors in case of an EFAULT on the userspace buffer.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 10 Oct 2006 04:34:26 +0000 (21:34 -0700)]
[SCTP]: Fix the RX queue size shown in /proc/net/sctp/assocs output.
Show the true receive buffer usage.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 10 Oct 2006 04:34:04 +0000 (21:34 -0700)]
[SCTP]: Fix receive buffer accounting.
When doing receiver buffer accounting, we always used skb->truesize.
This is problematic when processing bundled DATA chunks because for
every DATA chunk that could be small part of one large skb, we would
charge the size of the entire skb. The new approach is to store the
size of the DATA chunk we are accounting for in the sctp_ulpevent
structure and use that stored value for accounting.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chad Sellers [Fri, 6 Oct 2006 20:09:52 +0000 (16:09 -0400)]
SELinux: Bug fix in polidydb_destroy
This patch fixes two bugs in policydb_destroy. Two list pointers
(policydb.ocontexts[i] and policydb.genfs) were not being reset to NULL when
the lists they pointed to were being freed. This caused a problem when the
initial policy load failed, as the policydb being destroyed was not a
temporary new policydb that was thrown away, but rather was the global
(active) policydb. Consequently, later functions, particularly
sys_bind->selinux_socket_bind->security_node_sid and
do_rw_proc->selinux_sysctl->selinux_proc_get_sid->security_genfs_sid tried
to dereference memory that had previously been freed.
Signed-off-by: Chad Sellers <csellers@tresys.com> Signed-off-by: James Morris <jmorris@namei.org>
This treats the security errors encountered in the case of
socket policy matching, the same as how these are treated in
the case of main/sub policies, which is to return a full lookup
failure.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>
IPsec: correct semantics for SELinux policy matching
Currently when an IPSec policy rule doesn't specify a security
context, it is assumed to be "unlabeled" by SELinux, and so
the IPSec policy rule fails to match to a flow that it would
otherwise match to, unless one has explicitly added an SELinux
policy rule allowing the flow to "polmatch" to the "unlabeled"
IPSec policy rules. In the absence of such an explicitly added
SELinux policy rule, the IPSec policy rule fails to match and
so the packet(s) flow in clear text without the otherwise applicable
xfrm(s) applied.
The above SELinux behavior violates the SELinux security notion of
"deny by default" which should actually translate to "encrypt by
default" in the above case.
This was first reported by Evgeniy Polyakov and the way James Morris
was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
With this patch applied, SELinux "polmatching" of flows Vs. IPSec
policy rules will only come into play when there's a explicit context
specified for the IPSec policy rule (which also means there's corresponding
SELinux policy allowing appropriate domains/flows to polmatch to this context).
Secondly, when a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return errors other than access denied,
such as -EINVAL. We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The solution for this is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
This patch: Fix the selinux side of things.
This makes sure SELinux polmatching of flow contexts to IPSec policy
rules comes into play only when an explicit context is associated
with the IPSec policy rule.
Also, this no longer defaults the context of a socket policy to
the context of the socket since the "no explicit context" case
is now handled properly.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>
James Morris [Thu, 5 Oct 2006 20:42:27 +0000 (15:42 -0500)]
IPsec: propagate security module errors up from flow_cache_lookup
When a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return an access denied permission
(or other error). We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.
The way I was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.
The first SYNACK would be blocked, because of an uncached lookup via
flow_cache_lookup(), which would fail to resolve an xfrm policy because
the SELinux policy is checked at that point via the resolver.
However, retransmitted SYNACKs would then find a cached flow entry when
calling into flow_cache_lookup() with a null xfrm policy, which is
interpreted by xfrm_lookup() as the packet not having any associated
policy and similarly to the first case, allowing it to pass without
transformation.
The solution presented here is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.
Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).
Testing revealed a problem with the NetLabel cache where a cached entry could
be freed while in use by the LSM layer causing an oops and other problems.
This patch fixes that problem by introducing a reference counter to the cache
entry so that it is only freed when it is no longer in use.
Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
Martin Habets [Wed, 11 Oct 2006 21:58:30 +0000 (14:58 -0700)]
[SPARC32]: Fix sparc32 modpost warnings.
Fix these 2.6.19-rc1 build warnings from modpost:
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'core_kernel_text' (at offset 0x3e060) and '__kernel_text_address'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'core_kernel_text' (at offset 0x3e064) and '__kernel_text_address'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'core_kernel_text' (at offset 0x3e07c) and '__kernel_text_address'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'core_kernel_text' (at offset 0x3e080) and '__kernel_text_address'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'is_ksym_addr' (at offset 0x4b3a4) and 'kallsyms_expand_symbol'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'is_ksym_addr' (at offset 0x4b3a8) and 'kallsyms_expand_symbol'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'is_ksym_addr' (at offset 0x4b3b4) and 'kallsyms_expand_symbol'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'is_ksym_addr' (at offset 0x4b3e4) and 'kallsyms_expand_symbol'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'get_symbol_pos' (at offset 0x4b640) and 'kallsyms_lookup_size_offset'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'get_symbol_pos' (at offset 0x4b644) and 'kallsyms_lookup_size_offset'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'get_symbol_pos' (at offset 0x4b654) and 'kallsyms_lookup_size_offset'
WARNING: vmlinux - Section mismatch: reference to .init.text:_einittext from .text between 'get_symbol_pos' (at offset 0x4b658) and 'kallsyms_lookup_size_offset'
WARNING: vmlinux - Section mismatch: reference to .init.text:_sinittext from .text between 'get_symbol_pos' (at offset 0x4b68c) and 'kallsyms_lookup_size_offset'
The crux of the matter is that modpost only checks the relocatable
sections. i386 vmlinux has none, so modpost does no checking on it (it
does on the modules). However, sparc vmlinux has plenty of
relocatable sections because it is being built with 'ld -r' (to allow
for btfixup processing). So for sparc, modpost does do a lot of
checking. Sure enough, running modpost on arch/sparc/boot/image yields
no output (i.e. all is well).
modpost.c check_sec_ref() has:
/* We want to process only relocation sections and not .init */
if (sechdrs[i].sh_type == SHT_RELA) {
// check here
} else if (sechdrs[i].sh_type == SHT_REL) {
// check here
}
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Tue, 10 Oct 2006 21:44:01 +0000 (14:44 -0700)]
[SPARC32]: Fix sparc32 modpost warnings with sunzilog
Fix this 2.6.19-rc1 build warnings from modpost:
WARNING: vmlinux - Section mismatch: reference to .init.text:sunzilog_console_setup from .data between 'sunzilog_console' (at offset 0x8394) and 'devices_subsys'
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Tue, 10 Oct 2006 21:36:47 +0000 (14:36 -0700)]
[SPARC32]: Mark srmmu_nocache_init as __init.
Fix these 2.6.19-rc1 build warnings from modpost:
WARNING: vmlinux - Section mismatch: reference to .init.text:__alloc_bootmem from .text between 'srmmu_nocache_init' (at offset 0x1a0f8) and 'srmmu_mmu_info'
WARNING: vmlinux - Section mismatch: reference to .init.text:__alloc_bootmem from .text between 'srmmu_nocache_init' (at offset 0x1a118) and 'srmmu_mmu_info'
WARNING: vmlinux - Section mismatch: reference to .init.text:srmmu_early_allocate_ptable_skeleton from .text between 'srmmu_nocache_init' (at offset 0x1a188) and 'srmmu_mmu_info'
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 11 Oct 2006 22:30:14 +0000 (15:30 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Pass NULL not 0 for pointer value.
[MIPS] IP27: Make declaration of setup_replication_mask a proper prototype.
[MIPS] BigSur: More useful defconfig.
[MIPS] Cleanup definitions of speed_t and tcflag_t.
[MIPS] Fix compilation warnings in arch/mips/sibyte/bcm1480/smp.c
[MIPS] Optimize and cleanup get_saved_sp, set_saved_sp
[MIPS] <asm/irq.h> does not need pt_regs anymore.
[MIPS] Workaround for bug in gcc -EB / -EL options.
[MIPS] Fix timer setup for Jazz
Ralf Baechle [Tue, 10 Oct 2006 14:44:10 +0000 (15:44 +0100)]
[MIPS] Workaround for bug in gcc -EB / -EL options.
Certain gcc versions upto gcc 4.1.1 (probably 4.2-subversion as of
2006-10-10 don't properly change the the predefined symbols if -EB / -EL
are used, so we kludge that here. A bug has been filed at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29413.
Linus Torvalds [Wed, 11 Oct 2006 18:19:47 +0000 (11:19 -0700)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] pata-qdi: fix le32 in data_xfer
[libata] sata_promise: add PCI ID
[PATCH] libata: return sense data in HDIO_DRIVE_CMD ioctl
[PATCH] libata: Don't believe bogus claims in the older PIO mode register
Jeff Garzik [Wed, 11 Oct 2006 08:22:25 +0000 (01:22 -0700)]
[PATCH] ISDN: several minor fixes
pcbit: kill 'may be used uninitialized' warning. although the code does
eventually fill the 32 bits it cares about, the variable truly is
accessed uninitialized in each macro. Easier to just clean it up now.
sc: fix a ton of obviously incorrect printk's (some with missing
arguments even)
Signed-off-by: Jeff Garzik <jeff@garzik.org> Acked-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andreas Mohr [Wed, 11 Oct 2006 08:22:24 +0000 (01:22 -0700)]
[PATCH] fs/bio.c: tweaks
- Calculate a variable in bvec_alloc_bs() only once needed, not earlier
(bio.o down from 18408 to 18376 Bytes, 32 Bytes saved, probably due to
data locality improvements).
- Init variable idx to silence a gcc warning which already existed in the
unmodified original base file (bvec_alloc_bs() handles idx correctly, so
there's no need for the warning):
fs/bio.c: In function `bio_alloc_bioset':
fs/bio.c:169: warning: `idx' may be used uninitialized in this function
Signed-off-by: Andreas Mohr <andi@lisas.de> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Wed, 11 Oct 2006 08:22:19 +0000 (01:22 -0700)]
[PATCH] VFS: Destroy the dentries contributed by a superblock on unmounting
The attached patch destroys all the dentries attached to a superblock in one go
by:
(1) Destroying the tree rooted at s_root.
(2) Destroying every entry in the anon list, one at a time.
(3) Each entry in the anon list has its subtree consumed from the leaves
inwards.
This reduces the amount of work generic_shutdown_super() does, and avoids
iterating through the dentry_unused list.
Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
functions added by this patch. This is because:
(1) at the point the filesystem calls generic_shutdown_super(), it is not
permitted to further touch the superblock's set of dentries, and nor may
it remove aliases from inodes;
(2) the dcache memory shrinker now skips dentries that are being unmounted;
and
(3) the superblock no longer has any external references through which the VFS
can reach it.
Given these points, the only locking we need to do is when we remove dentries
from the unused list and the name hashes, which we do a directory's worth at a
time.
We also don't need to guard against reference counts going to zero unexpectedly
and removing bits of the tree we're working on as nothing else can call dput().
A cut down version of dentry_iput() has been folded into
shrink_dcache_for_umount_subtree() function. Apart from not needing to unlock
things, it also doesn't need to check for inotify watches.
In this version of the patch, the complaint about a dentry still being in use
has been expanded from a single BUG_ON() and now gives much more information.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Wed, 11 Oct 2006 08:22:15 +0000 (01:22 -0700)]
[PATCH] AUTOFS: Make sure all dentries refs are released before calling kill_anon_super()
Make sure all dentries refs are released before calling kill_anon_super() so
that the assumption that generic_shutdown_super() can completely destroy the
dentry tree for there will be no external references holds true.
What was being done in the put_super() superblock op, is now done in the
kill_sb() filesystem op instead, prior to calling kill_anon_super().
This makes the struct autofs_sb_info::root member variable redundant (since
sb->s_root is still available), and so that is removed. The calls to
shrink_dcache_sb() are also removed since they're also redundant as
shrink_dcache_for_umount() will now be called after the cleanup routine.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Wed, 11 Oct 2006 08:22:14 +0000 (01:22 -0700)]
[PATCH] ReiserFS: Make sure all dentries refs are released before calling kill_block_super()
Make sure all dentries refs are released before calling kill_block_super()
so that the assumption that generic_shutdown_super() can completely destroy
the dentry tree for there will be no external references holds true.
What was being done in the put_super() superblock op, is now done in the
kill_sb() filesystem op instead, prior to calling kill_block_super().
Changes made in [try #2]:
(*) reiserfs_kill_sb() now checks that the superblock FS info pointer is set
before trying to dereference it.
Signed-off-by: David Howells <dhowells@redhat.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 11 Oct 2006 08:22:13 +0000 (01:22 -0700)]
[PATCH] dell_rbu: printk() warning fix
drivers/firmware/dell_rbu.c: In function 'packetize_data':
drivers/firmware/dell_rbu.c:252: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int'
Cc: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Randy Dunlap [Wed, 11 Oct 2006 08:22:12 +0000 (01:22 -0700)]
[PATCH] kernel-doc: make parameter description indentation uniform
- In parameter descriptions, strip all whitespace between the parameter
name (e.g., @len) and its description so that the description is
indented uniformly in text and man page modes. Previously, spaces
or tabs (which are used for cleaner source code viewing) affected
the produced output in a negative way.
Before (man mode):
to Destination address, in user space.
from Source address, in kernel space.
n Number of bytes to copy.
After (man mode):
to Destination address, in user space.
from Source address, in kernel space.
n Number of bytes to copy.
- Fix/clarify a few function description comments.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Wed, 11 Oct 2006 08:22:08 +0000 (01:22 -0700)]
[PATCH] lockdep: fix printk recursion logic
Bug reported and fixed by Tilman Schmidt <tilman@imap.cc>: if lockdep is
enabled then log messages make it to /var/log/messages belatedly. The
reason is a missed wakeup of klogd.
Initially there was only a lockdep_internal() protection against lockdep
recursion within vprintk() - it grew the 'outer' lockdep_off()/on()
protection only later on. But that lockdep_off() made the
release_console_sem() within vprintk() always happen under the
lockdep_internal() condition, causing the bug.
The right solution to remove the inner protection against recursion here -
the outer one is enough.
+ '.' acquired while irqs enabled
+ '+' acquired in irq context
+ '-' acquired in process context with irqs disabled
+ '?' read-acquired both with irqs enabled and in irq context
+
But the get_usage_chars() function does this for '-'
if (class->usage_mask & LOCKF_ENABLED_HARDIRQS)
*c1 = '-';
So i guess what would be correct would be
'.' acquired while irqs disabled
'+' acquired in irq context
'-' acquired with irqs enabled
'?' read acquired in irq context with irqs enabled.
Adrian Bunk [Wed, 11 Oct 2006 08:22:04 +0000 (01:22 -0700)]
[PATCH] HT_IRQ must depend on PCI
CONFIG_PCI=n, CONFIG_HT_IRQ=y results in the following compile error:
...
LD vmlinux
arch/i386/mach-generic/built-in.o: In function `apicid_to_node':
summit.c:(.text+0x53): undefined reference to `apicid_2_node'
arch/i386/kernel/built-in.o: In function `arch_setup_ht_irq':
(.text+0xcf79): undefined reference to `write_ht_irq_low'
arch/i386/kernel/built-in.o: In function `arch_setup_ht_irq':
(.text+0xcf85): undefined reference to `write_ht_irq_high'
arch/i386/kernel/built-in.o: In function `k7nops':
alternative.c:(.data+0x1358): undefined reference to `mask_ht_irq'
alternative.c:(.data+0x1360): undefined reference to `unmask_ht_irq'
make[1]: *** [vmlinux] Error 1
Bug report by Jesper Juhl.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matthew Wilcox [Wed, 11 Oct 2006 08:22:02 +0000 (01:22 -0700)]
[PATCH] Consolidate check_signature
There's nothing arch-specific about check_signature(), so move it to
<linux/io.h>. Use a cross between the Alpha and i386 implementations as
the generic one.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trond Myklebust [Wed, 11 Oct 2006 08:21:58 +0000 (01:21 -0700)]
[PATCH] VM: Fix the gfp_mask in invalidate_complete_page2
If try_to_release_page() is called with a zero gfp mask, then the
filesystem is effectively denied the possibility of sleeping while
attempting to release the page. There doesn't appear to be any valid
reason why this should be banned, given that we're not calling this from a
memory allocation context.
For this reason, change the gfp_mask argument of the call to GFP_KERNEL.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Steve Dickson <SteveD@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reinette Chatre [Wed, 11 Oct 2006 08:21:55 +0000 (01:21 -0700)]
[PATCH] bitmap: parse input from kernel and user buffers
lib/bitmap.c:bitmap_parse() is a library function that received as input a
user buffer. This seemed to have originated from the way the write_proc
function of the /proc filesystem operates.
This has been reworked to not use kmalloc and eliminates a lot of
get_user() overhead by performing one access_ok before using __get_user().
We need to test if we are in kernel or user space (is_user) and access the
buffer differently. We cannot use __get_user() to access kernel addresses
in all cases, for example in architectures with separate address space for
kernel and user.
This function will be useful for other uses as well; for example, taking
input for /sysfs instead of /proc, so it was changed to accept kernel
buffers. We have this use for the Linux UWB project, as part as the
upcoming bandwidth allocator code.
Only a few routines used this function and they were changed too.
Signed-off-by: Reinette Chatre <reinette.chatre@linux.intel.com> Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Cc: Paul Jackson <pj@sgi.com> Cc: Joe Korty <joe.korty@ccur.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Wed, 11 Oct 2006 08:21:53 +0000 (01:21 -0700)]
[PATCH] invalidate_inode_pages2_range() debug
A failure in invalidate_inode_pages2_range() can result in unpleasant things
happening in NFS (at least). Stick a WARN_ON_ONCE() in there so we can find
out if it happens, and maybe why.
(akpm: might be a -mm-only patch, we'll see..)
Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Steve Dickson <SteveD@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Wed, 11 Oct 2006 08:21:52 +0000 (01:21 -0700)]
[PATCH] sched: likely profiling
This likely profiling is pretty fun. I found a few possible problems
in sched.c.
This patch may be not measurable, but when I did measure long ago,
nooping (un)likely cost a couple of % on scheduler heavy benchmarks, so
it all adds up.
Tweak some branch hints:
- the 2nd 64 bits in the bitmask is likely to be populated, because it
contains the first 28 bits (nearly 3/4) of the normal priorities.
(ratio of 669669:691 ~= 1000:1).
- it isn't unlikely that context switching switches to another process. it
might be very rapidly switching to and from the idle process (ratio of
475815:419004 and 471330:423544). Let the branch predictor decide.
- preempt_enable seems to be very often called in a nested preempt_disable
or with interrupts disabled (ratio of 3567760:87965 ~= 40:1)
Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Daniel Walker <dwalker@mvista.com> Cc: Hua Zhong <hzhong@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Garzik [Wed, 11 Oct 2006 08:21:51 +0000 (01:21 -0700)]
[PATCH] tpm: fix error handling
- handle sysfs error
- handle driver model errors
- de-obfuscate platform_device_register_simple() call, which included an
assignment in between two function calls, in the same C statement.
Signed-off-by: Jeff Garzik <jeff@garzik.org> Acked-by: Kylene Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Define the behaviour when an error is encountered. (Either ignore
errors and just mark the file system erroneous and continue, or remount
the file system read-only, or panic and halt the system.) The default is
set in the filesystem superblock, and can be changed using tune2fs(8).
---- end of quote ----
However EXT3_ERRORS_CONTINUE is not read from the superblock, and thus
ERRORS_CONT is not saved on the sbi->s_mount_opt. It leads to the incorrect
handle of errors on ext3.
Then we've checked corresponding code in ext2 and discovered that it is buggy
as well:
- EXT2_ERRORS_CONTINUE is not read from the superblock (the same);
- parse_option() does not clean the alternative values and thus something
like (ERRORS_CONT|ERRORS_RO) can be set;
- if options are omitted, parse_option() does not set any of these options.
Therefore it is possible to set any combination of these options on the ext2:
- none of them may be set: EXT2_ERRORS_CONTINUE on superblock / empty mount
options;
- any of them may be set using mount options;
- 2 any options may be set: by using EXT2_ERRORS_RO/EXT2_ERRORS_PANIC on the
superblock and other value in mount options;
- and finally all three options may be set by adding third option in remount.
Currently ext2 uses these values only in ext2_error() and it is not leading to
any noticeable troubles. However somebody may be discouraged when he will try
to workaround EXT2_ERRORS_PANIC on the superblock by using errors=continue in
mount options.
This patch:
EXT2_ERRORS_CONTINUE should be read from the superblock as default value for
error behaviour. parse_option() should clean the alternative options and
should not change default value taken from the superblock.