David S. Miller [Wed, 2 Aug 2006 21:37:06 +0000 (14:37 -0700)]
[SECURITY]: Fix build with CONFIG_SECURITY disabled.
include/linux/security.h: In function ‘security_release_secctx’:
include/linux/security.h:2757: warning: ‘return’ with a value, in function returning void
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Leech [Wed, 2 Aug 2006 21:21:19 +0000 (14:21 -0700)]
[I/OAT]: Remove CPU hotplug lock from net_dma_rebalance
Remove the lock_cpu_hotplug()/unlock_cpu_hotplug() calls from
net_dma_rebalance
The lock_cpu_hotplug()/unlock_cpu_hotplug() sequence in
net_dma_rebalance is both incorrect (as pointed out by David Miller)
because lock_cpu_hotplug() may sleep while the net_dma_event_lock
spinlock is held, and unnecessary (as pointed out by Andrew Morton) as
spin_lock() disables preemption which protects from CPU hotplug
events.
Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug in the DECnet routing code where we were
selecting a loopback device in preference to an outward facing device
even when the destination was known non-local. This patch should fix
the problem.
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com> Signed-off-by: Steven Whitehouse <steve@chygwyn.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Catherine Zhang [Wed, 2 Aug 2006 21:12:06 +0000 (14:12 -0700)]
[AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch
From: Catherine Zhang <cxzhang@watson.ibm.com>
This patch implements a cleaner fix for the memory leak problem of the
original unix datagram getpeersec patch. Instead of creating a
security context each time a unix datagram is sent, we only create the
security context when the receiver requests it.
This new design requires modification of the current
unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely,
secid_to_secctx and release_secctx. The former retrieves the security
context and the latter releases it. A hook is required for releasing
the security context because it is up to the security module to decide
how that's done. In the case of Selinux, it's a simple kfree
operation.
Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>
Adrian Bunk [Wed, 2 Aug 2006 21:07:58 +0000 (14:07 -0700)]
[NET]: skb_queue_lock_key() is no longer used.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NET]: Remove lockdep_set_class() call from skb_queue_head_init().
The skb_queue_head_init() function is used both in drivers for private use
and in the core networking code. The usage models are vastly set of
functions that is only softirq safe; while the driver usage tends to be
more limited to a few hardirq safe accessor functions. Rather than
annotating all 133+ driver usages, for now just split this lock into a per
queue class. This change is obviously safe and probably should make
2.6.18.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
When I tested linux kernel 2.6.71.7 about statistics
"ipv6IfStatsOutFragCreates", and found that it couldn't increase
correctly. The criteria is RFC 2465:
ipv6IfStatsOutFragCreates OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of output datagram fragments that have
been generated as a result of fragmentation at
this output interface."
::= { ipv6IfStatsEntry 15 }
I think there are two issues in Linux kernel.
1st:
RFC2465 specifies the counter is "The number of output datagram
fragments...". I think increasing this counter after output a fragment
successfully is better. And it should not be increased even though a
fragment is created but failed to output.
2nd:
If we send a big ICMP/ICMPv6 echo request to a host, and receive
ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in
Linux kernel first fragmentation occurs in ICMP layer(maybe saying
transport layer is better), but this is not the "real"
fragmentation,just do some "pre-fragment" -- allocate space for date,
and form a frag_list, etc. The "real" fragmentation happens in IP layer
-- set offset and MF flag and so on. So I think in "fast path" for
ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by
upper layer we should also increase "ipv6IfStatsOutFragCreates".
Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
When I tested Linux kernel 2.6.17.7 about statistics
"ipv6IfStatsInHdrErrors", found that this counter couldn't increase
correctly. The criteria is RFC2465:
ipv6IfStatsInHdrErrors OBJECT-TYPE
SYNTAX Counter3
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of input datagrams discarded due to
errors in their IPv6 headers, including version
number mismatch, other format errors, hop count
exceeded, errors discovered in processing their
IPv6 options, etc."
::= { ipv6IfStatsEntry 2 }
When I send TTL=0 and TTL=1 a packet to a router which need to be
forwarded, router just sends an ICMPv6 message to tell the sender that
TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the
function ip6_forward).
Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Aug 2006 07:00:12 +0000 (00:00 -0700)]
[NET]: Kill the WARN_ON() calls for checksum fixups.
We have a more complete solution in the works, involving
the seperation of CHECKSUM_HW on input vs. output, and
having netfilter properly do incremental checksums.
But that is a very involved patch and is thus 2.6.19
material.
What we have now is infinitely better than the past,
wherein all TSO packets were dropped due to corrupt
checksums as soon at the NAT module was loaded. At
least now, the checksums do get fixed up, it just
isn't the cleanest nor most optimal solution.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 1 Aug 2006 06:46:18 +0000 (23:46 -0700)]
[NETFILTER]: SIP helper: expect RTP streams in both directions
Since we don't know in which direction the first packet will arrive, we
need to create one expectation for each direction, which is currently
prevented by max_expected beeing set to 1.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Add a dev_alloc_skb variant that takes a struct net_device * paramater.
For now that paramater is unused, but I'll use it to allocate the skb
from node-local memory in a follow-up patch. Also there have been some
other plans mentioned on the list that can use it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 1 Aug 2006 05:32:09 +0000 (22:32 -0700)]
[TCP]: Process linger2 timeout consistently.
Based upon guidance from Alexey Kuznetsov.
When linger2 is active, we check to see if the fin_wait2
timeout is longer than the timewait. If it is, we schedule
the keepalive timer for the difference between the timewait
timeout and the fin_wait2 timeout.
When this orphan socket is seen by tcp_keepalive_timer()
it will try to transform this fin_wait2 socket into a
fin_wait2 mini-socket, again if linger2 is active.
Not all paths were setting this initial keepalive timer correctly.
The tcp input path was doing it correctly, but tcp_close() wasn't,
potentially making the socket linger longer than it really needs to.
Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Mon, 31 Jul 2006 03:46:38 +0000 (20:46 -0700)]
[SECURITY] secmark: nul-terminate secdata
The patch below fixes a problem in the iptables SECMARK target, where
the user-supplied 'selctx' string may not be nul-terminated.
From initial analysis, it seems that the strlen() called from
selinux_string_to_sid() could run until it arbitrarily finds a zero,
and possibly cause a kernel oops before then.
The impact of this appears limited because the operation requires
CAP_NET_ADMIN, which is essentially always root. Also, the module is
not yet in wide use.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Tucker [Mon, 31 Jul 2006 03:44:19 +0000 (20:44 -0700)]
[NET] infiniband: Cleanup ib_addr module to use the netevents
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Tucker [Mon, 31 Jul 2006 03:43:26 +0000 (20:43 -0700)]
[NET]: Network Event Notifier Mechanism.
This patch uses notifier blocks to implement a network event
notifier mechanism.
Clients register their callback function by calling
register_netevent_notifier() like this:
static struct notifier_block nb = {
.notifier_call = my_callback_func
};
...
register_netevent_notifier(&nb);
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Refer to RFC2012, tcpAttemptFails is defined as following:
tcpAttemptFails OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of times TCP connections have made a direct
transition to the CLOSED state from either the SYN-SENT
state or the SYN-RCVD state, plus the number of times TCP
connections have made a direct transition to the LISTEN
state from the SYN-RCVD state."
::= { tcp 7 }
When I lookup into RFC793, I found that the state change should occured
under following condition:
1. SYN-SENT -> CLOSED
a) Received ACK,RST segment when SYN-SENT state.
2. SYN-RCVD -> CLOSED
b) Received SYN segment when SYN-RCVD state(came from LISTEN).
c) Received RST segment when SYN-RCVD state(came from SYN-SENT).
d) Received SYN segment when SYN-RCVD state(came from SYN-SENT).
3. SYN-RCVD -> LISTEN
e) Received RST segment when SYN-RCVD state(came from LISTEN).
In my test, those direct state transition can not be counted to
tcpAttemptFails.
Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>
James Morris [Mon, 31 Jul 2006 03:21:45 +0000 (20:21 -0700)]
[TCP]: fix memory leak in net/ipv4/tcp_probe.c::tcpprobe_read()
Based upon a patch by Jesper Juhl.
Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Hemminger <shemminger@osdl.org> Acked-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 31 Jul 2006 03:20:54 +0000 (20:20 -0700)]
[ATALK]: Make CONFIG_DEV_APPLETALK a tristate.
Otherwise we allow building appletalk drivers in-kernel when
CONFIG_ATALK is modular. That doesn't work because these drivers use
symbols such as "alloc_talkdev" which is exported from code built
by CONFIG_ATALK.
Noticed by Toralf Förster.
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 31 Jul 2006 03:20:28 +0000 (20:20 -0700)]
[NET]: Fix ___pskb_trim when entire frag_list needs dropping
When the trim point is within the head and there is no paged data,
___pskb_trim fails to drop the first element in the frag_list.
This patch fixes this by moving the len <= offset case out of the
page data loop.
This patch also adds a missing kfree_skb on the frag that we just
cloned.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 31 Jul 2006 03:19:33 +0000 (20:19 -0700)]
[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls
The current users of ip6_dst_lookup can be divided into two classes:
1) The caller holds no locks and is in user-context (UDP).
2) The caller does not want to lookup the dst cache at all.
The second class covers everyone except UDP because most people do
the cache lookup directly before calling ip6_dst_lookup. This patch
adds ip6_sk_dst_lookup for the first class.
Similarly ip6_dst_store users can be divded into those that need to
take the socket dst lock and those that don't. This patch adds
__ip6_dst_store for those (everyone except UDP/datagram) that don't
need an extra lock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 31 Jul 2006 03:19:11 +0000 (20:19 -0700)]
[XFRM]: Fix protocol field value for outgoing IPv6 GSO packets
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (26 commits)
V4L/DVB (4380): Bttv: Revert VBI_OFFSET to previous value, it works better
V4L/DVB (4379): Videodev: Check return value of class_device_register() correctly
V4L/DVB (4373): Correctly handle sysfs error leg file removal in pvrusb2
V4L/DVB (4368): Bttv: use class_device_create_file and handle errors
V4L/DVB (4367): Videodev: Handle class_device related errors
V4L/DVB (4365): OVERLAY flag were enabled by mistake
V4L/DVB (4344): Fix broken dependencies on media Kconfig
V4L/DVB (4343): Fix for compilation without V4L1 or V4L1_COMPAT
V4L/DVB (4342): Fix ext_controls align on 64 bit architectures
V4L/DVB (4341): VIDIOCSMICROCODE were missing on compat_ioctl32
V4L/DVB (4322): Fix dvb-pll autoprobing
V4L/DVB (4311): Fix possible dvb-pll oops
V4L/DVB (4337): Refine dead code elimination in pvrusb2
V4L/DVB (4323): [budget/budget-av/budget-ci/budget-patch drivers] fixed DMA start/stop code
V4L/DVB (4316): Check __must_check warnings
V4L/DVB (4314): Set the Auxiliary Byte when tuning LG H06xF in analog mode
V4L/DVB (4313): Bugfix for keycode calculation on NPG remotes
V4L/DVB (4310): Saa7134: rename dmasound_{init, exit}
V4L/DVB (4306): Support non interlaced capture by default for saa713x
V4L/DVB (4298): Check all __must_check warnings in bttv.
...
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Minor comment fix for misc_64.S
[POWERPC] Use H_CEDE on non-SMT
[POWERPC] force 64bit mode in fwnmi handlers to workaround firmware bugs
[POWERPC] PMAC_APM_EMU should depend on ADB_PMU
[POWERPC] Fix new interrupt code (MPIC detection)
[POWERPC] Fix new interrupt code (MPIC endianness)
[POWERPC] Add cpufreq support for Xserve G5
[POWERPC] Xserve G5 thermal control fixes
[POWERPC] Fix mem= handling when the memory limit is > RMO size
[POWERPC] More offb/bootx fixes
[POWERPC] Fix legacy_serial.c error handling on 32 bits
[POWERPC] Fix default clock for udbg_16550
[POWERPC] Fix non-MPIC CHRPs with CONFIG_SMP set
[POWERPC] Fix 32 bits warning in prom_init.c
[POWERPC] Workaround Pegasos incorrect ISA "ranges"
[POWERPC] fix up front-LED Kconfig
[PATCH] rivafb/nvidiafb: race between register_framebuffer and *_bl_init
Since we now use the generic backlight infrastructure, I think we need to
call rivafb_bl_init before calling register_framebuffer since otherwise
rivafb_bl_init might race with the framebuffer layer already opening the
device and setting up the video mode. In this case we might end up with a
not yet fully intialized backlight (info->bl_dev still NULL) when calling
riva_bl_set_power via rivafb_set_par/rivafb_load_video_mode and the kernel
dies without any further notice during boot.
This fixes booting current git on a PB 6,1. In this case radeonfb/atyfb
would be affected too - I can fix that too but don't have any hardware to
test this on.
Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes several problems:
- The legacy backlight value might be set at interrupt time. Introduced
a worker to prevent it from directly calling the backlight code.
- via-pmu allows the backlight to be grabbed, in which case we need to
prevent other kernel code from changing the brightness.
- Don't send PMU requests in via-pmu-backlight when the machine is about
to sleep or waking up.
- More Kconfig fixes.
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Many IBM Thinkpad T4* models and some R* and X* with radeon video cards draw
too much power when suspended to RAM, reducing drastically the battery
lifetime. The solution is to enable suspend-to-D2 on these machines. They
are whitelisted through their subsystem vendor/device ID. This fixes
http://bugzilla.kernel.org/show_bug.cgi?id=3022
The patch introduces a framework to alter the pm_mode and reinit_func fields
of the radeonfb_info structure based on a whitelist. This should facilitate
future hardware-dependent workarounds. The workaround for the Samsung P35
that is already in the radeonfb code has been rewritten using this framework.
The behavior can be overridden with module options:
i) video=radeonfb:force_sleep=1
enable suspend-to-D2 also on non-whitelisted machines (useful for
testing new notebook models),
ii) video=radeonfb:ignore_devlist=1
Disable checking the whitelist and do not apply any workarounds.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] fbdev: statically link the framebuffer notification functions
The backlight and lcd subsystems can be notified by the framebuffer layer
of blanking events. However, these subsystems, as a whole, can function
independently from the framebuffer layer. But in order to enable to the
lcd and backlight subsystems, the framebuffer has to be compiled also,
effectively sucking in a huge amount of unneeded code.
To prevent dependency problems, separate out the framebuffer notification
mechanism from the framebuffer layer and permanently link it to the kernel.
Based on a bug report from Russ Ross <russruss@gmail.com>
According to the spec:
"The remove request asks the file server both to remove the file
represented by fid and to clunk the fid, even if the remove fails."
but the Linux client seems to expect the fid to be valid after a failed
remove attempt. Specifically, I'm getting this behavior when attempting to
remove a non-empty directory.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Russ Ross [Sun, 30 Jul 2006 10:04:15 +0000 (03:04 -0700)]
[PATCH] 9p: fix marshalling bug in tcreate with empty extension field
Signed-off-by: Russ Ross <russross@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For files other than IFREG, nobh option doesn't make sense. Modifications
to them are journalled and needs buffer heads to do that. Without this
patch, we get kernel oops in page_buffers().
kernel/timer.c defines a (per-cpu) pointer to tvec_base_t, but initializes
it using { &a_tvec_base_t }, which sparse warns about; change this to just
&a_tvec_base_t.
[PATCH] freevxfs: Add missing lock_kernel() to vxfs_readdir
Commit 7b2fd697427e73c81d5fa659efd91bd07d303b0e in the historical GIT tree
stopped calling the readdir member of a file_operations struct with the big
kernel lock held, and fixed up all the readdir functions to do their own
locking. However, that change added calls to unlock_kernel() in
vxfs_readdir, but no call to lock_kernel(). Fix this by adding a call to
lock_kernel().
Signed-off-by: Josh Triplett <josh@freedesktop.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Thomas Horsley [Sun, 30 Jul 2006 10:04:12 +0000 (03:04 -0700)]
[PATCH] documentation: Documentation/initrd.txt
I spent a long time the other day trying to examine an initrd image on a
fedora core 5 system because the initrd.txt file is apparently obsolete.
Here is a patch which I hope will reduce future confusion for others.
Signed-off-by: Thomas Horsley <tom.horsley@ccur.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is entirely possible (though rare) that jiffies half-wraps around, while a
dentry/inode remains in the cache. This could mean that the dentry/inode is
not invalidated for another half wraparound-time.
To get around this problem, use 64-bit jiffies. The only problem with this is
that dentry->d_time is 32 bits on 32-bit archs. So use d_fsdata as the high
32 bits. This is an ugly hack, but far simpler, than having to allocate
private data just for this purpose.
Since 64-bit jiffies can be assumed never to wrap around, simple comparison
can be used, and a zero time value can represent "invalid".
An attribute and entry timeout of zero should mean, that the entity is
invalidated immediately after the operation. Previously invalidation only
happened at the next clock tick.
Roland McGrath [Sun, 30 Jul 2006 10:04:06 +0000 (03:04 -0700)]
[PATCH] vDSO hash-style fix
The latest toolchains can produce a new ELF section in DSOs and
dynamically-linked executables. The new section ".gnu.hash" replaces
".hash", and allows for more efficient runtime symbol lookups by the
dynamic linker. The new ld option --hash-style={sysv|gnu|both} controls
whether to produce the old ".hash", the new ".gnu.hash", or both. In some
new systems such as Fedora Core 6, gcc by default passes --hash-style=gnu
to the linker, so that a standard invocation of "gcc -shared" results in
producing a DSO with only ".gnu.hash". The new ".gnu.hash" sections need
to be dealt with the same way as ".hash" sections in all respects; only the
dynamic linker cares about their contents. To work with older dynamic
linkers (i.e. preexisting releases of glibc), a binary must have the old
".hash" section. The --hash-style=both option produces binaries that a new
dynamic linker can use more efficiently, but an old dynamic linker can
still handle.
The new section runs afoul of the custom linker scripts used to build vDSO
images for the kernel. On ia64, the failure mode for this is a boot-time
panic because the vDSO's PT_IA_64_UNWIND segment winds up ill-formed.
This patch addresses the problem in two ways.
First, it mentions ".gnu.hash" in all the linker scripts alongside ".hash".
This produces correct vDSO images with --hash-style=sysv (or old tools),
with --hash-style=gnu, or with --hash-style=both.
Second, it passes the --hash-style=sysv option when building the vDSO
images, so that ".gnu.hash" is not actually produced. This is the most
conservative choice for compatibility with any old userland. There is some
concern that some ancient glibc builds (though not any known old production
system) might choke on --hash-style=both binaries. The optimizations
provided by the new style of hash section do not really matter for a DSO
with a tiny number of symbols, as the vDSO has. If someone wants to use
=gnu or =both for their vDSO builds and worry less about that
compatibility, just change the option and the linker script changes will
make any choice work fine.
Signed-off-by: Roland McGrath <roland@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Andi Kleen <ak@muc.de> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Steven Rostedt [Sun, 30 Jul 2006 10:04:03 +0000 (03:04 -0700)]
[PATCH] reference rt-mutex-design in rtmutex.c
In order to prevent Doc Rot, this patch adds a reference to the design
document for rtmutex.c in rtmutex.c. So when someone needs to update or
change the design of that file they will know that a document actually
exists that explains the design (helping them change it), and hopefully
that they will update the document if they too change the design.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tim Chen [Sun, 30 Jul 2006 10:04:02 +0000 (03:04 -0700)]
[PATCH] Reducing local_bh_enable/disable overhead in irqtrace
The recent changes from irqtrace feature has added overheads to
local_bh_disable and local_bh_enable that reduces UDP performance across
x86_64 and IA64, even though IA64 does not support the irqtrace feature.
Patch in question is
Prior to this patch, local_bh_disable was a short macro. Now it is a
function which calls __local_bh_disable with added irq flags save and
restore. The irq flags save and restore were also added to
local_bh_enable, probably for injecting the trace irqs code.
This overhead is on the generic code path across all architectures. On a
IA_64 test machine (Itanium-2 1.6 GHz) running a benchmark like netperf's
UDP streaming test, the added overhead results in a drop of 3% in
throughput, as udp_sendmsg calls the local_bh_enable/disable several times.
Other workloads that have heavy usages of local_bh_enable/disable could
also be affected. The patch ideally should not have affected IA-64
performance as it does not have IRQ tracing support. A significant portion
of the overhead is in the added irq flags save and restore, which I think
is not needed if IRQ tracing is unused. A suggested patch is attached
below that recovers the lost performance. However, the "ifdef"s in the
patch are a bit ugly.
Signed-off-by: Tim Chen <tim.c.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ufs: remove incorrect unlock_kernel from failure path in ufs_symlink()
ufs_symlink, in one of its error paths, calls unlock_kernel without ever
having called lock_kernel(); fix this by creating and jumping to a new
label out_notlocked rather than the out label used after calling
lock_kernel().
[PATCH] efs: add entry for EFS filesystem to MAINTAINERS as Orphan
The EFS filesystem does not have an entry in MAINTAINERS; add one, giving
the EFS filesystem and listing the status as Orphan, per the note on that
page saying "I'm no longer actively maintaining EFS".
[PATCH] efs: Remove incorrect unlock_kernel from failure path in efs_symlink_readpage()
If efs_symlink_readpage hits the -ENAMETOOLONG error path, it will call
unlock_kernel without ever having called lock_kernel(); fix this by
creating and jumping to a new label fail_notlocked rather than the fail
label used after calling lock_kernel().
[PATCH] Remove incorrect unlock_kernel from allocation failure path in coda_open()
Commit 398c53a757702e1e3a7a2c24860c7ad26acb53ed (in the historical GIT
tree) moved the lock_kernel() in coda_open after the allocation of a
coda_file_info struct, but left an unlock_kernel() in the allocation
failure error path; remove it.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ondrej Zary [Sun, 30 Jul 2006 10:03:55 +0000 (03:03 -0700)]
[PATCH] Fix swsusp with PNP BIOS
swsusp is unable to suspend my machine (DTK FortisPro TOP-5A notebook) with
kernel 2.6.17.5 because it's unable to suspend PNP device 00:16 (mouse).
The problem is in PNP BIOS. pnp_bus_suspend() calls pnp_stop_dev() for the
device if the device can be disabled according to pnp_can_disable(). The
problem is that pnpbios_disable_resources() returns -EPERM if the device is
not dynamic (!pnpbios_is_dynamic()) but insert_device() happily sets
PNP_DISABLE capability/flag even if the device is not dynamic. So we try
to disable non-dynamic devices which will fail. This patch prevents
insert_device() from setting PNP_DISABLE if the device is not dynamic and
fixes suspend on my system.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
where inotify_umount_inodes() takes the
mutex_lock(&inode->inotify_mutex);
The BC relationship comes directly from inotify_find_update_watch():
s32 inotify_find_update_watch(struct inotify_handle *ih, struct inode *inode,
u32 mask)
{
...
mutex_lock(&inode->inotify_mutex);
mutex_lock(&ih->mutex);
The CD relationship comes from inotify_rm_wd:
inotify_rm_wd does
mutex_lock(&inode->inotify_mutex);
mutex_lock(&ih->mutex)
and then calls inotify_remove_watch_locked() which calls
notify_dev_queue_event() which does
mutex_lock(&dev->ev_mutex);
(this strictly is a BCD relationship)
The DA relationship comes from the most interesting part:
inotify_dev_queue_event schedules a kernel_event which does a
kmem_cache_alloc( , GFP_KERNEL) which may try to shrink slabs, including
the inode cache .. which then takes iprune_mutex.
And voila, there is an AB, a BC, a CD relationship (even a direct BCD),
and also now a DA relationship -> a circular type AB-BA deadlock but
involving 4 locks.
The solution is simple: kernel_event() is NOT allowed to use GFP_KERNEL,
but must use GFP_NOFS to not cause recursion into the VFS.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Robert Love <rml@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Steven Rostedt [Sun, 30 Jul 2006 10:03:53 +0000 (03:03 -0700)]
[PATCH] Add linux-mm mailing list for memory management in MAINTAINERS file
Since I didn't know about the linux-mm mailing list until I spammed all
those that had their names anywhere in the mm directory, I'm sending this
patch to add the linux-mm mailing list to the MAINTAINERS file.
Also, since mm is so broad, it doesn't have a single person to maintain it,
and thus no maintainer is listed. I also left the status as Maintained,
since it obviously is.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olaf Hering [Sun, 30 Jul 2006 10:03:52 +0000 (03:03 -0700)]
[PATCH] hide onboard graphics drivers on G5
Hide the video drivers for onboard graphics found in early PCI PowerMacs in
Apple G5 config files.
drivers/built-in.o: In function `.platinumfb_probe':
platinumfb.c:(.text+0x377a0): undefined reference to `.nvram_read_byte'
platinumfb.c:(.text+0x37830): undefined reference to `.nvram_read_byte'
drivers/built-in.o: In function `.control_init':
controlfb.c:(.init.text+0x1938): undefined reference to `.nvram_read_byte'
controlfb.c:(.init.text+0x1968): undefined reference to `.nvram_read_byte'
drivers/built-in.o: In function `.valkyriefb_init':
(.init.text+0x2300): undefined reference to `.nvram_read_byte'
drivers/built-in.o:(.init.text+0x239c): more undefined references to `.nvram_read_byte' follow
Signed-off-by: Olaf Hering <olh@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Daniel Ritz [Sun, 30 Jul 2006 10:03:49 +0000 (03:03 -0700)]
[PATCH] pcmcia: fix ioctl GET_CONFIGURATION_INFO for pcmcia_cards
Values displayed when by cardctl config are horribly wrong for 16bit cards.
this fixes it up by not using memcpy() since source and target struct are
very different.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Daniel Ritz [Sun, 30 Jul 2006 10:03:47 +0000 (03:03 -0700)]
[PATCH] pcmcia: fix ioctl for GET_STATUS and GET_CONFIGURATION_INFO
the p_dev == NULL checks are wrong. the called functions handle a NULL
p_dev on their own. w/o this patch output of cardcctl status and cardctl
config is broken for cardbus cards or when the slot is empty.
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
WARNING: drivers/video/console/mdacon.o - Section mismatch: reference to .init.text: from .text between 'mdacon_startup' (at offset 0x123) and 'mdacon_init'
WARNING: drivers/video/console/mdacon.o - Section mismatch: reference to .init.text: from .text between 'mdacon_startup' (at offset 0x18b) and 'mdacon_init'
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The SGI IOC4 IDE device always shares an interrupt with other devices which
are part of IOC4. As such, IDEPCI_SHARE_IRQ should always be enabled when
BLK_DEV_SGIIOC4 is enabled.
[PATCH] Add DocBook documentation for workqueue functions
kernel/workqueue.c was omitted from generating kernel documentation. This
adds a new section "Workqueues and Kevents" and adds documentation for some
of the functions.
Some functions in this file already had DocBook-style comments, now they
finally become visible.
Randy Dunlap [Sun, 30 Jul 2006 10:03:40 +0000 (03:03 -0700)]
[PATCH] fix kernel-api doc for kernel/resource.c
insert_resource() was unexported, so kernel-doc needs to be told to search
kernel/resource.c for internal functions instead of exported functions so that
it won't report an error.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jim Houston [Sun, 30 Jul 2006 10:03:39 +0000 (03:03 -0700)]
[PATCH] fix cond_resched() fix
In cond_resched_lock() it calls __resched_legal() before dropping the spin
lock. __resched_legal() will always finds the preempt_count non-zero and
will prevent the call to __cond_resched().
The attached patch adds a parameter to __resched_legal() with the expected
preempt_count value.
So, due to underparenthesisation, this INDEX(i+1) is now a ... (TVR_BITS + i
+ 1 * TVN_BITS)) ...
So this bugfix changes behaviour. It worked before by sheer luck:
"If i was anything but 0, it was broken. But this was only used by
s390 and arm. Since it was for the next interrupt, could that next
interrupt be a problem (going into the second cascade)? But it was
probably seldom wrong. That is, this would fail if the next
interrupt was in the second cascade, and was wrapped. Which may
never of happened. Also if it did happen, it would have just missed
the interrupt.
If an interrupt was missed, and no one was there to miss it, was it
really missed :-)"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] cpu hotplug: replace __devinit* with __cpuinit* for cpu notifications
Few of the callback functions and notifier blocks that are associated with cpu
notifications incorrectly have __devinit and __devinitdata. They should be
__cpuinit and __cpuinitdata instead.
It makes no functional difference but wastes text area when CONFIG_HOTPLUG is
enabled and CONFIG_HOTPLUG_CPU is not.
This patch is part of an effort to unify the panic_on_oops behaviour across
all architectures that implement it.
It was pointed out to me by Andi Kleen that if an oops has occured in
interrupt context, then calling sleep() in the oops path will only cause a
panic, and that it would be really better for it not to be in the path at
all.
This patch removes the ssleep() call and reworks the console message
accordinly. I have a slght concern that the resulting console message is
too long, feedback welcome.
For powerpc it also unifies the 32bit and 64bit behaviour.
Fror x86_64, this patch only updates the console message, as ssleep() is
already not present.
Signed-off-by: Horms <horms@verge.net.au> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michal Feix [Sun, 30 Jul 2006 10:03:32 +0000 (03:03 -0700)]
[PATCH] nbd: Abort request on data reception failure
When reading from nbd device, we need to receive all the data after
receiving reply packet from the server - otherwise such request will never
be ended.
If socket is closed right after accepting reply control packet and in the
middle of waiting for read data, nbd_read_stat() returns NULL and
nbd_end_request() is not called.
This patch fixes it.
Signed-off-by: Michal Feix <michal@feix.cz> Acked-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michal Feix [Sun, 30 Jul 2006 10:03:31 +0000 (03:03 -0700)]
[PATCH] nbd: Check magic before doing anything else
We should check magic sequence in reply packet before trying to find
request with it's request handle. This also solves the problem with
"Unexpected reply" message beeing logged, when packet with invalid magic is
received.
Signed-off-by: Michal Feix <michal@feix.cz> Acked-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The pkt_*_dev functions operate on not-this-blockdevice, and that is
sufficiently checked at setup time. As a result there is a natural
hierarchy, which needs nesting annotations
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michal Schmidt [Sun, 30 Jul 2006 10:03:29 +0000 (03:03 -0700)]
[PATCH] IDE: Touch NMI watchdog during resume from STR
When resuming from suspend-to-RAM, the NMI watchdog detects a lockup in
ide_wait_not_busy. Here's a screenshot of the trace taken by a digital
camera: http://www.uamt.feec.vutbr.cz/rizeni/pom/DSC03510-2.JPG
Let's touch the NMI watchdog in ide_wait_not_busy. The system then resumes
correctly from STR.
[akpm@osdl.org: modular build fix] Signed-off-by: Michal Schmidt <xschmi00@stud.feec.vutbr.cz> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bibo, mao [Sun, 30 Jul 2006 10:03:26 +0000 (03:03 -0700)]
[PATCH] IA64: kprobe invalidate icache of jump buffer
Kprobe inserts breakpoint instruction in probepoint and then jumps to
instruction slot when breakpoint is hit, the instruction slot icache must
be consistent with dcache. Here is the patch which invalidates instruction
slot icache area.
Without this patch, in some machines there will be fault when executing
instruction slot where icache content is inconsistent with dcache.
Signed-off-by: bibo,mao <bibo.mao@intel.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] kprobe-booster: disable in preemptible kernel
The kprobe-booster's safety check against preemption does not work well
now, because the preemption count has been modified by read_rcu_lock() in
atomic_notifier_call_chain() before we check it. So, I'd like to prevent
boosting kprobe temporarily if the kernel is preemptable.
Now we are searching for the good solution.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>