The storage probe feature of the PS3 hypervisor returns device IDs. Add
the corresponding repository routine ps3_repository_find_device_by_id()
which can be used to retrieve the device info from the repository.
Change the PS3 bus_id and dev_id from type unsigned int to u64. These
IDs are 64-bit in the repository, and the special storage notification
device has a device ID of ULONG_MAX.
Michael Neuling [Fri, 18 Jan 2008 04:50:30 +0000 (15:50 +1100)]
[POWERPC] kdump shutdown hook support
This adds hooks into the default_machine_crash_shutdown so drivers can
register a function to be run in the first kernel before we hand off
to the second kernel. This should only be used in exceptional
circumstances, like where the device can't be reset in the second
kernel alone (as is the case with eHEA). To emphasize this, the
number of handles allowed to be registered is currently #def to 1.
This uses the setjmp/longjmp code around the call out to the
registered hooks, so any bogus exceptions we encounter will hopefully
be recoverable.
Tested with bogus data and instruction exceptions.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Neuling [Fri, 18 Jan 2008 04:50:30 +0000 (15:50 +1100)]
[POWERPC] Make setjmp/longjmp code usable outside of xmon
This makes the setjmp/longjmp code used by xmon, generically available
to other code. It also removes the requirement for debugger hooks to
be only called on 0x300 (data storage) exception.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Olof Johansson [Fri, 28 Dec 2007 04:11:09 +0000 (15:11 +1100)]
[POWERPC] Make smp_send_stop() handle panic and xmon reboot
smp_send_stop() will send an IPI to all other cpus to shut them down.
However, for the case of xmon-based reboots (as well as potentially some
panics), the other cpus are (or might be) spinning with interrupts off,
and won't take the IPI.
Current code will drop us into the debugger when the IPI fails, which
means we're in an infinite loop that we can't get out of without an
external reset of some sort.
Instead, make the smp_send_stop() IPI call path just print the warning
about being unable to send IPIs, but make it return so the rest of the
shutdown sequence can continue. It's not perfect, but the lesser of
two evils.
Also move the call_lock handling outside of smp_call_function_map so we
can avoid deadlocks in smp_send_stop().
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
Kumar Gala [Wed, 23 Jan 2008 11:53:47 +0000 (05:53 -0600)]
[RAPIDIO] Fix compile error and warning
drivers/rapidio/rio.c: In function 'rio_get_asm':
drivers/rapidio/rio.c:413: error: implicit declaration of function 'in_interrupt'
drivers/rapidio/rio.c: In function 'rio_init_mports':
drivers/rapidio/rio.c:480: warning: format '%8.8lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'
drivers/rapidio/rio.c:480: warning: format '%8.8lx' expects type 'long unsigned int', but argument 3 has type 'resource_size_t'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Dale Farnsworth [Thu, 22 Nov 2007 15:46:20 +0000 (08:46 -0700)]
[POWERPC] 85xx: Respect KERNELBASE, PAGE_OFFSET, and PHYSICAL_START on e500
The e500 MMU init code previously assumed KERNELBASE always equaled
PAGE_OFFSET and PHYSICAL_START was 0. This is useful for kdump
support as well as asymetric multicore.
For the initial kdump support the secondary kernel will run at 32M
but need access to all of memory so we bump the initial TLB up to
64M. This also matches with the forth coming ePAPR spec.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Scott Wood [Mon, 14 Jan 2008 16:29:35 +0000 (10:29 -0600)]
[POWERPC] fsl_soc: Fix get_immrbase() to use ranges, rather than reg.
Don't depend on the reg property as a way to determine the base
of the immr space. The reg property might be defined differently for
different SoC families.
Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Timur Tabi [Wed, 9 Jan 2008 23:35:05 +0000 (17:35 -0600)]
[POWERPC] QE: Add support for Freescale QUICCEngine UART
Add support for UART serial ports using a Freescale QUICCEngine. Update
booting-without-of.txt to define new properties for a QE UART node. Update
the MPC8323E-MDS device tree to add UCC5 as a UART. Update the QE library
to support slow UCC devices and modules.
Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Timur Tabi [Tue, 8 Jan 2008 16:30:58 +0000 (10:30 -0600)]
[POWERPC] QE: Add ability to upload QE firmware
Define the layout of a binary blob that contains a QE firmware and instructions
on how to upload it. Add function qe_upload_firmware() to parse the blob
and perform the actual upload. Fully define 'struct rsp' in immap_qe.h to
include the actual RISC Special Registers. Added description of a new
QE firmware node to booting-without-of.txt.
Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Vitaly Bordug [Thu, 6 Dec 2007 22:51:22 +0000 (01:51 +0300)]
phy/fixed.c: rework to not duplicate PHY layer functionality
With that patch fixed.c now fully emulates MDIO bus, thus no need
to duplicate PHY layer functionality. That, in turn, drastically
simplifies the code, and drops down line count.
As an additional bonus, now there is no need to register MDIO bus
for each PHY, all emulated PHYs placed on the platform fixed MDIO bus.
There is also no more need to pre-allocate PHYs via .config option,
this is all now handled dynamically.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Mon, 14 Jan 2008 23:02:19 +0000 (17:02 -0600)]
[POWERPC] FSL: Rework PCI/PCIe support for 85xx/86xx
The current PCI code for Freescale 85xx/86xx was treating the virtual
P2P PCIe bridge as a transparent bridge. Rather than doing that fixup
the virtual P2P bridge by copying the resources from the PHB.
Also, fixup a bit of the code for dealing with resource_size_t being
64-bits and how we set ATMU registers for >4G.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Mon, 14 Jan 2008 15:41:36 +0000 (09:41 -0600)]
[POWERPC] Fixup transparent P2P resources
For transparent P2P bridges the first 3 resources may get set from based on
BAR registers and need to get fixed up. Where as the remainder come from the
parent bus and have already been fixed up.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Sat, 12 Jan 2008 23:23:26 +0000 (17:23 -0600)]
[POWERPC] Ensure we only handle PowerMac PCI bus fixup for memory resources
The fixup code that handles the case for PowerMac's that leave bridge
windows open over an inaccessible region should only be applied to
memory resources (IORESOURCE_MEM). If not we can get it trying to fixup
IORESOURCE_IO on some systems since the other conditions that are used to
detect the case can easily match for IORESOURCE_IO.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Wed, 9 Jan 2008 17:27:23 +0000 (11:27 -0600)]
[POWERPC] Fix handling of memreserve if the range lands in highmem
There were several issues if a memreserve range existed and happened
to be in highmem:
* The bootmem allocator is only aware of lowmem so calling
reserve_bootmem with a highmem address would cause a BUG_ON
* All highmem pages were provided to the buddy allocator
Added a lmb_is_reserved() api that we now use to determine if a highem
page should continue to be PageReserved or provided to the buddy
allocator.
Also, we incorrectly reported the amount of pages reserved since all
highmem pages are initally marked reserved and we clear the
PageReserved flag as we "free" up the highmem pages.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Paul Mackerras [Wed, 23 Jan 2008 21:35:13 +0000 (08:35 +1100)]
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Jordan Crouse [Tue, 22 Jan 2008 22:30:16 +0000 (23:30 +0100)]
x86: GEODE fix a race condition in the MFGPT timer tick
When we set the MFGPT timer tick, there is a chance that we'll
immediately assert an event. If for some reason the IRQ routing
for this clock has been setup for some other purpose, then we
could end up firing an interrupt into the SMM handler or worse.
This rearranges the timer tick init function to initalize the handler
before we set up the MFGPT clock to make sure that even if we get
an event, it will go to the handler.
Furthermore, in the handler we need to make sure that we clear the
event, even if the timer isn't running.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Tested-by: Arnd Hannemann <hannemann@i4.informatik.rwth-aachen.de>
Fix typo in arch/powerpc/boot/flatdevtree_env.h.
There is no Documentation/networking/ixgbe.txt.
README.cycladesZ is now in Documentation/.
wavelan.p.h is now in drivers/net/wireless/.
HFS.txt is now Documentation/filesystems/hfs.txt.
OSS-files are now in sound/oss/.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Fries [Tue, 22 Jan 2008 11:31:39 +0000 (03:31 -0800)]
W1: w1_therm.c is flagging 0C etc as invalid
The extra rom[0] check is flagging valid temperatures as invalid when
there is already a CRC data transmission check.
w1_therm_read_bin()
if (rom[8] == crc && rom[0])
verdict = 1;
Requiring rom[0] to be non-zero will flag as invalid temperature
conversions when the low byte is zero, specifically the temperatures 0C,
16C, 32C, 48C, -16C, -32C, and -48C.
The CRC check is produced on the device for the previous 8 bytes and is
required to ensure the data integrity in transmission. I don't see why the
extra check for rom[0] being non-zero is in there. Evgeniy Polyakov didn't
know either. Just for a check I unplugged the sensor, executed a
temperature conversion, and read the results. The read was all ff's, which
also failed the CRC, so it doesn't need to protect against a disconnected
sensor.
I have more extensive patches in the work, but these two trivial ones will
do for today. I would like to hear from people who use the ds2490 USB to
one wire dongle. 1 if you would be willing to test the patches as I
currently only have the one sensor on a short parisite powered wire, 2 if
there is any cheap sources for the ds2490.
Signed-off-by: David Fries <david@fries.net> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correct the decoding of negative C temperatures. The code did a binary OR
of two bytes to make a 16 bit value, but assignd it to an integer. This
caused the value to not be sign extended and to loose that it was a
negative number in the assignment.
Before the patch (in my freezer),
w1_slave
ed fe 4b 46 7f ff 03 10 e4 : crc=e4 YES
ed fe 4b 46 7f ff 03 10 e4 t=4078
With the patch,
e3 fe 4b 46 7f ff 0d 10 81 : crc=81 YES
e3 fe 4b 46 7f ff 0d 10 81 t=-17
Signed-off-by: David Fries <david@fries.net> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bjorn Helgaas [Tue, 22 Jan 2008 12:21:03 +0000 (07:21 -0500)]
hwmon: (it87) request only Environment Controller ports
The IT8705F and related parts are Super I/O controllers that contain
many separate devices.
Some BIOSes describe IT8705F I/O port usage under a motherboard device
(PNP0C02) with overlapping regions, e.g., 0x290-0x29f and 0x290-0x294.
The it87 driver supports only the Environment Controller, which requires
only two ISA ports, but it used to request an eight-port range. If that
range exceeds a range reported by the BIOS, as 0x290-0x297 would, the
request fails, and the it87 driver cannot claim the device.
This patch makes the it87 driver request only the two ports used for the
Environment Controller device.
Systems where this problem has been reported:
Gigabyte GA-K8N Ultra 9
Gigabyte M56S-S3
Gigabyte GA-965G-DS3
The patch above increases the number of PNP port resources we support.
Prior to this patch, we ignored some port resources, which masked the
it87 problem.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
It tried to fix long standing bugzilla entries, but the solution was
reported to break other systems. The reporter of
http://bugzilla.kernel.org/show_bug.cgi?id=9791
tracked it down to this commit and confirmed that reverting the patch
restores the correct behaviour. It's too late in the release cycle to
find a better solution than reverting the commit to avoid regressions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Tue, 22 Jan 2008 03:40:05 +0000 (19:40 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
tc35815: Use irq number for tc35815-mac platform device id
[MIPS] Malta: Fix reading the PCI clock frequency on big-endian
[MIPS] SMTC: Fix build error.
Andrew G. Morgan [Tue, 22 Jan 2008 01:18:30 +0000 (17:18 -0800)]
Fix filesystem capability support
In linux-2.6.24-rc1, security/commoncap.c:cap_inh_is_capped() was
introduced. It has the exact reverse of its intended behavior. This
led to an unintended privilege esculation involving a process'
inheritable capability set.
To be exposed to this bug, you need to have Filesystem Capabilities
enabled and in use. That is:
- CONFIG_SECURITY_FILE_CAPABILITIES must be defined for the buggy code
to be compiled in.
- You also need to have files on your system marked with fI bits raised.
Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
Stefan Schmidt [Tue, 22 Jan 2008 01:18:27 +0000 (17:18 -0800)]
s3c2410_fb: fix line length calculation
Fix line length calculation. var->width is the size of the display in mm. We
like to use the pixel size.
Without this fix, dynamic (fbset) based resolution and depths changes with
s3c2410_fb don't work at all.
Spotted by john cass <johnpcass@yahoo.com>
Signed-off-by: Stefan Schmidt <stefan@openmoko.org> Signed-off-by: Harald Welte <laforge@openmoko.org> Acked-by: Ben Dooks <ben-linux@fluff.org> Acked-by: Arnaud Patard <arnaud.patard@rtp-net.org> Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
Randy Dunlap [Tue, 22 Jan 2008 01:18:25 +0000 (17:18 -0800)]
timer: fix section mismatch
The caller is __cpuinit.
Also, this code block and its caller are inside #ifdef CONFIG_HOTPLUG_CPU
blocks, so this code should reflect that config symbol's usage.
WARNING: vmlinux.o(.text+0x4252f): Section mismatch: reference to .init.text: (between 'timer_cpu_notify' and 'msleep')
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
Alan Cox [Tue, 22 Jan 2008 01:18:24 +0000 (17:18 -0800)]
keyspan: fix oops
If we get a data URB back from the hardware after we have put the tty to
bed we go kaboom. Fortunately all we need to do is process the URB without
trying to ram its contents down the throat of an ex-tty.
Signed-off-by: Alan Cox <alan@redhat.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
Atsushi Nemoto [Fri, 18 Jan 2008 16:15:52 +0000 (01:15 +0900)]
tc35815: Use irq number for tc35815-mac platform device id
The tc35815-mac platform device used a pci bus number and a devfn to
identify its target device, but the pci bus number may vary if some
bus-bridges are found. Use irq number which is be unique for embedded
controllers.
Dmitri Vorobiev [Mon, 14 Jan 2008 21:27:46 +0000 (00:27 +0300)]
[MIPS] Malta: Fix reading the PCI clock frequency on big-endian
The JMPRS register on Malta boards keeps a 32-bit CPU-endian
value. The readw() function assumes that the value it reads is a
little-endian 16-bit number. Therefore, using readw() to obtain
the value of the JMPRS register is a mistake. This error leads
to incorrect reading of the PCI clock frequency on big-endian
during board start-up.
Change readw() to __raw_readl().
This was tested by injecting a call to printk() and verifying
that the value of the jmpr variable was consistent with current
setting of the JP4 "PCI CLK" jumper.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Grant Likely [Mon, 21 Jan 2008 18:22:22 +0000 (11:22 -0700)]
[POWERPC] mpc5200: merge defconfigs for all mpc5200 boards
There is no reason to have separate defconfigs for each mpc5200 board.
Instead, here is a common defconfig that can be used for all supported
platforms.
Merging the defconfigs means there are fewer configuration to test when
compile testing all of arch/powerpc and should make support easier.
In that patch, David L added a icmp_out_count() in
ip_push_pending_frames(), remove icmp_out_count() from
icmp_reply(). But he forgot to remove icmp_out_count() from
icmp_send() too. Since icmp_send and icmp_reply will call
icmp_push_reply, which will call ip_push_pending_frames, a duplicated
increment happened in icmp_send.
This patch remove the icmp_out_count from icmp_send too.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Chen [Mon, 21 Jan 2008 11:05:43 +0000 (03:05 -0800)]
[IPV6]: RFC 2011 compatibility broken
The snmp6 entry name was changed, and it broke compatibility
to RFC 2011.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Chen [Mon, 21 Jan 2008 11:04:47 +0000 (03:04 -0800)]
[IPV6]: ICMP6_MIB_OUTMSGS increment duplicated
icmpv6_send() calls ip6_push_pending_frames() indirectly.
Both ip6_push_pending_frames() and icmpv6_send() increment
counter ICMP6_MIB_OUTMSGS.
This patch remove the increment from icmpv6_send.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 21 Jan 2008 01:25:14 +0000 (17:25 -0800)]
[NET]: rtnl_link: fix use-after-free
When unregistering the rtnl_link_ops, all existing devices using
the ops are destroyed. With nested devices this may lead to a
use-after-free despite the use of for_each_netdev_safe() in case
the upper device is next in the device list and is destroyed
by the NETDEV_UNREGISTER notifier.
The easy fix is to restart scanning the device list after removing
a device. Alternatively we could add new devices to the front of
the list to avoid having dependant devices follow the device they
depend on. A third option would be to only restart scanning if
dev->iflink of the next device matches dev->ifindex of the current
one. For now this seems like the safest solution.
With this patch, the veth rtnl_link_ops unregistration can use
rtnl_link_unregister() directly since it now also handles destruction
of multiple devices at once.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Mon, 21 Jan 2008 00:58:04 +0000 (16:58 -0800)]
[IrDA]: af_irda memory leak fixes
Here goes an IrDA patch against your latest net-2.6 tree.
This patch fixes some af_irda memory leaks. It also checks for
irias_new_obect() return value.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Jan 2008 00:39:03 +0000 (16:39 -0800)]
[NEIGH]: Revert 'Fix race between neigh_parms_release and neightbl_fill_parms'
Commit 9cd40029423701c376391da59d2c6469672b4bed (Fix race between
neigh_parms_release and neightbl_fill_parms) introduced device
reference counting regressions for several people, see:
http://bugzilla.kernel.org/show_bug.cgi?id=9778
for example.
Signed-off-by: David S. Miller <davem@davemloft.net>
When packets are flood-forwarded to multiple output devices, the
bridge-netfilter code reuses skb->nf_bridge for each clone to store
the bridge port. When queueing packets using NFQUEUE netfilter takes
a reference to skb->nf_bridge->physoutdev, which is overwritten
when the packet is forwarded to the second port. This causes
refcount unterflows for the first device and refcount leaks for all
others. Additionally this provides incorrect data to the iptables
physdev match.
Unshare skb->nf_bridge by copying it if it is shared before assigning
the physoutdev device.
Reported, tested and based on initial patch by
Jan Christoph Nordholz <hesso@pool.math.tu-berlin.de>.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[IPV6] ROUTE: Make sending algorithm more friendly with RFC 4861.
We omit (or delay) sending NSes for known-to-unreachable routers (in
NUD_FAILED state) according to RFC 4191 (Default Router Preferences
and More-Specific Routes). But this is not fully compatible with RFC
4861 (Neighbor Discovery Protocol for IPv6), which does not remember
unreachability of neighbors.
So, let's avoid mixing sending algorithm of RFC 4191 and that of RFC
4861, and make the algorithm more friendly with RFC 4861 if RFC 4191
is disabled.
Issue was found by IPv6 Ready Logo Core Self_Test 1.5.0b2 (by TAHI
Project), and has been tracked down by Mitsuru Chinen
<mitch@linux.vnet.ibm.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 18 Jan 2008 12:30:21 +0000 (04:30 -0800)]
[IPV4] FIB_HASH : Avoid unecessary loop in fn_hash_dump_zone()
I noticed "ip route list" was slower than "cat /proc/net/route" on a
machine with a full Internet routing table (214392 entries : Special
thanks to Robert ;) )
David S. Miller [Fri, 18 Jan 2008 12:21:39 +0000 (04:21 -0800)]
[NET]: Fix interrupt semaphore corruption in Intel drivers.
Several of the Intel ethernet drivers keep an atomic counter used to
manage when to actually hit the hardware with a disable or an enable.
The way the net_rx_work() breakout logic works during a pending
napi_disable() is that it simply unschedules the poll even if it
still has work.
This can potentially leave interrupts disabled, but that is OK
because all of the drivers are about to disable interrupts
anyways in all such code paths that do a napi_disable().
Unfortunately, this trips up the semaphore used here in the Intel
drivers. If you hit this case, when you try to bring the interface
back up it won't enable interrupts. A reload of the driver module
fixes it of course.
So what we do is make sure all the sequences now go: