git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log

net: dcb: add CEE notify calls

This adds code to trigger CEE events when an APP change or setall
command is made from user space. This simplifies user space code
significantly by creating a single interface to listen on that
works with both firmware and userland agents.

And if we end up with multiple agents this keeps every thing in
sync userland agents, firmware agents, and kernel notifier consumers.

For an example agent that listens for these events see:

https://github.com/jrfastab/cgdcbxd

cgdcbxd is a daemon used to monitor DCB netlink events and manage
the net_prio control group sub-system.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: remove inline instances from C source files.

Untie gcc's hands and let it do what it wants within the
individual source files. There are two files, node.c and
port.c -- only the latter effectively changes (gcc-4.5.2).
Objdump shows gcc deciding to not inline port_peernode().

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_netlink: drop_monitor/dropwatch friendly

Need to consume_skb() instead of kfree_skb() in netlink_dump() and
netlink_unicast_kernel() to avoid false dropwatch positives.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_netlink: cleanups

netlink_destroy_callback() move to avoid forward reference

CodingStyle cleanups

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: skb_can_coalesce returns a boolean

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: tcp_try_coalesce returns a boolean

This clarifies code intention, as suggested by David.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: make spd_fill_page() linear argument a bool

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Fix merge between commit 3adadc08cc1e ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d27 ("net ax25: Simplify and
cleanup the ax25 sysctl handling")

The former moved around the sysctl register/unregister calls, the
later simply removed them.

With help from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: Use bool and remove inline in skb_splice_bits() code.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: speedup skb_splice_bits()

Commit 35f3d14db (pipe: add support for shrinking and growing pipes)
added a slowdown for splice(socket -> pipe), as we might grow the spd
used in skb_splice_bits() for each skb we process in splice() syscall.

Its not needed since skb lengths are capped. The default on-stack arrays
are more than enough.

Use MAX_SKB_FRAGS instead of PIPE_DEF_BUFFERS to describe the reasonable
limit per skb.

Add coalescing support to help splicing of GRO skbs built from linear
skbs (linked into frag_list)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull build system failure fix from Michal Marek:
"This fixes build failure with newer gcc that adds some internal
  symbols that end in "__mod_*_device_table", but are not actually the
  tables themselves."

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Fix modpost failures in fedora 17

tcp: introduce tcp_try_coalesce

commit c8628155ece3 (tcp: reduce out_of_order memory use) took care of
coalescing tcp segments provided by legacy devices (linear skbs)

We extend this idea to fragged skbs, as their truesize can be heavy.

ixgbe for example uses 256+1024+PAGE_SIZE/2 = 3328 bytes per segment.

Use this coalescing strategy for receive queue too.

This contributes to reduce number of tcp collapses, at minimal cost, and
reduces memory overhead and packets drops.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Update driver version to 1.72.50-0

Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: remove gro workaround

Removes GRO workaround, as issue is fixed in FW 7.2.51.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: add afex support

Following patch adds afex multifunction support to the driver (afex
multifunction is based on vntag header) and updates FW version used to 7.2.51.

Support includes the following:

1. Configure vif parameters in firmware (default vlan, vif id, default
priority, allowed priorities) according to values received from NIC.
2. Configure FW to strip/add default vlan according to afex vlan mode.
3. Notify link up to OS only after vif is fully initialized.
4. Support vif list set/get requests and configure FW accordingly.
5. Supply afex statistics upon request from NIC.
6. Special handling to L2 interface in case of FCoE vif.

Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_en: Byte Queue Limit support

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_en: Moving to Interrupts for TX completions

Moving to interrupts instead of polling fpr TX completions
Avoiding situations where skb can be held in by the driver for
a long time (till timer expires).
The change is also necessary for supporting BQL.

Removing comp_lock that was required because we could handle TX
completions from several contexts: Interrupts, timer, polling.
Now there is only interrupts

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_en: Added Ethtool support for TX Interrupt coalescing

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: sk_add_backlog() is too agressive for TCP

While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.

A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.

A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.

Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add a limit parameter to sk_add_backlog()

sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.

No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ax25: Fix the build when sysctl support is disabled.

Randy Dunlap <rdunlap@xenotime.net> reported:

> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
> include/net/ax25.h:447:75: error: expected ';' before '}' token
>
> static inline int ax25_register_dev_sysctl(ax25_dev *ax25_dev) { return 0 };
> static inline void ax25_unregister_dev_sysctl(ax25_dev *ax25_dev) {};
>
> First function: move ';' inside braces.
> Second function: drop the ';'.

Put the semicolons where it makes sense.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'md-3.4-fixes' of git://neil.brown.name/md

Pull a few more md bug fixes from NeilBrown:
"2 are tagged for -stable, one being for a fairly serious bug that can
  corrupt metadata and make it hard to recovery an array.  The other is
  for a more recent regression since 3.3"

* tag 'md-3.4-fixes' of git://neil.brown.name/md:
  md: fix possible corruption of array metadata on shutdown.
  md: don't call ->add_disk unless there is good reason.
  DM RAID: Use safe version of rdev_for_each

Merge tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm fixes from David Teigland:
"This includes one short patch fixing the behavior of the QUECVT flag,
which the gfs2 folks are waiting on."

* tag 'dlm-fixes-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix QUECVT when convert queue is empty

mm: fix s390 BUG by __set_page_dirty_no_writeback on swap

Mel reports a BUG_ON(slot == NULL) in radix_tree_tag_set() on s390
3.0.13: called from __set_page_dirty_nobuffers() when page_remove_rmap()
tries to transfer dirty flag from s390 storage key to struct page and
radix_tree.

That would be because of reclaim's shrink_page_list() calling
add_to_swap() on this page at the same time: first PageSwapCache is set
(causing page_mapping(page) to appear as &swapper_space), then
page->private set, then tree_lock taken, then page inserted into
radix_tree - so there's an interval before taking the lock when the
radix_tree slot is empty.

We could fix this by moving __add_to_swap_cache()'s spin_lock_irq up
before the SetPageSwapCache. But a better fix is simply to do what's
five years overdue: Ken Chen introduced __set_page_dirty_no_writeback()
(if !PageDirty TestSetPageDirty) for tmpfs to skip all the radix_tree
overhead, and swap is just the same - it ignores the radix_tree tag, and
does not participate in dirty page accounting, so should be using
__set_page_dirty_no_writeback() too.

s390 testing now confirms that this does indeed fix the problem.

Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ken Chen <kenchen@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix possible corruption of array metadata on shutdown.

commit c744a65c1e2d59acc54333ce8
md: don't set md arrays to readonly on shutdown.

removed the possibility of a 'BUG' when data is written to an array
that has just been switched to read-only, but also introduced the
possibility that the array metadata could be corrupted.

If, when md_notify_reboot gets the mddev lock, the array is
in a state where it is assembled but hasn't been started (as can
happen if the personality module is not available, or in other unusual
situations), then incorrect metadata will be written out making it
impossible to re-assemble the array.

So only call __md_stop_writes() if the array has actually been
activated.

This patch is needed for any stable kernel which has had the above
commit applied.

Cc: stable@vger.kernel.org
Reported-by: Christoph Nelles <evilazrael@evilazrael.de>
Signed-off-by: NeilBrown <neilb@suse.de>

md: don't call ->add_disk unless there is good reason.

Commit 7bfec5f35c68121e7b18

md/raid5: If there is a spare and a want_replacement device, start replacement.

cause md_check_recovery to call ->add_disk much more often.
Instead of only when the array is degraded, it is now called whenever
md_check_recovery finds anything useful to do, which includes
updating the metadata for clean<->dirty transition.
This causes unnecessary work, and causes info messages from ->add_disk
to be reported much too often.

So refine md_check_recovery to only do any actual recovery checking
(including ->add_disk) if MD_RECOVERY_NEEDED is set.

This fix is suitable for 3.3.y:

Cc: stable@vger.kernel.org
Reported-by: Jan Ceuleers <jan.ceuleers@computer.org>
Signed-off-by: NeilBrown <neilb@suse.de>

DM RAID: Use safe version of rdev_for_each

Fix segfault caused by using rdev_for_each instead of rdev_for_each_safe

Commit dafb20fa34320a472deb7442f25a0c086e0feb33 mistakenly replaced a safe
iterator with an unsafe one when making some macro changes.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

net sysctl: Add place holder functions for when sysctl support is compiled out of the kernel.

Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
>
> ERROR: "unregister_net_sysctl_table" [net/phonet/phonet.ko] undefined!
> ERROR: "register_net_sysctl" [net/phonet/phonet.ko] undefined!
>
> when CONFIG_SYSCTL is not enabled.

Add static inline stub functions to gracefully handle the case when sysctl
support is not present.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix ethtool get settings

ethtool get settings was not displaying all the settings correctly.
use the get_phy_info to get more information about the PHY to fix this.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dlm: fix QUECVT when convert queue is empty

The QUECVT flag should not prevent conversions from
being granted immediately when the convert queue is
empty.

Signed-off-by: David Teigland <teigland@redhat.com>

tcp: Fix build warning after tcp_{v4,v6}_init_sock consolidation.

net/ipv4/tcp_ipv4.c: In function 'tcp_v4_init_sock':
net/ipv4/tcp_ipv4.c:1891:19: warning: unused variable 'tp' [-Wunused-variable]
net/ipv6/tcp_ipv6.c: In function 'tcp_v6_init_sock':
net/ipv6/tcp_ipv6.c:1836:19: warning: unused variable 'tp' [-Wunused-variable]

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
"Here's my usual Sunday push, just for one revert which PeterZ hollered
  about after last weeks push.  Other than that, all seems strangely
  quiet as far as fixes go in non-platform ARM land at the moment."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  Revert "ARM: 7359/2: smp_twd: Only wait for reprogramming on active cpus"

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for powerpc.  Note the addition to the generic
  irq.h.  This is part of a 3-patches regression fix for mpic due to
  changes in how IRQ_TYPE_NONE is being handled.  Thomas agreed to the
  addition of the new IRQ_TYPE_DEFAULT contant, however he hasn't
  replied with an Ack to the actual patch yet.  I don't to wait much
  longer with these patches tho."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mpic: Properly set default triggers
  irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers
  powerpc/mpic: Fix confusion between hw_irq and virq
  powerpc/pmac: Don't add_timer() twice
  powerpc/eeh: Fix crash caused by null eeh_dev
  powerpc/mpc85xx: add MPIC message dts node
  powerpc/mpic_msgr: fix offset error when setting mer register
  powerpc/mpic_msgr: add lock for MPIC message global variable
  powerpc/mpic_msgr: fix compile error when SMP disabled
  powerpc: fix build when CONFIG_BOOKE_WDT is enabled
  powerpc/85xx: don't call of_platform_bus_probe() twice

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix namespace init and cleanup in phonet to fix some oopses, from
    Eric W. Biederman.

2) Missing kfree_skb() in AF_KEY, from Julia Lawall.

3) Refcount leak and source address handling fix in l2tp from James
    Chapman.

4) Memory leak fix in CAIF from Tomasz Gregorek.

5) When routes are cloned from ipv6 addrconf routes, we don't process
    expirations properly.  Fix from Gao Feng.

6) Fix panic on DMA errors in atl1 driver, from Tony Zelenoff.

7) Only enable interrupts in 8139cp driver after we've registered the
    IRQ handler.  From Jason Wang.

8) Fix too many reads of KS_CIDER register in ks8851 during probe,
    fixing crashes on spurious interrupts.  From Matt Renzelmann.

9) Missing include in ath5k driver and missing iounmap on probe
    failure, from Jonathan Bither.

10) Fix RX packet handling in smsc911x driver, from Will Deacon.

11) Fix ixgbe WoL on fiber by leaving the laser on during shutdown.

12) ks8851 needs MAX_RECV_FRAMES increased otherwise the internal MAC
    buffers are easily overflown.  Fix from Davide Cimingahi.

13) Fix memory leaks in peak_usb CAN driver, from Jesper Juhl.

14) gred packet scheduler can dump in WRED more when doing a netlink
    dump.  Fix from David Ward.

15) Fix MTU in USB smsc75xx driver, from Stephane Fillod.

16) Dummy device needs ->ndo_uninit handler to properly handle
    ->ndo_init failures.  From Hiroaki SHIMODA.

17) Fix TX fragmentation in ath9k driver, from Sujith Manoharan.

18) Missing RTNL lock in ixgbe PM resume, from Benjamin Poirier.

19) Missing iounmap in farsync WAN driver, from Julia Lawall.

20) With LRO/GRO, tcp_grow_window() is easily tricked into not growing
    the receive window properly, and this hurts performance.  Fix from
    Eric Dumazet.

21) Network namespace init failure can leak net_generic data, fix from
    Julian Anastasov.

22) Fix skb_over_panic due to mis-accounting in TCP for partially ACK'd
    SKBs.  From Eric Dumazet.

23) New IDs for qmi_wwan driver, from Bjørn Mork.

24) Fix races in ax25_exit(), from Eric W. Biederman.

25) IPV6 TCP doesn't handle TCP_MAXSEG socket option properly, copy over
    logic from the IPV4 side.  From Neal Cardwell.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
  tcp: fix TCP_MAXSEG for established IPv6 passive sockets
  drivers/net: Do not free an IRQ if its request failed
  drop_monitor: allow more events per second
  ks8851: Fix request_irq/free_irq mismatch
  net/hyperv: Adding cancellation to ensure rndis filter is closed
  ks8851: Fix mutex deadlock in ks8851_net_stop()
  net ax25: Reorder ax25_exit to remove races.
  icplus: fix interrupt for IC+ 101A/G and 1001LF
  net: qmi_wwan: support Sierra Wireless MC77xx devices in QMI mode
  bnx2x: off by one in bnx2x_ets_e3b0_sp_pri_to_cos_set()
  ksz884x: don't copy too much in netdev_set_mac_address()
  tcp: fix retransmit of partially acked frames
  netns: do not leak net_generic data on failed init
  net/sock.h: fix sk_peek_off kernel-doc warning
  tcp: fix tcp_grow_window() for large incoming frames
  drivers/net/wan/farsync.c: add missing iounmap
  davinci_mdio: Fix MDIO timeout check
  ipv6: clean up rt6_clean_expires
  ipv6: fix rt6_update_expires
  arcnet: rimi: Fix device name in debug output
  ...

powerpc/mpic: Properly set default triggers

This gets rid of the unused default senses array, and replaces the
incorrect use of IRQ_TYPE_NONE with the new IRQ_TYPE_DEFAULT for
the initial set_trigger() call when mapping an interrupt.

This in turn makes us read the HW state and update the irq desc
accordingly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

irq: Add IRQ_TYPE_DEFAULT for use by PIC drivers

This is meant typically to allow a PIC driver's irq domain map() callback
to establish sane defaults for the interrupt (and make sure that the HW
and the irq_desc are in sync as far as the trigger is concerned).

The irq core may not call the set_trigger callback if it thinks the
trigger is already set to the right setting, so we need to ensure new
descriptors are properly synchronized with the hardware.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mpic: Fix confusion between hw_irq and virq

mpic_is_ipi() takes a virq and immediately converts it to a hw_irq.

However, one of the two call sites calls it with a ... hw_irq. The
other call site also happens to have the hw_irq at hand, so let's
change it to just take that as an argument. Also change mpic_is_tm()
for consistency.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/pmac: Don't add_timer() twice

If the interrupt and the timeout happen roughly at the same
time, we can get into a situation where the timer function
is run while the interrupt has already been processed. In
this case, the timer function might end up doing an add_timer
on an already pending timer, causing a BUG_ON() to trigger.

Instead, just skip the whole timeout operation if we see that
the timer is pending. The spinlock ensures that the only way
that happens is if we already started a new operation and thus
the timeout can be ignored.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/eeh: Fix crash caused by null eeh_dev

The problem was reported by Anton Blanchard. While EEH error
happened to the PCI device without the corresponding device
driver, kernel crash was seen. Eventually, I successfully
reproduced the problem on Firebird-L machine with utility
"errinjct". Initially, the device driver for Emulex ethernet
MAC has been disabled from .config and force data parity on
the Emulex ethernet MAC with help of "errinjct". Eventually,
I saw the kernel crash after issueing couple of "lspci -v"
command.

The root cause behind is that the PCI device, including the
reference to the corresponding eeh device, will be removed
from the system while EEH does recovery. Afterwards, the
PCI device will be probed again and added into the system
accordingly. So it's not safe to retrieve the eeh device from
the corresponding PCI device after the PCI device has been removed
and not added again.

The patch fixes the issue and retrieve the eeh device from OF node
instead of PCI device after the PCI device has been removed.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Merge remote-tracking branch 'kumar/merge' into merge

tcp: fix TCP_MAXSEG for established IPv6 passive sockets

Commit f5fff5d forgot to fix TCP_MAXSEG behavior IPv6 sockets, so IPv6
TCP server sockets that used TCP_MAXSEG would find that the advmss of
child sockets would be incorrect. This commit mirrors the advmss logic
from tcp_v4_syn_recv_sock in tcp_v6_syn_recv_sock. Eventually this
logic should probably be shared between IPv4 and IPv6, but this at
least fixes this issue.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 3.4-rc4

gianfar: add GRO support

Replace netif_receive_skb with napi_gro_receive.

Signed-off-by: Jiajun Wu <b06378@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: Do not free an IRQ if its request failed

Refrain from attempting to free an interrupt line if the request
fails and hence, there is no IRQ to free.

CC: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_packet: packet_getsockopt() cleanup

Factorize code, since most fetched values are int type.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock()

This commit moves the (substantial) common code shared between
tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family
independent function, tcp_init_sock().

Centralizing this functionality should help avoid drift issues,
e.g. where the IPv4 side is updated without a corresponding update to
IPv6. There was already some drift: IPv4 initialized snd_cwnd to
TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to
2 (in this case it should not matter, since snd_cwnd is also
initialized in tcp_init_metrics(), but the general risks and
maintenance overhead remain).

When diffing the old and new code, note that new tcp_init_sock()
function uses the order of steps from the tcp_v4_init_sock()
implementation (the order is slightly different in
tcp_v6_init_sock()).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: allow better page reuse in splice(sock -> pipe)

splice() from socket to pipe needs linear_to_page() helper to transfert
skb header to part of page.

We can reset the offset in the current sk->sk_sndmsg_page if we are the
last user of the page.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sparc32,leon: add notify_cpu_starting()

Otherwise cpu_active_mask will not set, which lead to other issue.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Konrad Eisele <konrad@gaisler.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drop_monitor: allow more events per second

It seems there is a logic error in trace_drop_common(), since we store
only 64 drops, even if they are from same location.

This fix is a one liner, but we probably need more work to avoid useless
atomic dec/inc

Now I can watch 1 Mpps drops through dropwatch...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: add per-port option for enabling/disabling ports

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: allow to enable/disable ports

This patch changes content of hashlist (used to get port struct by
computed index (0...en_port_count-1)). Now the hash list contains only
enabled ports so userspace will be able to say what ports can be used
for tx/rx. This becomes handy when userspace will need to disable ports
which does not belong to active aggregator. By default, newly added port
is enabled.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

team: lb: let userspace care about port macs

Better to leave this for userspace

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: change big iov allocations

iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
sock_kmalloc()

As these allocations are temporary only and small enough, it makes sense
to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.

Slightly changed fast path to be even faster.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ks8851: Fix request_irq/free_irq mismatch

The dev_id parameter passed to free_irq needs to match the one passed
to the corresponding request_irq.

Signed-off-by: Matt Renzelmann <mjr@cs.wisc.edu>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Repair connection-time negotiated parameters

There are options, which are set up on a socket while performing
TCP handshake. Need to resurrect them on a socket while repairing.
A new sockoption accepts a buffer and parses it. The buffer should
be CODE:VALUE sequence of bytes, where CODE is standard option
code and VALUE is the respective value.

Only 4 options should be handled on repaired socket.

To read 3 out of 4 of these options the TCP_INFO sockoption can be
used. An ability to get the last one (the mss_clamp) was added by
the previous patch.

Now the restore. Three of these options -- timestamp_ok, mss_clamp
and snd_wscale -- are just restored on a coket.

The sack_ok flags has 2 issues. First, whether or not to do sacks
at all. This flag is just read and set back. No other sack info is
saved or restored, since according to the standart and the code
dropping all sack-ed segments is OK, the sender will resubmit them
again, so after the repair we will probably experience a pause in
connection. Next, the fack bit. It's just set back on a socket if
the respective sysctl is set. No collected stats about packets flow
is preserved. As far as I see (plz, correct me if I'm wrong) the
fack-based congestion algorithm survives dropping all of the stats
and repairs itself eventually, probably losing the performance for
that period.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Report mss_clamp with TCP_MAXSEG option in repair mode

The mss_clamp is the only connection-time negotiated option which
cannot be obtained from the user space. Make the TCP_MAXSEG sockopt
report one in the repair mode.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Repair socket queues

Reading queues under repair mode is done with recvmsg call.
The queue-under-repair set by TCP_REPAIR_QUEUE option is used
to determine which queue should be read. Thus both send and
receive queue can be read with this.

Caller must pass the MSG_PEEK flag.

Writing to queues is done with sendmsg call and yet again --
the repair-queue option can be used to push data into the
receive queue.

When putting an skb into receive queue a zero tcp header is
appented to its head to address the tcp_hdr(skb)->syn and
the ->fin checks by the (after repair) tcp_recvmsg. These
flags flags are both set to zero and that's why.

The fin cannot be met in the queue while reading the source
socket, since the repair only works for closed/established
sockets and queueing fin packet always changes its state.

The syn in the queue denotes that the respective skb's seq
is "off-by-one" as compared to the actual payload lenght. Thus,
at the rcv queue refill we can just drop this flag and set the
skb's sequences to precice values.

When the repair mode is turned off, the write queue seqs are
updated so that the whole queue is considered to be 'already sent,
waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the
protocol POV the send queue looks like it was sent, but the data
between the write_seq and snd_nxt is lost in the network.

This helps to avoid another sockoption for setting the snd_nxt
sequence. Leaving the whole queue in a 'not yet sent' state (as
it will be after sendmsg-s) will not allow to receive any acks
from the peer since the ack_seq will be after the snd_nxt. Thus
even the ack for the window probe will be dropped and the
connection will be 'locked' with the zero peer window.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Initial repair mode

This includes (according the the previous description):

* TCP_REPAIR sockoption

This one just puts the socket in/out of the repair mode.
Allowed for CAP_NET_ADMIN and for closed/establised sockets only.
When repair mode is turned off and the socket happens to be in
the established state the window probe is sent to the peer to
'unlock' the connection.

* TCP_REPAIR_QUEUE sockoption

This one sets the queue which we're about to repair. The
'no-queue' is set by default.

* TCP_QUEUE_SEQ socoption

Sets the write_seq/rcv_nxt of a selected repaired queue.
Allowed for TCP_CLOSE-d sockets only. When the socket changes
its state the other seq-s are changed by the kernel according
to the protocol rules (most of the existing code is actually
reused).

* Ability to forcibly bind a socket to a port

The sk->sk_reuse is set to SK_FORCE_REUSE.

* Immediate connect modification

The connect syscall initializes the connection, then directly jumps
to the code which finalizes it.

* Silent close modification

The close just aborts the connection (similar to SO_LINGER with 0
time) but without sending any FIN/RST-s to peer.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Move code around

This is just the preparation patch, which makes the needed for
TCP repair code ready for use.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sock: Introduce named constants for sk_reuse

Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull "ARM: SoC fixes" from Olof Johansson:
* at91, ux500, imx, omap and bcmring:
  - at91 fixes for =m driver build issues, irqdomain fixes and config
    dependency fixes
  - ux500 kconfig dependency fixes and a  smp wakeup bugfix
  - imx idle bugfix and build fix due to irq domain changes
  - omap uart pinmux fixes, softreset regression revert and misc fixes
  - bcmring build error regression fix

* ux500 and imx had some small defconfig updates in this branch

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (27 commits)
  ARM: bcmring: fix UART declarations
  ARM: imx: Fix imx5 idle logic bug
  ARM: imx27-dt: Fix build due to removal of irq_domain_add_simple()
  ARM: imx_v4_v5_defconfig: Add support for CONFIG_REGULATOR_FIXED_VOLTAGE
  ARM: OMAP1: DMTIMER: fix broken timer clock source selection
  ARM: OMAP: serial: Fix the ocp smart idlemode handling bug
  ARM: OMAP2+: UART: Fix incorrect population of default uart pads
  ARM: OMAP: sram: fix BUG in dpll code for !PM case
  dmaengine: Kconfig: fix Atmel at_hdmac entry
  USB: gadget/at91_udc: add gpio_to_irq() function to vbus interrupt
  USB: ohci-at91: change annotations for probe/remove functions
  leds-atmel-pwm.c: Make pwmled_probe() __devinit
  ARM: at91: fix at91sam9261ek Ethernet dm9000 irq
  ARM: at91: fix rm9200ek flash size
  ARM: at91: remove empty at91_init_serial function
  ARM: at91: fix typo in at91_pmc_base assembly declaration
  ARM: at91: Export at91_matrix_base
  ARM: at91: Export at91_pmc_base
  ARM: at91: Export at91_ramc_base
  ARM: at91: Export at91_st_base
  ...

Merge tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
- Build fix for omap_hsmmc with OF against 3.4-rc1.
- Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which
   broke hotplug card detection when UNSAFE_RESUME is set.
- Fix a race condition in omap_hsmmc with runtime PM.
- Fix two libertas SDIO-powered-resume regressions.
- Small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx.

* tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: core: Do not pre-claim host in suspend
  mmc: dw_mmc: prevent NULL dereference for dma_ops
  mmc: unbreak sdhci-esdhc-imx on i.MX25
  mmc: cd-gpio: Include header to pickup exported symbol prototypes
  mmc: sdhci: refine non-removable card checking for card detection
  mmc: dw_mmc: Fix switch from DMA to PIO
  mmc: remove MMC bus legacy suspend/resume method
  mmc: omap_hsmmc: Get rid of of_have_populated_dt() usage
  mmc: omap_hsmmc: build fix for CONFIG_OF=y and CONFIG_MMC_OMAP_HS=m
  mmc: fixes for eMMC v4.5 sanitize operation
  mmc: fixes for eMMC v4.5 discard operation

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
- Fixes a regression at DVB core when switching from DVB-S2 to DVB-S on
   Kaffeine (Fedora 16 Bugzilla #812895);
- Fixes a mutex unlock at an error condition at drx-k;
- Fix winbond-cir set mode;
- mt9m032: Fix a compilation breakage with some random Kconfig;
- mt9m032: fix two dead locks;
- xc5000: don't require an special firmware (that won't be provided by
   the vendor) just because the xtal frequency is different;
- V4L DocBook: fix some typos at multi-plane formats description.

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] xc5000: support 32MHz & 31.875MHz xtal using the 41.024.5 firmware
  [media] V4L: mt9m032: fix compilation breakage
  [media] V4L: DocBook: Fix typos in the multi-plane formats description
  [media] V4L: mt9m032: fix two dead-locks
  [media] rc-core: set mode for winbond-cir
  [media] drxk: Does not unlock mutex if sanity check failed in scu_command()
  [media] dvb_frontend: Fix a regression when switching back to DVB-S

Merge tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6

Pull MFD fixes from Samuel Ortiz:
"We have 3 build fixes, a OMAP USB host PHY reset fix and the twl6040
  conversion to an i2c driver.  The latter may not sound like a fix but
  the twl6040 MFD driver won't probe without it, triggering an OMAP4
  audio regression."

* tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Fix modular builds of rc5t583 regulator support
  mfd: Fix asic3_gpio_to_irq
  ARM: OMAP3: USB: Fix the EHCI ULPI PHY reset issue
  mfd: Convert twl6040 to i2c driver, and separate it from twl core
  mfd : Fix dbx500 compilation error

net/hyperv: Adding cancellation to ensure rndis filter is closed

Although the network interface is down, the RX packets number which
could be observed by ifconfig may keep on increasing.

This is because the WORK scheduled in netvsc_set_multicast_list()
may be executed after netvsc_close(). That means the rndis filter
may be re-enabled by do_set_multicast() even if it was closed by
netvsc_close().

By canceling possible WORK before close the rndis filter, the issue
could be never happened.

Signed-off-by: Wenqi Ma <wenqi_ma@trendmicro.com.cn>
Reviewed-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>

ks8851: Fix mutex deadlock in ks8851_net_stop()

There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.

Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
((&ks->irq_work)){+.+...}, at: [<c019e0b8>] flush_work+0x0/0xac

but task is already holding lock:
(&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ks->lock){+.+...}:
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c083dbec>] mutex_lock_nested+0x68/0x3dc
       [<bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
       [<c019c580>] process_one_work+0x2d8/0x518
       [<c019cb98>] worker_thread+0x220/0x3a0
       [<c01a2ad4>] kthread+0x88/0x94
       [<c0107008>] kernel_thread_exit+0x0/0x8

-> #0 ((&ks->irq_work)){+.+...}:
       [<c01b7984>] validate_chain+0x914/0x1018
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c019e104>] flush_work+0x4c/0xac
       [<bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
       [<c06b209c>] __dev_close_many+0x98/0xcc
       [<c06b2174>] dev_close_many+0x68/0xd0
       [<c06b22ec>] rollback_registered_many+0xcc/0x2b8
       [<c06b2554>] rollback_registered+0x28/0x34
       [<c06b25b8>] unregister_netdevice_queue+0x58/0x7c
       [<c06b25f4>] unregister_netdev+0x18/0x20
       [<bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
       [<c049ddf0>] spi_drv_remove+0x18/0x1c
       [<c0468e98>] __device_release_driver+0x7c/0xbc
       [<c0468f64>] driver_detach+0x8c/0xb4
       [<c0467f00>] bus_remove_driver+0xb8/0xe8
       [<c01c1d20>] sys_delete_module+0x1e8/0x27c
       [<c0105ec0>] ret_fast_syscall+0x0/0x3c

other info that might help us debug this:

Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ks->lock);
                               lock((&ks->irq_work));
                               lock(&ks->lock);
  lock((&ks->irq_work));

*** DEADLOCK ***

4 locks held by rmmod/125:
#0:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f44>] driver_detach+0x6c/0xb4
#1:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f50>] driver_detach+0x78/0xb4
#2:  (rtnl_mutex){+.+.+.}, at: [<c06b25e8>] unregister_netdev+0xc/0x20
#3:  (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: decouple ISA and ISA_DMA_API

The two options are separate, and some platforms (e.g. arm pxa)
have ISA slots but no ISA dma controller, so they cannot build
drivers using the DMA API functions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sungem: use mdelay instead of udelay where necessary

Some architectures like ARM cannot handle large numbers as
arguments to udelay, so the drivers should use mdelay when
delaying for multiple miliseconds.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

donauboe: replace excessive udelay with msleep

No driver should spin the CPU for 10ms, so better use
an msleep, which is allowed in the ->suspend function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

8390: select CRC32 support

The ax88796 driver uses the CRC32 functions, so make sure that
they are actually enabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: iwmc3200 depends on EXPERIMENTAL

The iwmc3200 driver selects other code in Kconfig that depends on
EXPERIMENTAL. Kconfig warns about this when CONFIG_EXPERIMENTAL
is not already set, so logically, these options should also
be marked experimental or promoted to stable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

caif: include linux/io.h

The caif_shmcore requires io.h in order to use ioremap, so include that
explicitly to compile in all configurations.

Also add a note about the use of ioremap(), which is not a proper way
to map a DMA buffer into kernel space. It's not completely clear what
the intention is for using ioremap, but it is clear that the result
of ioremap must not simply be accessed using kernel pointers but
should use readl/writel or memcopy_{to,from}io. Assigning the result
of ioremap to a regular pointer that can also be set to something
else is not ok.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: add missing __devexit_p() annotations

Drivers that refer to a __devexit function in an operations
structure need to annotate that pointer with __devexit_p so
replace it with a NULL pointer when the section gets discarded.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

davinci_cpdma: export symbols used by other drivers

The davinci_emac driver can be a module, so the symbols
it needs from the cpdma driver must be exported.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: remove suspicious comment

The time stamping code in this driver appears to have been copied from
the ixp4xx_eth.c driver, including this timing comment. I had actually
measured the time stamp delay on an IXP425, but I really doubt that this
value also applies here.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: run the ptp bpf just once per packet

This patch fixes code which needlessly ran the BPF twice per
packet. Instead, we just run the classifier once and test
whether the packet is any kind of PTP event message.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: correct receive time stamp filtering

This patch fixes the driver so that multicast PTP event messages can
be recognized by the hardware time stamping unit. The station address
register must be set according to the desired transport type.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: do not set the channel control register

We will let the pch_gbe code do that according to the receive time stamp
filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: improve coding style

This patch clears up a few coding style issues:

- Makes two function definitions a bit nicer looking.
- Remove unneeded parentheses.
- Simplify macros for register bits.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: export a method to set the receive match address

The code in phc_gbe_main will need to call this method in order to set the
station address register according to the receive time stamping filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: reprogram multicast address register on reset

The reset logic after a Rx FIFO overrun will clear the programmed
multicast addresses. This patch fixes the issue by reprogramming the
registers after the reset.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: simplify transmit time stamping flag test

This patch makes logic surrounding the test of the
transmit time stamping flag more readable.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: scale time stamps to nanoseconds

This patch fixes the helper functions that give the transmit and
receive time stamps to return nanoseconds, instead of arbitrary clock
ticks.

[ RC - Rebased Takahiro's changes and wrote a commit message
explaining the changes. ]

Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

kill mm argument of vm_munmap()

it's always current->mm

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

perfmon: kill some helpers and arguments

pfm_vm_munmap() is simply vm_munmap() and pfm_remove_smpl_mapping()
always get current as the first argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

aio: don't bother with unmapping when aio_free_ring() is coming from exit_aio()

... since exit_mmap() is coming and it will munmap() everything anyway.
In all other cases aio_free_ring() has ctx->mm == current->mm; moreover,
all other callers of vm_munmap() have mm == current->mm, so this will
allow us to get rid of mm argument of vm_munmap().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

mmc: core: Do not pre-claim host in suspend

Since SDIO drivers may want to do some SDIO operations in their suspend
callback functions, we must not keep the host claimed when calling them.

Daniel Drake reported that libertas_sdio encountered a deadlock in its
suspend function.

Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Tested-by: Daniel Drake <dsd@laptop.org>
[stable@: please apply to 3.2-stable and 3.3-stable]
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>

mmc: dw_mmc: prevent NULL dereference for dma_ops

Now, dma_ops is assumed that use the IDMAC. But if dma_ops is assigned
the pdata->dma_ops, we didn't ensure that callback function is defined.

If the callback isn't defined, then we should run in PIO mode.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>

net: Remove register_net_sysctl_table

All of the users have been converted to use registera_net_sysctl so we
no longer need register_net_sysctl.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Delete all remaining instances of ctl_path

We don't use struct ctl_path anymore so delete the exported constants.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Convert all sysctl registrations to register_net_sysctl

This results in code with less boiler plate that is a bit easier
to read.

Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Convert nf_conntrack_proto to use register_net_sysctl

There isn't much advantage here except that strings paths are a bit
easier to read, and converting everything to them allows me to kill off
ctl_path.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ipv4: Convert devinet to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ipv6: Convert addrconf to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net decnet: Convert to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net neighbour: Convert to use register_net_sysctl

Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.

We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ipv6: Don't use sysctl tables with .child entries.

The sysctl core no longer natively understands sysctl tables
with .child entries.

Split the ipv6_table to remove the .child entries.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net llc: Don't use sysctl tables with .child entries.

The sysctl core no longer natively understands sysctl tables with .child
entries.

Kill the intermediate tables and use register_net_sysctl directly to
remove the need for compatibility code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ax25: Simplify and cleanup the ax25 sysctl handling.

Don't register/unregister every ax25 table in a batch. Instead register
and unregister per device ax25 sysctls as ax25 devices come and go.

This moves ax25 to be a completely modern sysctl user. Registering the
sysctls in just the initial network namespace, removing the use of
.child entries that are no longer natively supported by the sysctl core
and taking advantage of the fact that there are no longer any ordering
constraints between registering and unregistering different sysctl
tables.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ipv4: Remove the unneeded registration of an empty net/ipv4/neigh

sysctl no longer requires explicit creation of directories. The neigh
directory is always populated with at least a default entry so this
won't cause any user visible changes.

Delete the ipv4_path and the ipv4_skeleton these are no longer needed.

Directly register the ipv4_route_table.

And since I am an idiot remove the header definitions that I should
have removed in the previous patch.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>