git.proxmox.com Git - mirror_ubuntu-eoan-kernel.git/log

ixgbe: ethtool DCB registers dump for 82599 and x540

Added support for DCB registers dump using ethtool -d option both for
82599 and x540 ethernet controllers

Signed-off-by: Leonardo Potenza <leonardo.potenza@intel.com>
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: move API neg to reset path

After this patch the API negotiation will occur in the reset path. So now
the PF will be informed of the API version earlier. This will also require
the mailbox lock to be initialize sooner as well.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: add wait for Rx queue disable

New function was added to wait for Rx queues to be disabled before
disabling NAPI. This function also allows us to modify
ixgbevf_rx_desc_queue_enable() to better match ixgbe. I also cleaned up
some msleep calls to usleep_range while I was in this code anyway.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: cleanup redundant mailbox read failure check

Since we are already checking for read failure in check_link we don't need
to do it here. Instead just make sure the watchdog task gets scheduled, if
we are up, and it can be done there. This will better follow igbvf method
of handling a mailbox event and message timeout.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: do not print registers to dmesg in ixgbevf_get_regs

This patch removes the use of hw_dbg in ixgbevf when the ixgbe_get_regs function
is called from ethtool. This goes along side a patch to ethtool which enables
proper support for ixgbevf pretty-printing of registers.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: add a counter for pkts dropped in xmit path

In the xmit path, the driver may drop some pkts due to reasons such as
DMA mapping errors, out of memory conditions or to protect HW from
unrecoverable errors. Add a counter in TX-stats for such drops.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix adaptive interrupt coalescing

The current EQ delay calculation for AIC is based only on RX packet rate.
This fails to be effective when there's only TX and no RX.
This patch inclues:
- Calculating EQ-delay based on both RX and TX pps.
- Modifying EQ-delay of all EQs via one cmd, instead of issuing a separate
cmd for each EQ.
- A new structure to store interrupt coalescing parameters, in a separate
cache-line from the EQ-obj.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: call ENABLE_VF cmd for Skyhawk-R too

This cmd needs to be sent to FW when enabling VFs (currently used only
for Lancer.) Also, avoid calling the cmd when driver loads and finds that
VFs are already enabled from a previous load.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Create single TXQ on BE3-R 1G ports

On BE3-R 1G ports (identified by port numbers 2 and 3) the FW cannot properly
support multiple TXQs. This also makes the number of RX and TX queues symmetric
as only a single RXQ is available on 1G ports.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: pass if_id for v1 and V2 versions of TX_CREATE cmd

It is a required field for all TX_CREATE cmd versions > 0.
Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Call be_vf_setup() even when VFs are enbaled from previous load

Re-define the sriov_want() macro to check for number of VFs that need
to be enabled in the current load of the driver or the number of VFs that
still remain enabled from the previous load (attached VFs cannot be disabled.)

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: can: c_can_platform: Remove redundant of_match_ptr

The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: cpsw-phy-sel: Remove redundant of_match_ptr

The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: cpsw: Remove redundant of_match_ptr

The data structure of_match_ptr() protects is always compiled in.
Hence of_match_ptr() is not needed.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: RCUify bond_set_rx_mode()

Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:

RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
[<ffffffff81651ca5>] dump_stack+0x54/0x74
[<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
[<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
[<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
[<ffffffff81558020>] dev_mc_add+0x10/0x20
[<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
[<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
[<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
[<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...

Fix this by using RCU primitives.

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: phy: marvell.c: removed checkpatch.pl warnings

removes following warnings-
drivers/net/phy/marvell.c:37: WARNING: Use #include <linux/io.h> instead of <asm/io.h>
drivers/net/phy/marvell.c:39: WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>

Signed-off-by: Avinash Kumar <avi.kp.137@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: skb_is_gso_v6() requires skb_is_gso()

bnx2x makes a dangerous use of skb_is_gso_v6().

It should first make sure skb is a gso packet

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add missing sk_max_pacing_rate doc

Warning(include/net/sock.h:411): No description found for parameter
'sk_max_pacing_rate'

Lets please "make htmldocs" and kbuild bot.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bgmac: add support for Byte Queue Limits

This makes it possible to use some more advanced queuing
techniques with this driver.

When multi queue support will be added some changes to Byte Queue
handling is needed.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b44: add support for Byte Queue Limits

This makes it possible to use some more advanced queuing
techniques with this driver.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net ipv4: Convert ipv4.ip_local_port_range to be per netns v3

- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
  of the callers.
- Move the initialization of sysctl_local_ports into
   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
  This patch now successfully passes an allmodconfig build.
  Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya <samya@twitter.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet: use likely() for common Ethernet encap

Mark code path's likely/unlikely based on most common usage.
  * Very few devices use dsa tags.
  * Most traffic is Ethernet (not 802.2)
  * No sane person uses trailer type or Novell encapsulation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet: cleanup eth_type_trans

Remove old legacy comment and weird if condition.
The comment has outlived it's stay and is throwback to some
early net code (before my time). Maybe Dave remembers what it meant.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Not need to set fl6.flowi6_flags as zero

setting fl6.flowi6_flags as zero after memset is redundant, Remove it.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xen-netback: improve ring effeciency for guest RX

There was a bug that netback routines netbk/xenvif_skb_count_slots and
netbk/xenvif_gop_frag_copy disagreed with each other, which caused
netback to push wrong number of responses to netfront, which caused
netfront to eventually crash. The bug was fixed in 6e43fc04a
("xen-netback: count number required slots for an skb more carefully").

Commit 6e43fc04a focused on backport-ability. The drawback with the
existing packing scheme is that the ring is not used effeciently, as
stated in 6e43fc04a.

skb->data like:
    |        1111|222222222222|3333        |

is arranged as:
    |1111        |222222222222|3333        |

If we can do this:
    |111122222222|22223333    |
That would save one ring slot, which improves ring effeciency.

This patch effectively reverts 6e43fc04a. That patch made count_slots
agree with gop_frag_copy, while this patch goes the other way around --
make gop_frag_copy agree with count_slots. The end result is that they
still agree with each other, and the ring is now arranged like:
    |111122222222|22223333    |

The patch that improves packing was first posted by Xi Xong and Matt
Wilson. I only rebase it on top of net-next and rewrite commit message,
so I retain all their SoBs. For more infomation about the original bug
please refer to email listed below and commit message of 6e43fc04a.

Original patch:
http://lists.xen.org/archives/html/xen-devel/2013-07/msg00760.html

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[ msw: minor code cleanups, rewrote commit message, adjusted code
  to count RX slots instead of meta structures ]
Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
[ liuw: rebased on top of net-next tree, rewrote commit message, coding
  style cleanup. ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qdisc: basic classifier - remove unnecessary initialization

err is set once, then first code resets it.
err = tcf_exts_validate(...)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qdisc: meta return ENOMEM on alloc failure

Rather than returning earlier value (EINVAL), return ENOMEM if
kzalloc fails. Found while reviewing to find another EINVAL condition.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: trivial: remove forgotten bond_next_vlan()

It's a forgotten function declaration, which was removed some time ago
already.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '20130926_include_linux_networking_externs' of git://repo.or.cz/linux-2.6/trivial-mods

Conflicts:
include/linux/netdevice.h

More extern removals from Joe Perches.

Minor conflict with the dev_notify_flags changes which added a new
argument to __dev_notify_flags().

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Conflicts:
include/net/xfrm.h

Simple conflict between Joe Perches "extern" removal for function
declarations in header files and the changes in Steffen's tree.

Steffen Klassert says:

====================
Two patches that are left from the last development cycle.
Manual merging of include/net/xfrm.h is needed. The conflict
can be solved as it is currently done in linux-next.

1) We announce the creation of temporary acquire state via an asyc event,
   so the deletion should be annunced too. From Nicolas Dichtel.

2) The VTI tunnels do not real tunning, they just provide a routable
   IPsec tunnel interface. So introduce and use xfrm_tunnel_notifier
   instead of xfrm_tunnel for xfrm tunnel mode callback. From Fan Du.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

hamradio: baycom: remove deprecated IRQF_DISABLED

This patch proposes to remove the IRQF_DISABLED flag
from drivers/net/hamradio/baycom_*

It's a NOOP since 2.6.35 and it will be removed one day.

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dev: always advertise rx_flags changes via netlink

When flags IFF_PROMISC and IFF_ALLMULTI are changed, netlink messages are not
consistent. For example, if a multicast daemon is running (flag IFF_ALLMULTI
set in dev->flags but not dev->gflags, ie not exported to userspace) and then a
user sets it via netlink (flag IFF_ALLMULTI set in dev->flags and dev->gflags, ie
exported to userspace), no netlink message is sent.
Same for IFF_PROMISC and because dev->promiscuity is exported via
IFLA_PROMISCUITY, we may send a netlink message after each change of this
counter.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dev: update __dev_notify_flags() to send rtnl msg

This patch only prepares the next one, there is no functional change.
Now, __dev_notify_flags() can also be used to notify flags changes via
rtnetlink.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qmi_wwan: fix checkpatch warnings

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qmi_wwan: add Telit LE920 newer firmware support

Newer firmware use a new pid and a different interface.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: introduce SO_MAX_PACING_RATE

As mentioned in commit afe4fd062416b ("pkt_sched: fq: Fair Queue packet
scheduler"), this patch adds a new socket option.

SO_MAX_PACING_RATE offers the application the ability to cap the
rate computed by transport layer. Value is in bytes per second.

u32 val = 1000000;
setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));

To be effectively paced, a flow must use FQ packet scheduler.

Note that a packet scheduler takes into account the headers for its
computations. The effective payload rate depends on MSS and retransmits
if any.

I chose to make this pacing rate a SOL_SOCKET option instead of a
TCP one because this can be used by other protocols.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove bond_next_slave()

There are no users left, so it's safe to remove.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: don't use bond_next_slave() in bond_info_seq_next()

We don't need the circular loop there and it's the only current user of
bond_next_slave() - so just use the standard bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove unused __get_next_agg()

It has no users, so it's safe to remove it completely.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make bond_3ad_unbind_slave() use bond_for_each_slave()

Convert all instances of

for (agg = __get_first_agg(); agg; agg = __get_next_port)

to the standard bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make ad_agg_selection_logic() use bond_for_each_slave()

Convert all instances of

for (agg = __get_first_agg(); agg; agg = __get_next_port)

to the standard bond_for_each_slave(). Also, remove the useless checks
before calling bond_3ad_set_carrier() - if we have something NULL - it
would fire long ago, in __get_first/next_port(), per example.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make __get_active_agg() use bond_for_each_slave()

Currently we're relying on suboptimal construct

for (; aggregator; aggregator = __get_next_agg(aggregator)) {

where aggregator is an argument of __get_active_agg() which is _always_ the
first slave's aggregator - judging by all the callers, comments in the
ad_agg_selection_logic() and by logic.

Convert it to use the standard bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make ad_port_selection_logic() use bond_for_each_slave()

Currently, ad_port_selection_logic() uses

for (aggregator = __get_first_agg(port); aggregator;
aggregator = __get_next_agg(aggregator)) {

construct, however it's suboptimal, difficult to read and understand.

Change it to a standard bond_for_each_slave(), so that we won't need
__get_first/next_agg() and have it more readable.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove __get_first_port()

Currently we have only one user of it, so it's kind of useless and just
obfusicates things.

Remove it and move the logic to the only user -
bond_3ad_state_machine_handler().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove __get_next_port()

Currently this function is only used in constructs like

for (port = __get_first_port(bond); port; port = __get_next_port(port))

which is basicly the same as

bond_for_each_slave(bond, slave, iter) {
port = &(SLAVE_AD_INFO(slave).port);

but a more time consuming.

Remove the function and convert the users to bond_for_each_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: verify if we still have slaves in bond_3ad_unbind_slave()

After commit 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the unlinking of the slave
to the earliest position possible - so that nobody will see an
half-uninited slave.

However, bond_3ad_unbind_slave() relied that, even while removing the last
slave, it is still accessible - via __get_first_agg() (and, eventually,
bond_first_slave()).

Fix that by verifying if the aggregator return is an actual aggregator, but
not NULL.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: correctly verify for the first slave in bond_enslave

After commit 1f718f0f4f97145f4072d2d72dcf85069ca7226d ("bonding: populate
neighbour's private on enslave"), we've moved the actual 'linking' in the
end of the function - so that, once linked, the slave is ready to be used,
and is not still in the process of enslaving.

However, 802.3ad verified if it's the first slave by looking at the

if (bond_first_slave(bond) == new_slave)

which, because we've moved the linking to the end, became broken - on the
first slave bond_first_slave(bond) returns NULL.

Fix this by verifying if the prev_slave, that equals bond_last_slave(), is
actually populated - if it is - then it's not the first slave, and vice
versa.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: use pcie_get_minimum_link()

Use common code for getting the pcie link speed/width for debug printing.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Add support for EXTPHY2 LED mode

Add new LED mode for the BCM848xx to support new board type.

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Change function prototype

Change bnx2x_bsc_read function prototype (more of a cosmetic change).

Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Don't disable/enable SR-IOV when loading

Current bnx2x implementation controls the number of VFs only by
standard sysfs support, and will reject setting the number of VFs
when the PF is not loaded.
As a result, there is no need to schedule a delayed work to enable
SR-IOV when PF is loaded, as the number of VFs at that point
must be 0.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Correct VF driver info

When running ethtool on VF interfaces, returning values should indicate
that the interface does not support self-test or register dump.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Test nvram when interface is down

Since commit 3fb43eb ("bnx2x: Change to D3hot only on removal") nvram
is accessible whenever the driver is loaded - Thus it is possible to
test it during self-test even if the interface is down

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: processing ancillary IP_TOS or IP_TTL

If IP_TOS or IP_TTL are specified as ancillary data, then sendmsg() sends out
packets with the specified TTL or TOS overriding the socket values specified
with the traditional setsockopt().

The struct inet_cork stores the values of TOS, TTL and priority that are
passed through the struct ipcm_cookie. If there are user-specified TOS
(tos != -1) or TTL (ttl != 0) in the struct ipcm_cookie, these values are
used to override the per-socket values. In case of TOS also the priority
is changed accordingly.

Two helper functions get_rttos and get_rtconn_flags are defined to take
into account the presence of a user specified TOS value when computing
RT_TOS and RT_CONN_FLAGS.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: IP_TOS and IP_TTL can be specified as ancillary data

This patch enables the IP_TTL and IP_TOS values passed from userspace to
be stored in the ipcm_cookie struct. Three fields are added to the struct:

- the TTL, expressed as __u8.
  The allowed values are in the [1-255].
  A value of 0 means that the TTL is not specified.

- the TOS, expressed as __s16.
  The allowed values are in the range [0,255].
  A value of -1 means that the TOS is not specified.

- the priority, expressed as a char and computed when
  handling the ancillary data.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

lance: Fix hardcoded interrupt name lp->name to use system device value

The lance interrupt handler was using the hard-coded name which would make it difficult to tell where the interrupt came from. Changed to use the device name that made the interrupt.

Signed-off-by: Nate Levesque <thenaterhood@gmail.com>
Reviewed-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hp100: replace hardcoded name in /proc/interrupts with interface name

The /proc/interrupts file displays hp100, which is not the accepted style. Printing eth%d is more helpful.

Signed-off-by: Mihir Singh <me@mihirsingh.com>
Reviewed-By: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: compare sernum when walking fib for /proc/net/ipv6_route as safety net

This patch provides an additional safety net against NULL
pointer dereferences while walking the fib trie for the new
/proc/net/ipv6_route walkers. I never needed it myself and am unsure
if it is needed at all, but the same checks where introduced in
2bec5a369ee79576a3eea2c23863325089785a2c ("ipv6: fib: fix crash when
changing large fib while dumping it") to fix NULL pointer bugs.

This patch is separated from the first patch to make it easier to revert
if we are sure we can drop this logic.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: avoid high order memory allocations for /proc/net/ipv6_route

Dumping routes on a system with lots rt6_infos in the fibs causes up to
11-order allocations in seq_file (which fail). While we could switch
there to vmalloc we could just implement the streaming interface for
/proc/net/ipv6_route. This patch switches /proc/net/ipv6_route from
single_open_net to seq_open_net.

loff_t *pos tracks dst entries.

Also kill never used struct rt6_proc_arg and now unused function
fib6_clean_all_ro.

Cc: Ben Greear <greearb@candelatech.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: at803x: add suspend/resume callbacks

When WOL is enabled, the chip can't be put into power-down (BMCR_PDOWN)
mode, as that will also switch off the MAC, which consequently leads to
a link loss.

Use BMCR_ISOLATE in that case, which will at least save us some
milliamperes in comparison to normal operation mode.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: at803x: don't pass function pointers with &

Just a cosmetic cleanup.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qlge'

Jitendra Kalsaria says:

====================
This patch series enhance the handling of nested vlan tags in Rx path.

V2 changes:
* removed module parameter.

V3 changes:
* Users can enable or disable hardware VLAN acceleration using ethtool
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qlge: Update version to 1.00.00.33

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlge: Enhance nested VLAN (Q-in-Q) handling.

o Adapter doesn’t handle packets with nested VLAN tags in
Rx path. User can turn off VLAN tag stripping in the hardware
and let the stack handle stripping of VLAN tags in the Rx path.

o Users can enable or disable hardware VLAN acceleration using ethtool

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen_nic: Update version to 4.0.82

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen_nic: Print ULA information

This patch reads CAMRAM(0x178) where FW writes a key for ULA and non-ULA
adapter and based on the key, driver logs the message.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[networking]device.h: Remove extern from function prototypes

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches <joe@perches.com>

net.h/skbuff.h: Remove extern from function prototypes

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches <joe@perches.com>

netfilter: Remove extern from function prototypes

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches <joe@perches.com>

Merge branch 'bonding_neighbours'

bonding: use neighbours instead of own lists

Veaceslav Falico says:

====================
This patchset introduces all the needed infrastructure, on top of current
adjacent lists, to be able to remove bond's slave_list/slave->list. The
overhead in memory/CPU is minimal, and after the patchset bonding can rely
on its slave-related functions, given the proper locking. I've done some
netperf benchmarks on a vm, and the delta was about 0.1gbps for 35gbps as a
whole, so no speed fluctuations.

It also automatically creates lower/upper and master symlinks in dev's
sysfs directory.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: create sysfs symlinks for neighbour devices

Also, remove the same functionality from bonding - it will be already done
for any device that links to its lower/upper neighbour.

The links will be created for dev's kobject, and will look like
lower_eth0 for lower device eth0 and upper_bridge0 for upper device
bridge0.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: expose the master link to sysfs, and remove it from bond

Currently, we can have only one master upper neighbour, so it would be
useful to create a symlink to it in the sysfs device directory, the way
that bonding now does it, for every device. Lower devices from
bridge/team/etc will automagically get it, so we could rely on it.

Also, remove the same functionality from bonding.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: unlink the upper neighbour before unregistering

On netdev unregister we're removing also all of its sysfs-associated stuff,
including the sysfs symlinks that are controlled by netdev neighbour code.
Also, it's a subtle race condition - cause we can still access it after
unregistering.

Move the unlinking right before the unregistering to fix both.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: link the upper neighbour only after registering

Otherwise users might access it without being fully registered, as per
sysfs - it only inits in register_netdevice(), so is unusable till it is
called.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove slave lists

And all the initialization.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: use neighbours for bond_next_slave()

Use the new function __bond_next_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: add __bond_next_slave() which uses neighbours

Add a new function, __bond_next_slave(), which uses neighbours to find the
next slave after the slave provided. It will be further used to gradually
go start using neighbour netdev_adjacent infrastructure instead of
bonding's own lists.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove bond_prev_slave()

We don't really need it, and it's really hard to RCUify the list->prev.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: convert first/last slave logic to use neighbours

For that, use netdev_adjacent_get_private(list_head) on bond's lower
neighbour list members. Also, add a small macro - bond_slave_list(bond),
which returns the bond list via neighbour list.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add a possibility to get private from netdev_adjacent->list

It will be useful to get first/last element.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: convert bond_has_slaves() to use the neighbour list

The same way as it was used for its own slave_list.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: add bond_has_slaves() and use it

Currently we verify if we have slaves by checking if bond->slave_list is
empty. Create a define bond_has_slaves() and use it, a bit more readable
and easier to change in the future.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove unused bond_for_each_slave_from()

It has no users, so we can remove it.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: rework bond_ab_arp_probe() to use bond_for_each_slave()

Currently it uses the hard-to-rcuify bond_for_each_slave_from(), and also
it doesn't check every slave for disrepencies between the actual
IS_UP(slave) and the slave->link == BOND_LINK_UP, but only till we find the
next suitable slave.

Fix this by using bond_for_each_slave() and storing the first good slave in
*before till we find the current_arp_slave, after that we store the first good
slave in new_slave. If new_slave is empty - use the slave stored in before,
and if it's also empty - then we didn't find any suitable slave.

Also, in the meanwhile, check for each slave status.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: rework bond_find_best_slave() to use bond_for_each_slave()

bond_find_best_slave() does not have to be balanced - i.e. return the slave
that is *after* some other slave, but rather return the best slave that
suits, except of bond->primary_slave - in which case we just return it if
it's suitable.

After that we just look through all the slaves and return either first up
slave or the slave whose link came back earliest.

We also don't care about curr_active_slave lock cause we use it in
bond_should_change_active() only and there we take it right away - i.e. it
won't go away.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: rework rlb_next_rx_slave() to use bond_for_each_slave()

Currently, we're using bond_for_each_slave_from(), which is really hard to
implement under RCU and/or neighbour list.

Remove it and use bond_for_each_slave() instead, taking care of the last
used slave.

Also, rename next_rx_slave to rx_slave and store the current (last)
rx_slave.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: rework bond_3ad_xmit_xor() to use bond_for_each_slave() only

Currently, there are two loops - first we find the first slave in an
aggregator after the xmit_hash_policy() returned number, and after that we
loop from that slave, over bonding head, and till that slave to find any
suitable slave to send the packet through.

Replace it by just one bond_for_each_slave() loop, which first loops
through the requested number of slaves, saving the first suitable one, and
after that we've hit the requested number of slaves to skip - search for
any up slave to send the packet through. If we don't find such kind of
slave - then just send the packet through the first suitable slave found.

Logic remains unchainged, and we skip two loops. Also, refactor it a bit
for readability.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: use bond_for_each_slave() in bond_uninit()

We're safe agains removal there, cause we use neighbours primitives.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: make bond_for_each_slave() use lower neighbour's private

It needs a list_head *iter, so add it wherever needed. Use both non-rcu and
rcu variants.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: remove bond_for_each_slave_continue_reverse()

We only use it in rollback scenarios and can easily use the standart
bond_for_each_dev() instead.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add for_each iterators through neighbour lower link's private

Add a possibility to iterate through netdev_adjacent's private, currently
only for lower neighbours.

Add both RCU and RTNL/other locking variants of iterators, and make the
non-rcu variant to be safe from removal.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: modify bond_get_slave_by_dev() to use neighbours

It should be used under rtnl/bonding lock, so use the non-RCU version.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: populate neighbour's private on enslave

Use the new provided function when attaching the lower slave to populate
its ->private with struct slave *new_slave. Also, move it to the end to
be able to 'find' it only after it was completely initialized, and
deinitialize in the first place on release.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add netdev_adjacent->private and allow to use it

Currently, even though we can access any linked device, we can't attach
anything to it, which is vital to properly manage them.

To fix this, add a new void *private to netdev_adjacent and functions
setting/getting it (per link), so that we can save, per example, bonding's
slave structures there, per slave device.

netdev_master_upper_dev_link_private(dev, upper_dev, private) links dev to
upper dev and populates the neighbour link only with private.

netdev_lower_dev_get_private{,_rcu}() returns the private, if found.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add RCU variant to search for netdev_adjacent link

Currently we have only the RTNL flavour, however we can traverse it while
holding only RCU, so add the RCU search. Add an RCU variant that uses
list_head * as an argument, so that it can be universally used afterwards.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add adj_list to save only neighbours

Currently, we distinguish neighbours (first-level linked devices) from
non-neighbours by the neighbour bool in the netdev_adjacent. This could be
quite time-consuming in case we would like to traverse *only* through
neighbours - cause we'd have to traverse through all devices and check for
this flag, and in a (quite common) scenario where we have lots of vlans on
top of bridge, which is on top of a bond - the bonding would have to go
through all those vlans to get its upper neighbour linked devices.

This situation is really unpleasant, cause there are already a lot of cases
when a device with slaves needs to go through them in hot path.

To fix this, introduce a new upper/lower device lists structure -
adj_list, which contains only the neighbours. It works always in
pair with the all_adj_list structure (renamed from upper/lower_dev_list),
i.e. both of them contain the same links, only that all_adj_list contains
also non-neighbour device links. It's really a small change visible,
currently, only for __netdev_adjacent_dev_insert/remove(), and doesn't
change the main linked logic at all.

Also, add some comments a fix a name collision in
netdev_for_each_upper_dev_rcu() and rework the naming by the following
rules:

netdev_(all_)(upper|lower)_*

If "all_" is present, then we work with the whole list of upper/lower
devices, otherwise - only with direct neighbours. Uninline functions - to
get better stack traces.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: use lists as arguments instead of bool upper

Currently we make use of bool upper when we want to specify if we want to
work with upper/lower list. It's, however, harder to read, debug and
occupies a lot more code.

Fix this by just passing the correct upper/lower_dev_list list_head pointer
instead of bool upper, and work internally with it.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: neighbour: use source address of last enqueued packet for solicitation

Currently we always use the first member of the arp_queue to determine
the sender ip address of the arp packet (or in case of IPv6 - source
address of the ndisc packet). This skb is fixed as long as the queue is
not drained by a complete purge because of a timeout or by a successful
response.

If the first packet enqueued on the arp_queue is from a local application
with a manually set source address and the to be discovered system
does some kind of uRPF checks on the source address in the arp packet
the resolving process hangs until a timeout and restarts. This hurts
communication with the participating network node.

This could be mitigated a bit if we use the latest enqueued skb's
source address for the resolving process, which is not as static as
the arp_queue's head. This change of the source address could result in
better recovery of a failed solicitation.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>