git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log

ixgbe: Correct thermal sensor event check

The thermal sensor event logic is messed up, because it can execute
the code when there is no thermal event. The current logic is that
it will exit when !capable && !event whereas it really should exit
when !capable || !event. For one thing, it means that the service
task is doing too much work. It probably has some other symptoms as
well. So, correct the logic, simplifying to only execute when there
is a thermal event. The capable check is redundant.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: enable L3/L4 filtering for Tx switched packets

This will ensure that VF-to-VF traffic on the same PF
is filtered to allow RSS operation.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: Remove MAC X550EM_X 1Gbase-t led_[on|off] support

Since FW configures the PHY and MAC X550EM_X has no
PHY access, led_[on|off] is not supported with the 1Gbase-t design.

Removed MAC X550EM_X 1Gbase-t led_[on|off] support by setting
function pointers to NULL and added NULL pointer checks. Also set
init_led_link_act to NULL and added NULL pointer check.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ixgbe: initialize u64_stats_sync structures early at ixgbe_probe

Fix the following CallTrace:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 71 PID: 1 Comm: swapper/0 Not tainted 4.8.8-WR9.0.0.1_standard #11
Hardware name: Intel Corporation S2600WTT/S2600WTT,
BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
00200086 00200086 eb5e1ab8 c144dd70 00000000 00000000 eb5e1af8 c10af89a
c1d23de4 eb5e1af8 00000009 eb5d8600 eb5d8638 eb5e1af8 c10b14d8 00000009
0000000a c1d32911 00000000 00000000 e44c826c eb5d8000 eb5e1b74 c10b214e
Call Trace:
[<c144dd70>] dump_stack+0x5f/0x8f
[<c10af89a>] register_lock_class+0x25a/0x4c0
[<c10b14d8>] ? check_irq_usage+0x88/0xc0
[<c10b214e>] __lock_acquire+0x5e/0x17a0
[<c1abdb9b>] ? _raw_spin_unlock_irqrestore+0x3b/0x70
[<c10cf14a>] ? rcu_read_lock_sched_held+0x8a/0x90
[<c10b3c5f>] lock_acquire+0x9f/0x1f0
[<c1922dcf>] ? dev_get_stats+0x5f/0x110
[<c176e6b3>] ixgbe_get_stats64+0x113/0x320
[<c1922dcf>] ? dev_get_stats+0x5f/0x110
[<c1922dcf>] dev_get_stats+0x5f/0x110
[<c1ab5415>] rtnl_fill_stats+0x40/0x105
[<c193dd45>] rtnl_fill_ifinfo+0x4c5/0xd20
[<c11c5115>] ? __kmalloc_node_track_caller+0x1a5/0x410
[<c1917487>] ? __kmalloc_reserve.isra.42+0x27/0x80
[<c191754f>] ? __alloc_skb+0x6f/0x270
[<c1942291>] rtmsg_ifinfo_build_skb+0x71/0xd0
[<c194230a>] rtmsg_ifinfo.part.23+0x1a/0x50
[<c1923dad>] ? call_netdevice_notifiers_info+0x2d/0x60
[<c194236b>] rtmsg_ifinfo+0x2b/0x40
[<c192f997>] register_netdevice+0x3d7/0x4d0
[<c192faa7>] register_netdev+0x17/0x30
[<c177b83d>] ixgbe_probe+0x118d/0x1610
[<c1498202>] local_pci_probe+0x32/0x80
[<c1498172>] ? pci_match_device+0xd2/0x100
[<c14991e0>] pci_device_probe+0xc0/0x110
[<c1652cc5>] driver_probe_device+0x1c5/0x280
[<c1498172>] ? pci_match_device+0xd2/0x100
[<c1652e09>] __driver_attach+0x89/0x90
[<c1652d80>] ? driver_probe_device+0x280/0x280
[<c165114f>] bus_for_each_dev+0x4f/0x80
[<c165269e>] driver_attach+0x1e/0x20
[<c1652d80>] ? driver_probe_device+0x280/0x280
[<c1652317>] bus_add_driver+0x1a7/0x220
[<c1653a79>] driver_register+0x59/0xe0
[<c1f897b8>] ? igb_init_module+0x49/0x49
[<c1497b2a>] __pci_register_driver+0x4a/0x50
[<c1f8985d>] ixgbe_init_module+0xa5/0xc4
[<c1000485>] do_one_initcall+0x35/0x150
[<c107e818>] ? parameq+0x18/0x70
[<c1f395d8>] ? repair_env_string+0x12/0x51
[<c107ead0>] ? parse_args+0x260/0x3b0
[<c1074f73>] ? __usermodehelper_set_disable_depth+0x43/0x50
[<c1f39e90>] kernel_init_freeable+0x19b/0x267
[<c1f395c6>] ? set_debug_rodata+0xf/0xf
[<c10b1e7b>] ? trace_hardirqs_on+0xb/0x10
[<c1abdc02>] ? _raw_spin_unlock_irq+0x32/0x50
[<c1085f0b>] ? finish_task_switch+0xab/0x1f0
[<c1085ec9>] ? finish_task_switch+0x69/0x1f0
[<c1ab6a30>] kernel_init+0x10/0x110
[<c108bd65>] ? schedule_tail+0x25/0x80
[<c1abe422>] ret_from_kernel_thread+0xe/0x24
[<c1ab6a20>] ? rest_init+0x130/0x130

This CallTrace occurred on 32-bit kernel with CONFIG_PROVE_LOCKING
enabled.

This happens at ixgbe driver probe hardware stage, when comes to
ixgbe_get_stats64, the seqcount/seqlock still not initialize, although
this was initialize in TX/RX resources setup routin, but it was too late,
then lockdep give this Warning.

To fix this, move the u64_stats_init function to driver probe stage,
which before we get the status of seqcount and after the RX/TX ring
was finished init.

Signed-off-by: Liwei Song <liwei.song@windriver.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>

ixgbe/ixgbevf: Enables TSO for MPLS encapsulated packets

This patch advertises TSO & GSO features in netdev->mpls_features.
In ixgbe(vf)_tso() where we set up segmentation offload, the IP
header will be the inner network header when eth_p_mpls() indicates
the Ethernet protocol is MPLS (UC or MC).

Suggested-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Scott Peterson <scott.d.peterson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

bnxt_en: Fix xmit_more with BQL.

We need to write the doorbell if BQL has stopped the queue and
skb->xmit_more is set. Otherwise it is possible for the tx queue to
rot and cause tx timeout.

Fixes: 4d172f21cefe ("bnxt_en: Implement xmit_more.")
Suggested-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bnxt_en-Misc-updates-for-net-next'

Michael Chan says:

====================
bnxt_en: Misc. updates for net-next.

The 1st 2 patches add short firmware message support for new VF devices.
The 3rd patch adds a pci shutdown callback for the RDMA driver for proper
shutdown. The next 3 patches improve the doorbell operations by
elimiating the double doorbell workaround on newer chips, and by adding
xmit_more support. The last patch adds a parameter to bnxt_set_dflt_rings().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Pass in sh parameter to bnxt_set_dflt_rings().

In the existing code, the local variable sh is hardcoded to true to
calculate default rings for shared ring configuration. It is better
to have the caller determine the value of sh.

Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Implement xmit_more.

Do not write the TX doorbell if skb->xmit_more is set unless the TX
queue is full.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Optimize doorbell write operations for newer chips.

Older chips require the doorbells to be written twice, but newer chips
do not. Add a new common function bnxt_db_write() to write all
doorbells appropriately depending on the chip. Eliminating the extra
doorbell on newer chips has a significant performance improvement
on pktgen.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Add additional chip ID definitions.

Add additional chip definitions and macros for all supported chips.
Add a new macro BNXT_CHIP_P4_PLUS for the newer generation of chips and
use the macro to properly determine the features supported by these
newer chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Add a callback to inform RDMA driver during PCI shutdown.

When bnxt_en gets a PCI shutdown call, we need to have a new callback
to inform the RDMA driver to do proper shutdown and removal.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Add PCI IDs for BCM57454 VF devices.

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Support for Short Firmware Message

The new short message format is used on the new BCM57454 VFs. Each
firmware message is a fixed 16-byte message sent using the standard
firmware communication channel. The short message has a DMA address
pointing to the legacy long firmware message.

Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: b53: remove unused dev argument

The port net device passed to b53_fdb_copy is not used. Remove it.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: remove dsa_port_is_bridged

The helper is only used once and makes the code more complicated that it
should. Remove it and reorganize the variables so that it fits on 80
columns.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Fix netdev_features flag

GRO is not supported by Chelsio HW when rx_csum is disabled.
Update the netdev features flag when rx_csum is modified.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: FW upgrade fixes

Disable FW_OK flag while flashing Firmware. This will help to fix any
potential mailbox timeouts during Firmware flash.

Grab new devlog parameters after Firmware restart. When we FLASH new
Firmware onto an adapter, the new Firmware may have the Firmware Device Log
located at a different memory address or have a different size for it.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: add new T5 pci device id

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: keep carrier off before registering netdev

Mark carrier off before registering netdev to ensure that vlan device
picks up the correct state of the carrier

Signed-off-by: Surendra Mobiya <surendra@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-qualcomm-add-QCA7000-UART-driver'

Stefan Wahren says:

====================
net: qualcomm: add QCA7000 UART driver

The Qualcomm QCA7000 HomePlug GreenPHY supports two interfaces:
UART and SPI. This patch series adds the missing support for UART.

This driver based on the Qualcomm code [1], but contains some changes:
* use random MAC address per default
* use net_device_stats from device
* share frame decoding between SPI and UART driver
* improve error handling
* reimplement tty_wakeup with work queue (based on slcan)
* use new serial device bus instead of ldisc

The patches 1 - 3 are just for clean up and are not related to
the UART support. Patch 4 adds SET_NETDEV_DEV() to qca_spi.
Patches 5 - 16 prepare the existing QCA7000 code for UART support.
The last patch contains the new driver.

The code itself has been tested on a Freescale i.MX28 board and
a Raspberry Pi Zero.

Changes in v8:
  * add necessary header includes to qca_7k.c in order to reflect
    dependencies

Changes in v7:
  * fix race between tx workqueue and device deregistration (reported by Lino)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: add QCA7000 UART driver

This patch adds the Ethernet over UART driver for the
Qualcomm QCA7000 HomePlug GreenPHY.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: qca7000: append UART interface to binding

This merges the serdev binding for the QCA7000 UART driver (Ethernet over
UART) into the existing document.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: slave-device: add current-speed property

This adds a new DT property to define the current baud rate of the
slave device.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: qca7000: rename binding

Before we can merge the QCA7000 UART binding the document needs to be
renamed.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: qca7000-spi: Rework binding

In preparation for the QCA7000 UART binding rework the binding document.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: make qca_7k_common a separate kernel module

In order to share common functions between QCA7000 SPI and UART protocol
driver the qca_7k_common needs to be a separate kernel module.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: prepare frame decoding for UART driver

Unfortunately the frame format is not exactly identical between SPI
and UART. In case of SPI there is an additional HW length at the
beginning. So store the initial state to make the decoding state machine
more flexible and easy to extend for UART support.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: rename qca_framing.c to qca_7k_common.c

As preparation for the upcoming UART driver we need a module
which contains common functions for both interfaces. The module
qca_framing is a good candidate but renaming to qca_7k_common would
make it clear.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qca_spi: Clarify MODULE_DESCRIPTION

Since this driver is specific to the QCA7000, we should make the module
description more precisely.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: move qcaspi_tx_cmd to qca_spi.c

The function qcaspi_tx_cmd() is only called from qca_spi.c. So we better
move it there.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qca_spi: remove QCASPI_MTU

There is no need for an additional MTU define.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: Improve readability of length defines

In order to avoid mixing things up, make the MTU and frame length
defines easier to read.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: use net_device_ops instead of direct call

There is no need to export qcaspi_netdev_open and qcaspi_netdev_close
because they are also accessible via the net_device_ops.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qca_spi: Use SET_NETDEV_DEV()

Use SET_NETDEV_DEV() in qca_spi to create the "/sys/class/net/<if>/device"
symlink.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qca_7k: Use BIT macro

Use the BIT macro for the CONFIG and INT register values.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qca_framing: use u16 for frame offset

It doesn't make sense to use a signed variable for offset here, so
fix it up.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qualcomm: qca_7k: clean up header includes

Currently the includes doesn't reflect the dependencies. So
fix this up by removing all unnecessary entries and add the
necessary ones explicit.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-phy-Support-managed-Cortina-phys'

Bogdan Purcareata says:

====================
net: phy: Support managed Cortina phys

So far, the Cortina family phys (CS4340 in this particular case) are only
supported in fixed link mode (via fixed_phy_register). The generic 10G
phy driver does not work well with the phylib state machine, when the phy
is registered via of_phy_connect. This prohibits the user from describing the
phy nodes in the device tree.

In order to support this scenario, and to properly describe the board
device tree, add a minimal Cortina driver that reads the status from the
right register. With the generic 10G C45 driver, the kernel will print
messages like:
[    0.226521] mdio_bus 8b96000: Error while reading PHY16 reg at 1.6
[    0.232780] mdio_bus 8b96000: Error while reading PHY16 reg at 1.5

v3 -> v4:
- Add trademark info.
- Minor documentation entry consistency nit.

v2 -> v3:
- Add documentation entry.

v1 -> v2:
- Change approach for getting the phy_id from hacking get_phy_c45_ids to
  describing the device in the device tree via ethernet-phy-id.

More patch version changes per individual patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: Add Cortina device tree bindings

Add device tree description info for Cortina 10G phy devices.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: Add Cortina CS4340 driver

Add basic support for Cortina PHY drivers. Support only CS4340 for now.
The phys are not compatible with IEEE 802.3 clause 22/45 registers.

Implement proper read_status support. The generic 10G phy driver causes
bus register access errors.

The driver should be described using the "ethernet-phy-id" device tree
compatible.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qed-DCBx-and-Attentions-series'

Yuval Mintz says:

====================
qed: DCBx and Attentions series

The series contains 2 major components [& some odd bits]:
- The first 3 patches are DCBx-related, containg missing bits in the
   implementation, correcting existing API and removing code no longer
   necessary.
- Most of the remaining patches are interrupt/hw-attention related,
   adding some differeneces relating to QL41xxx and QL45xxx differences.
   While at it, they also remove a large chunk of unnecessary structure
   definitions.

The series also contain a patch [#10] that was accidently missing
from a previous series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Cache alignemnt padding to match host

Improve PCI performance by adjusting padding sizes to match those of the
host machine's cacheline.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Mask parities after occurance

Parities might exhibit a flood behavior since we re-enable the
attention line without preventing the parity from re-triggering the
assertion.
Mask the source in AEU until the parity would be handled.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Print multi-bit attentions properly

In strucuture reflecting the AEU hw block some entries
represent multiple HW bits, and the associated name is in fact
a pattern.
Today, whenever such an attention would be asserted the resulted
prints would show the pattern string instead of indicating which
of the possible bits was set.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Diffrentiate adapter-specific attentions

There are 4 attention bits in AEU that have different meaning
for QL45xxx and QL41xxx adapters.

Instead of doing a massive infrastructure change in favor of these
bits, we implement a point fix where only those four would change
meaning dependent on the adapter involved.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Get rid of the attention-arrays

We have almost all the necessary information regarding attentions
in the logic employed for taking register dumps.
Add some more and get rid of the seperate implementation we have today
for identifying & printing various attention sources.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Support dynamic s-tag change

In case management firmware indicates a change in the used S-tag,
propagate the configuration to HW and FW.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: QL41xxx VF MSI-x table

The QL41xxx adapters' PCI allows a single configuration for the
MSI-x table size of all child VFs of a given PF.
The existing code wouldn't cause the management firmware to set
that value, meaning the VFs would retain the default MSI-x table
size.

Introduce a new scheme so that whenever a VF is enabled, driver
would set the number of MSI-x to be the maximum over the various
VFs' needs.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Don't inherit RoCE DCBx for V2

Older firmware used by device didn't distinguish between RoCE and RoCE
V2 from DCBx configuration perspective, and as a result we've used to
take a the RoCE-related configuration and apply to it for both.

Since we now support configuring each its own values, there's no reason
to reflect [& configure] that both are using the same.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Correct DCBx update scheme

Instead of using a boolean value that propagates to FW configuration,
use the proper firmware HSI values.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Add missing static/local dcbx info

Some getters are not getting filled with the correct information
regarding local DCBx.

Fixes: 49632b5822ea ("qed: Add support for static dcbx.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-more-extack'

David Ahern says:

====================
net: another round of extack handling for routing

This set focuses on passing extack through lwtunnel and MPLS with
additional catches for IPv4 route add and minor cleanups in MPLS
encountered passing the extack arg around.

v2
- mindful of bloat adding duplicate messages
  + refactored prefix and prefix length checks in ipv4's fib_table_insert
    and fib_table_del
  + refactored label check in mpls

- split mpls cleanups into 2 patches
  + move nla_get_via up in af_mpls to avoid forward declaration
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: mpls: remove unnecessary initialization of err

err is initialized to EINVAL and not used before it is set again.
Remove the unnecessary initialization.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mpls: Make nla_get_via in af_mpls.c

nla_get_via is only used in af_mpls.c. Remove declaration from internal.h
and move up in af_mpls.c before first use. Code move only; no
functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mpls: Add extack messages for route add and delete failures

Add error messages for failures in adding and deleting mpls routes.
This covers most of the annoying EINVAL errors.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mpls: Pull common label check into helper

mpls_route_add and mpls_route_del have the same checks on the label.
Move to a helper. Avoid duplicate extack messages in the next patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fill in extack for mpls lwt encap

Fill in extack for errors in build_state for mpls lwt encap including
passing extack to nla_get_labels and adding error messages for failures
in it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add extack arg to lwtunnel build state

Pass extack arg down to lwtunnel_build_state and the build_state callbacks.
Add messages for failures in lwtunnel_build_state, and add the extarg to
nla_parse where possible in the build_state callbacks.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: lwtunnel: Add extack to encap attr validation

Pass extack down to lwtunnel_valid_encap_type and
lwtunnel_valid_encap_type_attr. Add messages for unknown
or unsupported encap types.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv4: Add extack message for invalid prefix or length

Add extack error message for invalid prefix length and invalid prefix.
Example of the latter is a route spec containing 172.16.100.1/24, where
the /24 mask means the lower 8-bits should be 0. Amazing how easy that
one is to overlook when an EINVAL is returned.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv4: refactor key and length checks

fib_table_insert and fib_table_delete have the same checks on the prefix
and length. Refactor into a helper. Avoids duplicate extack messages in
the next patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'nfp-pci-core-hwmon-live-mac-addr-change'

Jakub Kicinski says:

====================
nfp: pci core, hwmon, live mac addr change

This series brings updates to core PCI code, SR-IOV, exposes
firmware's capability to change MAC address at runtime and HWMON
interfaces.

The PCI code updates include resiliency improvement in conditions
which are quite unusual, but still shouldn't make the driver oops.
We also handle very large device memory operation more gracefully.
A timeout is added to acquiring mutexes in device memory.

Pablo provides a patch to expose to the stack the ability to change
MAC addresses under traffic while David adds HWMON interface for
reading device temperature and power consumption.

Last three patches are minor improvements to the netdev code.

v2:
- add patch 1 - fix for devlink build;
- fix build issue with the hwmon patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: don't keep count for free buffers delayed kick

We only kick RX free buffer queue controller every NFP_NET_FL_BATCH
(currently 16) entries. This means that we will always kick the QC
when write ring index is divisable by NFP_NET_FL_BATCH. There is
no need to keep counts.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: don't add ring size to index calculations

Adding ring size to index calculation is pointless, since index
will be masked with ring size - 1.

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: fix print format for ring pointers in ring dumps

Ring pointers are unsigned. Fix the print formats to avoid
showing users negative values.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: don't wait for resources indefinitely

There is currently no timeout to the resource and lock acquiring
loops.  We printed warnings and depended on user sending a signal
to the waiting process to stop the waiting.  This doesn't work
very well when wait happens out of a work queue.  The simplest
example of that is PCI probe.  When user loads the module and card
is in a broken state modprobe will wait forever and signals sent
to it will not actually reach the probing thread.

Make sure all wait loops have a time out.  Set the upper wait time
to 60 seconds to stay on the safe side.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: add hwmon support

Add support for retrieving temperature and power sensor and limits via NSP.

Signed-off-by: David Brunecz <david.brunecz@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: support variable NSP response lengths

We want to support extendable commands, where newer versions
of the management FW may provide more information. Zero out
the communication buffer before passing control to NSP. This
way if management FW is old and only fills in first N bytes,
the remaining ones will be zeros which extended ABI fields
should reserve as not supported/not available.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: shorten CPP core probe logs

We currently print reserved BAR mappings info as we create them.
This makes the probe logs longer than necessary. Print into a
buffer instead and log all the info as a single line.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: support long reads and writes with the cpp helpers

nfp_cpp_{read,write}() helpers perform device memory mapping (setting
the PCIe -> NOC translation BARs) and accessing it.  They, however,
currently implicitly expect that the length of entire operation will
fit in one BAR translation window.  There is a number of 16MB windows
available, and we don't really need to access such large areas today.

If the user, however, manages to trick the driver into making a big
mapping (e.g. by providing a huge fake FW file), the driver will
print a warning saying "No suitable BAR found for request" and a
stack trace - which most users find concerning.

To be future-proof and not scare users with warnings, make the
nfp_cpp_{read,write}() helpers do accesses chunk by chunk if the area
size is large.  Set the notion of "large" to 2MB, which is the size
of the smallest BAR window.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: only try to get to PCIe ctrl memory if BARs are wide enough

For accessing PCIe ctrl memory we depend on the BAR aperture being
large enough to reach all registers. Since the BAR aperture can
be set in the flash make sure the driver won't oops the kernel
when the PCIe configuration is unusual.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: don't set aux pointers if ioremap failed

If ioremap of PCIe ctrl memory failed we can still get to it through
PCI config space, therefore we allow ioremap() to fail. When if fails,
however, we must leave all the IOMEM pointers as NULL. Currently we
would calculate csr and em pointers, adding offsets to the potential
NULL value and therefore making the NULL-checks throughout the code
ineffective.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: set driver VF limit

PCI subsystem has support for drivers limiting the number of VFs
available below what the IOV capability claims. Make use of it.

While at it remove the #ifdef/#endif on CONFIG_PCI_IOV, it was
there to avoid unnecessary warnings in case device read failed
but kernel doesn't have SR-IOV support anyway. Device reads
should not fail.

Note that we still need the driver-internal check for the case
where max VFs is 0 since PCI subsystem treats 0 as limit not set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: add set_mac_address support while the interface is up

Expose FW app ability to change MAC address at runtime. Make sure
we only depend on it if FW app advertised the right capability.

Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: add MAY_USE_DEVLINK dependency

Fix build with DEVLINK=m and NFP=y.

Fixes: 1851f93fd2ee ("nfp: add devlink support")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: Relax error checking on sysfs_create_link()

Some Ethernet drivers will attach/connect to a PHY device before calling
register_netdevice() which is responsible for calling netdev_register_kobject()
which would do the network device's kobject initialization. In such a case,
sysfs_create_link() would return -ENOENT because the network device's kobject
is not ready yet, and we would fail to connect to the PHY device.

In order to keep things simple and symetrical, we just take the success path as
indicative of the ability to access the network device's kobject, and create
the second link if that's the case.

Fixes: 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev")
Reported-by: Woojung Hung <Woojung.Huh@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mv88e6xxx: handle SERDES error appropriately

mv88e6xxx_serdes_power returns an error, so no need to print an error
message inside of it. Rather print it in its caller when the error is
ignored, which is in the mv88e6xxx_port_disable void function.

Catch and return its error in the counterpart mv88e6xxx_port_enable.

Fixes: 04aca9938255 ("dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'rtnetlink-Updates-to-rtnetlink_event'

Vladislav Yasevich says:

====================
rtnetlink: Updates to rtnetlink_event()

First is the patch to add IFLA_EVENT attribute to the netlink message.  It
supports only currently white-listed events.
Like before, this is just an attribute that gets added to the rtnetlink
message only when the messaged was generated as a result of a netdev event.
In my case, this is necessary since I want to trap NETDEV_NOTIFY_PEERS
event (also possibly NETDEV_RESEND_IGMP event) and perform certain actions
in user space.  This is not possible since the messages generated as
a result of netdev events do not usually contain any changed data.  They
are just notifications.  This patch exposes this notification type to
userspace.

Second, I remove duplicate messages that a result of a change to bonding
options.  If netlink is used to configure bonding options, 2 messages
are generated, one as a result NETDEV_CHANGEINFODATA event triggered by
bonding code and one a result of device state changes triggered by
netdev_state_change (called from do_setlink).

V6: Updated names and refactored to make it less tied to netdev events.
    (From David Ahern)
V5: Rebased.  Added iproute2 patch to the series.
V4:
  * Removed the patch the removed NETDEV_CHANGENAME from event whitelist.
    It doesn't trigger duplicate messages since name changes can only be
    done while device is down and netdev_state_change() doesn't report
    changes while device is down.
  * Added a patch to clean-up duplicate messages on bonding option changes.

V3: Rebased.  Cleaned-up duplicate event.

V2: Added missed events (from David Ahern)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: Prevent duplicate userspace notification

Whenever a user changes bonding options, a NETDEV_CHANGEINFODATA
notificatin is generated which results in a rtnelink message to
be sent.  While runnig 'ip monitor', we can actually see 2 messages,
one a result of the event, and the other a result of state change
that is generated bo netdev_state_change().  However, this is not
always the case. If bonding changes were done via sysfs or ifenslave
(old ioctl interface), then only 1 message is seen.

This patch removes duplicate messages in the case of using netlink
to configure bonding.  It introduceds a separte function that
triggers a netdev event and uses that function in the syfs and ioctl
cases.

This was discovered while auditing all the different envents and
continues the effort of cleaning up duplicated netlink messages.

CC: David Ahern <dsa@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnl: Add support for netdev event to link messages

When netdev events happen, a rtnetlink_event() handler will send
messages for every event in it's white list.  These messages contain
current information about a particular device, but they do not include
the iformation about which event just happened.  So, it is impossible
to tell what just happend for these events.

This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT
that would have an encoding of event that triggered this
message.  This would allow the the message consumer to easily determine
if it needs to perform certain actions.

Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Overlapping changes in drivers/net/phy/marvell.c, bug fix in 'net'
restricting a HW workaround alongside cleanups in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds

Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.

  leds-pca955x driver uses only i2c_smbus API and thus it should pass
  I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"

* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: pca955x: Correct I2C Functionality

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
    Borkmann.

2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
    Caratti.

3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
    driver, from Jesper Brouer.

4) Defer slave->link updates until bonding is ready to do a full commit
    to the new settings, from Nithin Sujir.

5) Properly reference count ipv4 FIB metrics to avoid use after free
    situations, from Eric Dumazet and several others including Cong Wang
    and Julian Anastasov.

6) Fix races in llc_ui_bind(), from Lin Zhang.

7) Fix regression of ESP UDP encapsulation for TCP packets, from
    Steffen Klassert.

8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.

9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
    Dawson.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
  ipv4: add reference counting to metrics
  net: ethernet: ax88796: don't call free_irq without request_irq first
  ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
  sctp: fix ICMP processing if skb is non-linear
  net: llc: add lock_sock in llc_ui_bind to avoid a race condition
  bonding: Don't update slave->link until ready to commit
  test_bpf: Add a couple of tests for BPF_JSGE.
  bpf: add various verifier test cases
  bpf: fix wrong exposure of map_flags into fdinfo for lpm
  bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
  bpf: properly reset caller saved regs after helper call and ld_abs/ind
  bpf: fix incorrect pruning decision when alignment must be tracked
  arp: fixed -Wuninitialized compiler warning
  tcp: avoid fastopen API to be used on AF_UNSPEC
  net: move somaxconn init from sysctl code
  net: fix potential null pointer dereference
  geneve: fix fill_info when using collect_metadata
  virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
  be2net: Fix offload features for Q-in-Q packets
  vlan: Fix tcp checksum offloads in Q-in-Q vlans
  ...

Merge branch 'ibmvnic-Driver-updates'

Nathan Fontenot says:

====================
ibmvnic: Driver updates

This set of patches implements several updates to the ibmvnic driver
to fix issues that have been found in testing. Most of the updates
invovle updating queue handling during driver close and reset
operations.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Reset sub-crqs during driver reset

When the ibmvnic driver is resetting, we can just reset the sub crqs
instead of releasing all of their resources and re-allocting them.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Reset tx/rx pools on driver reset

When resetting the ibmvnic driver there is not a need to release
and re-allocate the resources for the tx and rx pools. These
resources can just be reset to avoid the re-allocations.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Reset the CRQ queue during driver reset

When a driver reset operation occurs there is not a need to release
the CRQ resources and re-allocate them. Instead a reset of the CRQ
will suffice.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Check adapter state during ibmvnic_poll

We do not want to process any receive frames if the ibmvnic_poll
routine is invoked while a reset is in process. Also, before
replenishing the rx pools in the ibmvnic_poll, we want to
make sure the adapter is not in the process of closing.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Deactivate RX pool buffer replenishment on H_CLOSED

If H_CLOSED is returned, halt RX buffer replenishment activity
until firmware sends a notification that the driver can reset.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Halt TX and report carrier off on H_CLOSED return code

This patch disables transmissions and reports carrier off if xmit
function returns that the hardware TX queue is closed. The driver can
then await a signal from firmware to determine the correct reset method.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Non-fatal error handling

Handle non-fatal error conditions. The process to do this when
resetting the driver is to just do __ibmvnic_close followed by
__ibmvnic_open.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Fix cleanup of SKB's on driver close

A race condition occurs when closing the driver. Free'ing of skb's
can race between the close routine and ibmvnic_tx_interrupt. To fix
this we move the claenup of tx pools during close to after the
sub-CRQ interrupts are disabled.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Send gratuitous arp on reset

Send gratuitous arp after any reset.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Handle failover after failed init crq

Handle case where phyp sends a failover after failing to send the
init crq.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Track state of adapter napis

Track the state of ibmvnic napis. The driver can get into states where it
can be reset when napis are already disabled and attempting to disable them
again will cause the driver to hang.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mlxsw-Improve-extensibility'

Jiri Pirko says:

====================
mlxsw: Improve extensibility

Ido says:

Since the initial introduction of the bridge offload in commit
56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
the per-port struct was used to store both physical properties of the
port as well as logical bridge properties such as learning and active
VLANs in the VLAN-aware bridge.

The above resulted in a bloated struct and code that is getting
increasingly difficult to extend when stacked devices are taken into
account as well as more advanced use cases such as IGMP snooping.

Due to the incremental development nature of this driver as well as the
complexity of the underlying hardware, subsequent design decisions failed
to generalize the FID and RIF resources, which could've benefited from
a more generic design, resulting in consolidated code paths and better
extensibility with regards to future ASICs and use cases.

This patchset tries to solve both of these design problems, as they're
tightly coupled. To ease the code review, the changes are done in a
bottom-up manner, in which the port struct is the first to be patched,
then the FIDs the ports are mapped to and finally the RIFs configured on
top.

The first half of the patchset gradually moves away from the previous
design to a design that is more in sync with the underlying hardware and
which clearly separates between hardware-specific structs and logical
ones such as a bridge port.

All the bridge-specific information is removed from the port struct, as
well as the list of VLAN devices ("vPorts") configured on top of it.
Instead, a linked list of VLANs is introduced, which allows each VLAN
to hold a state, such as mapping to a particular FID and membership in
a bridge. The data structures are depicted in the following figure:

                                  mlxsw_sp_bridge_device
                                       +----------+
                                       |          |
                                  +----+          |
                                  |    |          |
                                  |    +----------+
                                  |
             mlxsw_sp_bridge_port |
                 +----------+     |
                 |          |     |
              +-->          +-----+--> ..
              |  |          |
              |  +----+-----+
              |       |
              |       v
              | mlxsw_sp_bridge_vlan
              |  +----------+
              |  | vid X    |
              |  |          +--> ..
              |  |          |
              |  +----+-----+
              |       |
              +--+----v-----+
                 | vid X    |
              +--+          +--> ..
              |  |          |
mlxsw_sp_port |  +----------+
+----------+  | mlxsw_sp_port_vlan
|          |  |
|          +--+
|          |
+----------+

This model allows us to consolidate many of the code paths relating to
VLAN-aware and VLAN-unaware bridges, as the latter is simply represented
using a bridge port with a VLAN list size of one. Another advantage of
the model is that it's easy to extend it with future per-VLAN
attributes - such as mrouter indication - by merely pushing these down
from the bridge port struct to the bridge VLAN one.

The second half of the patchset builds on top of previous work and
prepares the driver for the common FID and RIF cores, which are finally
implemented in the last two patches. These exploit the fact that despite
the different kinds of FIDs and RIFs, they do share a common object on
which the core operations can operate on.

By hiding both objects from the rest of the driver and modeling their
operations using a VFT, it'll be easier to extend the driver for future
use cases such as VXLAN.

Tested using following LNST recipes:
https://github.com/jpirko/lnst/tree/master/recipes/switchdev
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Implement common RIF core

The mlxsw driver currently implements three types of RIFs. VLAN and FID
RIFs for L3 interfaces on top of VLAN-aware and VLAN-unaware bridges
(respectively) and Subport RIFs for all other L3 interfaces.

All the RIF types follow a common configuration procedure, which only
differs in the type-specific bits. The patch exploits this fact and
consolidates the common code paths, thereby simplifying the code and
making it more extensible.

This work also prepares the driver for use with future ASICs, where the
range of the Subport RIFs will be extended and their configuration
modified accordingly. By merely implementing a new RIF operations and
selecting it during initialization, the same driver could be re-used.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Implement common FID core

The device supports three types of FIDs. 802.1Q and 802.1D FIDs for
VLAN-aware and VLAN-unaware bridges (respectively) and rFIDs to
transport packets to the router block.

The different users (e.g., bridge, router, ACLs) of the FIDs
infrastructure need not know about the internal FIDs implementation and
can therefore interact with it using a restricted set of exported
functions.

By encapsulating the entire FID logic and hiding it from the rest of the
driver we get a code base that it much simpler and easier to work with
and extend.

For example, in the current Spectrum ASIC only 802.1D FIDs can be
assigned a VNI, but future ASICs will also support 802.1Q FIDs. With
this patch in place, support for future ASICs can be easily added by
implementing a new FID operations according to their capabilities.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Determine VR first when creating RIF

All RIF types are associated with a virtual router (VR), so determine VR
first when creating a RIF.

That way, we can more easily integrate the common RIF core in the
following patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>