git.proxmox.com Git - mirror_ubuntu-kernels.git/log

net: dsa: mt7530: set STP state on filter ID 1

As filter ID 1 is the only one used for bridges, set STP state on it.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges

Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Ideally, when the switch receives a packet from swp3 or swp4, it should
forward the packet to the CPU, according to the port matrix and unknown
unicast flood settings.

But packet loss will happen if the destination address is at one of the
offloaded ports (swp0~2). For example, when client C sends a packet to
A, the FDB lookup will indicate that it should be forwarded to swp0, but
the port matrix of swp3 and swp4 is configured to only allow the CPU to
be its destination, so it is dropped.

However, this issue does not happen if the bridge is VLAN-aware. That is
because VLAN-aware bridges use independent VLAN learning, i.e. use VID
for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded,
shared VLAN learning with default filter ID of 0 is used instead. So the
lookup for A with filter ID 0 never hits and the packet can be forwarded
to the CPU.

In the current code, only two combinations were used to toggle user
ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with
PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to
security mode with PVC.VLAN_ATTR set to user port.

It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and
port matrix mode just skips the VLAN table lookup. The reference manual
is somehow misleading when describing PORT_VLAN modes. It states that
PORT_MEM (VLAN port member) is used for destination if the VLAN table
lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of
VLAN port member and port matrix) is used instead, which means we can
have two or more separate VLAN-aware bridges with the same PVID and
traffic won't leak between them.

Therefore, to solve this, enable independent VLAN learning with PVID 0
on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback
mode, while leaving standalone ports in port matrix mode. The CPU port
is always set to fallback mode to serve those bridges.

During testing, it is found that FDB lookup with filter ID of 0 will
also hit entries with VID 0 even with independent VLAN learning. To
avoid that, install all VLANs with filter ID of 1.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mt7530: enable assisted learning on CPU port

Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Address learning is enabled on offloaded ports (swp0~2) and the CPU
port, so when client A sends a packet to C, the following will happen:

1. The switch learns that client A can be reached at swp0.
2. The switch probably already knows that client C can be reached at the
   CPU port, so it forwards the packet to the CPU.
3. The bridge core knows client C can be reached at bond0, so it
   forwards the packet back to the switch.
4. The switch learns that client A can be reached at the CPU port.
5. The switch forwards the packet to either swp3 or swp4, according to
   the packet's tag.

That makes client A's MAC address flap between swp0 and the CPU port. If
client B sends a packet to A, it is possible that the packet is
forwarded to the CPU. With offload_fwd_mark = 1, the bridge core won't
forward it back to the switch, resulting in packet loss.

As we have the assisted_learning_on_cpu_port in DSA core now, enable
that and disable hardware learning on the CPU port.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <oltean@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipa-pm-irqs'

Alex Elder says:

====================
net: ipa: prepare GSI interrupts for runtime PM

The last patch in this series arranges for GSI interrupts to be
disabled when the IPA hardware is suspended.  This ensures the clock
is always operational when a GSI interrupt fires.  Leading up to
that are patches that rearrange the code a bit to allow this to
be done.

The first two patches aren't *directly* related.  They remove some
flag arguments to some GSI suspend/resume related functions, using
the version field now present in the GSI structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: disable GSI interrupts while suspended

Introduce new functions gsi_suspend() and gsi_resume(), which will
disable the GSI interrupt handler after all endpoints are suspended
and re-enable it before endpoints are resumed.  This will ensure no
GSI interrupt handler will fire when the hardware is suspended.

Here's a little further explanation.  There are seven GSI interrupt
types, and most are disabled except when needed.
  - These two are not used (never enabled):
      GSI_INTER_EE_CH_CTRL
      GSI_INTER_EE_EV_CTRL
  - These two are only used to implement channel and event ring
    commands, and are only enabled while a command is underway:
      GSI_CH_CTRL
      GSI_EV_CTRL
  - The IEOB interrupt signals I/O completion.  It will not fire
    when a channel is stopped (or "suspended").
      GSI_IEOB
  - This interrupt is used to allocate or halt modem channels,
    and is only enabled while such a command is underway.
      GSI_GLOB_EE
    However it also is used to signal certain errors, and this could
    occur at any time.
  - The general interrupt signals general errors, and could occur at
    any time.
      GSI_GENERAL

The purpose for this change is to ensure no global or general
interrupts fire due to errors while the hardware is suspended.
We enable the clock on resume, and at that time we can "handle"
(at least report) these error conditions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: move gsi_irq_init() code into setup

The GSI IRQ handler could be triggered as soon as it is registered
with request_irq().  The handler function, gsi_isr(), touches
hardware, meaning the IPA clock must be operational.  The IPA clock
is not operating when the handler is registered (in gsi_irq_init()),
so this is a problem.

Move the call to request_irq() for the GSI interrupt handler into
gsi_irq_setup(), which is called when the IPA clock is known to be
operational (and furthermore, the GSI firmware will have been
loaded).  Request the IRQ at the end of that function, after all
interrupt types have been disabled and masked.

Move the matching free_irq() call into gsi_irq_teardown(), and get
rid of the now empty gsi_irq_exit(),

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: have gsi_irq_setup() return an error code

Change gsi_irq_setup() so it returns an error value, and introduce
gsi_irq_teardown() as its inverse. Set the interrupt type (IRQ
rather than MSI) in gsi_irq_setup().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: move some GSI setup functions

Move gsi_irq_setup() and gsi_ring_setup() so they're defined right
above gsi_setup() where they're called. This is a trivial movement
of code to prepare for upcoming patches.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: move version check for channel suspend/resume

Change the Boolean flags passed to __gsi_channel_start() and
__gsi_channel_stop() so they represent whether the request is being
made to implement suspend (versus stop) or resume (versus start).

Then stop or start the channel for suspend/resume requests only if
the hardware version indicates it should be done.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: use gsi->version for channel suspend/resume

The GSI layer has the IPA version now, so there's no need for
version-specific flags to be passed from IPA. One instance of
this is in gsi_channel_suspend() and gsi_channel_resume(), which
indicate whether or not the endpoint suspend is implemented by
GSI stopping the channel. We can make that determination based
on gsi->version, eliminating the need for a Boolean flag in those
functions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mhi-mbim'

Loic Poulain says:

====================
net: mhi: move MBIM to WWAN

Implement a proper WWAN driver for MBIM network protocol, with multi link
management supported through the WWAN framework (wwan rtnetlink).

Until now, MBIM over MHI was supported directly in the mhi_net driver, via
some protocol rx/tx fixup callbacks, but with only one session supported
(no multilink muxing). We can then remove that part from mhi_net and restore
the driver to a simpler version for 'raw' ip transfer (or QMAP via rmnet link).

Note that a wwan0 link is created by default for session-id 0. Additional links
can be managed via ip tool:

$ ip link add dev wwan0mms parentdev wwan0 type wwan linkid 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: mhi: Remove MBIM protocol

The MBIM protocol has now been integrated in a proper WWAN driver. We
can then revert back to a simpler driver for mhi_net, which is used
for raw IP or QMAP protocol (via rmnet link).

- Remove protocol management
- Remove WWAN framework usage (only valid for mbim)
- Remove net/mhi directory for simpler mhi_net.c file

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: wwan: Add MHI MBIM network driver

Add new wwan driver for MBIM over MHI. MBIM is a transport protocol
for IP packets, allowing packet aggregation and muxing. Initially
designed for USB bus, it is also exposed through MHI bus for QCOM
based PCIe wwan modems.

This driver supports the new wwan rtnetlink interface for multi-link
management and has been tested with Quectel EM120R-GL M2 module.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'queues'

Jakub Kicinski says:

====================
net: add netif_set_real_num_queues() for device reconfig

This short set adds a helper to make the implementation of
two-phase NIC reconfig easier.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: use netif_set_real_num_queues()

Avoid reconfig problems due to failures in netif_set_real_num_tx_queues()
by using netif_set_real_num_queues().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add netif_set_real_num_queues() for device reconfig

netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues()
can fail which breaks drivers trying to implement reconfiguration
in a way that can't leave the device half-broken. In other words
those functions are incompatible with prepare/commit approach.

Luckily setting real number of queues can fail only if the number
is increased, meaning that if we order operations correctly we
can guarantee ending up with either new config (success), or
the old one (on error).

Provide a helper implementing such logic so that drivers don't
have to duplicate it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add extack arg for link ops

Pass extack arg to validate_linkmsg and validate_link_af callbacks.
If a netlink attribute has a reject_message, use the extended ack
mechanism to carry the message back to user space.

Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_unix: Add OOB support

This patch adds OOB support for AF_UNIX sockets.
The semantics is same as TCP.

The last byte of a message with the OOB flag is
treated as the OOB byte. The byte is separated into
a skb and a pointer to the skb is stored in unix_sock.
The pointer is used to enforce OOB semantics.

Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dpaa2-switch-next'

Ioana Ciornei says:

====================
dpaa2-switch: integrate the MAC endpoint support

This patch set integrates the already available MAC support into the
dpaa2-switch driver as well.

The first 4 patches are fixing up some minor problems or optimizing the
code, while the remaining ones are actually integrating the dpaa2-mac
support into the switch driver by calling the dpaa2_mac_* provided
functions. While at it, we also export the MAC statistics in ethtool
like we do for dpaa2-eth.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: export MAC statistics in ethtool

If a switch port is connected to a MAC, use the common dpaa2-mac support
for exporting the available MAC statistics.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: add a prefix to HW ethtool stats

In the next patch, we'll add support for also exporting the MAC
statistics in the ethtool stats. Annotate already present HW stats with
a suggestive prefix.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: integrate the MAC endpoint support

Integrate the common MAC endpoint management support into the
dpaa2-switch driver as well. Nothing special happens here, just that the
already available dpaa2-mac functions are also called from dpaa2-switch.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bus: fsl-mc: extend fsl_mc_get_endpoint() to pass interface ID

In case of a switch DPAA2 object, the interface ID is also needed when
querying for the object endpoint. Extend fsl_mc_get_endpoint() so that
users can also pass the interface ID that are interested in.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: no need to check link state right after ndo_open

The call to dpaa2_switch_port_link_state_update is a leftover from the
time when on DPAA2 platforms the PHYs were started at boot time so when
an ifconfig was issued on the associated interface, the link status
needed to be checked directly from the ndo_open() callback. This is not
needed anymore since we are now properly integrated with the PHY layer
thus a link interrupt will come directly from the PHY eventually without
the need to call the sync function.
Fix this up by removing the call to dpaa2_switch_port_link_state_update.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: do not enable the DPSW at probe time

We should not enable the switch interfaces at probe time since this is
trigged by the open callback. Remove the call dpsw_enable() which does
exactly this.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: use the port index in the IRQ handler

The MC firmware supplies us the switch interface index for which an
interrupt was triggered. Use this to our advantage instead of looping
through all the switch ports and doing unnecessary work.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa2-switch: request all interrupts sources on the DPSW

Request all interrupt sources to be read and then cleared on the DPSW
object. In the next patches we'll also add support for treating other
interrupts.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: fix MAC internal delay doesn't work

This patch intends to fix MAC internal delay doesn't work, due to use
of_property_read_u32() incorrectly, and improve this feature a bit:
1) check the delay value if valid, only program register when it's 2000ps.
2) only enable "enet_2x_txclk" clock when require MAC internal delay.

Fixes: fc539459e900 ("net: fec: add MAC internal delayed clock feature support")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20210803052424.19008-1-qiangqing.zhang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: tag_sja1105: consistently fail with arbitrary input

Dan Carpenter's smatch tests report that the "vid" variable, populated
by sja1105_vlan_rcv when an skb is received by the tagger that has a
VLAN ID which cannot be decoded by tag_8021q, may be uninitialized when
used here:

if (source_port == -1 || switch_id == -1)
skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);

The sja1105 driver, by construction, sets up the switch in a way that
all data plane packets sent towards the CPU port are VLAN-tagged. So it
is practically impossible, in a functional system, for a packet to be
processed by sja1110_rcv() which is not a control packet and does not
have a VLAN header either.

However, it would be nice if the sja1105 tagging driver could
consistently do something valid, for example fail, even if presented with
packets that do not hold valid sja1105 tags. Currently it is a bit hard
to argue that it does that, given the fact that a data plane packet with
no VLAN tag will trigger a call to dsa_find_designated_bridge_port_by_vid
with a vid argument that is an uninitialized stack variable.

To fix this, we can initialize the u16 vid variable with 0, a value that
can never be a bridge VLAN, so dsa_find_designated_bridge_port_by_vid
will always return a NULL skb->dev.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210802195137.303625-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: switchdev: fix incorrect use of FDB flags when picking the dst device

Nikolay points out that it is incorrect to assume that it is impossible
to have an fdb entry with fdb->dst == NULL and the BR_FDB_LOCAL bit in
fdb->flags not set. This is because there are reader-side places that
test_bit(BR_FDB_LOCAL, &fdb->flags) without the br->hash_lock, and if
the updating of the FDB entry happens on another CPU, there are no
memory barriers at writer or reader side which would ensure that the
reader sees the updates to both fdb->flags and fdb->dst in the same
order, i.e. the reader will not see an inconsistent FDB entry.

So we must be prepared to deal with FDB entries where fdb->dst and
fdb->flags are in a potentially inconsistent state, and that means that
fdb->dst == NULL should remain a condition to pick the net_device that
we report to switchdev as being the bridge device, which is what the
code did prior to the blamed patch.

Fixes: 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the same as entries towards the bridge")
Suggested-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210802113633.189831-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Revert "Merge branch 'qcom-dts-updates'"

This reverts commit b79c6fba6cd7c49a7dbea9999e182f74cca63e19, reversing
these changes made to 0ac26271344478ff718329fa9d4ef81d4bcbc43b:

  commit 6a0eb6c9d934 ("dt-bindings: net: qcom,ipa: make imem interconnect
                       optional")
  commit f8bd3c82bf7d ("arm64: dts: qcom: sc7280: add IPA information")
  commit fd0f72c34bd9 ("arm64: dts: qcom: sc7180: define ipa_fw_mem node")

I intend for these commits to go through the Qualcomm repository, to
avoid conflicting with other activity being merged there.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20210802233019.800250-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

octeontx2-af: Fix spelling mistake "Makesure" -> "Make sure"

There is a spelling mistake in a NL_SET_ERR_MSG_MOD message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210803105617.338546-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: decnet: Fix refcount warning for new dn_fib_info

fib_treeref needs to be set after kzalloc. The old code had a ++ which
led to the confusion when the int was replaced by a refcount_t.

Fixes: 79976892f7ea ("net: convert fib_treeref from int to refcount_t")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210803073739.22339-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'Space-cleanup'

Arnd Bergmann says:

====================
drivers/net/Space.c cleanup

I discovered that there are still a couple of drivers that rely on
beiong statically initialized from drivers/net/Space.c the way
we did in the last century. As it turns out, there are a couple
of simplifications that can be made here, as well as some minor
bugfixes.

There are four classes of drivers that use this:

- most 10mbit ISA bus ethernet drivers (and one 100mbit one)
- both ISA localtalk drivers
- several m68k ethernet drivers
- one obsolete WAN driver

I found that the drivers using in arch/m68k/ don't actually benefit
from being probed this way as they do not rely on the netdev= command
line arguments, they have simply never been changed to work like a
modern driver.

I had previously sent a patch to remove the sbni/granch driver, and
there were no objections to this patch but forgot to resend it after
some discussion about another patch in the same series.

For the ISA drivers, there is usually no way to probe multiple devices
at boot time other than the netdev= arguments, so all that logic is left
in place for the moment, but centralized in a single file that only gets
included in the kernel build if one or more of the drivers are built-in.

I'm also changing the old-style init_module() functions in these drivers
to static functions with a module_init() annotation, to more closely
resemble modern drivers. These are the last drivers in the kernel to
still use init_module/cleanup_module, removing those may enable future
cleanups to the module loading process.

Arnd

Changes in v2:

- replace xsurf100 change with Michael's version
- make it PATCH instead of RFC
- rebase to net-next as of August 3
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet: isa: convert to module_init/module_exit

There are a couple of ISA ethernet drivers that use the old
init_module/cleanup_module function names for the main entry
points, nothing else uses those any more.

Change them to the documented method with module_init()
and module_exit() markers next to static functions.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

wan: hostess_sv11: use module_init/module_exit helpers

This is one of very few drivers using the old init_module/cleanup_module
function names. Change it over to the modern method.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

wan: remove sbni/granch driver

The driver was merged in 1999 and has only ever seen treewide cleanups
since then, with no indication whatsoever that anyone has actually
had access to hardware for testing the patches.

>From the information in the link below, it appears that the hardware
is for some leased line system in Russia that has since been
discontinued, and useless without any remote end to connect to.

As the driver still feels like a Linux-2.2 era artifact today, it
appears that the best way forward is to just delete it.

Link: https://www.tms.ru/%D0%90%D0%B4%D0%B0%D0%BF%D1%82%D0%B5%D1%80_%D0%B4%D0%BB%D1%8F_%D0%B2%D1%8B%D0%B4%D0%B5%D0%BB%D0%B5%D0%BD%D0%BD%D1%8B%D1%85_%D0%BB%D0%B8%D0%BD%D0%B8%D0%B9_Granch_SBNI12-10
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

wan: remove stale Kconfig entries

The dscc4 driver was removed in 2019 but these Kconfig entries remain,
so remove them as well.

Fixes: 28c9eb9042a9 ("net/wan: dscc4: remove broken dscc4 driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

make legacy ISA probe optional

There are very few ISA drivers left that rely on the static probing from
drivers/net/Space.o. Make them all select a new CONFIG_NETDEV_LEGACY_INIT
symbol, and drop the entire probe logic when that is disabled.

The 9 drivers that are called from Space.c are the same set that
calls netdev_boot_setup_check().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

move netdev_boot_setup into Space.c

This is now only used by a handful of old ISA drivers,
and can be moved into the file they already all depend on.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

xsurf100: drop include of lib8390.c

Now that ax88796.c exports the ax_NS8390_reinit() symbol, we can
include 8390.h instead of lib8390.c, avoiding duplication of that
function and killing a few compile warnings in the bargain.

Fixes: 861928f4e60e826c ("net-next: New ax88796 platform
driver for Amiga X-Surf 100 Zorro board (m68k)")

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ax88796: export ax_NS8390_init() hook

The block I/O code for the new X-Surf 100 ax88796 driver needs
ax_NS8390_init() for error fixup in its block_output function.

Export this static function through the ax_NS8390_reinit()
wrapper so we can lose the lib8380.c include in the X-Surf 100
driver.

[arnd: add the declaration in the header to avoid a
-Wmissing-prototypes warning]
Fixes: 861928f4e60e826c ("net-next: New ax88796 platform
driver for Amiga X-Surf 100 Zorro board (m68k)")
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

m68k: remove legacy probing

There are six m68k specific drivers that use the legacy probe method
in drivers/net/Space.c. However, all of these only support a single
device, and they completely ignore the command line settings from
netdev_boot_setup_check, so there is really no point at all.

Aside from sun3_82586, these already have a module_init function that
can be used for built-in mode as well, simply by removing the #ifdef.

Note that the 82596 driver was previously used on ISA as well, but
that got dropped long ago.

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

cs89x0: rework driver configuration

There are two drivers in the cs89x0 file, with the CONFIG_CS89x0_PLATFORM
symbol deciding which one is getting built. This is somewhat confusing
and makes it more likely ton configure a driver that works nowhere.

Split up the Kconfig option into separate ISA and PLATFORM drivers,
with the ISA symbol explicitly connecting to the static probing in
drivers/net/Space.c

The two drivers are still mutually incompatible at compile time,
which could be lifted by splitting them into multiple files,
but in practice this will make no difference.

The platform driver can now be enabled for compile-testing on
non-ARM machines.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

3c509: stop calling netdev_boot_setup_check

This driver never uses the information returned by
netdev_boot_setup_check, and is not called by the boot-time probing from
driver/net/Space.c, so just remove these stale references.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

appletalk: ltpc: remove static probing

This driver never relies on the netdev_boot_setup_check()
to get its configuration, so it can just as well do its
own probing all the time.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

natsemi: sonic: stop calling netdev_boot_setup_check

The data from the kernel command line is no longer used since the
probe function gets it from the platform device resources instead.

The jazz version was changed to be like this in 2007, the xtensa
version apparently copied the code from there.

Fixes: ed9f0e0bf3ce ("remove setup of platform device from jazzsonic.c")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

bcmgenet: remove call to netdev_boot_setup_check

The driver has never used the netdev->{irq,base_addr,mem_start}
members, so this call is completely unnecessary.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ethtool-runtime-pm'

Heiner Kallweit says:

====================
ethtool: runtime-resume netdev parent before ethtool ops

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if
  not accessing the device) will fail because netif_device_present()
  returns false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing
ethtool ops in kernel.

This patch series covers the typical case that the netdev parent is power-
managed, e.g. a PCI device. Not sure whether cases exist where the netdev
itself is power-managed. If yes then we may need an extension for this.
But the series as-is at least shouldn't cause problems in that case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: runtime-resume netdev parent in ethnl_ops_begin

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
  accessing the device) will fail because netif_device_present() returns
  false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin

If device is runtime-suspended and not accessible then it may be
flagged as not present. If checking whether device is present is
done too early then we may bail out before we have the chance to
runtime-resume the device. Therefore move this check to
ethnl_ops_begin(). This is in preparation of a follow-up patch
that tries to runtime-resume the device before executing ethtool
ops.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: move implementation of ethnl_ops_begin/complete to netlink.c

In preparation of subsequent extensions to both functions move the
implementations from netlink.h to netlink.c.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethtool: runtime-resume netdev parent before ethtool ioctl ops

If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
  accessing the device) will fail because netif_device_present() returns
  false
- ethtool ops may fail because device is not accessible (e.g. because being
  in D3 in case of a PCI device)

It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio-net: realign page_to_skb() after merges

We ended up merging two versions of the same patch set:

commit 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
commit 5c37711d9f27 ("virtio-net: fix for unable to handle page fault for address")

into net, and

commit 7bf64460e3b2 ("virtio-net: get build_skb() buf by data ptr")
commit 6c66c147b9a4 ("virtio-net: fix for unable to handle page fault for address")

into net-next. Redo the merge from commit 126285651b7f ("Merge
ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net"), so that
the most recent code remains.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bnxt_en-rx-ring'

Michael Chan says:

====================
bnxt_en: Increase maximum RX ring size when jumbo ring is unused

The RX jumbo ring is automatically enabled when HW GRO/LRO is enabled or
when the MTU exceeds the page size.  The RX jumbo ring provides a lot
more RX buffer space when it is in use.  When the RX jumbo ring is not
in use, some users report that the current maximum of 2K buffers is
too limiting.  This patchset increases the maximum to 8K buffers when
the RX jumbo ring is not used.  The default RX ring size is unchanged
at 511.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Increase maximum RX ring size if jumbo ring is not used

The current maximum RX ring size is defined assuming the RX jumbo ring
(aka aggregation ring) is used.  The RX jumbo ring is automicatically used
when the MTU exceeds a threshold or when rx-gro-hw/lro is enabled.  The RX
jumbo ring is automatically sized up to 4 times the size of the RX ring
size.

The BNXT_MAX_RX_DESC_CNT constant is the upper limit on the size of the
RX ring whether or not the RX jumbo ring is used.  Obviously, the
maximum amount of RX buffer space is significantly less when the RX jumbo
ring is not used.

To increase flexibility for the user who does not use the RX jumbo ring,
we now define a bigger maximum RX ring size when the RX jumbo ring is not
used.  The maximum RX ring size is now up to 8K when the RX jumbo ring
is not used.  The maximum completion ring size also needs to be scaled
up to accomodate the larger maximum RX ring size.

Note that when the RX jumbo ring is re-enabled, the RX ring size will
automatically drop if it exceeds the maximum.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Don't use static arrays for completion ring pages

We currently store these page addresses and DMA addreses in static
arrays.  On systems with 4K pages, we support up to 64 pages per
completion ring.  The actual number of pages for each completion ring
may be much less than 64.  For example, when the RX ring size is set
to the default 511 entries, only 16 completion ring pages are needed
per ring.

In the next patch, we'll be doubling the maximum number of completion
pages.  So we convert to allocate these arrays as needed instead of
declaring them statically.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Keep vertical alignment

Those files under /proc/net/stat/ don't have vertical alignment, it looks
very difficult. Modify the seq_printf statement, keep vertical alignment.

v2:
- Use seq_puts() and seq_printf() correctly.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: add new option lacp_active

Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".

Note, the LACPDU state frames still will be sent when init or unbind port.

v2: remove module parameter

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Remove duplicated include of kernel.h

Duplicate include header file <linux/kernel.h>
line 4: #include <linux/kernel.h>
line 7: #include <linux/kernel.h>

Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net/usb: Remove all strcpy() uses

strcpy() performs no bounds checking on the destination buffer. This
could result in linear overflows beyond the end of the buffer, leading
to all kinds of misbehaviors. The safe replacement is strscpy().

Signed-off-by: Len Baker <len.baker@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Remove redundant prints from the iWARP SYN handling

Remove redundant prints from the iWARP SYN handling.

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Skip DORQ attention handling during recovery

The device recovery flow will reset the entire HW device, in that case
the DORQ HW block attention is redundant.

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qed: Avoid db_recovery during recovery

Avoid calling the qed doorbell recovery - qed_db_rec_handler()
during device recovery.

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'skb_expand_head'

Vasily Averin says:

====================
skbuff: introduce skb_expand_head()

currently if skb does not have enough headroom skb_realloc_headrom is called.
It is not optimal because it creates new skb.

this patch set introduces new helper skb_expand_head()
Unlike skb_realloc_headroom, it does not allocate a new skb if possible;
copies skb->sk on new skb when as needed and frees original skb in case of failures.

This helps to simplify ip[6]_finish_output2(), ip6_xmit() and few other
functions in vrf, ax25 and bpf.

There are few other cases where this helper can be used
but it requires an additional investigations.

v3 changes:
- ax25 compilation warning fixed
- v5.14-rc4 rebase
- now it does not depend on non-committed pathces

v2 changes:
- helper's name was changed to skb_expand_head
- fixed few mistakes inside skb_expand_head():
    skb_set_owner_w should set sk on nskb
    kfree was replaced by kfree_skb()
    improved warning message
- added minor refactoring in changed functions in vrf and bpf patches
- removed kfree_skb() in ax25_rt_build_path caller ax25_ip_xmit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: use skb_expand_head in bpf_out_neigh_v4/6

Unlike skb_realloc_headroom, new helper skb_expand_head
does not allocate a new skb if possible.

Additionally this patch replaces commonly used dereferencing with variables.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ax25: use skb_expand_head

Use skb_expand_head() in ax25_transmit_buffer and ax25_rt_build_path.
Unlike skb_realloc_headroom, new helper does not allocate a new skb if possible.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vrf: use skb_expand_head in vrf_finish_output

Unlike skb_realloc_headroom, new helper skb_expand_head
does not allocate a new skb if possible.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: use skb_expand_head in ip_finish_output2

Unlike skb_realloc_headroom, new helper skb_expand_head
does not allocate a new skb if possible.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: use skb_expand_head in ip6_xmit

Unlike skb_realloc_headroom, new helper skb_expand_head
does not allocate a new skb if possible.

Additionally this patch replaces commonly used dereferencing with variables.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: use skb_expand_head in ip6_finish_output2

Unlike skb_realloc_headroom, new helper skb_expand_head does not allocate
a new skb if possible.

Additionally this patch replaces commonly used dereferencing with variables.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

skbuff: introduce skb_expand_head()

Like skb_realloc_headroom(), new helper increases headroom of specified skb.
Unlike skb_realloc_headroom(), it does not allocate a new skb if possible;
copies skb->sk on new skb when as needed and frees original skb in case
of failures.

This helps to simplify ip[6]_finish_output2() and a few other similar cases.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mlx5-updates-2021-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
This patch-set changes the TTC (Traffic Type Classification) logic
to be independent from the mlx5 ethernet driver by renaming the traffic
types enums and making the TTC API generic to the mlx5 core driver.

It allows to decouple TTC logic from mlx5e and reused by other parts
of mlx5 drivers, namely ADQ and lag TX steering hashing.

Patches overview:
1 - Rename traffic type enums to be mlx5 generic.
2 - Rename related TTC arguments and functions.
3 - Remove dependency in the mlx5e driver from the TTC implementation.
4 - Move TTC logic to fs_ttc.
5 - Embed struct mlx5_ttc_table in fs_ttc.

The refactoring series is followed by misc' cleanup patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: Fix missing return value in mlx5_devlink_eswitch_inline_mode_set()

The return value is missing in this code scenario, add the return value
'0' to the return value 'err'.

Eliminate the follow smatch warning:

drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c:3083
mlx5_devlink_eswitch_inline_mode_set() warn: missing error code 'err'.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: 8e0aa4bc959c ("net/mlx5: E-switch, Protect eswitch mode changes")
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Return -EOPNOTSUPP if more relevant when parsing tc actions

Instead of returning -EINVAL.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant assignment of counter to null

counter is being initialized before being used.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant parse_attr arg

Passing parse_attr is redundant in parse_tc_nic_actions() and
mlx5e_tc_add_nic_flow() as we can get it from flow.
This is the same as with parse_tc_fdb_actions() and mlx5e_tc_add_fdb_flow().

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant cap check for flow counter

The cap is very old and today will always exists.
The cap is not being checked anywhere else. Remove the check from
drop action when parsing tc rules in nic mode.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant filter_dev arg from parse_tc_fdb_actions()

filter_dev is saved in parse_attr. and being used in other cases from
there. use it also for the leftover case.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Remove redundant tc act includes

Since the code changed to use the flow action infra
there is no usage of tcf values from those includes.
Remove those.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Embed mlx5_ttc_table

mlx5_ttc_table struct shouldn't be exposed to the users so
this patch make it internal to ttc.

In addition add a getter function to get the TTC flow table for users
that need to add a rule which points on it.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>

net/mlx5: Move TTC logic to fs_ttc

Now that TTC logic is not dependent on mlx5e structs, move it to
lib/fs_ttc.c so it could be used other part of the mlx5 driver.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Decouple TTC logic from mlx5e

Remove dependency in the mlx5e driver from the TTC implementation
by changing the TTC related functions to receive mlx5 generic arguments.
It allows to decouple TTC logic from mlx5e and reused by other parts of
mlx5 driver.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Rename some related TTC args and functions

Since TTC logic is going to be moved to a separate file, make the
relevant functions and arguments that used by TTC to be mlx5 generic.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Rename traffic type enums

Rename traffic type enums as part of the preparation for moving
the traffic type logic to a separate file.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Allocate the array of channels according to the real max_nch

The channels array in struct mlx5e_rx_res is converted to a dynamic one,
which will use the dynamic value of max_nch instead of
implementation-defined maximum of MLX5E_MAX_NUM_CHANNELS.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Hide all implementation details of mlx5e_rx_res

This commit moves all implementation details of struct mlx5e_rx_res
under en/rx_res.c. All access to RX resources is now done using methods.
Encapsulating RX resources into an object allows for better
manageability, because all the implementation details are now in a
single place, and external code can use only a limited set of API
methods to init/teardown the whole thing, reconfigure RSS and LRO
parameters, connect TIRs to flow steering and activate/deactivate TIRs.

mlx5e_rx_res is self-contained and doesn't depend on struct mlx5e_priv
or include en.h.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Introduce mlx5e_channels API to get RQNs

Currently, struct mlx5e_channels is defined in en.h, along with a lot of
other stuff. In the following commit mlx5e_rx_res will need to get RQNs
(RQ hardware IDs), given a pointer to mlx5e_channels and the channel
index. In order to make it possible without including the whole en.h,
this commit introduces functions that will hide the implementation
details of mlx5e_channels.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Use a new initializer to build uniform indir table

Replace mlx5e_build_default_indir_rqt with a new initializer of struct
mlx5e_rss_params_indir that works directly with the struct, rather than
its internals.

The new initializer is called mlx5e_rss_params_indir_init_uniform, which
also reflects the purpose (uniform spreading) better.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx4: make the array states static const, makes object smaller

Don't populate the array states on the stack but instead it
static const. Makes the object code smaller by 79 bytes.

Before:
   text   data   bss    dec    hex filename
  21309   8304   192  29805   746d drivers/net/ethernet/mellanox/mlx4/qp.o

After:
   text   data   bss    dec    hex filename
  21166   8368   192  29726   741e drivers/net/ethernet/mellanox/mlx4/qp.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210801153742.147304-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: 3c509: make the array if_names static const, makes object smaller

Don't populate the array if_names on the stack but instead it
static const. Makes the object code smaller by 99 bytes.

Before:
   text    data     bss     dec     hex filename
  27886   10752     672   39310    998e ./drivers/net/ethernet/3com/3c509.o

After:
   text    data     bss     dec     hex filename
  27723   10816     672   39211    992b ./drivers/net/ethernet/3com/3c509.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801152650.146572-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dpaa2-eth: make the array faf_bits static const, makes object smaller

Don't populate the array faf_bits on the stack but instead it
static const. Makes the object code smaller by 175 bytes.

Before:
   text  data   bss     dec   hex filename
   9645  4552     0   14197  3775 ../freescale/dpaa2/dpaa2-eth-devlink.o

After:
   text  data   bss     dec   hex filename
   9406  4616     0   14022  36c6 ../freescale/dpaa2/dpaa2-eth-devlink.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801152209.146359-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

qlcnic: make the array random_data static const, makes object smaller

Don't populate the array random_data on the stack but instead it
static const. Makes the object code smaller by 66 bytes.

Before:
   text    data     bss     dec     hex filename
  52895   10976       0   63871    f97f ../qlogic/qlcnic/qlcnic_ethtool.o

After:
   text    data     bss     dec     hex filename
  52701   11104       0   63805    f93d ../qlogic//qlcnic/qlcnic_ethtool.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801151659.146113-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: marvell: make the array name static, makes object smaller

Don't populate the const array name on the stack but instead it
static. Makes the object code smaller by 28 bytes. Add a missing
const to clean up a checkpatch warning.

Before:
   text    data   bss     dec     hex filename
124565   31565   384  156514   26362 drivers/net/ethernet/marvell/sky2.o

After:
   text    data   bss     dec     hex filename
124441   31661   384  156486   26346 drivers/net/ethernet/marvell/sky2.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801150647.145728-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

cxgb4: make the array match_all_mac static, makes object smaller

Don't populate the array match_all_mac on the stack but instead it
static const. Makes the object code smaller by 75 bytes.

Before:
   text    data     bss     dec     hex filename
  46701    8960      64   55725    d9ad ../chelsio/cxgb4/cxgb4_filter.o

After:
   text    data     bss     dec     hex filename
  46338    9120     192   55650    d962 ../chelsio/cxgb4/cxgb4_filter.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801151205.145924-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: Fix refcount warning for new fib_info

Ioana reported a refcount warning when booting over NFS:

[    5.042532] ------------[ cut here ]------------
[    5.047184] refcount_t: addition on 0; use-after-free.
[    5.052324] WARNING: CPU: 7 PID: 1 at lib/refcount.c:25 refcount_warn_saturate+0xa4/0x150
...
[    5.167201] Call trace:
[    5.169635]  refcount_warn_saturate+0xa4/0x150
[    5.174067]  fib_create_info+0xc00/0xc90
[    5.177982]  fib_table_insert+0x8c/0x620
[    5.181893]  fib_magic.isra.0+0x110/0x11c
[    5.185891]  fib_add_ifaddr+0xb8/0x190
[    5.189629]  fib_inetaddr_event+0x8c/0x140

fib_treeref needs to be set after kzalloc. The old code had a ++ which
led to the confusion when the int was replaced by a refcount_t.

Fixes: 79976892f7ea ("net: convert fib_treeref from int to refcount_t")
Signed-off-by: David Ahern <dsahern@kernel.org>
Reported-by: Ioana Ciornei <ciorneiioana@gmail.com>
Cc: Yajun Deng <yajun.deng@linux.dev>
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20210802160221.27263-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: mscc: make some arrays static const, makes object smaller

Don't populate arrays on the stack but instead them static const.
Makes the object code smaller by 280 bytes.

Before:
   text    data     bss     dec     hex filename
  24142    4368     192   28702    701e ./drivers/net/phy/mscc/mscc_ptp.o

After:
   text    data     bss     dec     hex filename
  23830    4400     192   28422    6f06 ./drivers/net/phy/mscc/mscc_ptp.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801070155.139057-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netdevsim: make array res_ids static const, makes object smaller

Don't populate the array res_ids on the stack but instead it
static const. Makes the object code smaller by 14 bytes.

Before:
   text    data     bss     dec     hex filename
  50833    8314     256   59403    e80b ./drivers/net/netdevsim/fib.o

After:
   text    data     bss     dec     hex filename
  50755    8378     256   59389    e7fd ./drivers/net/netdevsim/fib.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801065328.138906-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/ipv4: Replace one-element array with flexible-array member

There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Use an anonymous union with a couple of anonymous structs in order to
keep userspace unchanged:

$ pahole -C ip_msfilter net/ipv4/ip_sockglue.o
struct ip_msfilter {
union {
struct {
__be32     imsf_multiaddr_aux;   /*     0     4 */
__be32     imsf_interface_aux;   /*     4     4 */
__u32      imsf_fmode_aux;       /*     8     4 */
__u32      imsf_numsrc_aux;      /*    12     4 */
__be32     imsf_slist[1];        /*    16     4 */
};                                       /*     0    20 */
struct {
__be32     imsf_multiaddr;       /*     0     4 */
__be32     imsf_interface;       /*     4     4 */
__u32      imsf_fmode;           /*     8     4 */
__u32      imsf_numsrc;          /*    12     4 */
__be32     imsf_slist_flex[0];   /*    16     0 */
};                                       /*     0    16 */
};                                               /*     0    20 */

/* size: 20, cachelines: 1, members: 1 */
/* last cacheline: 20 bytes */
};

Also, refactor the code accordingly and make use of the struct_size()
and flex_array_size() helpers.

This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: remove the struct packet_type argument from dsa_device_ops::rcv()

No tagging driver uses this.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>