Ideally, when the switch receives a packet from swp3 or swp4, it should
forward the packet to the CPU, according to the port matrix and unknown
unicast flood settings.
But packet loss will happen if the destination address is at one of the
offloaded ports (swp0~2). For example, when client C sends a packet to
A, the FDB lookup will indicate that it should be forwarded to swp0, but
the port matrix of swp3 and swp4 is configured to only allow the CPU to
be its destination, so it is dropped.
However, this issue does not happen if the bridge is VLAN-aware. That is
because VLAN-aware bridges use independent VLAN learning, i.e. use VID
for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded,
shared VLAN learning with default filter ID of 0 is used instead. So the
lookup for A with filter ID 0 never hits and the packet can be forwarded
to the CPU.
In the current code, only two combinations were used to toggle user
ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with
PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to
security mode with PVC.VLAN_ATTR set to user port.
It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and
port matrix mode just skips the VLAN table lookup. The reference manual
is somehow misleading when describing PORT_VLAN modes. It states that
PORT_MEM (VLAN port member) is used for destination if the VLAN table
lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of
VLAN port member and port matrix) is used instead, which means we can
have two or more separate VLAN-aware bridges with the same PVID and
traffic won't leak between them.
Therefore, to solve this, enable independent VLAN learning with PVID 0
on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback
mode, while leaving standalone ports in port matrix mode. The CPU port
is always set to fallback mode to serve those bridges.
During testing, it is found that FDB lookup with filter ID of 0 will
also hit entries with VID 0 even with independent VLAN learning. To
avoid that, install all VLANs with filter ID of 1.
Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Address learning is enabled on offloaded ports (swp0~2) and the CPU
port, so when client A sends a packet to C, the following will happen:
1. The switch learns that client A can be reached at swp0.
2. The switch probably already knows that client C can be reached at the
CPU port, so it forwards the packet to the CPU.
3. The bridge core knows client C can be reached at bond0, so it
forwards the packet back to the switch.
4. The switch learns that client A can be reached at the CPU port.
5. The switch forwards the packet to either swp3 or swp4, according to
the packet's tag.
That makes client A's MAC address flap between swp0 and the CPU port. If
client B sends a packet to A, it is possible that the packet is
forwarded to the CPU. With offload_fwd_mark = 1, the bridge core won't
forward it back to the switch, resulting in packet loss.
As we have the assisted_learning_on_cpu_port in DSA core now, enable
that and disable hardware learning on the CPU port.
Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Vladimir Oltean <oltean@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Aug 2021 09:12:05 +0000 (10:12 +0100)]
Merge branch 'ipa-pm-irqs'
Alex Elder says:
====================
net: ipa: prepare GSI interrupts for runtime PM
The last patch in this series arranges for GSI interrupts to be
disabled when the IPA hardware is suspended. This ensures the clock
is always operational when a GSI interrupt fires. Leading up to
that are patches that rearrange the code a bit to allow this to
be done.
The first two patches aren't *directly* related. They remove some
flag arguments to some GSI suspend/resume related functions, using
the version field now present in the GSI structure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:01:03 +0000 (09:01 -0500)]
net: ipa: disable GSI interrupts while suspended
Introduce new functions gsi_suspend() and gsi_resume(), which will
disable the GSI interrupt handler after all endpoints are suspended
and re-enable it before endpoints are resumed. This will ensure no
GSI interrupt handler will fire when the hardware is suspended.
Here's a little further explanation. There are seven GSI interrupt
types, and most are disabled except when needed.
- These two are not used (never enabled):
GSI_INTER_EE_CH_CTRL
GSI_INTER_EE_EV_CTRL
- These two are only used to implement channel and event ring
commands, and are only enabled while a command is underway:
GSI_CH_CTRL
GSI_EV_CTRL
- The IEOB interrupt signals I/O completion. It will not fire
when a channel is stopped (or "suspended").
GSI_IEOB
- This interrupt is used to allocate or halt modem channels,
and is only enabled while such a command is underway.
GSI_GLOB_EE
However it also is used to signal certain errors, and this could
occur at any time.
- The general interrupt signals general errors, and could occur at
any time.
GSI_GENERAL
The purpose for this change is to ensure no global or general
interrupts fire due to errors while the hardware is suspended.
We enable the clock on resume, and at that time we can "handle"
(at least report) these error conditions.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:01:02 +0000 (09:01 -0500)]
net: ipa: move gsi_irq_init() code into setup
The GSI IRQ handler could be triggered as soon as it is registered
with request_irq(). The handler function, gsi_isr(), touches
hardware, meaning the IPA clock must be operational. The IPA clock
is not operating when the handler is registered (in gsi_irq_init()),
so this is a problem.
Move the call to request_irq() for the GSI interrupt handler into
gsi_irq_setup(), which is called when the IPA clock is known to be
operational (and furthermore, the GSI firmware will have been
loaded). Request the IRQ at the end of that function, after all
interrupt types have been disabled and masked.
Move the matching free_irq() call into gsi_irq_teardown(), and get
rid of the now empty gsi_irq_exit(),
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:01:01 +0000 (09:01 -0500)]
net: ipa: have gsi_irq_setup() return an error code
Change gsi_irq_setup() so it returns an error value, and introduce
gsi_irq_teardown() as its inverse. Set the interrupt type (IRQ
rather than MSI) in gsi_irq_setup().
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:01:00 +0000 (09:01 -0500)]
net: ipa: move some GSI setup functions
Move gsi_irq_setup() and gsi_ring_setup() so they're defined right
above gsi_setup() where they're called. This is a trivial movement
of code to prepare for upcoming patches.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:00:59 +0000 (09:00 -0500)]
net: ipa: move version check for channel suspend/resume
Change the Boolean flags passed to __gsi_channel_start() and
__gsi_channel_stop() so they represent whether the request is being
made to implement suspend (versus stop) or resume (versus start).
Then stop or start the channel for suspend/resume requests only if
the hardware version indicates it should be done.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Tue, 3 Aug 2021 14:00:58 +0000 (09:00 -0500)]
net: ipa: use gsi->version for channel suspend/resume
The GSI layer has the IPA version now, so there's no need for
version-specific flags to be passed from IPA. One instance of
this is in gsi_channel_suspend() and gsi_channel_resume(), which
indicate whether or not the endpoint suspend is implemented by
GSI stopping the channel. We can make that determination based
on gsi->version, eliminating the need for a Boolean flag in those
functions.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Aug 2021 09:10:12 +0000 (10:10 +0100)]
Merge branch 'mhi-mbim'
Loic Poulain says:
====================
net: mhi: move MBIM to WWAN
Implement a proper WWAN driver for MBIM network protocol, with multi link
management supported through the WWAN framework (wwan rtnetlink).
Until now, MBIM over MHI was supported directly in the mhi_net driver, via
some protocol rx/tx fixup callbacks, but with only one session supported
(no multilink muxing). We can then remove that part from mhi_net and restore
the driver to a simpler version for 'raw' ip transfer (or QMAP via rmnet link).
Note that a wwan0 link is created by default for session-id 0. Additional links
can be managed via ip tool:
$ ip link add dev wwan0mms parentdev wwan0 type wwan linkid 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Loic Poulain [Tue, 3 Aug 2021 13:36:29 +0000 (15:36 +0200)]
net: mhi: Remove MBIM protocol
The MBIM protocol has now been integrated in a proper WWAN driver. We
can then revert back to a simpler driver for mhi_net, which is used
for raw IP or QMAP protocol (via rmnet link).
Loic Poulain [Tue, 3 Aug 2021 13:36:28 +0000 (15:36 +0200)]
net: wwan: Add MHI MBIM network driver
Add new wwan driver for MBIM over MHI. MBIM is a transport protocol
for IP packets, allowing packet aggregation and muxing. Initially
designed for USB bus, it is also exposed through MHI bus for QCOM
based PCIe wwan modems.
This driver supports the new wwan rtnetlink interface for multi-link
management and has been tested with Quectel EM120R-GL M2 module.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 3 Aug 2021 13:05:26 +0000 (06:05 -0700)]
net: add netif_set_real_num_queues() for device reconfig
netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues()
can fail which breaks drivers trying to implement reconfiguration
in a way that can't leave the device half-broken. In other words
those functions are incompatible with prepare/commit approach.
Luckily setting real number of queues can fail only if the number
is increased, meaning that if we order operations correctly we
can guarantee ending up with either new config (success), or
the old one (on error).
Provide a helper implementing such logic so that drivers don't
have to duplicate it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Rocco Yue [Tue, 3 Aug 2021 12:02:50 +0000 (20:02 +0800)]
net: add extack arg for link ops
Pass extack arg to validate_linkmsg and validate_link_af callbacks.
If a netlink attribute has a reject_message, use the extended ack
mechanism to carry the message back to user space.
Signed-off-by: Rocco Yue <rocco.yue@mediatek.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Rao Shoaib [Sun, 1 Aug 2021 07:57:07 +0000 (00:57 -0700)]
af_unix: Add OOB support
This patch adds OOB support for AF_UNIX sockets.
The semantics is same as TCP.
The last byte of a message with the OOB flag is
treated as the OOB byte. The byte is separated into
a skb and a pointer to the skb is stored in unix_sock.
The pointer is used to enforce OOB semantics.
Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 4 Aug 2021 08:53:34 +0000 (09:53 +0100)]
Merge branch 'dpaa2-switch-next'
Ioana Ciornei says:
====================
dpaa2-switch: integrate the MAC endpoint support
This patch set integrates the already available MAC support into the
dpaa2-switch driver as well.
The first 4 patches are fixing up some minor problems or optimizing the
code, while the remaining ones are actually integrating the dpaa2-mac
support into the switch driver by calling the dpaa2_mac_* provided
functions. While at it, we also export the MAC statistics in ethtool
like we do for dpaa2-eth.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:44 +0000 (19:57 +0300)]
dpaa2-switch: add a prefix to HW ethtool stats
In the next patch, we'll add support for also exporting the MAC
statistics in the ethtool stats. Annotate already present HW stats with
a suggestive prefix.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:43 +0000 (19:57 +0300)]
dpaa2-switch: integrate the MAC endpoint support
Integrate the common MAC endpoint management support into the
dpaa2-switch driver as well. Nothing special happens here, just that the
already available dpaa2-mac functions are also called from dpaa2-switch.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:42 +0000 (19:57 +0300)]
bus: fsl-mc: extend fsl_mc_get_endpoint() to pass interface ID
In case of a switch DPAA2 object, the interface ID is also needed when
querying for the object endpoint. Extend fsl_mc_get_endpoint() so that
users can also pass the interface ID that are interested in.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:41 +0000 (19:57 +0300)]
dpaa2-switch: no need to check link state right after ndo_open
The call to dpaa2_switch_port_link_state_update is a leftover from the
time when on DPAA2 platforms the PHYs were started at boot time so when
an ifconfig was issued on the associated interface, the link status
needed to be checked directly from the ndo_open() callback. This is not
needed anymore since we are now properly integrated with the PHY layer
thus a link interrupt will come directly from the PHY eventually without
the need to call the sync function.
Fix this up by removing the call to dpaa2_switch_port_link_state_update.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:40 +0000 (19:57 +0300)]
dpaa2-switch: do not enable the DPSW at probe time
We should not enable the switch interfaces at probe time since this is
trigged by the open callback. Remove the call dpsw_enable() which does
exactly this.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:39 +0000 (19:57 +0300)]
dpaa2-switch: use the port index in the IRQ handler
The MC firmware supplies us the switch interface index for which an
interrupt was triggered. Use this to our advantage instead of looping
through all the switch ports and doing unnecessary work.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Tue, 3 Aug 2021 16:57:38 +0000 (19:57 +0300)]
dpaa2-switch: request all interrupts sources on the DPSW
Request all interrupt sources to be read and then cleared on the DPSW
object. In the next patches we'll also add support for treating other
interrupts.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joakim Zhang [Tue, 3 Aug 2021 05:24:24 +0000 (13:24 +0800)]
net: fec: fix MAC internal delay doesn't work
This patch intends to fix MAC internal delay doesn't work, due to use
of_property_read_u32() incorrectly, and improve this feature a bit:
1) check the delay value if valid, only program register when it's 2000ps.
2) only enable "enet_2x_txclk" clock when require MAC internal delay.
Vladimir Oltean [Mon, 2 Aug 2021 19:51:37 +0000 (22:51 +0300)]
net: dsa: tag_sja1105: consistently fail with arbitrary input
Dan Carpenter's smatch tests report that the "vid" variable, populated
by sja1105_vlan_rcv when an skb is received by the tagger that has a
VLAN ID which cannot be decoded by tag_8021q, may be uninitialized when
used here:
The sja1105 driver, by construction, sets up the switch in a way that
all data plane packets sent towards the CPU port are VLAN-tagged. So it
is practically impossible, in a functional system, for a packet to be
processed by sja1110_rcv() which is not a control packet and does not
have a VLAN header either.
However, it would be nice if the sja1105 tagging driver could
consistently do something valid, for example fail, even if presented with
packets that do not hold valid sja1105 tags. Currently it is a bit hard
to argue that it does that, given the fact that a data plane packet with
no VLAN tag will trigger a call to dsa_find_designated_bridge_port_by_vid
with a vid argument that is an uninitialized stack variable.
To fix this, we can initialize the u16 vid variable with 0, a value that
can never be a bridge VLAN, so dsa_find_designated_bridge_port_by_vid
will always return a NULL skb->dev.
Vladimir Oltean [Mon, 2 Aug 2021 11:36:33 +0000 (14:36 +0300)]
net: bridge: switchdev: fix incorrect use of FDB flags when picking the dst device
Nikolay points out that it is incorrect to assume that it is impossible
to have an fdb entry with fdb->dst == NULL and the BR_FDB_LOCAL bit in
fdb->flags not set. This is because there are reader-side places that
test_bit(BR_FDB_LOCAL, &fdb->flags) without the br->hash_lock, and if
the updating of the FDB entry happens on another CPU, there are no
memory barriers at writer or reader side which would ensure that the
reader sees the updates to both fdb->flags and fdb->dst in the same
order, i.e. the reader will not see an inconsistent FDB entry.
So we must be prepared to deal with FDB entries where fdb->dst and
fdb->flags are in a potentially inconsistent state, and that means that
fdb->dst == NULL should remain a condition to pick the net_device that
we report to switchdev as being the bridge device, which is what the
code did prior to the blamed patch.
Fixes: 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the same as entries towards the bridge") Suggested-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20210802113633.189831-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yajun Deng [Tue, 3 Aug 2021 07:37:39 +0000 (15:37 +0800)]
net: decnet: Fix refcount warning for new dn_fib_info
fib_treeref needs to be set after kzalloc. The old code had a ++ which
led to the confusion when the int was replaced by a refcount_t.
Fixes: 79976892f7ea ("net: convert fib_treeref from int to refcount_t") Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210803073739.22339-1-yajun.deng@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Tue, 3 Aug 2021 12:05:27 +0000 (13:05 +0100)]
Merge branch 'Space-cleanup'
Arnd Bergmann says:
====================
drivers/net/Space.c cleanup
I discovered that there are still a couple of drivers that rely on
beiong statically initialized from drivers/net/Space.c the way
we did in the last century. As it turns out, there are a couple
of simplifications that can be made here, as well as some minor
bugfixes.
There are four classes of drivers that use this:
- most 10mbit ISA bus ethernet drivers (and one 100mbit one)
- both ISA localtalk drivers
- several m68k ethernet drivers
- one obsolete WAN driver
I found that the drivers using in arch/m68k/ don't actually benefit
from being probed this way as they do not rely on the netdev= command
line arguments, they have simply never been changed to work like a
modern driver.
I had previously sent a patch to remove the sbni/granch driver, and
there were no objections to this patch but forgot to resend it after
some discussion about another patch in the same series.
For the ISA drivers, there is usually no way to probe multiple devices
at boot time other than the netdev= arguments, so all that logic is left
in place for the moment, but centralized in a single file that only gets
included in the kernel build if one or more of the drivers are built-in.
I'm also changing the old-style init_module() functions in these drivers
to static functions with a module_init() annotation, to more closely
resemble modern drivers. These are the last drivers in the kernel to
still use init_module/cleanup_module, removing those may enable future
cleanups to the module loading process.
Arnd
Changes in v2:
- replace xsurf100 change with Michael's version
- make it PATCH instead of RFC
- rebase to net-next as of August 3
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Aug 2021 11:40:51 +0000 (13:40 +0200)]
ethernet: isa: convert to module_init/module_exit
There are a couple of ISA ethernet drivers that use the old
init_module/cleanup_module function names for the main entry
points, nothing else uses those any more.
Change them to the documented method with module_init()
and module_exit() markers next to static functions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Aug 2021 11:40:49 +0000 (13:40 +0200)]
wan: remove sbni/granch driver
The driver was merged in 1999 and has only ever seen treewide cleanups
since then, with no indication whatsoever that anyone has actually
had access to hardware for testing the patches.
>From the information in the link below, it appears that the hardware
is for some leased line system in Russia that has since been
discontinued, and useless without any remote end to connect to.
As the driver still feels like a Linux-2.2 era artifact today, it
appears that the best way forward is to just delete it.
Arnd Bergmann [Tue, 3 Aug 2021 11:40:47 +0000 (13:40 +0200)]
make legacy ISA probe optional
There are very few ISA drivers left that rely on the static probing from
drivers/net/Space.o. Make them all select a new CONFIG_NETDEV_LEGACY_INIT
symbol, and drop the entire probe logic when that is disabled.
The 9 drivers that are called from Space.c are the same set that
calls netdev_boot_setup_check().
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Schmitz [Tue, 3 Aug 2021 11:40:45 +0000 (13:40 +0200)]
xsurf100: drop include of lib8390.c
Now that ax88796.c exports the ax_NS8390_reinit() symbol, we can
include 8390.h instead of lib8390.c, avoiding duplication of that
function and killing a few compile warnings in the bargain.
Fixes: 861928f4e60e826c ("net-next: New ax88796 platform
driver for Amiga X-Surf 100 Zorro board (m68k)")
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Schmitz [Tue, 3 Aug 2021 11:40:44 +0000 (13:40 +0200)]
ax88796: export ax_NS8390_init() hook
The block I/O code for the new X-Surf 100 ax88796 driver needs
ax_NS8390_init() for error fixup in its block_output function.
Export this static function through the ax_NS8390_reinit()
wrapper so we can lose the lib8380.c include in the X-Surf 100
driver.
[arnd: add the declaration in the header to avoid a
-Wmissing-prototypes warning] Fixes: 861928f4e60e826c ("net-next: New ax88796 platform
driver for Amiga X-Surf 100 Zorro board (m68k)") Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Aug 2021 11:40:43 +0000 (13:40 +0200)]
m68k: remove legacy probing
There are six m68k specific drivers that use the legacy probe method
in drivers/net/Space.c. However, all of these only support a single
device, and they completely ignore the command line settings from
netdev_boot_setup_check, so there is really no point at all.
Aside from sun3_82586, these already have a module_init function that
can be used for built-in mode as well, simply by removing the #ifdef.
Note that the 82596 driver was previously used on ISA as well, but
that got dropped long ago.
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Aug 2021 11:40:42 +0000 (13:40 +0200)]
cs89x0: rework driver configuration
There are two drivers in the cs89x0 file, with the CONFIG_CS89x0_PLATFORM
symbol deciding which one is getting built. This is somewhat confusing
and makes it more likely ton configure a driver that works nowhere.
Split up the Kconfig option into separate ISA and PLATFORM drivers,
with the ISA symbol explicitly connecting to the static probing in
drivers/net/Space.c
The two drivers are still mutually incompatible at compile time,
which could be lifted by splitting them into multiple files,
but in practice this will make no difference.
The platform driver can now be enabled for compile-testing on
non-ARM machines.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Aug 2021 11:40:41 +0000 (13:40 +0200)]
3c509: stop calling netdev_boot_setup_check
This driver never uses the information returned by
netdev_boot_setup_check, and is not called by the boot-time probing from
driver/net/Space.c, so just remove these stale references.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The data from the kernel command line is no longer used since the
probe function gets it from the platform device resources instead.
The jazz version was changed to be like this in 2007, the xtensa
version apparently copied the code from there.
Fixes: ed9f0e0bf3ce ("remove setup of platform device from jazzsonic.c") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Aug 2021 11:58:22 +0000 (12:58 +0100)]
Merge branch 'ethtool-runtime-pm'
Heiner Kallweit says:
====================
ethtool: runtime-resume netdev parent before ethtool ops
If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if
not accessing the device) will fail because netif_device_present()
returns false
- ethtool ops may fail because device is not accessible (e.g. because being
in D3 in case of a PCI device)
It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing
ethtool ops in kernel.
This patch series covers the typical case that the netdev parent is power-
managed, e.g. a PCI device. Not sure whether cases exist where the netdev
itself is power-managed. If yes then we may need an extension for this.
But the series as-is at least shouldn't cause problems in that case.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 1 Aug 2021 10:41:31 +0000 (12:41 +0200)]
ethtool: runtime-resume netdev parent in ethnl_ops_begin
If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
accessing the device) will fail because netif_device_present() returns
false
- ethtool ops may fail because device is not accessible (e.g. because being
in D3 in case of a PCI device)
It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 1 Aug 2021 10:40:05 +0000 (12:40 +0200)]
ethtool: move netif_device_present check from ethnl_parse_header_dev_get to ethnl_ops_begin
If device is runtime-suspended and not accessible then it may be
flagged as not present. If checking whether device is present is
done too early then we may bail out before we have the chance to
runtime-resume the device. Therefore move this check to
ethnl_ops_begin(). This is in preparation of a follow-up patch
that tries to runtime-resume the device before executing ethtool
ops.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 1 Aug 2021 10:36:48 +0000 (12:36 +0200)]
ethtool: runtime-resume netdev parent before ethtool ioctl ops
If a network device is runtime-suspended then:
- network device may be flagged as detached and all ethtool ops (even if not
accessing the device) will fail because netif_device_present() returns
false
- ethtool ops may fail because device is not accessible (e.g. because being
in D3 in case of a PCI device)
It may not be desirable that userspace can't use even simple ethtool ops
that not access the device if interface or link is down. To be more friendly
to userspace let's ensure that device is runtime-resumed when executing the
respective ethtool op in kernel.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 2 Aug 2021 17:57:29 +0000 (10:57 -0700)]
virtio-net: realign page_to_skb() after merges
We ended up merging two versions of the same patch set:
commit 8fb7da9e9907 ("virtio_net: get build_skb() buf by data ptr")
commit 5c37711d9f27 ("virtio-net: fix for unable to handle page fault for address")
into net, and
commit 7bf64460e3b2 ("virtio-net: get build_skb() buf by data ptr")
commit 6c66c147b9a4 ("virtio-net: fix for unable to handle page fault for address")
into net-next. Redo the merge from commit 126285651b7f ("Merge
ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net"), so that
the most recent code remains.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Aug 2021 11:38:47 +0000 (12:38 +0100)]
Merge branch 'bnxt_en-rx-ring'
Michael Chan says:
====================
bnxt_en: Increase maximum RX ring size when jumbo ring is unused
The RX jumbo ring is automatically enabled when HW GRO/LRO is enabled or
when the MTU exceeds the page size. The RX jumbo ring provides a lot
more RX buffer space when it is in use. When the RX jumbo ring is not
in use, some users report that the current maximum of 2K buffers is
too limiting. This patchset increases the maximum to 8K buffers when
the RX jumbo ring is not used. The default RX ring size is unchanged
at 511.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 2 Aug 2021 14:52:39 +0000 (10:52 -0400)]
bnxt_en: Increase maximum RX ring size if jumbo ring is not used
The current maximum RX ring size is defined assuming the RX jumbo ring
(aka aggregation ring) is used. The RX jumbo ring is automicatically used
when the MTU exceeds a threshold or when rx-gro-hw/lro is enabled. The RX
jumbo ring is automatically sized up to 4 times the size of the RX ring
size.
The BNXT_MAX_RX_DESC_CNT constant is the upper limit on the size of the
RX ring whether or not the RX jumbo ring is used. Obviously, the
maximum amount of RX buffer space is significantly less when the RX jumbo
ring is not used.
To increase flexibility for the user who does not use the RX jumbo ring,
we now define a bigger maximum RX ring size when the RX jumbo ring is not
used. The maximum RX ring size is now up to 8K when the RX jumbo ring
is not used. The maximum completion ring size also needs to be scaled
up to accomodate the larger maximum RX ring size.
Note that when the RX jumbo ring is re-enabled, the RX ring size will
automatically drop if it exceeds the maximum.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 2 Aug 2021 14:52:38 +0000 (10:52 -0400)]
bnxt_en: Don't use static arrays for completion ring pages
We currently store these page addresses and DMA addreses in static
arrays. On systems with 4K pages, we support up to 64 pages per
completion ring. The actual number of pages for each completion ring
may be much less than 64. For example, when the RX ring size is set
to the default 511 entries, only 16 completion ring pages are needed
per ring.
In the next patch, we'll be doubling the maximum number of completion
pages. So we convert to allocate these arrays as needed instead of
declaring them statically.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Mon, 2 Aug 2021 03:02:19 +0000 (11:02 +0800)]
bonding: add new option lacp_active
Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".
Note, the LACPDU state frames still will be sent when init or unbind port.
v2: remove module parameter
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Len Baker [Sun, 1 Aug 2021 17:12:26 +0000 (19:12 +0200)]
drivers/net/usb: Remove all strcpy() uses
strcpy() performs no bounds checking on the destination buffer. This
could result in linear overflows beyond the end of the buffer, leading
to all kinds of misbehaviors. The safe replacement is strscpy().
Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
currently if skb does not have enough headroom skb_realloc_headrom is called.
It is not optimal because it creates new skb.
this patch set introduces new helper skb_expand_head()
Unlike skb_realloc_headroom, it does not allocate a new skb if possible;
copies skb->sk on new skb when as needed and frees original skb in case of failures.
This helps to simplify ip[6]_finish_output2(), ip6_xmit() and few other
functions in vrf, ax25 and bpf.
There are few other cases where this helper can be used
but it requires an additional investigations.
v3 changes:
- ax25 compilation warning fixed
- v5.14-rc4 rebase
- now it does not depend on non-committed pathces
v2 changes:
- helper's name was changed to skb_expand_head
- fixed few mistakes inside skb_expand_head():
skb_set_owner_w should set sk on nskb
kfree was replaced by kfree_skb()
improved warning message
- added minor refactoring in changed functions in vrf and bpf patches
- removed kfree_skb() in ax25_rt_build_path caller ax25_ip_xmit
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Mon, 2 Aug 2021 08:52:15 +0000 (11:52 +0300)]
skbuff: introduce skb_expand_head()
Like skb_realloc_headroom(), new helper increases headroom of specified skb.
Unlike skb_realloc_headroom(), it does not allocate a new skb if possible;
copies skb->sk on new skb when as needed and frees original skb in case
of failures.
This helps to simplify ip[6]_finish_output2() and a few other similar cases.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Aug 2021 10:16:13 +0000 (11:16 +0100)]
Merge tag 'mlx5-updates-2021-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
This patch-set changes the TTC (Traffic Type Classification) logic
to be independent from the mlx5 ethernet driver by renaming the traffic
types enums and making the TTC API generic to the mlx5 core driver.
It allows to decouple TTC logic from mlx5e and reused by other parts
of mlx5 drivers, namely ADQ and lag TX steering hashing.
Patches overview:
1 - Rename traffic type enums to be mlx5 generic.
2 - Rename related TTC arguments and functions.
3 - Remove dependency in the mlx5e driver from the TTC implementation.
4 - Move TTC logic to fs_ttc.
5 - Embed struct mlx5_ttc_table in fs_ttc.
The refactoring series is followed by misc' cleanup patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roi Dayan [Mon, 26 Jul 2021 12:11:43 +0000 (15:11 +0300)]
net/mlx5e: Remove redundant parse_attr arg
Passing parse_attr is redundant in parse_tc_nic_actions() and
mlx5e_tc_add_nic_flow() as we can get it from flow.
This is the same as with parse_tc_fdb_actions() and mlx5e_tc_add_fdb_flow().
Roi Dayan [Sun, 25 Jul 2021 13:04:00 +0000 (16:04 +0300)]
net/mlx5e: Remove redundant cap check for flow counter
The cap is very old and today will always exists.
The cap is not being checked anywhere else. Remove the check from
drop action when parsing tc rules in nic mode.
Maor Gottlieb [Fri, 2 Jul 2021 11:25:14 +0000 (14:25 +0300)]
net/mlx5e: Decouple TTC logic from mlx5e
Remove dependency in the mlx5e driver from the TTC implementation
by changing the TTC related functions to receive mlx5 generic arguments.
It allows to decouple TTC logic from mlx5e and reused by other parts of
mlx5 driver.
Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
net/mlx5e: Allocate the array of channels according to the real max_nch
The channels array in struct mlx5e_rx_res is converted to a dynamic one,
which will use the dynamic value of max_nch instead of
implementation-defined maximum of MLX5E_MAX_NUM_CHANNELS.
net/mlx5e: Hide all implementation details of mlx5e_rx_res
This commit moves all implementation details of struct mlx5e_rx_res
under en/rx_res.c. All access to RX resources is now done using methods.
Encapsulating RX resources into an object allows for better
manageability, because all the implementation details are now in a
single place, and external code can use only a limited set of API
methods to init/teardown the whole thing, reconfigure RSS and LRO
parameters, connect TIRs to flow steering and activate/deactivate TIRs.
mlx5e_rx_res is self-contained and doesn't depend on struct mlx5e_priv
or include en.h.
net/mlx5e: Introduce mlx5e_channels API to get RQNs
Currently, struct mlx5e_channels is defined in en.h, along with a lot of
other stuff. In the following commit mlx5e_rx_res will need to get RQNs
(RQ hardware IDs), given a pointer to mlx5e_channels and the channel
index. In order to make it possible without including the whole en.h,
this commit introduces functions that will hide the implementation
details of mlx5e_channels.
net/mlx5e: Use a new initializer to build uniform indir table
Replace mlx5e_build_default_indir_rqt with a new initializer of struct
mlx5e_rss_params_indir that works directly with the struct, rather than
its internals.
The new initializer is called mlx5e_rss_params_indir_init_uniform, which
also reflects the purpose (uniform spreading) better.
Colin Ian King [Sun, 1 Aug 2021 15:06:47 +0000 (16:06 +0100)]
net: marvell: make the array name static, makes object smaller
Don't populate the const array name on the stack but instead it
static. Makes the object code smaller by 28 bytes. Add a missing
const to clean up a checkpatch warning.
Before:
text data bss dec hex filename
124565 31565 384 156514 26362 drivers/net/ethernet/marvell/sky2.o
After:
text data bss dec hex filename
124441 31661 384 156486 26346 drivers/net/ethernet/marvell/sky2.o
net/ipv4: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].
Use an anonymous union with a couple of anonymous structs in order to
keep userspace unchanged:
Also, refactor the code accordingly and make use of the struct_size()
and flex_array_size() helpers.
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().