]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
2 years agoipv6: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:20 +0000 (23:02 +0200)]
ipv6: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210220.8491-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:19 +0000 (23:02 +0200)]
ipv4: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210219.8467-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocaif: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:14 +0000 (23:02 +0200)]
caif: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210214.8371-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobridge: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:12 +0000 (23:02 +0200)]
bridge: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20220818210212.8347-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoax25: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:05 +0000 (23:02 +0200)]
ax25: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210206.8299-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agovlan: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:04 +0000 (23:02 +0200)]
vlan: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210204.8275-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoisdn: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:00:23 +0000 (23:00 +0200)]
isdn: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210023.6889-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'validate-of-nodes-for-dsa-shared-ports'
Jakub Kicinski [Tue, 23 Aug 2022 00:45:51 +0000 (17:45 -0700)]
Merge branch 'validate-of-nodes-for-dsa-shared-ports'

Vladimir Oltean says:

====================
Validate OF nodes for DSA shared ports

This is the first set of measures taken so that more drivers can be
transitioned towards phylink on shared (CPU and DSA) ports some time in
the future. It consists of:

- expanding the DT schema for DSA and related drivers to clarify the new
  requirements.

- introducing warnings for drivers that currently skip phylink due to
  incomplete DT descriptions.

- introducing warning for drivers that currently skip phylink due to
  using platform data (search for struct dsa_chip_data).

- closing the possibility for new(ish) drivers to skip phylink, by
  validating their DT descriptions.

- making the code paths used by shared ports more evident.

- preparing the code paths used by shared ports for further work to fake
  a link description where that is possible.

More details in patch 10/10.

DT binding (patches 1-6) and kernel (7-10) are in principle separable,
but are submitted together since they're part of the same story.

Patches 8 and 9 are DSA cleanups, and patch 7 is a dependency for patch
10.

v1 at
https://patchwork.kernel.org/project/netdevbpf/patch/20220723164635.1621911-1-vladimir.oltean@nxp.com/

v2 at
https://patchwork.kernel.org/project/netdevbpf/patch/20220729132119.1191227-5-vladimir.oltean@nxp.com/

v3 at
https://patchwork.kernel.org/project/netdevbpf/cover/20220806141059.2498226-1-vladimir.oltean@nxp.com/
====================

Link: https://lore.kernel.org/r/20220818115500.2592578-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: make phylink-related OF properties mandatory on DSA and CPU ports
Vladimir Oltean [Thu, 18 Aug 2022 11:55:00 +0000 (14:55 +0300)]
net: dsa: make phylink-related OF properties mandatory on DSA and CPU ports

Early DSA drivers were kind of simplistic in that they assumed a fairly
narrow hardware layout. User ports would have integrated PHYs at an
internal MDIO address that is derivable from the port number, and shared
(DSA and CPU) ports would have an MII-style (serial or parallel)
connection to another MAC. Phylib and then phylink were used to drive
the internal PHYs, and this needed little to no description through the
platform data structures. Bringing up the shared ports at the maximum
supported link speed was the responsibility of the drivers.

As a result of this, when these early drivers were converted from
platform data to the new DSA OF bindings, there was no link information
translated into the first DT bindings.

https://lore.kernel.org/all/YtXFtTsf++AeDm1l@lunn.ch/

Later, phylink was adopted for shared ports as well, and today we have a
workaround in place, introduced by commit a20f997010c4 ("net: dsa: Don't
instantiate phylink for CPU/DSA ports unless needed"). There, DSA checks
for the presence of phy-handle/fixed-link/managed OF properties, and if
missing, phylink registration would be skipped. This is because phylink
is optional for some drivers (the shared ports already work without it),
but the process of starting to register a port with phylink is
irreversible: if phylink_create() fails to find the fwnode properties it
needs, it bails out and it leaves the ports inoperational (because
phylink expects ports to be initially down, so DSA necessarily takes
them down, and doesn't know how to put them back up again).

DSA being a common framework, new drivers opt into this workaround
willy-nilly, but the ideal behavior from the DSA core's side would have
been to not interfere with phylink's process of failing at all. This
isn't possible because of regression concerns with pre-phylink DT blobs,
but at least DSA should put a stop to the proliferation of more of such
cases that rely on the workaround to skip phylink registration, and
sanitize the environment that new drivers work in.

To that end, create a list of compatible strings for which the
workaround is preserved, and don't apply the workaround for any drivers
outside that list (this includes new drivers).

In some cases, we make the assumption that even existing drivers don't
rely on DSA's workaround, and we do this by looking at the device trees
in which they appear. We can't fully know what is the situation with
downstream DT blobs, but we can guess the overall trend by studying the
DT blobs that were submitted upstream. If there are upstream blobs that
have lacking descriptions, we take it as very likely that there are many
more downstream blobs that do so too. If all upstream blobs have
complete descriptions, we take that as a hint that the driver is a
candidate for enforcing strict DT bindings (considering that most
bindings are copy-pasted). If there are no upstream DT blobs, we take
the conservative route of allowing the workaround, unless the driver
maintainer instructs us otherwise.

The driver situation is as follows:

ar9331
~~~~~~

    compatible strings:
    - qca,ar9331-switch

    1 occurrence in mainline device trees, part of SoC dtsi
    (arch/mips/boot/dts/qca/ar9331.dtsi), description is not problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

b53
~~~

    compatible strings:
    - brcm,bcm5325
    - brcm,bcm53115
    - brcm,bcm53125
    - brcm,bcm53128
    - brcm,bcm5365
    - brcm,bcm5389
    - brcm,bcm5395
    - brcm,bcm5397
    - brcm,bcm5398

    - brcm,bcm53010-srab
    - brcm,bcm53011-srab
    - brcm,bcm53012-srab
    - brcm,bcm53018-srab
    - brcm,bcm53019-srab
    - brcm,bcm5301x-srab
    - brcm,bcm11360-srab
    - brcm,bcm58522-srab
    - brcm,bcm58525-srab
    - brcm,bcm58535-srab
    - brcm,bcm58622-srab
    - brcm,bcm58623-srab
    - brcm,bcm58625-srab
    - brcm,bcm88312-srab
    - brcm,cygnus-srab
    - brcm,nsp-srab
    - brcm,omega-srab

    - brcm,bcm3384-switch
    - brcm,bcm6328-switch
    - brcm,bcm6368-switch
    - brcm,bcm63xx-switch

    I've found at least these mainline DT blobs with problems:

    arch/arm/boot/dts/bcm47094-linksys-panamera.dts
    - lacks phy-mode
    arch/arm/boot/dts/bcm47189-tenda-ac9.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts
    arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts
    arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts
    arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts
    arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts
    arch/arm/boot/dts/bcm953012er.dts
    arch/arm/boot/dts/bcm4708-netgear-r6250.dts
    arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi
    arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts
    arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm53016-meraki-mr32.dts
    - lacks phy-mode

    Verdict: opt into DSA workarounds.

bcm_sf2
~~~~~~~

    compatible strings:
    - brcm,bcm4908-switch
    - brcm,bcm7445-switch-v4.0
    - brcm,bcm7278-switch-v4.0
    - brcm,bcm7278-switch-v4.8

    A single occurrence in mainline
    (arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi), part of a SoC
    dtsi, valid description. Florian Fainelli explains that most of the
    bcm_sf2 device trees lack a full description for the internal IMP
    ports.

    Verdict: opt the BCM4908 into strict DT bindings, and opt the rest
    into the workarounds. Note that even though BCM4908 has strict DT
    bindings, it still does not register with phylink on the IMP port
    due to it implementing ->adjust_link().

hellcreek
~~~~~~~~~

    compatible strings:
    - hirschmann,hellcreek-de1soc-r1

    No occurrence in mainline device trees. Kurt Kanzenbach explains
    that the downstream device trees lacked phy-mode and fixed link, and
    needed work, but were fixed in the meantime.

    Verdict: opt into strict DT bindings and out of workarounds.

lan9303
~~~~~~~

    compatible strings:
    - smsc,lan9303-mdio
    - smsc,lan9303-i2c

    1 occurrence in mainline device trees:
    arch/arm/boot/dts/imx53-kp-hsc.dts
    - no phy-mode, no fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

lantiq_gswip
~~~~~~~~~~~~

    compatible strings:
    - lantiq,xrx200-gswip
    - lantiq,xrx300-gswip
    - lantiq,xrx330-gswip

    No occurrences in mainline device trees. Martin Blumenstingl
    confirms that the downstream OpenWrt device trees lack a proper
    fixed-link and need work, and that the incomplete description can
    even be seen in the example from
    Documentation/devicetree/bindings/net/dsa/lantiq-gswip.txt.

    Verdict: opt out of strict DT bindings and into workarounds.

microchip ksz
~~~~~~~~~~~~~

    compatible strings:
    - microchip,ksz8765
    - microchip,ksz8794
    - microchip,ksz8795
    - microchip,ksz8863
    - microchip,ksz8873
    - microchip,ksz9477
    - microchip,ksz9897
    - microchip,ksz9893
    - microchip,ksz9563
    - microchip,ksz8563
    - microchip,ksz9567
    - microchip,lan9370
    - microchip,lan9371
    - microchip,lan9372
    - microchip,lan9373
    - microchip,lan9374

    5 occurrences in mainline device trees, all descriptions are valid.
    But we had a snafu for the ksz8795 and ksz9477 drivers where the
    phy-mode property would be expected to be located directly under the
    'switch' node rather than under a port OF node. It was fixed by
    commit edecfa98f602 ("net: dsa: microchip: look for phy-mode in port
    nodes"). The driver still has compatibility with the old DT blobs.
    The lan937x support was added later than the above snafu was fixed,
    and even though it has support for the broken DT blobs by virtue of
    sharing a common probing function, I'll take it that its DT blobs
    are correct.

    Verdict: opt lan937x into strict DT bindings, and the others out.

mt7530
~~~~~~

    compatible strings
    - mediatek,mt7621
    - mediatek,mt7530
    - mediatek,mt7531

    Multiple occurrences in mainline device trees, one is part of an SoC
    dtsi (arch/mips/boot/dts/ralink/mt7621.dtsi), all descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

mv88e6060
~~~~~~~~~

    compatible string:
    - marvell,mv88e6060

    no occurrences in mainline, nobody knows anybody who uses it.

    Verdict: opt out of strict DT bindings and into workarounds.

mv88e6xxx
~~~~~~~~~

    compatible strings:
    - marvell,mv88e6085
    - marvell,mv88e6190
    - marvell,mv88e6250

    Device trees that have incomplete descriptions of CPU or DSA ports:
    arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi
    - lacks phy-mode
    arch/arm64/boot/dts/marvell/cn9130-crb.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-spb4.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-cfu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-dev-rev-c.dts
    - lacks phy-mode on CPU port, fixed-link on DSA ports
    arch/arm/boot/dts/vf610-zii-dev-rev-b.dts
    - lacks phy-mode on CPU port
    arch/arm/boot/dts/armada-381-netgear-gs110emx.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-scu4-aib.dts
    - lacks fixed-link on xgmii DSA ports and/or in-band-status on
      2500base-x DSA ports, and phy-mode on CPU port
    arch/arm/boot/dts/imx6qdl-gw5904.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-385-clearfog-gtr-l8.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-dtu.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-dir665.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-rd88f6281.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/orion5x-netgear-wnr854t.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-388-clearfog.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-xp-linksys-mamba.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-linksys.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/imx6q-b450v3.dts
    arch/arm/boot/dts/imx6q-b850v3.dts
    - has a phy-handle but not a phy-mode?
    arch/arm/boot/dts/armada-370-rd.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-linksys-viper.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-rdu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-clearfog-gtr-s4.dts
    - lacks phy-mode and fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

ocelot
~~~~~~

    compatible strings:
    - mscc,vsc9953-switch
    - felix (arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi) is a PCI
      device, has no compatible string

    2 occurrences in mainline, both are part of SoC dtsi and complete.

    Verdict: opt into strict DT bindings and out of workarounds.

qca8k
~~~~~

    compatible strings:
    - qca,qca8327
    - qca,qca8328
    - qca,qca8334
    - qca,qca8337

    5 occurrences in mainline device trees, none of the descriptions are
    problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

realtek
~~~~~~~

    compatible strings:
    - realtek,rtl8366rb
    - realtek,rtl8365mb

    2 occurrences in mainline, both descriptions are fine, additionally
    rtl8365mb.c has a comment "The device tree firmware should also
    specify the link partner of the extension port - either via a
    fixed-link or other phy-handle."

    Verdict: opt into strict DT bindings and out of workarounds.

rzn1_a5psw
~~~~~~~~~~

    compatible strings:
    - renesas,rzn1-a5psw

    One single occurrence, part of SoC dtsi
    (arch/arm/boot/dts/r9a06g032.dtsi), description is fine.

    Verdict: opt into strict DT bindings and out of workarounds.

sja1105
~~~~~~~

    Driver already validates its port OF nodes in
    sja1105_parse_ports_node().

    Verdict: opt into strict DT bindings and out of workarounds.

vsc73xx
~~~~~~~

    compatible strings:
    - vitesse,vsc7385
    - vitesse,vsc7388
    - vitesse,vsc7395
    - vitesse,vsc7398

    2 occurrences in mainline device trees, both descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

xrs700x
~~~~~~~

    compatible strings:
    - arrow,xrs7003e
    - arrow,xrs7003f
    - arrow,xrs7004e
    - arrow,xrs7004f

    no occurrences in mainline, we don't know.

    Verdict: opt out of strict DT bindings and into workarounds.

Because there is a pattern where newly added switches reuse existing
drivers more often than introducing new ones, I've opted for deciding
who gets to opt into the workaround based on an OF compatible match
table in the DSA core. The alternative would have been to add another
boolean property to struct dsa_switch, like configure_vlan_while_not_filtering.
But this avoids situations where sometimes driver maintainers obfuscate
what goes on by sharing a common probing function, and therefore making
new switches inherit old quirks.

Side note, we also warn about missing properties for drivers that rely
on the workaround. This isn't an indication that we'll break
compatibility with those DT blobs any time soon, but is rather done to
raise awareness about the change, for future DT blob authors.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Acked-by: Alvin Šipraga <alsi@bang-olufsen.dk> # realtek
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: rename dsa_port_link_{,un}register_of
Vladimir Oltean [Thu, 18 Aug 2022 11:54:59 +0000 (14:54 +0300)]
net: dsa: rename dsa_port_link_{,un}register_of

There is a subset of functions that applies only to shared (DSA and CPU)
ports, yet this is difficult to comprehend by looking at their code alone.
These are dsa_port_link_register_of(), dsa_port_link_unregister_of(),
and the functions that only these 2 call.

Rename this class of functions to dsa_shared_port_* to make this fact
more evident, even if this goes against the apparent convention that
function names in port.c must start with dsa_port_.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: avoid dsa_port_link_{,un}register_of() calls with platform data
Vladimir Oltean [Thu, 18 Aug 2022 11:54:58 +0000 (14:54 +0300)]
net: dsa: avoid dsa_port_link_{,un}register_of() calls with platform data

dsa_port_link_register_of() and dsa_port_link_unregister_of() are not
written with the fact in mind that they can be called with a dp->dn that
is NULL (as evidenced even by the _of suffix in their name), but this is
exactly what happens.

How this behaves will differ depending on whether the backing driver
implements ->adjust_link() or not.

If it doesn't, the "if (of_phy_is_fixed_link(dp->dn) || phy_np)"
condition will return false, and dsa_port_link_register_of() will do
nothing and return 0.

If the driver does implement ->adjust_link(), the
"if (of_phy_is_fixed_link(dp->dn))" condition will return false
(dp->dn is NULL) and we will call dsa_port_setup_phy_of(). This will
call dsa_port_get_phy_device(), which will also return NULL, and we will
also do nothing and return 0.

It is hard to maintain this code and make future changes to it in this
state, so just suppress calls to these 2 functions if dp->dn is NULL.
The only functional effect is that if the driver does implement
->adjust_link(), we'll stop printing this to the console:

Using legacy PHYLIB callbacks. Please migrate to PHYLINK!

but instead we'll always print:

[    8.539848] dsa-loop fixed-0:1f: skipping link registration for CPU port 5

This is for the better anyway, since "using legacy phylib callbacks"
was misleading information - we weren't issuing _any_ callbacks due to
dsa_port_get_phy_device() returning NULL.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoof: base: export of_device_compatible_match() for use in modules
Vladimir Oltean [Thu, 18 Aug 2022 11:54:57 +0000 (14:54 +0300)]
of: base: export of_device_compatible_match() for use in modules

Modules such as net/dsa/dsa_core.ko might want to iterate through an
array of compatible strings for things such as validation (or rather,
skipping it for some potentially broken drivers).

of_device_is_compatible() is exported, by of_device_compatible_match()
isn't. Export the latter as well, so we don't have to open-code the
iteration.

Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: make phylink bindings required for CPU/DSA ports
Vladimir Oltean [Thu, 18 Aug 2022 11:54:56 +0000 (14:54 +0300)]
dt-bindings: net: dsa: make phylink bindings required for CPU/DSA ports

It is desirable that new DSA drivers are written to expect that all
their ports register with phylink, and not rely on the DSA core's
workarounds to skip this process.

To that end, DSA is being changed to warn existing drivers when such DT
blobs are in use, and to opt new drivers out of the workarounds.

Introduce another layer of validation in the DSA DT schema, and assert
that CPU and DSA ports must have phylink-related properties present.

Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: rzn1-a5psw: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:55 +0000 (14:54 +0300)]
dt-bindings: net: dsa: rzn1-a5psw: add missing CPU port phy-mode to example

To prevent warnings during "make dt_bindings_check" after dsa-port.yaml
will make phylink properties mandatory, add phy-mode = "internal" to the
example.

This new property is taken straight out of the SoC dtsi at
arch/arm/boot/dts/r9a06g032.dtsi, so it seems likely that only the
example needs to be fixed, rather than DT blobs in circulation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: microchip: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:54 +0000 (14:54 +0300)]
dt-bindings: net: dsa: microchip: add missing CPU port phy-mode to example

The ksz_switch_chips[] element for KSZ9477 says that port 5 is an xMII
port and it supports speeds of 10/100/1000. The device tree example does
declare a fixed-link at 1000, and RGMII is the only one of those modes
that supports this speed, so use this phy-mode.

The microchip,ksz8565 compatible string is not supported by the
microchip driver, however on Microchip's product page it says that there
are 5 ports, 4 of which have internal PHYs and the 5th is an
MII/RMII/RGMII port. It's a bit strange that this is port@6, but it is
probably just the way it is. Select an RGMII phy-mode for this one as
well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: b53: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:53 +0000 (14:54 +0300)]
dt-bindings: net: dsa: b53: add missing CPU port phy-mode to example

Looking at b53_srab_phylink_get_caps() I get no indication of what PHY
modes does port 8 support, since it is implemented circularly based on
the p->mode retrieved from the device tree (and in PHY_INTERFACE_MODE_NA
it reports nothing to supported_interfaces).

However if I look at the b53_switch_chips[] element for BCM58XX_DEVICE_ID,
I see that port 8 is the IMP port, and SRAB means the IMP port is
internal to the SoC. So use phy-mode = "internal" in the example.

Note that this will make b53_srab_phylink_get_caps() go through the
"default" case and report PHY_INTERFACE_MODE_INTERNAL to
supported_interfaces, which is probably a good thing.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: hellcreek: add missing CPU port phy-mode/fixed-link to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:52 +0000 (14:54 +0300)]
dt-bindings: net: dsa: hellcreek: add missing CPU port phy-mode/fixed-link to example

Looking at hellcreek_phylink_get_caps(), I see that depending on whether
is_100_mbits is set, speeds of 1G or of 100M will be advertised. The
de1soc_r1_pdata sets is_100_mbits to true.

The PHY modes declared in the capabilities are MII, RGMII and GMII. GMII
doesn't support 100Mbps, and as for RGMII, it would be a bit implausible
to me to support this PHY mode but limit it to only 25 MHz. So I've
settled on MII as a phy-mode in the example, and a fixed-link of
100Mbps.

As a side note, there exists such a thing as "rev-mii", because the MII
protocol is asymmetric, and "mii" is the designation for the MAC side
(expected to be connected to a PHY), and "rev-mii" is the designation
for the PHY side (expected to be connected to a MAC). I wonder whether
"mii" or "rev-mii" should actually be used here, since this is a CPU
port and presumably connected to another MAC.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: xrs700x: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:51 +0000 (14:54 +0300)]
dt-bindings: net: dsa: xrs700x: add missing CPU port phy-mode to example

Judging by xrs700x_phylink_get_caps(), I deduce that this switch
supports the RGMII modes on port 3, so state this phy-mode in the
example, such that users are encouraged to not rely on avoiding phylink
for this port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: prestera: cache port state for non-phylink ports too
Maksym Glubokiy [Thu, 18 Aug 2022 11:18:21 +0000 (14:18 +0300)]
net: prestera: cache port state for non-phylink ports too

Port event data must stored to port-state cache regardless of whether
the port uses phylink or not since this data is used by ethtool.

Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support")
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: freescale: xgmac: Do not dereference fwnode in struct device
zhaoxiao [Thu, 18 Aug 2022 09:50:59 +0000 (17:50 +0800)]
net: freescale: xgmac: Do not dereference fwnode in struct device

In order to make the underneath API easier to change in the future,
prevent users from dereferencing fwnode from struct device.
Instead, use the specific dev_fwnode() API for that.

Signed-off-by: zhaoxiao <zhaoxiao@uniontech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoRemove DECnet support from kernel
Stephen Hemminger [Thu, 18 Aug 2022 00:43:21 +0000 (17:43 -0700)]
Remove DECnet support from kernel

DECnet is an obsolete network protocol that receives more attention
from kernel janitors than users. It belongs in computer protocol
history museum not in Linux kernel.

It has been "Orphaned" in kernel since 2010. The iproute2 support
for DECnet was dropped in 5.0 release. The documentation link on
Sourceforge says it is abandoned there as well.

Leave the UAPI alone to keep userspace programs compiling.
This means that there is still an empty neighbour table
for AF_DECNET.

The table of /proc/sys/net entries was updated to match
current directories and reformatted to be alphabetical.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Ahern <dsahern@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'lan966x-lag-support'
David S. Miller [Mon, 22 Aug 2022 13:00:54 +0000 (14:00 +0100)]
Merge branch 'lan966x-lag-support'

Horatiu Vultur says:

====================
net: lan966x: Add lag support

Add lag support for lan966x.
First 4 patches don't do any changes to the current behaviour, they
just prepare for lag support. While the rest is to add the lag support.

v3->v4:
- aggregation configuration is global for all bonds, so make sure that
  there can't be enabled multiple configurations at the same time
- return error faster from lan966x_foreign_bridging_check, don't
  continue the search if the error is seen already
- flush fdb workqueue when a port leaves a bridge or lag.

v2->v3:
- return error code from 'switchdev_bridge_port_offload()'
- fix lan966x_foreign_dev_check(), it was missing lag support
- remove lan966x_lag_mac_add_entry and lan966x_mac_del_entry as
  they are not needed
- fix race conditions when accessing port->bond
- move FDB entries when a new port joins the lag if it has a lower

v1->v2:
- fix the LAG PGIDs when ports go down, in this way is not
  needed anymore the last patch of the series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend MAC to support also lag interfaces.
Horatiu Vultur [Wed, 17 Aug 2022 19:34:49 +0000 (21:34 +0200)]
net: lan966x: Extend MAC to support also lag interfaces.

Extend MAC support to support also lag interfaces:
1. In case an entry is learned on a port that is part of lag interface,
   then notify the upper layers that the entry is learned on the bond
   interface
2. If a port leaves the bond and the port is the first port in the lag
   group, then it is required to update all MAC entries to change the
   destination port.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend FDB to support also lag
Horatiu Vultur [Wed, 17 Aug 2022 19:34:48 +0000 (21:34 +0200)]
net: lan966x: Extend FDB to support also lag

Offload FDB entries when the original device is a lag interface. Because
all the ports under the lag have the same chip id, which is the chip id
of first port, then add the entries only for the first port.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add lag support for lan966x
Horatiu Vultur [Wed, 17 Aug 2022 19:34:47 +0000 (21:34 +0200)]
net: lan966x: Add lag support for lan966x

Add link aggregation hardware offload support for lan966x

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend lan966x_foreign_bridging_check
Horatiu Vultur [Wed, 17 Aug 2022 19:34:46 +0000 (21:34 +0200)]
net: lan966x: Extend lan966x_foreign_bridging_check

Extend lan966x_foreign_bridging_check to check also if the upper
interface is a lag device. Don't allow a lan966x port to be part of a
lag if it has foreign interfaces.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb
Horatiu Vultur [Wed, 17 Aug 2022 19:34:45 +0000 (21:34 +0200)]
net: lan966x: Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb

Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb to the
lan966x_main.h file because they will be needed by the lag driver.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Flush fdb workqueue when port is leaving a bridge.
Horatiu Vultur [Wed, 17 Aug 2022 19:34:44 +0000 (21:34 +0200)]
net: lan966x: Flush fdb workqueue when port is leaving a bridge.

Whenever a port leaves a bridge, flush the workqueue of the FDB work.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Split lan966x_fdb_event_work
Horatiu Vultur [Wed, 17 Aug 2022 19:34:43 +0000 (21:34 +0200)]
net: lan966x: Split lan966x_fdb_event_work

Split the function lan966x_fdb_event_work. One case for when the
orig_dev is a bridge and one case when orig_dev is lan966x port.
This is preparation for lag support. There is no functional change.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add registers used to configure lag interfaces
Horatiu Vultur [Wed, 17 Aug 2022 19:34:42 +0000 (21:34 +0200)]
net: lan966x: Add registers used to configure lag interfaces

Add the registers used by lan966x to configure the lag interface.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'tsnep-minor-improvements'
David S. Miller [Mon, 22 Aug 2022 12:56:37 +0000 (13:56 +0100)]
Merge branch 'tsnep-minor-improvements'

Gerhard Engleder says:

====================
tsnep: Various minor driver improvements

During XDP development some general driver improvements has been done
which I want to keep out of future patch series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Record RX queue
Gerhard Engleder [Wed, 17 Aug 2022 19:30:17 +0000 (21:30 +0200)]
tsnep: Record RX queue

Other drivers record RX queue so it should make sense to do that also.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Support full DMA mask
Gerhard Engleder [Wed, 17 Aug 2022 19:30:16 +0000 (21:30 +0200)]
tsnep: Support full DMA mask

DMA addresses up to 64bit are supported by the device. Configure DMA
mask according to the capabilities of the device.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Improve TX length handling
Gerhard Engleder [Wed, 17 Aug 2022 19:30:15 +0000 (21:30 +0200)]
tsnep: Improve TX length handling

TX length can by calculated more efficient during map and unmap of
fragments. Another reason is that, by moving TX statistic counting to
tsnep_tx_poll() it can be used there for XDP too.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Add loopback support
Gerhard Engleder [Wed, 17 Aug 2022 19:30:14 +0000 (21:30 +0200)]
tsnep: Add loopback support

Add support for NETIF_F_LOOPBACK feature. Loopback mode is used for
testing.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Fix TSNEP_INFO_TX_TIME register define
Gerhard Engleder [Wed, 17 Aug 2022 19:30:13 +0000 (21:30 +0200)]
tsnep: Fix TSNEP_INFO_TX_TIME register define

Fixed register define is not used, but register definition shall be kept
in sync.

Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoopenvswitch: Fix overreporting of drops in dropwatch
Mike Pattrick [Wed, 17 Aug 2022 15:06:35 +0000 (11:06 -0400)]
openvswitch: Fix overreporting of drops in dropwatch

Currently queue_userspace_packet will call kfree_skb for all frames,
whether or not an error occurred. This can result in a single dropped
frame being reported as multiple drops in dropwatch. This functions
caller may also call kfree_skb in case of an error. This patch will
consume the skbs instead and allow caller's to use kfree_skb.

Signed-off-by: Mike Pattrick <mkp@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2109957
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoopenvswitch: Fix double reporting of drops in dropwatch
Mike Pattrick [Wed, 17 Aug 2022 15:06:34 +0000 (11:06 -0400)]
openvswitch: Fix double reporting of drops in dropwatch

Frames sent to userspace can be reported as dropped in
ovs_dp_process_packet, however, if they are dropped in the netlink code
then netlink_attachskb will report the same frame as dropped.

This patch checks for error codes which indicate that the frame has
already been freed.

Signed-off-by: Mike Pattrick <mkp@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2109946
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-phy-QUSGMII'
David S. Miller [Mon, 22 Aug 2022 12:46:26 +0000 (13:46 +0100)]
Merge branch 'net-phy-QUSGMII'

Maxime Chevallier says:

====================
net: Introduce QUSGMII phy mode

Re-sending, since the previous v4 was sent while net-next was closed.

This is a resend of the V4 of a previous series [1] initially aimed at
introducing inband extensions, with modes like QUSGMII. This mode allows
passing info in the ethernet preamble between the MAC and the PHY, such as
timestamps.

This series has now become a preliminary series, that simply introduces
the new interface mode, without support for inband extensions, that will
come later.

The reasonning is that work will need to be done in the networking
subsystem, but also in the generic phy driver subsystem to allow serdes
configuration for qusgmii.

This series add the mode, the relevant binding changes, adds support for
it in the lan966x driver, and also introduces a small helper to get the
number of links a given phy mode can carry (think 1 for SGMII and 4 for
QSGMII). This allows for better readability and will prove useful
when (if) we support PSGMII (5 links on 1 interface) and OUSGMII (8
links on one interface).

V4 contains no change but the collected Reviewed-by from Andrew.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add QUSGMII support for lan966x
Maxime Chevallier [Wed, 17 Aug 2022 12:32:55 +0000 (14:32 +0200)]
net: lan966x: Add QUSGMII support for lan966x

The Lan996x controller supports the QUSGMII mode, which is very similar
to QSGMII in the way it's configured and the autonegociation
capababilities it provides.

This commit adds support for that mode, treating it most of the time
like QSGMII, making sure that we do configure the PCS how we should.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: Add helper to derive the number of ports from a phy mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:54 +0000 (14:32 +0200)]
net: phy: Add helper to derive the number of ports from a phy mode

Some phy modes such as QSGMII multiplex several MAC<->PHY links on one
single physical interface. QSGMII used to be the only one supported, but
other modes such as QUSGMII also carry multiple links.

This helper allows getting the number of links that are multiplexed
on a given interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: ethernet-controller: add QUSGMII mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:53 +0000 (14:32 +0200)]
dt-bindings: net: ethernet-controller: add QUSGMII mode

Add a new QUSGMII mode, standing for "Quad Universal Serial Gigabit
Media Independent Interface", a derivative of USGMII which, similarly to
QSGMII, allows to multiplex 4 1Gbps links to a Quad-PHY.

The main difference with QSGMII is that QUSGMII can include an extension
instead of the standard 7bytes ethernet preamble, allowing to convey
arbitrary data such as Timestamps.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: Introduce QUSGMII PHY mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:52 +0000 (14:32 +0200)]
net: phy: Introduce QUSGMII PHY mode

The QUSGMII mode is a derivative of Cisco's USXGMII standard. This
standard is pretty similar to SGMII, but allows for faster speeds, and
has the build-in bits for Quad and Octa variants (like QSGMII).

The main difference with SGMII/QSGMII is that USXGMII/QUSGMII re-uses
the preamble to carry various information, named 'Extensions'.

As of today, the USXGMII standard only mentions the "PCH" extension,
which is used to convey timestamps, allowing in-band signaling of PTP
timestamps without having to modify the frame itself.

This commit adds support for that mode. When no extension is in use, it
behaves exactly like QSGMII, although it's not compatible with QSGMII.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: ti: davinci_mdio: Add workaround for errata i2329
Ravi Gunasekaran [Wed, 17 Aug 2022 09:44:06 +0000 (15:14 +0530)]
net: ethernet: ti: davinci_mdio: Add workaround for errata i2329

On the CPSW and ICSS peripherals, there is a possibility that the MDIO
interface returns corrupt data on MDIO reads or writes incorrect data
on MDIO writes. There is also a possibility for the MDIO interface to
become unavailable until the next peripheral reset.

The workaround is to configure the MDIO in manual mode and disable the
MDIO state machine and emulate the MDIO protocol by reading and writing
appropriate fields in MDIO_MANUAL_IF_REG register of the MDIO controller
to manipulate the MDIO clock and data pins.

More details about the errata i2329 and the workaround is available in:
https://www.ti.com/lit/er/sprz487a/sprz487a.pdf

Add implementation to disable MDIO state machine, configure MDIO in manual
mode and achieve MDIO read and writes via MDIO Bitbanging

Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: realtek: add support for RTL8211F(D)(I)-VD-CG
Clark Wang [Wed, 17 Aug 2022 01:36:18 +0000 (09:36 +0800)]
net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG

RTL8211F(D)(I)-VD-CG is the pin-to-pin upgrade chip from
RTL8211F(D)(I)-CG.

Add new PHY ID for this chip.
It does not support RTL8211F_PHYCR2 anymore, so remove the w/r operation
of this register.

Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoaf_unix: Show number of inflight fds for sockets in TCP_LISTEN state too
Kirill Tkhai [Tue, 16 Aug 2022 21:51:54 +0000 (00:51 +0300)]
af_unix: Show number of inflight fds for sockets in TCP_LISTEN state too

TCP_LISTEN sockets is a special case. They preserve skb with a newly
connected sock till accept() makes it fully functional socket.
Receive queue of such socket may grow after connected peer
send messages there. Since these messages may contain scm_fds,
we should expose correct fdinfo::scm_fds for listening socket too.

Signed-off-by: Kirill Tkhai <tkhai@ya.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: prestera: add missing ABI compatibility check
Maksym Glubokiy [Thu, 18 Aug 2022 11:14:19 +0000 (14:14 +0300)]
net: prestera: add missing ABI compatibility check

Size-check a type used for FW communication is packed as expected.

Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Link: https://lore.kernel.org/r/20220818111419.414877-1-maksym.glubokiy@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoamt: remove unnecessary skb pointer check
Yang Yingliang [Thu, 18 Aug 2022 09:31:14 +0000 (17:31 +0800)]
amt: remove unnecessary skb pointer check

The skb pointer will be checked in kfree_skb(), so remove the outside check.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220818093114.2449179-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests/net: test l2 tunnel TOS/TTL inheriting
Matthias May [Wed, 17 Aug 2022 07:36:49 +0000 (09:36 +0200)]
selftests/net: test l2 tunnel TOS/TTL inheriting

There are currently 3 ip tunnels that are capable of carrying
L2 traffic: gretap, vxlan and geneve.
They all are capable to inherit the TOS/TTL for the outer
IP-header from the inner frame.

Add a test that verifies that these fields are correctly inherited.

These tests failed before the following commits:
b09ab9c92e50 ("ip6_tunnel: allow to inherit from VLAN encapsulated IP")
3f8a8447fd0b ("ip6_gre: use actual protocol to select xmit")
41337f52b967 ("ip6_gre: set DSCP for non-IP")
7ae29fd1be43 ("ip_tunnel: allow to inherit from VLAN encapsulated IP")
7074732c8fae ("ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN")
ca2bb69514a8 ("geneve: do not use RT_TOS for IPv6 flowlabel")
b4ab94d6adaa ("geneve: fix TOS inheriting for ipv4")

Signed-off-by: Matthias May <matthias.may@westermo.com>
Link: https://lore.kernel.org/r/20220817073649.26117-1-matthias.may@westermo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-dpaa-cleanups-in-preparation-for-phylink-conversion'
Jakub Kicinski [Fri, 19 Aug 2022 23:36:06 +0000 (16:36 -0700)]
Merge branch 'net-dpaa-cleanups-in-preparation-for-phylink-conversion'

Sean Anderson says:

====================
net: dpaa: Cleanups in preparation for phylink conversion

This series contains several cleanup patches for dpaa/fman. While they
are intended to prepare for a phylink conversion, they stand on their
own. This series was originally submitted as part of [1].

[1] https://lore.kernel.org/netdev/20220715215954.1449214-1-sean.anderson@seco.com
====================

Link: https://lore.kernel.org/r/20220818161649.2058728-1-sean.anderson@seco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: memac: Use params instead of priv for max_speed
Sean Anderson [Thu, 18 Aug 2022 16:16:35 +0000 (12:16 -0400)]
net: fman: memac: Use params instead of priv for max_speed

This option is present in params, so use it instead of the fman private
version.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Export/rename some common functions
Sean Anderson [Thu, 18 Aug 2022 16:16:34 +0000 (12:16 -0400)]
net: fman: Export/rename some common functions

In preparation for moving each of the initialization functions to their
own file, export some common functions so they can be re-used. This adds
an fman prefix to set_multi to make it a bit less genericly-named.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Configure fixed link in memac_initialization
Sean Anderson [Thu, 18 Aug 2022 16:16:33 +0000 (12:16 -0400)]
net: fman: Configure fixed link in memac_initialization

memac is the only mac which parses fixed links. Move the
parsing/configuring to its initialization function.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Move struct dev to mac_device
Sean Anderson [Thu, 18 Aug 2022 16:16:32 +0000 (12:16 -0400)]
net: fman: Move struct dev to mac_device

Move the reference to our device to mac_device. This way, macs can use
it in their log messages.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Store initialization function in match data
Sean Anderson [Thu, 18 Aug 2022 16:16:31 +0000 (12:16 -0400)]
net: fman: Store initialization function in match data

Instead of re-matching the compatible string in order to determine the init
function, just store it in the match data. The separate setup functions
aren't needed anymore. Merge their content into init as well. To ensure
everything compiles correctly, we move them to the bottom of the file.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Get PCS node in per-mac init
Sean Anderson [Thu, 18 Aug 2022 16:16:30 +0000 (12:16 -0400)]
net: fman: Get PCS node in per-mac init

This moves the reading of the PCS property out of the generic probe and
into the mac-specific initialization function. This reduces the
mac-specific jobs done in the top-level probe function.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: dtsec: Always gracefully stop/start
Sean Anderson [Thu, 18 Aug 2022 16:16:29 +0000 (12:16 -0400)]
net: fman: dtsec: Always gracefully stop/start

There are two ways that GRS can be set: graceful_stop and dtsec_isr. It
is cleared by graceful_start. If it is already set before calling
graceful_stop, then that means that dtsec_isr set it. In that case, we
will not set GRS nor will we clear it (which seems like a bug?). For GTS
the logic is similar, except that there is no one else messing with this
bit (so we will always set and clear it). Simplify the logic by always
setting/clearing GRS/GTS. This is less racy that the previous behavior,
and ensures that we always end up clearing the bits. This can of course
clear GRS while dtsec_isr is waiting, but because we have already done
our own waiting it should be fine.

This is the last user of enum comm_mode, so remove it.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Store en/disable in mac_device instead of mac_priv_s
Sean Anderson [Thu, 18 Aug 2022 16:16:28 +0000 (12:16 -0400)]
net: fman: Store en/disable in mac_device instead of mac_priv_s

All macs use the same start/stop functions. The actual mac-specific code
lives in enable/disable. Move these functions to an appropriate struct,
and inline the phy enable/disable calls to the caller of start/stop.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Don't pass comm_mode to enable/disable
Sean Anderson [Thu, 18 Aug 2022 16:16:27 +0000 (12:16 -0400)]
net: fman: Don't pass comm_mode to enable/disable

mac_priv_s->enable() and ->disable() are always called with
a comm_mode of COMM_MODE_RX_AND_TX. Remove this parameter, and refactor
the macs appropriately.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Convert to SPDX identifiers
Sean Anderson [Thu, 18 Aug 2022 16:16:26 +0000 (12:16 -0400)]
net: fman: Convert to SPDX identifiers

This converts the license text of files in the fman directory to use
SPDX license identifiers instead.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: Convert FMan MAC bindings to yaml
Sean Anderson [Thu, 18 Aug 2022 16:16:25 +0000 (12:16 -0400)]
dt-bindings: net: Convert FMan MAC bindings to yaml

This converts the MAC portion of the FMan MAC bindings to yaml.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "Merge branch 'wwan-t7xx-fw-flashing-and-coredump-support'"
Jakub Kicinski [Fri, 19 Aug 2022 22:29:38 +0000 (15:29 -0700)]
Revert "Merge branch 'wwan-t7xx-fw-flashing-and-coredump-support'"

This reverts commit 5417197dd516a8e115aa69f62a7b7554b0c3829c, reversing
changes made to 0630f64d25a0f0a8c6a9ce9fde8750b3b561e6f5.

Reverting to allow addressing review comments.

Link: https://lore.kernel.org/all/4c5dbea0-52a9-1c3d-7547-00ea54c90550@linux.intel.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 19 Aug 2022 04:17:10 +0000 (21:17 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoigc: add xdp frags support to ndo_xdp_xmit
Lorenzo Bianconi [Wed, 17 Aug 2022 17:36:28 +0000 (10:36 -0700)]
igc: add xdp frags support to ndo_xdp_xmit

Add the capability to map non-linear xdp frames in XDP_TX and
ndo_xdp_xmit callback.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220817173628.109102-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model'
Jakub Kicinski [Fri, 19 Aug 2022 03:50:44 +0000 (20:50 -0700)]
Merge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model'

Petr Machata says:

====================
selftests: mlxsw: Add ordering tests for unified bridge model

Amit Cohen writes:

Commit 798661c73672 ("Merge branch 'mlxsw-unified-bridge-conversion-part-6'")
converted mlxsw driver to use unified bridge model. In the legacy model,
when a RIF was created / destroyed, it was firmware's responsibility to
update it in the relevant FID classification records. In the unified bridge
model, this responsibility moved to software.

This set adds tests to check the order of configuration for the following
classifications:
1. {Port, VID} -> FID
2. VID -> FID
3. VNI -> FID (after decapsulation)

In addition, in the legacy model, software is responsible to update a
table which is used to determine the packet's egress VID. Add a test to
check that the order of configuration does not impact switch behavior.

See more details in the commit messages.

Note that the tests supposed to pass also using the legacy model, they
are added now as with the new model they test the driver and not the
firmware.

Patch set overview:
Patch #1 adds test for {Port, VID} -> FID
Patch #2 adds test for VID -> FID
Patch #3 adds test for VNI -> FID
Patch #4 adds test for egress VID classification
====================

Link: https://lore.kernel.org/r/cover.1660747162.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add egress VID classification test
Amit Cohen [Wed, 17 Aug 2022 15:28:28 +0000 (17:28 +0200)]
selftests: mlxsw: Add egress VID classification test

After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via
REIV register.

The table needs to be updated in the following flows:
1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new
   {RIF, Port}->VID mapping should be created.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
   a new {RIF, Port}->VID mapping should be created.

Add a test to verify that packets get the correct VID after routing,
regardless of the order of the configuration.

 # ./egress_vid_classification.sh
 TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
 TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for VXLAN
Amit Cohen [Wed, 17 Aug 2022 15:28:27 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for VXLAN

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VXLAN decapsulation, the FID classification is done according to the
VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping
should be updated by the software with the new RIF. In addition, when a new
mapping is added for FID which already has a RIF, the correct RIF should
be used for it.

Add a test to verify that packets can be routed after decapsulation which
is done after VNI->FID classification, regardless of the order of the
configuration.

 # ./ingress_rif_conf_vxlan.sh
 TEST: Add RIF for existing VNI->FID mapping                         [ OK ]
 TEST: Add VNI->FID mapping for FID with a RIF                       [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridge
Amit Cohen [Wed, 17 Aug 2022 15:28:26 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridge

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VLAN-aware bridges (802.1Q), the FID classification is done according
to VID. When a RIF is added on top of a FID, the existing VID->FID mapping
should be updated by the software with the new RIF.

We never map multiple VLANs to the same FID using VID->FID, so we cannot
create VID->FID for FID which already has a RIF using 802.1Q. Anyway,
verify that packets can be routed via port which is added after the FID
already has a RIF.

Add a test to verify that packets can be routed after VID->FID
classification, regardless of the order of the configuration.

 # ./ingress_rif_conf_1q.sh
 TEST: Add RIF for existing VID->FID mapping                         [ OK ]
 TEST: Add port to VID->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for 802.1D bridge
Amit Cohen [Wed, 17 Aug 2022 15:28:25 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for 802.1D bridge

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VLAN-unaware bridges (802.1D), the FID classification is done according
to {Port, VID}. When a RIF is added on top of a FID, all the existing
{Port, VID}->FID mappings should be updated by the software with the new
RIF. In addition, when a new mapping is added for FID which already has a
RIF, the correct RIF should be used for it.

Add a test to verify that packets can be routed after {Port, VID}->FID
classification, regardless of the order of the configuration.

 # ./ingress_rif_conf_1d.sh
 TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
 TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: remove unused txd_pdma pointer in mtk_xdp_submit_frame
Lorenzo Bianconi [Tue, 16 Aug 2022 17:14:30 +0000 (19:14 +0200)]
net: ethernet: mtk_eth_soc: remove unused txd_pdma pointer in mtk_xdp_submit_frame

Get rid of unnecessary txd_pdma pointer in mtk_xdp_submit_frame for loop
since it is actually used at the end of the routine using latest mtk_tx_dma
consumed pointer as reference.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/2c40b0fbb9163a0d62ff897abae17db84a9f3b99.1660669138.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: macsec: Expose MACSEC_SALT_LEN definition to user space
Emeel Hakim [Thu, 18 Aug 2022 15:32:30 +0000 (18:32 +0300)]
net: macsec: Expose MACSEC_SALT_LEN definition to user space

Expose MACSEC_SALT_LEN definition to user space to be
used in various user space applications such as iproute.
Iproute will use this as part of adding macsec extended
packet number support.

Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Link: https://lore.kernel.org/r/20220818153229.4721-1-ehakim@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 19 Aug 2022 02:37:15 +0000 (19:37 -0700)]
Merge tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
     socket maps get data out of the TCP stack)

   - tls: rx: react to strparser initialization errors

   - netfilter: nf_tables: fix scheduling-while-atomic splat

   - net: fix suspicious RCU usage in bpf_sk_reuseport_detach()

  Current release - new code bugs:

   - mlxsw: ptp: fix a couple of races, static checker warnings and
     error handling

  Previous releases - regressions:

   - netfilter:
      - nf_tables: fix possible module reference underflow in error path
      - make conntrack helpers deal with BIG TCP (skbs > 64kB)
      - nfnetlink: re-enable conntrack expectation events

   - net: fix potential refcount leak in ndisc_router_discovery()

  Previous releases - always broken:

   - sched: cls_route: disallow handle of 0

   - neigh: fix possible local DoS due to net iface start/stop loop

   - rtnetlink: fix module refcount leak in rtnetlink_rcv_msg

   - sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu

   - virtio_net: fix endian-ness for RSS

   - dsa: mv88e6060: prevent crash on an unused port

   - fec: fix timer capture timing in `fec_ptp_enable_pps()`

   - ocelot: stats: fix races, integer wrapping and reading incorrect
     registers (the change of register definitions here accounts for
     bulk of the changed LoC in this PR)"

* tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  net: moxa: MAC address reading, generating, validity checking
  tcp: handle pure FIN case correctly
  tcp: refactor tcp_read_skb() a bit
  tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
  tcp: fix sock skb accounting in tcp_read_skb()
  igb: Add lock to avoid data race
  dt-bindings: Fix incorrect "the the" corrections
  net: genl: fix error path memory leak in policy dumping
  stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
  net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
  net/mlx5e: Allocate flow steering storage during uplink initialization
  net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
  net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
  net: mscc: ocelot: make struct ocelot_stat_layout array indexable
  net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
  net: mscc: ocelot: turn stats_lock into a spinlock
  net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
  net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
  net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
  net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
  ...

2 years agoMerge tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Aug 2022 02:24:57 +0000 (19:24 -0700)]
Merge tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fix from Shuah Khan:

 - fix landlock test build regression

* tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/landlock: fix broken include of linux/landlock.h

2 years agoMerge tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 19 Aug 2022 02:18:28 +0000 (19:18 -0700)]
Merge tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull rtla tool fixes from Steven Rostedt:
 "Fixes for the Real-Time Linux Analysis tooling:

   - Fix tracer name in comments and prints

   - Fix setting up symlinks

   - Allow extra flags to be set in build

   - Consolidate and show all necessary libraries not found in build
     error"

* tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  rtla: Consolidate and show all necessary libraries that failed for building
  tools/rtla: Build with EXTRA_{C,LD}FLAGS
  tools/rtla: Fix command symlinks
  rtla: Fix tracer name

2 years agoMerge branch 'add-dt-property-to-disable-hibernation-mode'
Jakub Kicinski [Thu, 18 Aug 2022 21:16:36 +0000 (14:16 -0700)]
Merge branch 'add-dt-property-to-disable-hibernation-mode'

Wei Fang says:

====================
Add DT property to disable hibernation mode

The patches add the ability to disable the hibernation mode of AR803x
PHYs. Hibernation mode defaults to enabled after hardware reset on
these PHYs. If the AR803x PHYs enter hibernation mode, they will not
provide any clock. For some MACs, they might need the clocks which
provided by the PHYs to support their own hardware logic.
So, the patches add the support to disable hibernation mode by adding
a boolean:

        qca,disable-hibernation-mode

If one wished to disable hibernation mode to better match with the
specifical MAC, just add this property in the phy node of DT.
====================

Link: https://lore.kernel.org/r/20220818030054.1010660-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: at803x: add disable hibernation mode support
Wei Fang [Thu, 18 Aug 2022 03:00:54 +0000 (11:00 +0800)]
net: phy: at803x: add disable hibernation mode support

When the cable is unplugged, the Atheros AR803x PHYs will enter
hibernation mode after about 10 seconds if the hibernation mode
is enabled and will not provide any clock to the MAC. But for
some MACs, this feature might cause unexpected issues due to the
logic of MACs.
Taking SYNP MAC (stmmac) as an example, if the cable is unplugged
and the "eth0" interface is down, the AR803x PHY will enter
hibernation mode. Then perform the "ifconfig eth0 up" operation,
the stmmac can't be able to complete the software reset operation
and fail to init it's own DMA. Therefore, the "eth0" interface is
failed to ifconfig up. Why does it cause this issue? The truth is
that the software reset operation of the stmmac is designed to
depend on the RX_CLK of PHY.
So, this patch offers an option for the user to determine whether
to disable the hibernation mode of AR803x PHYs.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: ar803x: add disable-hibernation-mode propetry
Wei Fang [Thu, 18 Aug 2022 03:00:53 +0000 (11:00 +0800)]
dt-bindings: net: ar803x: add disable-hibernation-mode propetry

The hibernation mode of Atheros AR803x PHYs defaults to be
enabled after hardware reset. When the cable is unplugged,
the PHY will enter hibernation mode after about 10 seconds
and the PHY clocks will be stopped to save power.
However, some MACs need the phy output clock for proper
functioning of their logic. For instance, stmmac needs the
RX_CLK of PHY for software reset to complete.
Therefore, add a DT property to configure the PHY to disable
this hardware hibernation mode.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: moxa: MAC address reading, generating, validity checking
Sergei Antonov [Thu, 18 Aug 2022 09:23:17 +0000 (12:23 +0300)]
net: moxa: MAC address reading, generating, validity checking

This device does not remember its MAC address, so add a possibility
to get it from the platform. If it fails, generate a random address.
This will provide a MAC address early during boot without user space
being involved.

Also remove extra calls to is_valid_ether_addr().

Made after suggestions by Andrew Lunn:
1) Use eth_hw_addr_random() to assign a random MAC address during probe.
2) Remove is_valid_ether_addr() from moxart_mac_open()
3) Add a call to platform_get_ethdev_address() during probe
4) Remove is_valid_ether_addr() from moxart_set_mac_address(). The core does this

v1 -> v2:
Handle EPROBE_DEFER returned from platform_get_ethdev_address().
Move MAC reading code to the beginning of the probe function.

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
CC: Yang Yingliang <yangyingliang@huawei.com>
CC: Pavel Skripkin <paskripkin@gmail.com>
CC: Guobin Huang <huangguobin4@huawei.com>
CC: Yang Wei <yang.wei9@zte.com.cn>
CC: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220818092317.529557-1-saproj@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'tcp-some-bug-fixes-for-tcp_read_skb'
Jakub Kicinski [Thu, 18 Aug 2022 18:04:58 +0000 (11:04 -0700)]
Merge branch 'tcp-some-bug-fixes-for-tcp_read_skb'

Cong Wang says:

====================
tcp: some bug fixes for tcp_read_skb()

This patchset contains 3 bug fixes and 1 minor refactor patch for
tcp_read_skb(). V1 only had the first patch, as Eric prefers to fix all
of them together, I have to group them together.
====================

Link: https://lore.kernel.org/r/20220817195445.151609-1-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: handle pure FIN case correctly
Cong Wang [Wed, 17 Aug 2022 19:54:45 +0000 (12:54 -0700)]
tcp: handle pure FIN case correctly

When skb->len==0, the recv_actor() returns 0 too, but we also use 0
for error conditions. This patch amends this by propagating the errors
to tcp_read_skb() so that we can distinguish skb->len==0 case from
error cases.

Fixes: 04919bed948d ("tcp: Introduce tcp_read_skb()")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: refactor tcp_read_skb() a bit
Cong Wang [Wed, 17 Aug 2022 19:54:44 +0000 (12:54 -0700)]
tcp: refactor tcp_read_skb() a bit

As tcp_read_skb() only reads one skb at a time, the while loop is
unnecessary, we can turn it into an if. This also simplifies the
code logic.

Cc: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
Cong Wang [Wed, 17 Aug 2022 19:54:43 +0000 (12:54 -0700)]
tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()

tcp_cleanup_rbuf() retrieves the skb from sk_receive_queue, it
assumes the skb is not yet dequeued. This is no longer true for
tcp_read_skb() case where we dequeue the skb first.

Fix this by introducing a helper __tcp_cleanup_rbuf() which does
not require any skb and calling it in tcp_read_skb().

Fixes: 04919bed948d ("tcp: Introduce tcp_read_skb()")
Cc: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: fix sock skb accounting in tcp_read_skb()
Cong Wang [Wed, 17 Aug 2022 19:54:42 +0000 (12:54 -0700)]
tcp: fix sock skb accounting in tcp_read_skb()

Before commit 965b57b469a5 ("net: Introduce a new proto_ops
->read_skb()"), skb was not dequeued from receive queue hence
when we close TCP socket skb can be just flushed synchronously.

After this commit, we have to uncharge skb immediately after being
dequeued, otherwise it is still charged in the original sock. And we
still need to retain skb->sk, as eBPF programs may extract sock
information from skb->sk. Therefore, we have to call
skb_set_owner_sk_safe() here.

Fixes: 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()")
Reported-and-tested-by: syzbot+a0e6f8738b58f7654417@syzkaller.appspotmail.com
Tested-by: Stanislav Fomichev <sdf@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoigb: Add lock to avoid data race
Lin Ma [Wed, 17 Aug 2022 18:49:21 +0000 (11:49 -0700)]
igb: Add lock to avoid data race

The commit c23d92b80e0b ("igb: Teardown SR-IOV before
unregister_netdev()") places the unregister_netdev() call after the
igb_disable_sriov() call to avoid functionality issue.

However, it introduces several race conditions when detaching a device.
For example, when .remove() is called, the below interleaving leads to
use-after-free.

 (FREE from device detaching)      |   (USE from netdev core)
igb_remove                         |  igb_ndo_get_vf_config
 igb_disable_sriov                 |  vf >= adapter->vfs_allocated_count?
  kfree(adapter->vf_data)          |
  adapter->vfs_allocated_count = 0 |
                                   |    memcpy(... adapter->vf_data[vf]

Moreover, the igb_disable_sriov() also suffers from data race with the
requests from VF driver.

 (FREE from device detaching)      |   (USE from requests)
igb_remove                         |  igb_msix_other
 igb_disable_sriov                 |   igb_msg_task
  kfree(adapter->vf_data)          |    vf < adapter->vfs_allocated_count
  adapter->vfs_allocated_count = 0 |

To this end, this commit first eliminates the data races from netdev
core by using rtnl_lock (similar to commit 719479230893 ("dpaa2-eth: add
MAC/PHY support through phylink")). And then adds a spinlock to
eliminate races from driver requests. (similar to commit 1e53834ce541
("ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero")

Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220817184921.735244-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 18 Aug 2022 18:02:11 +0000 (11:02 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-08-17 (ice)

This series contains updates to ice driver only.

Grzegorz prevents modifications to VLAN 0 when setting VLAN promiscuous
as it will already be set. He also ignores -EEXIST error when attempting
to set promiscuous and ensures promiscuous mode is properly cleared from
the hardware when being removed.

Benjamin ignores additional -EEXIST errors when setting promiscuous mode
since the existing mode is the desired mode.

Sylwester fixes VFs to allow sending of tagged traffic when no VLAN filters
exist.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: Fix VF not able to send tagged traffic with no VLAN filters
  ice: Ignore error message when setting same promiscuous mode
  ice: Fix clearing of promisc mode with bridge over bond
  ice: Ignore EEXIST when setting promisc mode
  ice: Fix double VLAN error when entering promisc mode
====================

Link: https://lore.kernel.org/r/20220817171329.65285-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: Fix incorrect "the the" corrections
Geert Uytterhoeven [Wed, 17 Aug 2022 16:54:51 +0000 (18:54 +0200)]
dt-bindings: Fix incorrect "the the" corrections

Lots of double occurrences of "the" were replaced by single occurrences,
but some of them should become "to the" instead.

Fixes: 12e5bde18d7f6ca4 ("dt-bindings: Fix typo in comment")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/c5743c0a1a24b3a8893797b52fed88b99e56b04b.1660755148.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: altera: Add use of ethtool_op_get_ts_info
Maxime Chevallier [Wed, 17 Aug 2022 09:57:25 +0000 (11:57 +0200)]
net: ethernet: altera: Add use of ethtool_op_get_ts_info

Add the ethtool_op_get_ts_info() callback to ethtool ops, so that we can
at least use software timestamping.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://lore.kernel.org/r/20220817095725.97444-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: genl: fix error path memory leak in policy dumping
Jakub Kicinski [Tue, 16 Aug 2022 16:19:39 +0000 (09:19 -0700)]
net: genl: fix error path memory leak in policy dumping

If construction of the array of policies fails when recording
non-first policy we need to unwind.

netlink_policy_dump_add_policy() itself also needs fixing as
it currently gives up on error without recording the allocated
pointer in the pstate pointer.

Reported-by: syzbot+dc54d9ba8153b216cae0@syzkaller.appspotmail.com
Fixes: 50a896cf2d6f ("genetlink: properly support per-op policy dumping")
Link: https://lore.kernel.org/r/20220816161939.577583-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agostmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
Christophe JAILLET [Tue, 16 Aug 2022 14:23:57 +0000 (16:23 +0200)]
stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()

Commit 09f012e64e4b ("stmmac: intel: Fix clock handling on error and remove
paths") removed this clk_disable_unprepare()

This was partly revert by commit ac322f86b56c ("net: stmmac: Fix clock
handling on remove path") which removed this clk_disable_unprepare()
because:
"
   While unloading the dwmac-intel driver, clk_disable_unprepare() is
   being called twice in stmmac_dvr_remove() and
   intel_eth_pci_remove(). This causes kernel panic on the second call.
"

However later on, commit 5ec55823438e8 ("net: stmmac: add clocks management
for gmac driver") has updated stmmac_dvr_remove() which do not call
clk_disable_unprepare() anymore.

So this call should now be called from intel_eth_pci_remove().

Fixes: 5ec55823438e8 ("net: stmmac: add clocks management for gmac driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/d7c8c1dadf40df3a7c9e643f76ffadd0ccc1ad1b.1660659689.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotee: add overflow check in register_shm_helper()
Jens Wiklander [Thu, 18 Aug 2022 11:08:59 +0000 (13:08 +0200)]
tee: add overflow check in register_shm_helper()

With special lengths supplied by user space, register_shm_helper() has
an integer overflow when calculating the number of pages covered by a
supplied user space memory region.

This causes internal_get_user_pages_fast() a helper function of
pin_user_pages_fast() to do a NULL pointer dereference:

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
  Modules linked in:
  CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11
  Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
  pc : internal_get_user_pages_fast+0x474/0xa80
  Call trace:
   internal_get_user_pages_fast+0x474/0xa80
   pin_user_pages_fast+0x24/0x4c
   register_shm_helper+0x194/0x330
   tee_shm_register_user_buf+0x78/0x120
   tee_ioctl+0xd0/0x11a0
   __arm64_sys_ioctl+0xa8/0xec
   invoke_syscall+0x48/0x114

Fix this by adding an an explicit call to access_ok() in
tee_shm_register_user_buf() to catch an invalid user space address
early.

Fixes: 033ddf12bcf5 ("tee: add register user memory")
Cc: stable@vger.kernel.org
Reported-by: Nimish Mishra <neelam.nimish@gmail.com>
Reported-by: Anirban Chakraborty <ch.anirban00727@gmail.com>
Reported-by: Debdeep Mukhopadhyay <debdeep.mukhopadhyay@gmail.com>
Suggested-by: Jerome Forissier <jerome.forissier@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agonet: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
Lorenzo Bianconi [Tue, 16 Aug 2022 14:16:15 +0000 (16:16 +0200)]
net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run

Fix possible NULL pointer dereference in mtk_xdp_run() if the
ebpf program returns XDP_TX and xdp_convert_buff_to_frame routine fails
returning NULL.

Fixes: 5886d26fd25bb ("net: ethernet: mtk_eth_soc: add xmit XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/627a07d759020356b64473e09f0855960e02db28.1660659112.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5e: Allocate flow steering storage during uplink initialization
Leon Romanovsky [Tue, 16 Aug 2022 08:47:23 +0000 (11:47 +0300)]
net/mlx5e: Allocate flow steering storage during uplink initialization

IPsec code relies on valid priv->fs pointer that is the case in NIC
flow, but not correct in uplink. Before commit that mentioned in the
Fixes line, that pointer was valid in all flows as it was allocated
together with priv struct.

In addition, the cleanup representors routine called to that
not-initialized priv->fs pointer and its internals which caused NULL
deference.

So, move FS allocation to be as early as possible.

Fixes: af8bbf730068 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/ae46fa5bed3c67f937bfdfc0370101278f5422f1.1660639564.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'fixes-for-ocelot-driver-statistics'
Jakub Kicinski [Thu, 18 Aug 2022 04:58:48 +0000 (21:58 -0700)]
Merge branch 'fixes-for-ocelot-driver-statistics'

Vladimir Oltean says:

====================
Fixes for Ocelot driver statistics

This series contains bug fixes for the ocelot drivers (both switchdev
and DSA). Some concern the counters exposed to ethtool -S, and others to
the counters exposed to ifconfig. I'm aware that the changes are fairly
large, but I wanted to prioritize on a proper approach to addressing the
issues rather than a quick hack.

Some of the noticed problems:
- bad register offsets for some counters
- unhandled concurrency leading to corrupted counters
- unhandled 32-bit wraparound of ifconfig counters

The issues on the ocelot switchdev driver were noticed through code
inspection, I do not have the hardware to test.

This patch set necessarily converts ocelot->stats_lock from a mutex to a
spinlock. I know this affects Colin Foster's development with the SPI
controlled VSC7512. I have other changes prepared for net-next that
convert this back into a mutex (along with other changes in this area).
====================

Link: https://lore.kernel.org/r/20220816135352.1431497-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
Vladimir Oltean [Tue, 16 Aug 2022 13:53:52 +0000 (16:53 +0300)]
net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats

Rather than reading the stats64 counters directly from the 32-bit
hardware, it's better to rely on the output produced by the periodic
ocelot_port_update_stats().

It would be even better to call ocelot_port_update_stats() right from
ocelot_get_stats64() to make sure we report the current values rather
than the ones from 2 seconds ago. But we need to export
ocelot_port_update_stats() from the switch lib towards the switchdev
driver for that, and future work will largely undo that.

There are more ocelot-based drivers waiting to be introduced, an example
of which is the SPI-controlled VSC7512. In that driver's case, it will
be impossible to call ocelot_port_update_stats() from ndo_get_stats64
context, since the latter is atomic, and reading the stats over SPI is
sleepable. So the compromise taken here, which will also hold going
forward, is to report 64-bit counters to stats64, which are not 100% up
to date.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
Vladimir Oltean [Tue, 16 Aug 2022 13:53:51 +0000 (16:53 +0300)]
net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset

With so many counter addresses recently discovered as being wrong, it is
desirable to at least have a central database of information, rather
than two: one through the SYS_COUNT_* registers (used for
ndo_get_stats64), and the other through the offset field of struct
ocelot_stat_layout elements (used for ethtool -S).

The strategy will be to keep the SYS_COUNT_* definitions as the single
source of truth, but for that we need to expand our current definitions
to cover all registers. Then we need to convert the ocelot region
creation logic, and stats worker, to the read semantics imposed by going
through SYS_COUNT_* absolute register addresses, rather than offsets
of 32-bit words relative to SYS_COUNT_RX_OCTETS (which should have been
SYS_CNT, by the way).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: make struct ocelot_stat_layout array indexable
Vladimir Oltean [Tue, 16 Aug 2022 13:53:50 +0000 (16:53 +0300)]
net: mscc: ocelot: make struct ocelot_stat_layout array indexable

The ocelot counters are 32-bit and require periodic reading, every 2
seconds, by ocelot_port_update_stats(), so that wraparounds are
detected.

Currently, the counters reported by ocelot_get_stats64() come from the
32-bit hardware counters directly, rather than from the 64-bit
accumulated ocelot->stats, and this is a problem for their integrity.

The strategy is to make ocelot_get_stats64() able to cherry-pick
individual stats from ocelot->stats the way in which it currently reads
them out from SYS_COUNT_* registers. But currently it can't, because
ocelot->stats is an opaque u64 array that's used only to feed data into
ethtool -S.

To solve that problem, we need to make ocelot->stats indexable, and
associate each element with an element of struct ocelot_stat_layout used
by ethtool -S.

This makes ocelot_stat_layout a fat (and possibly sparse) array, so we
need to change the way in which we access it. We no longer need
OCELOT_STAT_END as a sentinel, because we know the array's size
(OCELOT_NUM_STATS). We just need to skip the array elements that were
left unpopulated for the switch revision (ocelot, felix, seville).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
Vladimir Oltean [Tue, 16 Aug 2022 13:53:49 +0000 (16:53 +0300)]
net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work

The 2 methods can run concurrently, and one will change the window of
counters (SYS_STAT_CFG_STAT_VIEW) that the other sees. The fix is
similar to what commit 7fbf6795d127 ("net: mscc: ocelot: fix mutex lock
error during ethtool stats read") has done for ethtool -S.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: turn stats_lock into a spinlock
Vladimir Oltean [Tue, 16 Aug 2022 13:53:48 +0000 (16:53 +0300)]
net: mscc: ocelot: turn stats_lock into a spinlock

ocelot_get_stats64() currently runs unlocked and therefore may collide
with ocelot_port_update_stats() which indirectly accesses the same
counters. However, ocelot_get_stats64() runs in atomic context, and we
cannot simply take the sleepable ocelot->stats_lock mutex. We need to
convert it to an atomic spinlock first. Do that as a preparatory change.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
Vladimir Oltean [Tue, 16 Aug 2022 13:53:47 +0000 (16:53 +0300)]
net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter

This register, used as part of stats->tx_dropped in
ocelot_get_stats64(), has a wrong address. At the address currently
given, there is actually the c_tx_green_prio_6 counter.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
Vladimir Oltean [Tue, 16 Aug 2022 13:53:46 +0000 (16:53 +0300)]
net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters

Reading stats using the SYS_COUNT_* register definitions is only used by
ocelot_get_stats64() from the ocelot switchdev driver, however,
currently the bucket definitions are incorrect.

Separately, on both RX and TX, we have the following problems:
- a 256-1023 bucket which actually tracks the 256-511 packets
- the 1024-1526 bucket actually tracks the 512-1023 packets
- the 1527-max bucket actually tracks the 1024-1526 packets

=> nobody tracks the packets from the real 1527-max bucket

Additionally, the RX_PAUSE, RX_CONTROL, RX_LONGS and RX_CLASSIFIED_DROPS
all track the wrong thing. However this doesn't seem to have any
consequence, since ocelot_get_stats64() doesn't use these.

Even though this problem only manifests itself for the switchdev driver,
we cannot split the fix for ocelot and for DSA, since it requires fixing
the bucket definitions from enum ocelot_reg, which makes us necessarily
adapt the structures from felix and seville as well.

Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch")
Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family")
Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>