Haiyue Wang [Tue, 13 Apr 2021 00:48:43 +0000 (08:48 +0800)]
iavf: Support for modifying UDP RSS flow hashing
Provides the ability to enable UDP RSS hashing by ethtool.
It gives users option of generating RSS hash based on the UDP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Haiyue Wang [Tue, 13 Apr 2021 00:48:42 +0000 (08:48 +0800)]
iavf: Support for modifying TCP RSS flow hashing
Provides the ability to enable TCP RSS hashing by ethtool.
It gives users option of generating RSS hash based on the TCP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Haiyue Wang [Tue, 13 Apr 2021 00:48:41 +0000 (08:48 +0800)]
iavf: Add framework to enable ethtool RSS config
Add the virtchnl message interface to VF, so that VF can request RSS
input set(s) based on PF's capability.
This framework allows ethtool RSS config support on the VF driver.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently, RSS hash input is not available to AVF by ethtool, it is set
by the PF directly.
Add the RSS configure support for AVF through new virtchnl message, and
define the capability flag VIRTCHNL_VF_OFFLOAD_ADV_RSS_PF to query this
new RSS offload support.
Signed-off-by: Jia Guo <jia.guo@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Bo Chen <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Brett Creeley [Tue, 2 Mar 2021 18:15:39 +0000 (10:15 -0800)]
ice: Add helper function to get the VF's VSI
Currently, the driver gets the VF's VSI by using a long string of
dereferences (i.e. vf->pf->vsi[vf->lan_vsi_idx]). If the method to get
the VF's VSI were to change the driver would have to change it in every
location. Fix this by adding the helper ice_get_vf_vsi().
Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Colin Ian King [Sat, 22 Feb 2020 17:10:54 +0000 (17:10 +0000)]
ice: remove redundant assignment to pointer vsi
Pointer vsi is being re-assigned a value that is never read,
the assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
As the hardware is capable of supporting UDP segmentation offload, add a
capability bit to virtchnl.h to communicate this and have the driver
advertise its support.
Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Declare bitmap of allowed commands on VF. Initialize default
opcodes list that should be always supported. Declare array of
supported opcodes for each caps used in virtchnl code.
Change allowed bitmap by setting or clearing corresponding
bit to allowlist (bit set) or denylist (bit clear).
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Vignesh Sridhar [Tue, 2 Mar 2021 18:12:00 +0000 (10:12 -0800)]
ice: warn about potentially malicious VFs
Attempt to detect malicious VFs and, if suspected, log the information but
keep going to allow the user to take any desired actions.
Potentially malicious VFs are identified by checking if the VFs are
transmitting too many messages via the PF-VF mailbox which could cause an
overflow of this channel resulting in denial of service. This is done by
creating a snapshot or static capture of the mailbox buffer which can be
traversed and in which the messages sent by VFs are tracked.
Co-developed-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Co-developed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marek Behún [Wed, 21 Apr 2021 14:08:03 +0000 (16:08 +0200)]
net: phy: marvell: don't use empty switch default case
This causes error reported by kernel test robot.
Signed-off-by: Marek Behún <kabel@kernel.org> Fixes: 41d26bf4aba0 ("net: phy: marvell: refactor HWMON OOP style") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Wed, 21 Apr 2021 18:44:20 +0000 (21:44 +0300)]
net: bridge: fix error in br_multicast_add_port when CONFIG_NET_SWITCHDEV=n
When CONFIG_NET_SWITCHDEV is disabled, the shim for switchdev_port_attr_set
inside br_mc_disabled_update returns -EOPNOTSUPP. This is not caught,
and propagated to the caller of br_multicast_add_port, preventing ports
from joining the bridge.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: ae1ea84b33da ("net: bridge: propagate error code and extack from br_mc_disabled_update") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Adam Ford [Wed, 21 Apr 2021 14:05:05 +0000 (09:05 -0500)]
net: ethernet: ravb: Fix release of refclk
The call to clk_disable_unprepare() can happen before priv is
initialized. This means moving clk_disable_unprepare out of
out_release into a new label.
Fixes: 8ef7adc6beb2 ("net: ethernet: ravb: Enable optional refclk") Signed-off-by: Adam Ford <aford173@gmail.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: fix bridge support for drivers without port_bridge_flags callback
Starting with patch: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags")
drivers without "port_bridge_flags" callback will fail to join the bridge.
Looking at the code, -EOPNOTSUPP seems to be the proper return value,
which makes at least microchip and atheros switches work again.
Fixes: 5961d6a12c13 ("net: dsa: inherit the actual bridge port flags at join time") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 21 Apr 2021 13:22:50 +0000 (16:22 +0300)]
stmmac: intel: unlock on error path in intel_crosststamp()
We recently added some new locking to this function but one error path
was overlooked. We need to drop the lock before returning.
Fixes: f4da56529da6 ("net: stmmac: Add support for external trigger timestamping") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: Export cross-chip PVT as devlink region
Export the raw PVT data in a devlink region so that it can be
inspected from userspace and compared to the current bridge
configuration.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: Fix off-by-one in VTU devlink region size
In the unlikely event of the VTU being loaded to the brim with 4k
entries, the last one was placed in the buffer, but the size reported
to devlink was off-by-one. Make sure that the final entry is available
to the caller.
Fixes: ca4d632aef03 ("net: dsa: mv88e6xxx: Export VTU as devlink region") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: Correct spelling of define "ADRR" -> "ADDR"
Because ADRR is not a thing.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Apr 2021 17:23:17 +0000 (10:23 -0700)]
Merge branch 'octeontx2-af-cn10k'
Srujana Challa says:
====================
Add support for CN10K CPT block
OcteonTX3 (CN10K) silicon is a Marvell next-gen silicon. CN10K CPT
introduces new features like reassembly support and some feature
enhancements.
This patchset adds new mailbox messages and some minor changes to
existing mailbox messages to support CN10K CPT.
v1-v2
Fixed sparse warnings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a new mailbox to get CPT stats, includes performance
counters, CPT engines status and RXC status.
Signed-off-by: Narayana Prasad Raju Atherya <pathreya@marvell.com> Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: cn10k: Add mailbox to configure reassembly timeout
CN10K CPT coprocessor includes a component named RXC which
is responsible for reassembly of inner IP packets. RXC has
the feature to evict oldest entries based on age/threshold.
This patch adds a new mailbox to configure reassembly age
or threshold.
Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com> Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: cn10k: Mailbox changes for CN10K CPT
Adds changes to existing CPT mailbox messages to support
CN10K CPT block. This patch also adds new register defines
for CN10K CPT.
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com> Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The mhi_wwan_rx_budget_dec function is supposed to return true if
RX buffer budget has been successfully decremented, allowing to queue
a new RX buffer for transfer. However the current implementation is
broken when RX budget is '1', in which case budget is decremented but
false is returned, preventing to requeue one buffer, and leading to
RX buffer starvation.
Fixes: fa588eba632d ("net: Add Qcom WWAN control driver") Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Walle [Tue, 20 Apr 2021 10:29:29 +0000 (12:29 +0200)]
net: phy: at803x: fix probe error if copper page is selected
The commit c329e5afb42f ("net: phy: at803x: select correct page on
config init") selects the copper page during probe. This fails if the
copper page was already selected. In this case, the value of the copper
page (which is 1) is propagated through phy_restore_page() and is
finally returned for at803x_probe(). Fix it, by just using the
at803x_page_write() directly.
Also in case of an error, the regulator is not disabled and leads to a
WARN_ON() when the probe fails. This couldn't happen before, because
at803x_parse_dt() was the last call in at803x_probe(). It is hard to
see, that the parse_dt() actually enables the regulator. Thus move the
regulator_enable() to the probe function and undo it in case of an
error.
Fixes: c329e5afb42f ("net: phy: at803x: select correct page on config init") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 20 Apr 2021 12:27:30 +0000 (13:27 +0100)]
net: mana: remove redundant initialization of variable err
The variable err is being initialized with a value that is
never read and it is being updated later with a new value. The
initialization is redundant and can be removed
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 20 Apr 2021 09:43:41 +0000 (02:43 -0700)]
virtio-net: fix use-after-free in page_to_skb()
KASAN/syzbot had 4 reports, one of them being:
BUG: KASAN: slab-out-of-bounds in memcpy include/linux/fortify-string.h:191 [inline]
BUG: KASAN: slab-out-of-bounds in page_to_skb+0x5cf/0xb70 drivers/net/virtio_net.c:480
Read of size 12 at addr ffff888014a5f800 by task systemd-udevd/8445
Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: virtualization@lists.linux-foundation.org Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Walle [Tue, 20 Apr 2021 14:28:21 +0000 (16:28 +0200)]
net: enetc: automatically select IERB module
Now that enetc supports flow control we have to make sure the settings in
the IERB are correct. Therefore, we actually depend on the enetc-ierb
module. Previously it was possible that this module was disabled while the
enetc was enabled. Fix it by automatically select the enetc-ierb module.
Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block") Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 20 Apr 2021 20:01:44 +0000 (13:01 -0700)]
virtio-net: restrict build_skb() use to some arches
build_skb() is supposed to be followed by
skb_reserve(skb, NET_IP_ALIGN), so that IP headers are word-aligned.
(Best practice is to reserve NET_IP_ALIGN+NET_SKB_PAD, but the NET_SKB_PAD
part is only a performance optimization if tunnel encaps are added.)
Unfortunately virtio_net has not provisioned this reserve.
We can only use build_skb() for arches where NET_IP_ALIGN == 0
We might refine this later, with enough testing.
Fixes: fb32856b16ad ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: virtualization@lists.linux-foundation.org Signed-off-by: David S. Miller <davem@davemloft.net>
bit operation helpers such as test_bit, clear_bit, etc take bit
position as parameter and not value. Current usage causes double
shift => BIT(BIT(0)). Fix that in wwan_core and mhi_wwan_ctrl.
Fixes: 9a44c1cc6388 ("net: Add a WWAN subsystem") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Apr 2021 23:51:20 +0000 (16:51 -0700)]
Merge branch 'dsa-tag-override'
Tobias Waldekranz says:
====================
net: dsa: Allow default tag protocol to be overridden from DT
This is a continuation of the work started in this patch:
https://lore.kernel.org/netdev/20210323102326.3677940-1-tobias@waldekranz.com/
In addition to the mv88e6xxx support to dynamically change the
protocol, it is now possible to override the protocol from the device
tree. This means that when a board vendor finds an incompatibility,
they can specify a working protocol in the DT, and users will not have
to worry about it.
Some background information:
In a system using an NXP T1023 SoC connected to a 6390X switch, we
noticed that TO_CPU frames where not reaching the CPU. This only
happened on hardware port 8. Looking at the DSA master interface
(dpaa-ethernet) we could see that an Rx error counter was bumped at
the same rate. The logs indicated a parser error.
It just so happens that a TO_CPU coming in on device 0, port 8, will
result in the first two bytes of the DSA tag being one of:
00 40
00 44
00 46
My guess was that since these values looked like 802.3 length fields,
the controller's parser would signal an error if the frame length did
not match what was in the header.
This was later confirmed using two different workarounds provided by
Vladimir. Unfortunately these either bypass or ignore the hardware
parser and thus robs working combinations of the ability to do RSS and
other nifty things. It was therefore decided to go with the option of
a DT override.
v1 -> v2:
- Fail if the device does not support changing protocols instead of
falling back to the default. (Andrew)
- Only call change_tag_protocol on CPU ports. (Andrew/Vladimir)
- Only allow changing the protocol on chips that have at least
"undocumented" level of support for EDSA. (Andrew).
- List the supported protocols in the binding documentation. I opted
for only listing the protocols that I have tested. As more people
test their drivers, they can add them. (Rob)
v2 -> v3:
- Rename "dsa,tag-protocol" -> "dsa-tag-protocol". (Rob)
- Some cleanups to 4/5. (Vladimir)
- Add a comment detailing how tree/driver agreement on the tag
protocol is enforced. (Vladimir).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'dsa-tag-protocol' is used to force a switch tree to use a
particular tag protocol, typically because the Ethernet controller
that it is connected to is not compatible with the default one.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: Allow default tag protocol to be overridden from DT
Some combinations of tag protocols and Ethernet controllers are
incompatible, and it is hard for the driver to keep track of these.
Therefore, allow the device tree author (typically the board vendor)
to inform the driver of this fact by selecting an alternate protocol
that is known to work.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: Only notify CPU ports of changes to the tag protocol
Previously DSA ports were also included, on the assumption that the
protocol used by the CPU port had to the matched throughout the entire
tree.
As there is not yet any consumer in need of this, drop the call.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: Allow dynamic reconfiguration of tag protocol
For devices that supports both regular and Ethertyped DSA tags, allow
the user to change the protocol.
Additionally, because there are ethernet controllers that do not
handle regular DSA tags in all cases, also allow the protocol to be
changed on devices with undocumented support for EDSA. But, in those
cases, make sure to log the fact that an undocumented feature has been
enabled.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: mv88e6xxx: Mark chips with undocumented EDSA tag support
All devices are capable of using regular DSA tags. Support for
Ethertyped DSA tags sort into three categories:
1. No support. Older chips fall into this category.
2. Full support. Datasheet explicitly supports configuring the CPU
port to receive FORWARDs with a DSA tag.
3. Undocumented support. Datasheet lists the configuration from
category 2 as "reserved for future use", but does empirically
behave like a category 2 device.
So, instead of listing the one true protocol that should be used by a
particular chip, specify the level of support for EDSA (support for
regular DSA is implicit on all chips). As before, we use EDSA for all
chips that fully supports it.
In upcoming changes, we will use this information to support
dynamically changing the tag protocol.
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Apr 2021 23:44:04 +0000 (16:44 -0700)]
Merge tag 'mac80211-next-for-net-next-2021-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Another set of updates, all over the map:
* set sk_pacing_shift for 802.3->802.11 encap offload
* some monitor support for 802.11->802.3 decap offload
* HE (802.11ax) spec updates
* userspace API for TDLS HE support
* along with various other small features, cleanups and
fixups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlxsw admits for offload a suitable root qdisc, and its
children. Thus up to two levels of hierarchy are offloaded. Often, this is
enough: one can configure TCs with RED and TCs with a shaper, and can even
see counters for each TC by looking at a qdisc at a sufficiently shallow
position.
While simple, the system has obvious shortcomings. It is not possible to
configure both RED and shaping on one TC. It is not possible to place a
PRIO below root TBF, which would then be offloaded as port shaper. FIFOs
are only offloaded at root or directly below, which is confusing to users,
because RED and TBF of course have their own FIFO.
This patchset is a step towards the end goal of allowing more comprehensive
qdisc tree offload and cleans up the qdisc offload code.
- Patches #1-#4 contain small cleanups.
- Up until now, since mlxsw offloaded only a very simple qdisc
configurations, basically all bookkeeping was done using one container
for the root qdisc, and 8 containers for its children. Patches #5, #6, #8
and #9 gradually introduce a more dynamic structure, where parent-child
relationships are tracked directly at qdiscs, instead of being implicit.
- This tree management assumes only one qdisc is created at a time. In FIFO
handlers, this condition was enforced simply by asserting RTNL lock. But
instead of furthering this RTNL dependence, patch #7 converts the whole
qdisc offload logic to a per-port mutex.
- Patch #10 adds a selftest.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:48 +0000 (16:53 +0200)]
selftests: mlxsw: sch_red_ets: Test proper counter cleaning in ETS
There was a bug introduced during the rework which cause non-zero backlog
being stuck at ETS. Introduce a selftest that would have caught the issue
earlier.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:47 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Index future FIFOs by band number
mlxsw used to hold an array of qdiscs indexed by the TC number. In the
previous patch, it was changed to allocate child qdiscs dynamically, and
they are now indexed by band number. Follow suit with the array of future
FIFOs.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of keeping qdiscs in globally-preallocated arrays, introduce a
per-qdisc-kind value num_classes, and then allocate the necessary child
qdiscs (if any) based on that value. Since now dynamic allocation is
involved, mlxsw_sp_qdisc_replace() gets messy enough that it is worth it to
split it to two cases: a new qdisc allocation and a change of existing
qdisc. (Note that the change also includes what TC formally calls replace,
if the qdisc kind is the same.)
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:45 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Guard all qdisc accesses with a lock
The FIFO handler currently guards accesses to the future FIFO tracking by
asserting RTNL. In the future, the changes to the qdisc state will be more
thorough, so other qdiscs will need this guarding is as well. In order
to not further the RTNL infestation, instead convert to a custom lock that
will guard accesses to the qdisc state.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:44 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Track children per qdisc
mlxsw currently allows a two-level structure of qdiscs: the root and
possibly a number of children. In order to support offloading more general
qdisc trees, introduce to struct mlxsw_sp_qdisc a pointer to child qdiscs.
Refer to the child qdiscs through this pointer, instead of going through
the tclass_qdiscs in qdisc_state. Additionally introduce a field
num_classes, which holds number of given qdisc's children.
Also introduce a generic function for walking qdisc trees. Rewrite
mlxsw_sp_qdisc_find() and _find_by_handle() to use the generic walker.
For now, keep the qdisc_state.tclass_qdisc, and just point root_qdiscs's
children to this array. Following patches will make the allocation dynamic.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:43 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Promote backlog reduction to mlxsw_sp_qdisc_destroy()
When a qdisc is removed, it is necessary to update the backlog value at its
parent--unless the qdisc is at root position. RED, TBF and FIFO all do
that, each separately. Since all of them need to do this, just promote the
operation directly to mlxsw_sp_qdisc_destroy(), instead of deferring it to
individual destructors. Since FIFO dtor thus becomes trivial, remove it.
Add struct mlxsw_sp_qdisc.parent to point at the parent qdisc. This will be
handy later as deeper structures are offloaded. Use the parent qdisc to
find the chain of parents whose backlog value needs to be updated.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:42 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Track tclass_num as int, not u8
tclass_num is just a number, a value that would be ordinarily passed around
as an int. (Which is unlike a u8 prio_bitmap.) In several places,
tclass_num already is an int. Convert the remaining instances.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Tue, 20 Apr 2021 14:53:41 +0000 (16:53 +0200)]
mlxsw: spectrum_qdisc: Drop an always-true condition
The function mlxsw_sp_qdisc_compare() is invoked a couple lines above this
check, which will bounce any requests where this condition does not hold.
Therefore drop it.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The purpose of this function is to filter out events that are related to
qdiscs that are not offloaded, or are not offloaded anymore. But the
function is unnecessarily thorough:
- mlxsw_sp_qdisc pointer is never NULL in the context where it is called
- Two qdiscs with the same handle will never have different types. Even
when replacing one qdisc with another in the same class, Linux will not
permit handle reuse unless the qdisc type also matches.
Simplify the function by omitting these two unnecessary conditions.
Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Tue, 20 Apr 2021 07:54:00 +0000 (09:54 +0200)]
net: phy: marvell: fix HWMON enable register for 6390
Register 27_6.15:14 has the following description in 88E6393X
documentation:
Temperature Sensor Enable
0x0 - Sample every 1s
0x1 - Sense rate decided by bits 10:8 of this register
0x2 - Use 26_6.5 (One shot Temperature Sample) to enable
0x3 - Disable
This is compatible with how the 6390 code uses this register currently,
but the 6390 code handles it as two 1-bit registers (somewhat), instead
of one register with 4 possible values.
(A newer version of the 6390 documentation removed temperature sensor
section completely. In an older version, the above mentioned register
is reserved, although it is R/W. Since the code works, I think we can
assume that it is correct.)
Rename this register and define all 4 values according to 6393X
documentation.
Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Tue, 20 Apr 2021 07:53:59 +0000 (09:53 +0200)]
net: phy: marvell: refactor HWMON OOP style
Use a structure of Marvell PHY specific HWMON methods to reduce code
duplication. Store a pointer to this structure into the PHY driver's
driver_data member.
Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Apr 2021 23:14:02 +0000 (16:14 -0700)]
Merge tag 'mlx5-updates-2021-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-04-19
This patchset provides some updates to mlx5e and mlx5 SW steering drivers:
1) Tariq and Vladyslav they both provide some trivial update to mlx5e netdev.
The next 12 patches in the patchset are focused toward mlx5 SW steering:
2) 3 trivial cleanup patches
3) Dynamic Flex parser support:
Flex parser is a HW parser that can support protocols that are not
natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
There are 8 such parsers, and each of them can be assigned to parse a
specific set of protocols.
4) Enable matching on Geneve TLV options
5) Use Flex parser for MPLS over UDP/GRE
6) Enable matching on tunnel GTP-U and GTP-U first extension
header using
7) Improved QoS for SW steering internal QPair for a better insertion rate
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xiaoliang Yang [Mon, 19 Apr 2021 10:25:30 +0000 (18:25 +0800)]
net: dsa: felix: disable always guard band bit for TAS config
ALWAYS_GUARD_BAND_SCH_Q bit in TAS config register is descripted as
this:
0: Guard band is implemented for nonschedule queues to schedule
queues transition.
1: Guard band is implemented for any queue to schedule queue
transition.
The driver set guard band be implemented for any queue to schedule queue
transition before, which will make each GCL time slot reserve a guard
band time that can pass the max SDU frame. Because guard band time could
not be set in tc-taprio now, it will use about 12000ns to pass 1500B max
SDU. This limits each GCL time interval to be more than 12000ns.
This patch change the guard band to be only implemented for nonschedule
queues to schedule queues transition, so that there is no need to reserve
guard band on each GCL. Users can manually add guard band time for each
schedule queues in their configuration if they want.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Apr 2021 23:08:02 +0000 (16:08 -0700)]
Merge branch 'net-generic-selftest-support'
Oleksij Rempel says:
====================
provide generic net selftest support
changes v3:
- make more granular tests
- enable loopback for all PHYs by default
- fix allmodconfig build errors
- poll for link status update after switching to the loopback mode
changes v2:
- make generic selftests available for all networking devices.
- make use of net_selftest* on FEC, ag71xx and all DSA switches.
- add loopback support on more PHYs.
This patch set provides diagnostic capabilities for some iMX, ag71xx or
any DSA based devices. For proper functionality, PHY loopback support is
needed.
So far there is only initial infrastructure with basic tests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: enable selftest support for all switches by default
Most of generic selftest should be able to work with probably all ethernet
controllers. The DSA switches are not exception, so enable it by default at
least for DSA.
This patch was tested with SJA1105 and AR9331.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Port some parts of the stmmac selftest and reuse it as basic generic selftest
library. This patch was tested with following combinations:
- iMX6DL FEC -> AT8035
- iMX6DL FEC -> SJA1105Q switch -> KSZ8081
- iMX6DL FEC -> SJA1105Q switch -> KSZ9031
- AR9331 ag71xx -> AR9331 PHY
- AR9331 ag71xx -> AR9331 switch -> AR9331 PHY
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: genphy_loopback: add link speed configuration
In case of loopback, in most cases we need to disable autoneg support
and force some speed configuration. Otherwise, depending on currently
active auto negotiated link speed, the loopback may or may not work.
This patch was tested with following PHYs: TJA1102, KSZ8081, KSZ9031,
AT8035, AR9331.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: execute genphy_loopback() per default on all PHYs
The generic loopback is really generic and is defined by the 802.3
standard, we should just mandate that drivers implement a custom
loopback if the generic one cannot work.
Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
When using SW steering, rule insertion rate depends on the RDMA RC QP
performance used for writing to the ICM. During stress this QP is competing
on the HW resources with all the other QPs that are used to send data.
To protect SW steering QP's performance in such cases, we set this QP to
use isolated VL. The VL number is reserved by FW and is not exposed to the
driver.
Support for this QP on isolated VL exists only when both force-loopback and
isolate_vl_tc capabilities are set.
When supported by the device, SW steering RoCE RC QP that is used to
write/read to/from ICM will be created with force-loopback attribute.
Such QP doesn't require GID index upon creation.
Flex parser is a HW parser that can support protocols that are not
natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
There are 8 such parsers, and each of them can be assigned to parse a
specific set of protocols.
This patch adds misc4 match params which allows using a correct flex parser
that was programmed to the required protocol.
Added the required definitions for supporting more protocols by flex parsers
(GTP-U, Geneve TLV options), and for using the right flex parser that was
configured for this protocol.
net/mlx5: E-Switch, Improve error messages in term table creation
Add error code to the error messages and removed duplicated message:
if termination table creation failed, we already get an error message
in mlx5_eswitch_termtbl_create, so no need for the additional error print
in the calling function.
Tariq Toukan [Sun, 21 Jun 2020 18:35:34 +0000 (21:35 +0300)]
net/mlx5e: RX, Add checks for calculated Striding RQ attributes
Striding RQ attributes below are mutually dependent. An unaware
change to one might take the others out of the valid range derived
by the HW caps:
- The MPWQE size in bytes
- The number of strides in a MPWQE
- The stride size
Add checks to verify they are valid and comply to the HW spec
and SW assumptions/requirements.
This is not a fix, no particular issue exists today.
net/mlx5e: Fix possible non-initialized struct usage
If mlx5e_devlink_port_register() fails, driver may try to register
devlink health TX and RX reporters on non-registered devlink port.
Instead, create health reporters only if mlx5e_devlink_port_register()
does not fail. And destroy reporters only if devlink_port is registered.
Also, change mlx5e_get_devlink_port() behavior and return NULL in case
port is not registered to replicate devlink's wrapper when ndo is not
implemented.
Jakub Kicinski [Mon, 19 Apr 2021 21:52:35 +0000 (14:52 -0700)]
ethtool: add missing EEPROM to list of messages
ETHTOOL_MSG_MODULE_EEPROM_GET is missing from the list of messages.
ETHTOOL_MSG_MODULE_EEPROM_GET_REPLY is sadly a rather long name
so we need to adjust column length.
v2: use spaces (Andrew)
Fixes: c781ff12a2f3 ("ethtool: Allow network drivers to dump arbitrary EEPROM data") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Apr 2021 23:19:44 +0000 (16:19 -0700)]
Merge branch 'tja1103-driver'
Radu Pirea says:
====================
TJA1103 driver
This small series adds the TJA1103 PHY driver.
Changes in v3:
- use phy_read_mmd_poll_timeout instead of spin_until_cond
- changed the phy name from a generic one to something specific
- minor indentation change
Changes in v2:
- implemented genphy_c45_pma_suspend/genphy_c45_pma_suspend
- set default internal delays set to 2ns(90 degrees)
- added "VEND1_" prefix to the register definitions
- disable rxid in case of txid
- disable txid in case of rxid
- disable internal delays in rgmii mode
- reduced max line length to 80 characters
- rebased on top of net-next/master
- use genphy_c45_loopback as .set_loopback callback
- renamed the driver from nxp-c45 to nxp-c45-tja11xx
- used phy phy_set_bits_mmd/phy_clear_bits_mmd instead on phy_write_mmd where
I had to set/clear one bit.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add generic PMA suspend and resume callback functions for C45 PHYs.
Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Cc: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Apr 2021 22:58:15 +0000 (15:58 -0700)]
Merge branch 'korina-next'
Thomas Bogendoerfer says:
====================
net: Korina improvements
While converting Mikrotik RB532 support to use device tree I stumbled
over the korina ethernet driver, which used way too many MIPS specific
hacks. This series cleans this all up and adds support for device tree.
Changes in v6:
- remove korina from resource names and adapt DT binding to it
- removed superfluous braces around of_get_mac_address
Changes in v5:
- fixed email address in binding document, which prevented sending it
Changes in v4:
- improve error returns in mdio_read further
- added clock name and improved clk handling
- fixed binding errors
Changes in v3:
- use readl_poll_timeout_atomic in mdio_wait
- return -ETIMEDOUT, if mdio_wait failed
- added DT binding and changed name to idt,3243x-emac
- fixed usage of of_get_mac_address for net-next
Changes in v2:
- added device tree support to get rid of idt_cpu_freq
- fixed compile test on 64bit archs
- fixed descriptor current address handling by storing/using mapped
dma addresses (dma controller modifies current address)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Move structs/defines for ethernet/dma register into driver, since they
are only used for this driver and remove any MIPS specific includes.
This makes it possible to COMPILE_TEST the driver.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net: korina: Get mdio input clock via common clock framework
With device tree clock is provided via CCF. For non device tree
use a maximum clock value to not overclock the PHY. The non device
tree usage will go away after platform is converted to DT.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of messing with MIPS specific macros use DMA API for mapping
descriptors and skbs.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove helpers, which are only used in one call site.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Descriptors are mapped uncached so there is no need to do any cache
handling for them.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>