]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
2 years agoMerge branch 'dsa-changes-for-multiple-cpu-ports-part-3'
Paolo Abeni [Tue, 23 Aug 2022 09:39:36 +0000 (11:39 +0200)]
Merge branch 'dsa-changes-for-multiple-cpu-ports-part-3'

Vladimir Oltean says:

====================
DSA changes for multiple CPU ports (part 3)

Those who have been following part 1:
https://patchwork.kernel.org/project/netdevbpf/cover/20220511095020.562461-1-vladimir.oltean@nxp.com/
and part 2:
https://patchwork.kernel.org/project/netdevbpf/cover/20220521213743.2735445-1-vladimir.oltean@nxp.com/
will know that I am trying to enable the second internal port pair from
the NXP LS1028A Felix switch for DSA-tagged traffic via "ocelot-8021q".
This series represents part 3 of that effort.

Covered here are some preparations in DSA for handling multiple DSA
masters:
- when changing the tagging protocol via sysfs
- when the masters go down
as well as preparation for monitoring the upper devices of a DSA master
(to support DSA masters under a LAG).

There are also 2 small preparations for the ocelot driver, for the case
where multiple tag_8021q CPU ports are used in a LAG. Both those changes
have to do with PGID forwarding domains.

Compared to v1, the patches were trimmed down to just another
preparation stage, and the UAPI changes were pushed further out to part 4.
https://patchwork.kernel.org/project/netdevbpf/cover/20220523104256.3556016-1-olteanv@gmail.com/

Compared to v2, I had to export a symbol I forgot to
(ocelot_port_teardown_dsa_8021q_cpu), to avoid a build breakage when the
felix and seville drivers are built as modules.
====================

Link: https://lore.kernel.org/r/20220819174820.3585002-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mscc: ocelot: adjust forwarding domain for CPU ports in a LAG
Vladimir Oltean [Fri, 19 Aug 2022 17:48:20 +0000 (20:48 +0300)]
net: mscc: ocelot: adjust forwarding domain for CPU ports in a LAG

Currently when we have 2 CPU ports configured for DSA tag_8021q mode and
we put them in a LAG, a PGID dump looks like this:

PGID_SRC[0] = ports 4,
PGID_SRC[1] = ports 4,
PGID_SRC[2] = ports 4,
PGID_SRC[3] = ports 4,
PGID_SRC[4] = ports 0, 1, 2, 3, 4, 5,
PGID_SRC[5] = no ports

(ports 0-3 are user ports, ports 4 and 5 are CPU ports)

There are 2 problems with the configuration above:

- user ports should enable forwarding towards both CPU ports, not just 4,
  and the aggregation PGIDs should prune one CPU port or the other from
  the destination port mask, based on a hash computed from packet headers.

- CPU ports should not be allowed to forward towards themselves and also
  not towards other ports in the same LAG as themselves

The first problem requires fixing up the PGID_SRC of user ports, when
ocelot_port_assigned_dsa_8021q_cpu_mask() is called. We need to say that
when a user port is assigned to a tag_8021q CPU port and that port is in
a LAG, it should forward towards all ports in that LAG.

The second problem requires fixing up the PGID_SRC of port 4, to remove
ports 4 and 5 (in a LAG) from the allowed destinations.

After this change, the PGID source masks look as follows:

PGID_SRC[0] = ports 4, 5,
PGID_SRC[1] = ports 4, 5,
PGID_SRC[2] = ports 4, 5,
PGID_SRC[3] = ports 4, 5,
PGID_SRC[4] = ports 0, 1, 2, 3,
PGID_SRC[5] = no ports

Note that PGID_SRC[5] still looks weird (it should say "0, 1, 2, 3" just
like PGID_SRC[4] does), but I've tested forwarding through this CPU port
and it doesn't seem like anything is affected (it appears that PGID_SRC[4]
is being looked up on forwarding from the CPU, since both ports 4 and 5
have logical port ID 4). The reason why it looks weird is because
we've never called ocelot_port_assign_dsa_8021q_cpu() for any user port
towards port 5 (all user ports are assigned to port 4 which is in a LAG
with 5).

Since things aren't broken, I'm willing to leave it like that for now
and just document the oddity.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity
Vladimir Oltean [Fri, 19 Aug 2022 17:48:19 +0000 (20:48 +0300)]
net: mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity

This is a partial revert of commit c295f9831f1d ("net: mscc: ocelot:
switch from {,un}set to {,un}assign for tag_8021q CPU ports"), because
as it turns out, this isn't how tag_8021q CPU ports under a LAG are
supposed to work.

Under that scenario, all user ports are "assigned" to the single
tag_8021q CPU port represented by the logical port corresponding to the
bonding interface. So one CPU port in a LAG would have is_dsa_8021q_cpu
set to true (the one whose physical port ID is equal to the logical port
ID), and the other one to false.

In turn, this makes 2 undesirable things happen:

(1) PGID_CPU contains only the first physical CPU port, rather than both
(2) only the first CPU port will be added to the private VLANs used by
    ocelot for VLAN-unaware bridging

To make the driver behave in the same way for both bonded CPU ports, we
need to bring back the old concept of setting up a port as a tag_8021q
CPU port, and this is what deals with VLAN membership and PGID_CPU
updating. But we also need the CPU port "assignment" (the user to CPU
port affinity), and this is what updates the PGID_SRC forwarding rules.

All DSA CPU ports are statically configured for tag_8021q mode when the
tagging protocol is changed to ocelot-8021q. User ports are "assigned"
to one CPU port or the other dynamically (this will be handled by a
future change).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: use dsa_tree_for_each_cpu_port in dsa_tree_{setup,teardown}_master
Vladimir Oltean [Fri, 19 Aug 2022 17:48:18 +0000 (20:48 +0300)]
net: dsa: use dsa_tree_for_each_cpu_port in dsa_tree_{setup,teardown}_master

More logic will be added to dsa_tree_setup_master() and
dsa_tree_teardown_master() in upcoming changes.

Reduce the indentation by one level in these functions by introducing
and using a dedicated iterator for CPU ports of a tree.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: all DSA masters must be down when changing the tagging protocol
Vladimir Oltean [Fri, 19 Aug 2022 17:48:17 +0000 (20:48 +0300)]
net: dsa: all DSA masters must be down when changing the tagging protocol

The fact that the tagging protocol is set and queried from the
/sys/class/net/<dsa-master>/dsa/tagging file is a bit of a quirk from
the single CPU port days which isn't aging very well now that DSA can
have more than a single CPU port. This is because the tagging protocol
is a switch property, yet in the presence of multiple CPU ports it can
be queried and set from multiple sysfs files, all of which are handled
by the same implementation.

The current logic ensures that the net device whose sysfs file we're
changing the tagging protocol through must be down. That net device is
the DSA master, and this is fine for single DSA master / CPU port setups.

But exactly because the tagging protocol is per switch [ tree, in fact ]
and not per DSA master, this isn't fine any longer with multiple CPU
ports, and we must iterate through the tree and find all DSA masters,
and make sure that all of them are down.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: only bring down user ports assigned to a given DSA master
Vladimir Oltean [Fri, 19 Aug 2022 17:48:16 +0000 (20:48 +0300)]
net: dsa: only bring down user ports assigned to a given DSA master

This is an adaptation of commit c0a8a9c27493 ("net: dsa: automatically
bring user ports down when master goes down") for multiple DSA masters.
When a DSA master goes down, only the user ports under its control
should go down too, the others can still send/receive traffic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: existing DSA masters cannot join upper interfaces
Vladimir Oltean [Fri, 19 Aug 2022 17:48:15 +0000 (20:48 +0300)]
net: dsa: existing DSA masters cannot join upper interfaces

All the traffic to/from a DSA master is supposed to be distributed among
its DSA switch upper interfaces, so we should not allow other upper
device kinds.

An exception to this is DSA_TAG_PROTO_NONE (switches with no DSA tags),
and in that case it is actually expected to create e.g. VLAN interfaces
on the master. But for those, netdev_uses_dsa(master) returns false, so
the restriction doesn't apply.

The motivation for this change is to allow LAG interfaces of DSA masters
to be DSA masters themselves. We want to restrict the user's degrees of
freedom by 1: the LAG should already have all DSA masters as lowers, and
while lower ports of the LAG can be removed, none can be added after the
fact.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: bridge: move DSA master bridging restriction to DSA
Vladimir Oltean [Fri, 19 Aug 2022 17:48:14 +0000 (20:48 +0300)]
net: bridge: move DSA master bridging restriction to DSA

When DSA gains support for multiple CPU ports in a LAG, it will become
mandatory to monitor the changeupper events for the DSA master.

In fact, there are already some restrictions to be imposed in that area,
namely that a DSA master cannot be a bridge port except in some special
circumstances.

Centralize the restrictions at the level of the DSA layer as a
preliminary step.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: don't stop at NOTIFY_OK when calling ds->ops->port_prechangeupper
Vladimir Oltean [Fri, 19 Aug 2022 17:48:13 +0000 (20:48 +0300)]
net: dsa: don't stop at NOTIFY_OK when calling ds->ops->port_prechangeupper

dsa_slave_prechangeupper_sanity_check() is supposed to enforce some
adjacency restrictions, and calls ds->ops->port_prechangeupper if the
driver implements it.

We convert the error code from the port_prechangeupper() call to a
notifier code, and 0 is converted to NOTIFY_OK, but the caller of
dsa_slave_prechangeupper_sanity_check() stops at any notifier code
different from NOTIFY_DONE.

Avoid this by converting back the notifier code to an error code, so
that both NOTIFY_OK and NOTIFY_DONE will be seen as 0. This allows more
parallel sanity check functions to be added.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: walk through all changeupper notifier functions
Vladimir Oltean [Fri, 19 Aug 2022 17:48:12 +0000 (20:48 +0300)]
net: dsa: walk through all changeupper notifier functions

Traditionally, DSA has had a single netdev notifier handling function
for each device type.

For the sake of code cleanliness, we would like to introduce more
handling functions which do one thing, but the conditions for entering
these functions start to overlap. Example: a handling function which
tracks whether any bridges contain both DSA and non-DSA interfaces.
Either this is placed before dsa_slave_changeupper(), case in which it
will prevent that function from executing, or we place it after
dsa_slave_changeupper(), case in which we will prevent it from
executing. The other alternative is to ignore errors from the new
handling function (not ideal).

To support this usage, we need to change the pattern. In the new model,
we enter all notifier handling sub-functions, and exit with NOTIFY_DONE
if there is nothing to do. This allows the sub-functions to be
relatively free-form and independent from each other.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'vsock-updates-for-so_rcvlowat-handling'
Paolo Abeni [Tue, 23 Aug 2022 08:45:02 +0000 (10:45 +0200)]
Merge branch 'vsock-updates-for-so_rcvlowat-handling'

Arseniy Krasnov says:

====================
vsock: updates for SO_RCVLOWAT handling

This patchset includes some updates for SO_RCVLOWAT:

1) af_vsock:
   During my experiments with zerocopy receive, i found, that in some
   cases, poll() implementation violates POSIX: when socket has non-
   default SO_RCVLOWAT(e.g. not 1), poll() will always set POLLIN and
   POLLRDNORM bits in 'revents' even number of bytes available to read
   on socket is smaller than SO_RCVLOWAT value. In this case,user sees
   POLLIN flag and then tries to read data(for example using  'read()'
   call), but read call will be blocked, because  SO_RCVLOWAT logic is
   supported in dequeue loop in af_vsock.c. But the same time,  POSIX
   requires that:

   "POLLIN     Data other than high-priority data may be read without
               blocking.
    POLLRDNORM Normal data may be read without blocking."

   See https://www.open-std.org/jtc1/sc22/open/n4217.pdf, page 293.

   So, we have, that poll() syscall returns POLLIN, but read call will
   be blocked.

   Also in man page socket(7) i found that:

   "Since Linux 2.6.28, select(2), poll(2), and epoll(7) indicate a
   socket as readable only if at least SO_RCVLOWAT bytes are available."

   I checked TCP callback for poll()(net/ipv4/tcp.c, tcp_poll()), it
   uses SO_RCVLOWAT value to set POLLIN bit, also i've tested TCP with
   this case for TCP socket, it works as POSIX required.

   I've added some fixes to af_vsock.c and virtio_transport_common.c,
   test is also implemented.

2) virtio/vsock:
   It adds some optimization to wake ups, when new data arrived. Now,
   SO_RCVLOWAT is considered before wake up sleepers who wait new data.
   There is no sense, to kick waiter, when number of available bytes
   in socket's queue < SO_RCVLOWAT, because if we wake up reader in
   this case, it will wait for SO_RCVLOWAT data anyway during dequeue,
   or in poll() case, POLLIN/POLLRDNORM bits won't be set, so such
   exit from poll() will be "spurious". This logic is also used in TCP
   sockets.

3) vmci/vsock:
   Same as 2), but i'm not sure about this changes. Will be very good,
   to get comments from someone who knows this code.

4) Hyper-V:
   As Dexuan Cui mentioned, for Hyper-V transport it is difficult to
   support SO_RCVLOWAT, so he suggested to disable this feature for
   Hyper-V.
====================

Link: https://lore.kernel.org/r/de41de4c-0345-34d7-7c36-4345258b7ba8@sberdevices.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovsock_test: POLLIN + SO_RCVLOWAT test
Arseniy Krasnov [Fri, 19 Aug 2022 05:43:50 +0000 (05:43 +0000)]
vsock_test: POLLIN + SO_RCVLOWAT test

This adds test to check, that when poll() returns POLLIN, POLLRDNORM bits,
next read call won't block.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovmci/vsock: check SO_RCVLOWAT before wake up reader
Arseniy Krasnov [Fri, 19 Aug 2022 05:41:35 +0000 (05:41 +0000)]
vmci/vsock: check SO_RCVLOWAT before wake up reader

This adds extra condition to wake up data reader: do it only when number
of readable bytes >= SO_RCVLOWAT. Otherwise, there is no sense to kick
user, because it will wait until SO_RCVLOWAT bytes will be dequeued. This
check is performed in vsock_data_ready().

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovirtio/vsock: check SO_RCVLOWAT before wake up reader
Arseniy Krasnov [Fri, 19 Aug 2022 05:39:24 +0000 (05:39 +0000)]
virtio/vsock: check SO_RCVLOWAT before wake up reader

This adds extra condition to wake up data reader: do it only when number
of readable bytes >= SO_RCVLOWAT. Otherwise, there is no sense to kick
user,because it will wait until SO_RCVLOWAT bytes will be dequeued. This
check is performed in vsock_data_ready().

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovsock: add API call for data ready
Arseniy Krasnov [Fri, 19 Aug 2022 05:36:52 +0000 (05:36 +0000)]
vsock: add API call for data ready

This adds 'vsock_data_ready()' which must be called by transport to kick
sleeping data readers. It checks for SO_RCVLOWAT value before waking
user, thus preventing spurious wake ups. Based on 'tcp_data_ready()' logic.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovsock: pass sock_rcvlowat to notify_poll_in as target
Arseniy Krasnov [Fri, 19 Aug 2022 05:33:47 +0000 (05:33 +0000)]
vsock: pass sock_rcvlowat to notify_poll_in as target

Passing 1 as the target to notify_poll_in(), we don't honor
what the user has set via SO_RCVLOWAT, going to set POLLIN
and POLLRDNORM, even if we don't have the amount of bytes
expected by the user.

Let's use sock_rcvlowat() to get the right target to pass to
notify_poll_in();

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovmci/vsock: use 'target' in notify_poll_in callback
Arseniy Krasnov [Fri, 19 Aug 2022 05:31:43 +0000 (05:31 +0000)]
vmci/vsock: use 'target' in notify_poll_in callback

This callback controls setting of POLLIN, POLLRDNORM output bits of poll()
syscall, but in some cases, it is incorrectly to set it, when socket has
at least 1 bytes of available data. Use 'target' which is already exists.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovirtio/vsock: use 'target' in notify_poll_in callback
Arseniy Krasnov [Fri, 19 Aug 2022 05:29:34 +0000 (05:29 +0000)]
virtio/vsock: use 'target' in notify_poll_in callback

This callback controls setting of POLLIN, POLLRDNORM output bits of poll()
syscall, but in some cases, it is incorrectly to set it, when socket has
at least 1 bytes of available data. Use 'target' which is already exists.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agohv_sock: disable SO_RCVLOWAT support
Arseniy Krasnov [Fri, 19 Aug 2022 05:27:34 +0000 (05:27 +0000)]
hv_sock: disable SO_RCVLOWAT support

For Hyper-V it is quiet difficult to support this socket option,due to
transport internals, so disable it.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agovsock: SO_RCVLOWAT transport set callback
Arseniy Krasnov [Fri, 19 Aug 2022 05:25:19 +0000 (05:25 +0000)]
vsock: SO_RCVLOWAT transport set callback

This adds transport specific callback for SO_RCVLOWAT, because in some
transports it may be difficult to know current available number of bytes
ready to read. Thus, when SO_RCVLOWAT is set, transport may reject it.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: sched: remove duplicate check of user rights in qdisc
Zhengchao Shao [Fri, 19 Aug 2022 04:18:54 +0000 (12:18 +0800)]
net: sched: remove duplicate check of user rights in qdisc

In rtnetlink_rcv_msg function, the permission for all user operations
is checked except the GET operation, which is the same as the checking
in qdisc. Therefore, remove the user rights check in qdisc.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Message-Id: <20220819041854.83372-1-shaozhengchao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Tue, 23 Aug 2022 03:24:45 +0000 (20:24 -0700)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-08-18 (ixgbe)

This series contains updates to ixgbe driver only.

Fabio M. De Francesco replaces kmap() call to page_address() for
rx_buffer->page().

Jeff Daly adds a manual AN-37 restart to help resolve issues with some link
partners.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ixgbe: Manual AN-37 for troublesome link partners for X550 SFI
  ixgbe: Don't call kmap() on page allocated with GFP_ATOMIC
====================

Link: https://lore.kernel.org/r/20220818223402.1294091-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Tue, 23 Aug 2022 03:10:49 +0000 (20:10 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-08-18 (ice)

This series contains updates to ice driver only.

Jesse and Anatolii add support for controlling FCS/CRC stripping via
ethtool.

Anirudh allows for 100M speeds on devices which support it.

Sylwester removes ucast_shared field and the associated dead code related
to it.

Mikael removes non-inclusive language from the driver.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: remove non-inclusive language
  ice: Remove ucast_shared
  ice: Allow 100M speeds for some devices
  ice: Implement FCS/CRC and VLAN stripping co-existence policy
  ice: Implement control of FCS/CRC stripping
====================

Link: https://lore.kernel.org/r/20220818155207.996297-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: tag_8021q: remove old comment regarding dsa_8021q_netdev_ops
Vladimir Oltean [Thu, 18 Aug 2022 14:38:08 +0000 (17:38 +0300)]
net: dsa: tag_8021q: remove old comment regarding dsa_8021q_netdev_ops

Since commit 129bd7ca8ac0 ("net: dsa: Prevent usage of NET_DSA_TAG_8021Q
as tagging protocol"), dsa_8021q_netdev_ops no longer exists, so remove
the comment that talks about it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220818143808.2808393-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet_sched: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:28 +0000 (23:02 +0200)]
net_sched: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210228.8635-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoopenvswitch: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:25 +0000 (23:02 +0200)]
openvswitch: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210226.8587-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoethtool: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:18 +0000 (23:02 +0200)]
ethtool: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210218.8443-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodsa: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:16 +0000 (23:02 +0200)]
dsa: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220818210216.8419-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:15 +0000 (23:02 +0200)]
net: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210215.8395-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agopacket: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:27 +0000 (23:02 +0200)]
packet: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210227.8611-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agol2tp: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:21 +0000 (23:02 +0200)]
l2tp: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210222.8515-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:20 +0000 (23:02 +0200)]
ipv6: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210220.8491-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:19 +0000 (23:02 +0200)]
ipv4: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210219.8467-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agocaif: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:14 +0000 (23:02 +0200)]
caif: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210214.8371-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobridge: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:12 +0000 (23:02 +0200)]
bridge: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20220818210212.8347-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoax25: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:05 +0000 (23:02 +0200)]
ax25: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210206.8299-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agovlan: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:02:04 +0000 (23:02 +0200)]
vlan: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210204.8275-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoisdn: move from strlcpy with unused retval to strscpy
Wolfram Sang [Thu, 18 Aug 2022 21:00:23 +0000 (23:00 +0200)]
isdn: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210023.6889-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'validate-of-nodes-for-dsa-shared-ports'
Jakub Kicinski [Tue, 23 Aug 2022 00:45:51 +0000 (17:45 -0700)]
Merge branch 'validate-of-nodes-for-dsa-shared-ports'

Vladimir Oltean says:

====================
Validate OF nodes for DSA shared ports

This is the first set of measures taken so that more drivers can be
transitioned towards phylink on shared (CPU and DSA) ports some time in
the future. It consists of:

- expanding the DT schema for DSA and related drivers to clarify the new
  requirements.

- introducing warnings for drivers that currently skip phylink due to
  incomplete DT descriptions.

- introducing warning for drivers that currently skip phylink due to
  using platform data (search for struct dsa_chip_data).

- closing the possibility for new(ish) drivers to skip phylink, by
  validating their DT descriptions.

- making the code paths used by shared ports more evident.

- preparing the code paths used by shared ports for further work to fake
  a link description where that is possible.

More details in patch 10/10.

DT binding (patches 1-6) and kernel (7-10) are in principle separable,
but are submitted together since they're part of the same story.

Patches 8 and 9 are DSA cleanups, and patch 7 is a dependency for patch
10.

v1 at
https://patchwork.kernel.org/project/netdevbpf/patch/20220723164635.1621911-1-vladimir.oltean@nxp.com/

v2 at
https://patchwork.kernel.org/project/netdevbpf/patch/20220729132119.1191227-5-vladimir.oltean@nxp.com/

v3 at
https://patchwork.kernel.org/project/netdevbpf/cover/20220806141059.2498226-1-vladimir.oltean@nxp.com/
====================

Link: https://lore.kernel.org/r/20220818115500.2592578-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: make phylink-related OF properties mandatory on DSA and CPU ports
Vladimir Oltean [Thu, 18 Aug 2022 11:55:00 +0000 (14:55 +0300)]
net: dsa: make phylink-related OF properties mandatory on DSA and CPU ports

Early DSA drivers were kind of simplistic in that they assumed a fairly
narrow hardware layout. User ports would have integrated PHYs at an
internal MDIO address that is derivable from the port number, and shared
(DSA and CPU) ports would have an MII-style (serial or parallel)
connection to another MAC. Phylib and then phylink were used to drive
the internal PHYs, and this needed little to no description through the
platform data structures. Bringing up the shared ports at the maximum
supported link speed was the responsibility of the drivers.

As a result of this, when these early drivers were converted from
platform data to the new DSA OF bindings, there was no link information
translated into the first DT bindings.

https://lore.kernel.org/all/YtXFtTsf++AeDm1l@lunn.ch/

Later, phylink was adopted for shared ports as well, and today we have a
workaround in place, introduced by commit a20f997010c4 ("net: dsa: Don't
instantiate phylink for CPU/DSA ports unless needed"). There, DSA checks
for the presence of phy-handle/fixed-link/managed OF properties, and if
missing, phylink registration would be skipped. This is because phylink
is optional for some drivers (the shared ports already work without it),
but the process of starting to register a port with phylink is
irreversible: if phylink_create() fails to find the fwnode properties it
needs, it bails out and it leaves the ports inoperational (because
phylink expects ports to be initially down, so DSA necessarily takes
them down, and doesn't know how to put them back up again).

DSA being a common framework, new drivers opt into this workaround
willy-nilly, but the ideal behavior from the DSA core's side would have
been to not interfere with phylink's process of failing at all. This
isn't possible because of regression concerns with pre-phylink DT blobs,
but at least DSA should put a stop to the proliferation of more of such
cases that rely on the workaround to skip phylink registration, and
sanitize the environment that new drivers work in.

To that end, create a list of compatible strings for which the
workaround is preserved, and don't apply the workaround for any drivers
outside that list (this includes new drivers).

In some cases, we make the assumption that even existing drivers don't
rely on DSA's workaround, and we do this by looking at the device trees
in which they appear. We can't fully know what is the situation with
downstream DT blobs, but we can guess the overall trend by studying the
DT blobs that were submitted upstream. If there are upstream blobs that
have lacking descriptions, we take it as very likely that there are many
more downstream blobs that do so too. If all upstream blobs have
complete descriptions, we take that as a hint that the driver is a
candidate for enforcing strict DT bindings (considering that most
bindings are copy-pasted). If there are no upstream DT blobs, we take
the conservative route of allowing the workaround, unless the driver
maintainer instructs us otherwise.

The driver situation is as follows:

ar9331
~~~~~~

    compatible strings:
    - qca,ar9331-switch

    1 occurrence in mainline device trees, part of SoC dtsi
    (arch/mips/boot/dts/qca/ar9331.dtsi), description is not problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

b53
~~~

    compatible strings:
    - brcm,bcm5325
    - brcm,bcm53115
    - brcm,bcm53125
    - brcm,bcm53128
    - brcm,bcm5365
    - brcm,bcm5389
    - brcm,bcm5395
    - brcm,bcm5397
    - brcm,bcm5398

    - brcm,bcm53010-srab
    - brcm,bcm53011-srab
    - brcm,bcm53012-srab
    - brcm,bcm53018-srab
    - brcm,bcm53019-srab
    - brcm,bcm5301x-srab
    - brcm,bcm11360-srab
    - brcm,bcm58522-srab
    - brcm,bcm58525-srab
    - brcm,bcm58535-srab
    - brcm,bcm58622-srab
    - brcm,bcm58623-srab
    - brcm,bcm58625-srab
    - brcm,bcm88312-srab
    - brcm,cygnus-srab
    - brcm,nsp-srab
    - brcm,omega-srab

    - brcm,bcm3384-switch
    - brcm,bcm6328-switch
    - brcm,bcm6368-switch
    - brcm,bcm63xx-switch

    I've found at least these mainline DT blobs with problems:

    arch/arm/boot/dts/bcm47094-linksys-panamera.dts
    - lacks phy-mode
    arch/arm/boot/dts/bcm47189-tenda-ac9.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts
    arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts
    arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts
    arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts
    arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts
    arch/arm/boot/dts/bcm953012er.dts
    arch/arm/boot/dts/bcm4708-netgear-r6250.dts
    arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi
    arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts
    arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm53016-meraki-mr32.dts
    - lacks phy-mode

    Verdict: opt into DSA workarounds.

bcm_sf2
~~~~~~~

    compatible strings:
    - brcm,bcm4908-switch
    - brcm,bcm7445-switch-v4.0
    - brcm,bcm7278-switch-v4.0
    - brcm,bcm7278-switch-v4.8

    A single occurrence in mainline
    (arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi), part of a SoC
    dtsi, valid description. Florian Fainelli explains that most of the
    bcm_sf2 device trees lack a full description for the internal IMP
    ports.

    Verdict: opt the BCM4908 into strict DT bindings, and opt the rest
    into the workarounds. Note that even though BCM4908 has strict DT
    bindings, it still does not register with phylink on the IMP port
    due to it implementing ->adjust_link().

hellcreek
~~~~~~~~~

    compatible strings:
    - hirschmann,hellcreek-de1soc-r1

    No occurrence in mainline device trees. Kurt Kanzenbach explains
    that the downstream device trees lacked phy-mode and fixed link, and
    needed work, but were fixed in the meantime.

    Verdict: opt into strict DT bindings and out of workarounds.

lan9303
~~~~~~~

    compatible strings:
    - smsc,lan9303-mdio
    - smsc,lan9303-i2c

    1 occurrence in mainline device trees:
    arch/arm/boot/dts/imx53-kp-hsc.dts
    - no phy-mode, no fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

lantiq_gswip
~~~~~~~~~~~~

    compatible strings:
    - lantiq,xrx200-gswip
    - lantiq,xrx300-gswip
    - lantiq,xrx330-gswip

    No occurrences in mainline device trees. Martin Blumenstingl
    confirms that the downstream OpenWrt device trees lack a proper
    fixed-link and need work, and that the incomplete description can
    even be seen in the example from
    Documentation/devicetree/bindings/net/dsa/lantiq-gswip.txt.

    Verdict: opt out of strict DT bindings and into workarounds.

microchip ksz
~~~~~~~~~~~~~

    compatible strings:
    - microchip,ksz8765
    - microchip,ksz8794
    - microchip,ksz8795
    - microchip,ksz8863
    - microchip,ksz8873
    - microchip,ksz9477
    - microchip,ksz9897
    - microchip,ksz9893
    - microchip,ksz9563
    - microchip,ksz8563
    - microchip,ksz9567
    - microchip,lan9370
    - microchip,lan9371
    - microchip,lan9372
    - microchip,lan9373
    - microchip,lan9374

    5 occurrences in mainline device trees, all descriptions are valid.
    But we had a snafu for the ksz8795 and ksz9477 drivers where the
    phy-mode property would be expected to be located directly under the
    'switch' node rather than under a port OF node. It was fixed by
    commit edecfa98f602 ("net: dsa: microchip: look for phy-mode in port
    nodes"). The driver still has compatibility with the old DT blobs.
    The lan937x support was added later than the above snafu was fixed,
    and even though it has support for the broken DT blobs by virtue of
    sharing a common probing function, I'll take it that its DT blobs
    are correct.

    Verdict: opt lan937x into strict DT bindings, and the others out.

mt7530
~~~~~~

    compatible strings
    - mediatek,mt7621
    - mediatek,mt7530
    - mediatek,mt7531

    Multiple occurrences in mainline device trees, one is part of an SoC
    dtsi (arch/mips/boot/dts/ralink/mt7621.dtsi), all descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

mv88e6060
~~~~~~~~~

    compatible string:
    - marvell,mv88e6060

    no occurrences in mainline, nobody knows anybody who uses it.

    Verdict: opt out of strict DT bindings and into workarounds.

mv88e6xxx
~~~~~~~~~

    compatible strings:
    - marvell,mv88e6085
    - marvell,mv88e6190
    - marvell,mv88e6250

    Device trees that have incomplete descriptions of CPU or DSA ports:
    arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi
    - lacks phy-mode
    arch/arm64/boot/dts/marvell/cn9130-crb.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-spb4.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-cfu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-dev-rev-c.dts
    - lacks phy-mode on CPU port, fixed-link on DSA ports
    arch/arm/boot/dts/vf610-zii-dev-rev-b.dts
    - lacks phy-mode on CPU port
    arch/arm/boot/dts/armada-381-netgear-gs110emx.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-scu4-aib.dts
    - lacks fixed-link on xgmii DSA ports and/or in-band-status on
      2500base-x DSA ports, and phy-mode on CPU port
    arch/arm/boot/dts/imx6qdl-gw5904.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-385-clearfog-gtr-l8.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-dtu.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-dir665.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-rd88f6281.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/orion5x-netgear-wnr854t.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-388-clearfog.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-xp-linksys-mamba.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-linksys.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/imx6q-b450v3.dts
    arch/arm/boot/dts/imx6q-b850v3.dts
    - has a phy-handle but not a phy-mode?
    arch/arm/boot/dts/armada-370-rd.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-linksys-viper.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-rdu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-clearfog-gtr-s4.dts
    - lacks phy-mode and fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

ocelot
~~~~~~

    compatible strings:
    - mscc,vsc9953-switch
    - felix (arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi) is a PCI
      device, has no compatible string

    2 occurrences in mainline, both are part of SoC dtsi and complete.

    Verdict: opt into strict DT bindings and out of workarounds.

qca8k
~~~~~

    compatible strings:
    - qca,qca8327
    - qca,qca8328
    - qca,qca8334
    - qca,qca8337

    5 occurrences in mainline device trees, none of the descriptions are
    problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

realtek
~~~~~~~

    compatible strings:
    - realtek,rtl8366rb
    - realtek,rtl8365mb

    2 occurrences in mainline, both descriptions are fine, additionally
    rtl8365mb.c has a comment "The device tree firmware should also
    specify the link partner of the extension port - either via a
    fixed-link or other phy-handle."

    Verdict: opt into strict DT bindings and out of workarounds.

rzn1_a5psw
~~~~~~~~~~

    compatible strings:
    - renesas,rzn1-a5psw

    One single occurrence, part of SoC dtsi
    (arch/arm/boot/dts/r9a06g032.dtsi), description is fine.

    Verdict: opt into strict DT bindings and out of workarounds.

sja1105
~~~~~~~

    Driver already validates its port OF nodes in
    sja1105_parse_ports_node().

    Verdict: opt into strict DT bindings and out of workarounds.

vsc73xx
~~~~~~~

    compatible strings:
    - vitesse,vsc7385
    - vitesse,vsc7388
    - vitesse,vsc7395
    - vitesse,vsc7398

    2 occurrences in mainline device trees, both descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

xrs700x
~~~~~~~

    compatible strings:
    - arrow,xrs7003e
    - arrow,xrs7003f
    - arrow,xrs7004e
    - arrow,xrs7004f

    no occurrences in mainline, we don't know.

    Verdict: opt out of strict DT bindings and into workarounds.

Because there is a pattern where newly added switches reuse existing
drivers more often than introducing new ones, I've opted for deciding
who gets to opt into the workaround based on an OF compatible match
table in the DSA core. The alternative would have been to add another
boolean property to struct dsa_switch, like configure_vlan_while_not_filtering.
But this avoids situations where sometimes driver maintainers obfuscate
what goes on by sharing a common probing function, and therefore making
new switches inherit old quirks.

Side note, we also warn about missing properties for drivers that rely
on the workaround. This isn't an indication that we'll break
compatibility with those DT blobs any time soon, but is rather done to
raise awareness about the change, for future DT blob authors.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Acked-by: Alvin Å ipraga <alsi@bang-olufsen.dk> # realtek
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: rename dsa_port_link_{,un}register_of
Vladimir Oltean [Thu, 18 Aug 2022 11:54:59 +0000 (14:54 +0300)]
net: dsa: rename dsa_port_link_{,un}register_of

There is a subset of functions that applies only to shared (DSA and CPU)
ports, yet this is difficult to comprehend by looking at their code alone.
These are dsa_port_link_register_of(), dsa_port_link_unregister_of(),
and the functions that only these 2 call.

Rename this class of functions to dsa_shared_port_* to make this fact
more evident, even if this goes against the apparent convention that
function names in port.c must start with dsa_port_.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: avoid dsa_port_link_{,un}register_of() calls with platform data
Vladimir Oltean [Thu, 18 Aug 2022 11:54:58 +0000 (14:54 +0300)]
net: dsa: avoid dsa_port_link_{,un}register_of() calls with platform data

dsa_port_link_register_of() and dsa_port_link_unregister_of() are not
written with the fact in mind that they can be called with a dp->dn that
is NULL (as evidenced even by the _of suffix in their name), but this is
exactly what happens.

How this behaves will differ depending on whether the backing driver
implements ->adjust_link() or not.

If it doesn't, the "if (of_phy_is_fixed_link(dp->dn) || phy_np)"
condition will return false, and dsa_port_link_register_of() will do
nothing and return 0.

If the driver does implement ->adjust_link(), the
"if (of_phy_is_fixed_link(dp->dn))" condition will return false
(dp->dn is NULL) and we will call dsa_port_setup_phy_of(). This will
call dsa_port_get_phy_device(), which will also return NULL, and we will
also do nothing and return 0.

It is hard to maintain this code and make future changes to it in this
state, so just suppress calls to these 2 functions if dp->dn is NULL.
The only functional effect is that if the driver does implement
->adjust_link(), we'll stop printing this to the console:

Using legacy PHYLIB callbacks. Please migrate to PHYLINK!

but instead we'll always print:

[    8.539848] dsa-loop fixed-0:1f: skipping link registration for CPU port 5

This is for the better anyway, since "using legacy phylib callbacks"
was misleading information - we weren't issuing _any_ callbacks due to
dsa_port_get_phy_device() returning NULL.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoof: base: export of_device_compatible_match() for use in modules
Vladimir Oltean [Thu, 18 Aug 2022 11:54:57 +0000 (14:54 +0300)]
of: base: export of_device_compatible_match() for use in modules

Modules such as net/dsa/dsa_core.ko might want to iterate through an
array of compatible strings for things such as validation (or rather,
skipping it for some potentially broken drivers).

of_device_is_compatible() is exported, by of_device_compatible_match()
isn't. Export the latter as well, so we don't have to open-code the
iteration.

Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: make phylink bindings required for CPU/DSA ports
Vladimir Oltean [Thu, 18 Aug 2022 11:54:56 +0000 (14:54 +0300)]
dt-bindings: net: dsa: make phylink bindings required for CPU/DSA ports

It is desirable that new DSA drivers are written to expect that all
their ports register with phylink, and not rely on the DSA core's
workarounds to skip this process.

To that end, DSA is being changed to warn existing drivers when such DT
blobs are in use, and to opt new drivers out of the workarounds.

Introduce another layer of validation in the DSA DT schema, and assert
that CPU and DSA ports must have phylink-related properties present.

Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: rzn1-a5psw: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:55 +0000 (14:54 +0300)]
dt-bindings: net: dsa: rzn1-a5psw: add missing CPU port phy-mode to example

To prevent warnings during "make dt_bindings_check" after dsa-port.yaml
will make phylink properties mandatory, add phy-mode = "internal" to the
example.

This new property is taken straight out of the SoC dtsi at
arch/arm/boot/dts/r9a06g032.dtsi, so it seems likely that only the
example needs to be fixed, rather than DT blobs in circulation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: microchip: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:54 +0000 (14:54 +0300)]
dt-bindings: net: dsa: microchip: add missing CPU port phy-mode to example

The ksz_switch_chips[] element for KSZ9477 says that port 5 is an xMII
port and it supports speeds of 10/100/1000. The device tree example does
declare a fixed-link at 1000, and RGMII is the only one of those modes
that supports this speed, so use this phy-mode.

The microchip,ksz8565 compatible string is not supported by the
microchip driver, however on Microchip's product page it says that there
are 5 ports, 4 of which have internal PHYs and the 5th is an
MII/RMII/RGMII port. It's a bit strange that this is port@6, but it is
probably just the way it is. Select an RGMII phy-mode for this one as
well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: b53: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:53 +0000 (14:54 +0300)]
dt-bindings: net: dsa: b53: add missing CPU port phy-mode to example

Looking at b53_srab_phylink_get_caps() I get no indication of what PHY
modes does port 8 support, since it is implemented circularly based on
the p->mode retrieved from the device tree (and in PHY_INTERFACE_MODE_NA
it reports nothing to supported_interfaces).

However if I look at the b53_switch_chips[] element for BCM58XX_DEVICE_ID,
I see that port 8 is the IMP port, and SRAB means the IMP port is
internal to the SoC. So use phy-mode = "internal" in the example.

Note that this will make b53_srab_phylink_get_caps() go through the
"default" case and report PHY_INTERFACE_MODE_INTERNAL to
supported_interfaces, which is probably a good thing.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: hellcreek: add missing CPU port phy-mode/fixed-link to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:52 +0000 (14:54 +0300)]
dt-bindings: net: dsa: hellcreek: add missing CPU port phy-mode/fixed-link to example

Looking at hellcreek_phylink_get_caps(), I see that depending on whether
is_100_mbits is set, speeds of 1G or of 100M will be advertised. The
de1soc_r1_pdata sets is_100_mbits to true.

The PHY modes declared in the capabilities are MII, RGMII and GMII. GMII
doesn't support 100Mbps, and as for RGMII, it would be a bit implausible
to me to support this PHY mode but limit it to only 25 MHz. So I've
settled on MII as a phy-mode in the example, and a fixed-link of
100Mbps.

As a side note, there exists such a thing as "rev-mii", because the MII
protocol is asymmetric, and "mii" is the designation for the MAC side
(expected to be connected to a PHY), and "rev-mii" is the designation
for the PHY side (expected to be connected to a MAC). I wonder whether
"mii" or "rev-mii" should actually be used here, since this is a CPU
port and presumably connected to another MAC.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: dsa: xrs700x: add missing CPU port phy-mode to example
Vladimir Oltean [Thu, 18 Aug 2022 11:54:51 +0000 (14:54 +0300)]
dt-bindings: net: dsa: xrs700x: add missing CPU port phy-mode to example

Judging by xrs700x_phylink_get_caps(), I deduce that this switch
supports the RGMII modes on port 3, so state this phy-mode in the
example, such that users are encouraged to not rely on avoiding phylink
for this port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: prestera: cache port state for non-phylink ports too
Maksym Glubokiy [Thu, 18 Aug 2022 11:18:21 +0000 (14:18 +0300)]
net: prestera: cache port state for non-phylink ports too

Port event data must stored to port-state cache regardless of whether
the port uses phylink or not since this data is used by ethtool.

Fixes: 52323ef75414 ("net: marvell: prestera: add phylink support")
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: freescale: xgmac: Do not dereference fwnode in struct device
zhaoxiao [Thu, 18 Aug 2022 09:50:59 +0000 (17:50 +0800)]
net: freescale: xgmac: Do not dereference fwnode in struct device

In order to make the underneath API easier to change in the future,
prevent users from dereferencing fwnode from struct device.
Instead, use the specific dev_fwnode() API for that.

Signed-off-by: zhaoxiao <zhaoxiao@uniontech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoRemove DECnet support from kernel
Stephen Hemminger [Thu, 18 Aug 2022 00:43:21 +0000 (17:43 -0700)]
Remove DECnet support from kernel

DECnet is an obsolete network protocol that receives more attention
from kernel janitors than users. It belongs in computer protocol
history museum not in Linux kernel.

It has been "Orphaned" in kernel since 2010. The iproute2 support
for DECnet was dropped in 5.0 release. The documentation link on
Sourceforge says it is abandoned there as well.

Leave the UAPI alone to keep userspace programs compiling.
This means that there is still an empty neighbour table
for AF_DECNET.

The table of /proc/sys/net entries was updated to match
current directories and reformatted to be alphabetical.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Ahern <dsahern@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'lan966x-lag-support'
David S. Miller [Mon, 22 Aug 2022 13:00:54 +0000 (14:00 +0100)]
Merge branch 'lan966x-lag-support'

Horatiu Vultur says:

====================
net: lan966x: Add lag support

Add lag support for lan966x.
First 4 patches don't do any changes to the current behaviour, they
just prepare for lag support. While the rest is to add the lag support.

v3->v4:
- aggregation configuration is global for all bonds, so make sure that
  there can't be enabled multiple configurations at the same time
- return error faster from lan966x_foreign_bridging_check, don't
  continue the search if the error is seen already
- flush fdb workqueue when a port leaves a bridge or lag.

v2->v3:
- return error code from 'switchdev_bridge_port_offload()'
- fix lan966x_foreign_dev_check(), it was missing lag support
- remove lan966x_lag_mac_add_entry and lan966x_mac_del_entry as
  they are not needed
- fix race conditions when accessing port->bond
- move FDB entries when a new port joins the lag if it has a lower

v1->v2:
- fix the LAG PGIDs when ports go down, in this way is not
  needed anymore the last patch of the series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend MAC to support also lag interfaces.
Horatiu Vultur [Wed, 17 Aug 2022 19:34:49 +0000 (21:34 +0200)]
net: lan966x: Extend MAC to support also lag interfaces.

Extend MAC support to support also lag interfaces:
1. In case an entry is learned on a port that is part of lag interface,
   then notify the upper layers that the entry is learned on the bond
   interface
2. If a port leaves the bond and the port is the first port in the lag
   group, then it is required to update all MAC entries to change the
   destination port.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend FDB to support also lag
Horatiu Vultur [Wed, 17 Aug 2022 19:34:48 +0000 (21:34 +0200)]
net: lan966x: Extend FDB to support also lag

Offload FDB entries when the original device is a lag interface. Because
all the ports under the lag have the same chip id, which is the chip id
of first port, then add the entries only for the first port.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add lag support for lan966x
Horatiu Vultur [Wed, 17 Aug 2022 19:34:47 +0000 (21:34 +0200)]
net: lan966x: Add lag support for lan966x

Add link aggregation hardware offload support for lan966x

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Extend lan966x_foreign_bridging_check
Horatiu Vultur [Wed, 17 Aug 2022 19:34:46 +0000 (21:34 +0200)]
net: lan966x: Extend lan966x_foreign_bridging_check

Extend lan966x_foreign_bridging_check to check also if the upper
interface is a lag device. Don't allow a lan966x port to be part of a
lag if it has foreign interfaces.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb
Horatiu Vultur [Wed, 17 Aug 2022 19:34:45 +0000 (21:34 +0200)]
net: lan966x: Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb

Expose lan966x_switchdev_nb and lan966x_switchdev_blocking_nb to the
lan966x_main.h file because they will be needed by the lag driver.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Flush fdb workqueue when port is leaving a bridge.
Horatiu Vultur [Wed, 17 Aug 2022 19:34:44 +0000 (21:34 +0200)]
net: lan966x: Flush fdb workqueue when port is leaving a bridge.

Whenever a port leaves a bridge, flush the workqueue of the FDB work.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Split lan966x_fdb_event_work
Horatiu Vultur [Wed, 17 Aug 2022 19:34:43 +0000 (21:34 +0200)]
net: lan966x: Split lan966x_fdb_event_work

Split the function lan966x_fdb_event_work. One case for when the
orig_dev is a bridge and one case when orig_dev is lan966x port.
This is preparation for lag support. There is no functional change.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add registers used to configure lag interfaces
Horatiu Vultur [Wed, 17 Aug 2022 19:34:42 +0000 (21:34 +0200)]
net: lan966x: Add registers used to configure lag interfaces

Add the registers used by lan966x to configure the lag interface.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'tsnep-minor-improvements'
David S. Miller [Mon, 22 Aug 2022 12:56:37 +0000 (13:56 +0100)]
Merge branch 'tsnep-minor-improvements'

Gerhard Engleder says:

====================
tsnep: Various minor driver improvements

During XDP development some general driver improvements has been done
which I want to keep out of future patch series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Record RX queue
Gerhard Engleder [Wed, 17 Aug 2022 19:30:17 +0000 (21:30 +0200)]
tsnep: Record RX queue

Other drivers record RX queue so it should make sense to do that also.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Support full DMA mask
Gerhard Engleder [Wed, 17 Aug 2022 19:30:16 +0000 (21:30 +0200)]
tsnep: Support full DMA mask

DMA addresses up to 64bit are supported by the device. Configure DMA
mask according to the capabilities of the device.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Improve TX length handling
Gerhard Engleder [Wed, 17 Aug 2022 19:30:15 +0000 (21:30 +0200)]
tsnep: Improve TX length handling

TX length can by calculated more efficient during map and unmap of
fragments. Another reason is that, by moving TX statistic counting to
tsnep_tx_poll() it can be used there for XDP too.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Add loopback support
Gerhard Engleder [Wed, 17 Aug 2022 19:30:14 +0000 (21:30 +0200)]
tsnep: Add loopback support

Add support for NETIF_F_LOOPBACK feature. Loopback mode is used for
testing.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotsnep: Fix TSNEP_INFO_TX_TIME register define
Gerhard Engleder [Wed, 17 Aug 2022 19:30:13 +0000 (21:30 +0200)]
tsnep: Fix TSNEP_INFO_TX_TIME register define

Fixed register define is not used, but register definition shall be kept
in sync.

Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoopenvswitch: Fix overreporting of drops in dropwatch
Mike Pattrick [Wed, 17 Aug 2022 15:06:35 +0000 (11:06 -0400)]
openvswitch: Fix overreporting of drops in dropwatch

Currently queue_userspace_packet will call kfree_skb for all frames,
whether or not an error occurred. This can result in a single dropped
frame being reported as multiple drops in dropwatch. This functions
caller may also call kfree_skb in case of an error. This patch will
consume the skbs instead and allow caller's to use kfree_skb.

Signed-off-by: Mike Pattrick <mkp@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2109957
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoopenvswitch: Fix double reporting of drops in dropwatch
Mike Pattrick [Wed, 17 Aug 2022 15:06:34 +0000 (11:06 -0400)]
openvswitch: Fix double reporting of drops in dropwatch

Frames sent to userspace can be reported as dropped in
ovs_dp_process_packet, however, if they are dropped in the netlink code
then netlink_attachskb will report the same frame as dropped.

This patch checks for error codes which indicate that the frame has
already been freed.

Signed-off-by: Mike Pattrick <mkp@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2109946
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-phy-QUSGMII'
David S. Miller [Mon, 22 Aug 2022 12:46:26 +0000 (13:46 +0100)]
Merge branch 'net-phy-QUSGMII'

Maxime Chevallier says:

====================
net: Introduce QUSGMII phy mode

Re-sending, since the previous v4 was sent while net-next was closed.

This is a resend of the V4 of a previous series [1] initially aimed at
introducing inband extensions, with modes like QUSGMII. This mode allows
passing info in the ethernet preamble between the MAC and the PHY, such as
timestamps.

This series has now become a preliminary series, that simply introduces
the new interface mode, without support for inband extensions, that will
come later.

The reasonning is that work will need to be done in the networking
subsystem, but also in the generic phy driver subsystem to allow serdes
configuration for qusgmii.

This series add the mode, the relevant binding changes, adds support for
it in the lan966x driver, and also introduces a small helper to get the
number of links a given phy mode can carry (think 1 for SGMII and 4 for
QSGMII). This allows for better readability and will prove useful
when (if) we support PSGMII (5 links on 1 interface) and OUSGMII (8
links on one interface).

V4 contains no change but the collected Reviewed-by from Andrew.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: lan966x: Add QUSGMII support for lan966x
Maxime Chevallier [Wed, 17 Aug 2022 12:32:55 +0000 (14:32 +0200)]
net: lan966x: Add QUSGMII support for lan966x

The Lan996x controller supports the QUSGMII mode, which is very similar
to QSGMII in the way it's configured and the autonegociation
capababilities it provides.

This commit adds support for that mode, treating it most of the time
like QSGMII, making sure that we do configure the PCS how we should.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: Add helper to derive the number of ports from a phy mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:54 +0000 (14:32 +0200)]
net: phy: Add helper to derive the number of ports from a phy mode

Some phy modes such as QSGMII multiplex several MAC<->PHY links on one
single physical interface. QSGMII used to be the only one supported, but
other modes such as QUSGMII also carry multiple links.

This helper allows getting the number of links that are multiplexed
on a given interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodt-bindings: net: ethernet-controller: add QUSGMII mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:53 +0000 (14:32 +0200)]
dt-bindings: net: ethernet-controller: add QUSGMII mode

Add a new QUSGMII mode, standing for "Quad Universal Serial Gigabit
Media Independent Interface", a derivative of USGMII which, similarly to
QSGMII, allows to multiplex 4 1Gbps links to a Quad-PHY.

The main difference with QSGMII is that QUSGMII can include an extension
instead of the standard 7bytes ethernet preamble, allowing to convey
arbitrary data such as Timestamps.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: Introduce QUSGMII PHY mode
Maxime Chevallier [Wed, 17 Aug 2022 12:32:52 +0000 (14:32 +0200)]
net: phy: Introduce QUSGMII PHY mode

The QUSGMII mode is a derivative of Cisco's USXGMII standard. This
standard is pretty similar to SGMII, but allows for faster speeds, and
has the build-in bits for Quad and Octa variants (like QSGMII).

The main difference with SGMII/QSGMII is that USXGMII/QUSGMII re-uses
the preamble to carry various information, named 'Extensions'.

As of today, the USXGMII standard only mentions the "PCH" extension,
which is used to convey timestamps, allowing in-band signaling of PTP
timestamps without having to modify the frame itself.

This commit adds support for that mode. When no extension is in use, it
behaves exactly like QSGMII, although it's not compatible with QSGMII.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: ti: davinci_mdio: Add workaround for errata i2329
Ravi Gunasekaran [Wed, 17 Aug 2022 09:44:06 +0000 (15:14 +0530)]
net: ethernet: ti: davinci_mdio: Add workaround for errata i2329

On the CPSW and ICSS peripherals, there is a possibility that the MDIO
interface returns corrupt data on MDIO reads or writes incorrect data
on MDIO writes. There is also a possibility for the MDIO interface to
become unavailable until the next peripheral reset.

The workaround is to configure the MDIO in manual mode and disable the
MDIO state machine and emulate the MDIO protocol by reading and writing
appropriate fields in MDIO_MANUAL_IF_REG register of the MDIO controller
to manipulate the MDIO clock and data pins.

More details about the errata i2329 and the workaround is available in:
https://www.ti.com/lit/er/sprz487a/sprz487a.pdf

Add implementation to disable MDIO state machine, configure MDIO in manual
mode and achieve MDIO read and writes via MDIO Bitbanging

Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: realtek: add support for RTL8211F(D)(I)-VD-CG
Clark Wang [Wed, 17 Aug 2022 01:36:18 +0000 (09:36 +0800)]
net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG

RTL8211F(D)(I)-VD-CG is the pin-to-pin upgrade chip from
RTL8211F(D)(I)-CG.

Add new PHY ID for this chip.
It does not support RTL8211F_PHYCR2 anymore, so remove the w/r operation
of this register.

Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoaf_unix: Show number of inflight fds for sockets in TCP_LISTEN state too
Kirill Tkhai [Tue, 16 Aug 2022 21:51:54 +0000 (00:51 +0300)]
af_unix: Show number of inflight fds for sockets in TCP_LISTEN state too

TCP_LISTEN sockets is a special case. They preserve skb with a newly
connected sock till accept() makes it fully functional socket.
Receive queue of such socket may grow after connected peer
send messages there. Since these messages may contain scm_fds,
we should expose correct fdinfo::scm_fds for listening socket too.

Signed-off-by: Kirill Tkhai <tkhai@ya.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: prestera: add missing ABI compatibility check
Maksym Glubokiy [Thu, 18 Aug 2022 11:14:19 +0000 (14:14 +0300)]
net: prestera: add missing ABI compatibility check

Size-check a type used for FW communication is packed as expected.

Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Link: https://lore.kernel.org/r/20220818111419.414877-1-maksym.glubokiy@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoamt: remove unnecessary skb pointer check
Yang Yingliang [Thu, 18 Aug 2022 09:31:14 +0000 (17:31 +0800)]
amt: remove unnecessary skb pointer check

The skb pointer will be checked in kfree_skb(), so remove the outside check.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220818093114.2449179-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests/net: test l2 tunnel TOS/TTL inheriting
Matthias May [Wed, 17 Aug 2022 07:36:49 +0000 (09:36 +0200)]
selftests/net: test l2 tunnel TOS/TTL inheriting

There are currently 3 ip tunnels that are capable of carrying
L2 traffic: gretap, vxlan and geneve.
They all are capable to inherit the TOS/TTL for the outer
IP-header from the inner frame.

Add a test that verifies that these fields are correctly inherited.

These tests failed before the following commits:
b09ab9c92e50 ("ip6_tunnel: allow to inherit from VLAN encapsulated IP")
3f8a8447fd0b ("ip6_gre: use actual protocol to select xmit")
41337f52b967 ("ip6_gre: set DSCP for non-IP")
7ae29fd1be43 ("ip_tunnel: allow to inherit from VLAN encapsulated IP")
7074732c8fae ("ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN")
ca2bb69514a8 ("geneve: do not use RT_TOS for IPv6 flowlabel")
b4ab94d6adaa ("geneve: fix TOS inheriting for ipv4")

Signed-off-by: Matthias May <matthias.may@westermo.com>
Link: https://lore.kernel.org/r/20220817073649.26117-1-matthias.may@westermo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-dpaa-cleanups-in-preparation-for-phylink-conversion'
Jakub Kicinski [Fri, 19 Aug 2022 23:36:06 +0000 (16:36 -0700)]
Merge branch 'net-dpaa-cleanups-in-preparation-for-phylink-conversion'

Sean Anderson says:

====================
net: dpaa: Cleanups in preparation for phylink conversion

This series contains several cleanup patches for dpaa/fman. While they
are intended to prepare for a phylink conversion, they stand on their
own. This series was originally submitted as part of [1].

[1] https://lore.kernel.org/netdev/20220715215954.1449214-1-sean.anderson@seco.com
====================

Link: https://lore.kernel.org/r/20220818161649.2058728-1-sean.anderson@seco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: memac: Use params instead of priv for max_speed
Sean Anderson [Thu, 18 Aug 2022 16:16:35 +0000 (12:16 -0400)]
net: fman: memac: Use params instead of priv for max_speed

This option is present in params, so use it instead of the fman private
version.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Export/rename some common functions
Sean Anderson [Thu, 18 Aug 2022 16:16:34 +0000 (12:16 -0400)]
net: fman: Export/rename some common functions

In preparation for moving each of the initialization functions to their
own file, export some common functions so they can be re-used. This adds
an fman prefix to set_multi to make it a bit less genericly-named.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Configure fixed link in memac_initialization
Sean Anderson [Thu, 18 Aug 2022 16:16:33 +0000 (12:16 -0400)]
net: fman: Configure fixed link in memac_initialization

memac is the only mac which parses fixed links. Move the
parsing/configuring to its initialization function.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Move struct dev to mac_device
Sean Anderson [Thu, 18 Aug 2022 16:16:32 +0000 (12:16 -0400)]
net: fman: Move struct dev to mac_device

Move the reference to our device to mac_device. This way, macs can use
it in their log messages.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Store initialization function in match data
Sean Anderson [Thu, 18 Aug 2022 16:16:31 +0000 (12:16 -0400)]
net: fman: Store initialization function in match data

Instead of re-matching the compatible string in order to determine the init
function, just store it in the match data. The separate setup functions
aren't needed anymore. Merge their content into init as well. To ensure
everything compiles correctly, we move them to the bottom of the file.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Get PCS node in per-mac init
Sean Anderson [Thu, 18 Aug 2022 16:16:30 +0000 (12:16 -0400)]
net: fman: Get PCS node in per-mac init

This moves the reading of the PCS property out of the generic probe and
into the mac-specific initialization function. This reduces the
mac-specific jobs done in the top-level probe function.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: dtsec: Always gracefully stop/start
Sean Anderson [Thu, 18 Aug 2022 16:16:29 +0000 (12:16 -0400)]
net: fman: dtsec: Always gracefully stop/start

There are two ways that GRS can be set: graceful_stop and dtsec_isr. It
is cleared by graceful_start. If it is already set before calling
graceful_stop, then that means that dtsec_isr set it. In that case, we
will not set GRS nor will we clear it (which seems like a bug?). For GTS
the logic is similar, except that there is no one else messing with this
bit (so we will always set and clear it). Simplify the logic by always
setting/clearing GRS/GTS. This is less racy that the previous behavior,
and ensures that we always end up clearing the bits. This can of course
clear GRS while dtsec_isr is waiting, but because we have already done
our own waiting it should be fine.

This is the last user of enum comm_mode, so remove it.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Store en/disable in mac_device instead of mac_priv_s
Sean Anderson [Thu, 18 Aug 2022 16:16:28 +0000 (12:16 -0400)]
net: fman: Store en/disable in mac_device instead of mac_priv_s

All macs use the same start/stop functions. The actual mac-specific code
lives in enable/disable. Move these functions to an appropriate struct,
and inline the phy enable/disable calls to the caller of start/stop.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Don't pass comm_mode to enable/disable
Sean Anderson [Thu, 18 Aug 2022 16:16:27 +0000 (12:16 -0400)]
net: fman: Don't pass comm_mode to enable/disable

mac_priv_s->enable() and ->disable() are always called with
a comm_mode of COMM_MODE_RX_AND_TX. Remove this parameter, and refactor
the macs appropriately.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: fman: Convert to SPDX identifiers
Sean Anderson [Thu, 18 Aug 2022 16:16:26 +0000 (12:16 -0400)]
net: fman: Convert to SPDX identifiers

This converts the license text of files in the fman directory to use
SPDX license identifiers instead.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodt-bindings: net: Convert FMan MAC bindings to yaml
Sean Anderson [Thu, 18 Aug 2022 16:16:25 +0000 (12:16 -0400)]
dt-bindings: net: Convert FMan MAC bindings to yaml

This converts the MAC portion of the FMan MAC bindings to yaml.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "Merge branch 'wwan-t7xx-fw-flashing-and-coredump-support'"
Jakub Kicinski [Fri, 19 Aug 2022 22:29:38 +0000 (15:29 -0700)]
Revert "Merge branch 'wwan-t7xx-fw-flashing-and-coredump-support'"

This reverts commit 5417197dd516a8e115aa69f62a7b7554b0c3829c, reversing
changes made to 0630f64d25a0f0a8c6a9ce9fde8750b3b561e6f5.

Reverting to allow addressing review comments.

Link: https://lore.kernel.org/all/4c5dbea0-52a9-1c3d-7547-00ea54c90550@linux.intel.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 19 Aug 2022 04:17:10 +0000 (21:17 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoigc: add xdp frags support to ndo_xdp_xmit
Lorenzo Bianconi [Wed, 17 Aug 2022 17:36:28 +0000 (10:36 -0700)]
igc: add xdp frags support to ndo_xdp_xmit

Add the capability to map non-linear xdp frames in XDP_TX and
ndo_xdp_xmit callback.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220817173628.109102-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model'
Jakub Kicinski [Fri, 19 Aug 2022 03:50:44 +0000 (20:50 -0700)]
Merge branch 'selftests-mlxsw-add-ordering-tests-for-unified-bridge-model'

Petr Machata says:

====================
selftests: mlxsw: Add ordering tests for unified bridge model

Amit Cohen writes:

Commit 798661c73672 ("Merge branch 'mlxsw-unified-bridge-conversion-part-6'")
converted mlxsw driver to use unified bridge model. In the legacy model,
when a RIF was created / destroyed, it was firmware's responsibility to
update it in the relevant FID classification records. In the unified bridge
model, this responsibility moved to software.

This set adds tests to check the order of configuration for the following
classifications:
1. {Port, VID} -> FID
2. VID -> FID
3. VNI -> FID (after decapsulation)

In addition, in the legacy model, software is responsible to update a
table which is used to determine the packet's egress VID. Add a test to
check that the order of configuration does not impact switch behavior.

See more details in the commit messages.

Note that the tests supposed to pass also using the legacy model, they
are added now as with the new model they test the driver and not the
firmware.

Patch set overview:
Patch #1 adds test for {Port, VID} -> FID
Patch #2 adds test for VID -> FID
Patch #3 adds test for VNI -> FID
Patch #4 adds test for egress VID classification
====================

Link: https://lore.kernel.org/r/cover.1660747162.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add egress VID classification test
Amit Cohen [Wed, 17 Aug 2022 15:28:28 +0000 (17:28 +0200)]
selftests: mlxsw: Add egress VID classification test

After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via
REIV register.

The table needs to be updated in the following flows:
1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new
   {RIF, Port}->VID mapping should be created.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
   a new {RIF, Port}->VID mapping should be created.

Add a test to verify that packets get the correct VID after routing,
regardless of the order of the configuration.

 # ./egress_vid_classification.sh
 TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
 TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for VXLAN
Amit Cohen [Wed, 17 Aug 2022 15:28:27 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for VXLAN

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VXLAN decapsulation, the FID classification is done according to the
VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping
should be updated by the software with the new RIF. In addition, when a new
mapping is added for FID which already has a RIF, the correct RIF should
be used for it.

Add a test to verify that packets can be routed after decapsulation which
is done after VNI->FID classification, regardless of the order of the
configuration.

 # ./ingress_rif_conf_vxlan.sh
 TEST: Add RIF for existing VNI->FID mapping                         [ OK ]
 TEST: Add VNI->FID mapping for FID with a RIF                       [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridge
Amit Cohen [Wed, 17 Aug 2022 15:28:26 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for 802.1Q bridge

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VLAN-aware bridges (802.1Q), the FID classification is done according
to VID. When a RIF is added on top of a FID, the existing VID->FID mapping
should be updated by the software with the new RIF.

We never map multiple VLANs to the same FID using VID->FID, so we cannot
create VID->FID for FID which already has a RIF using 802.1Q. Anyway,
verify that packets can be routed via port which is added after the FID
already has a RIF.

Add a test to verify that packets can be routed after VID->FID
classification, regardless of the order of the configuration.

 # ./ingress_rif_conf_1q.sh
 TEST: Add RIF for existing VID->FID mapping                         [ OK ]
 TEST: Add port to VID->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Add ingress RIF configuration test for 802.1D bridge
Amit Cohen [Wed, 17 Aug 2022 15:28:25 +0000 (17:28 +0200)]
selftests: mlxsw: Add ingress RIF configuration test for 802.1D bridge

Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.

For VLAN-unaware bridges (802.1D), the FID classification is done according
to {Port, VID}. When a RIF is added on top of a FID, all the existing
{Port, VID}->FID mappings should be updated by the software with the new
RIF. In addition, when a new mapping is added for FID which already has a
RIF, the correct RIF should be used for it.

Add a test to verify that packets can be routed after {Port, VID}->FID
classification, regardless of the order of the configuration.

 # ./ingress_rif_conf_1d.sh
 TEST: Add RIF for existing {port, VID}->FID mapping                 [ OK ]
 TEST: Add {port, VID}->FID mapping for FID with a RIF               [ OK ]

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>